Skip to content

Live Sessions

Everything in Aura Live revolves around a Live Session. A session is a temporary bridge between your audio source and the transcription engine.

Before starting a session, you should identify:

  1. Language: What language is being spoken? Use the iso_639_1 code (e.g., en, fr).
  2. Model: Which ASR model should process the audio? (e.g., tiny, medium, large-v3).

Use these endpoints to discover available options:

  • List Languages: GET /languages/
  • List Models: GET /models/
  • List Translations: GET /translations/

To start a session, send a POST request to create a live session.

  1. Pending: The session is created but no audio is being received.
  2. Active: Audio is being streamed and transcribed.
  3. Finished: The connection has been closed by either the client or the server.
  4. Error: Something went wrong with the transcription engine.

Once a session is created, connect to the WebSocket endpoint to stream audio and receive transcription results in real time.

[!TIP] Use the models’ vram field to understand the resource requirements. Smaller models like tiny are faster but less accurate, whereas large-v3 provides the highest accuracy but requires more compute.