Setup
Install the package:Credentials
Get your Soniox API key from the Soniox Console and set it as an environment variable:Usage
Basic transcription
Example how to transcribe audio file using theSonioxDocumentLoader and generate the summary with an LLM.
Async transcription
For async operations, useaload() or alazy_load():
Advanced usage
Language hints
Soniox automatically detects and transcribes speech in 60+ languages. When you know which languages are likely to appear in your audio, providelanguage_hints to improve accuracy by biasing recognition toward those languages.
Language hints do not restrict recognition—they only bias the model toward the specified languages, while still allowing other languages to be detected if present.
Speaker diarization
Enable speaker identification to distinguish between different speakers:Language identification
Enable automatic language detection and identification:Context for improved accuracy
Provide domain-specific context to improve transcription accuracy. Context helps the model understand your domain, recognize important terms, and apply custom vocabulary. Thecontext object supports four optional sections:
Translation
Translate from any detected language to a target language:two_way translation type. Learn more about translation here.
API reference
Constructor parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
file_path | str | No* | None | Path to local audio file to transcribe |
file_data | bytes | No* | None | Binary data of audio file to transcribe |
file_url | str | No* | None | URL of audio file to transcribe |
api_key | str | No | SONIOX_API_KEY env var | Soniox API key |
base_url | str | No | https://api.soniox.com/v1 | API base URL (see regional endpoints) |
options | SonioxTranscriptionOptions | No | SonioxTranscriptionOptions() | Transcription options |
polling_interval_seconds | float | No | 1.0 | Time between status polls (seconds) |
timeout_seconds | float | No | 300.0 (5 minutes) | Maximum time to wait for transcription |
http_request_timeout_seconds | float | No | 60.0 | Timeout for individual HTTP requests |
file_path, file_data, or file_url.
Transcription options
TheSonioxTranscriptionOptions class supports these parameters:
| Parameter | Type | Description |
|---|---|---|
model | str | Async model to use (see available models) |
language_hints | list[str] | Language hints for transcription (ISO language codes) |
language_hints_strict | bool | Enforce strict language hints |
enable_speaker_diarization | bool | Enable speaker identification |
enable_language_identification | bool | Enable language detection |
translation | TranslationConfig | Translation configuration |
context | StructuredContext | Context for improved accuracy |
client_reference_id | str | Custom reference ID for your records |
webhook_url | str | Webhook URL for completion notifications |
webhook_auth_header_name | str | Custom auth header name for webhook |
webhook_auth_header_value | str | Custom auth header value for webhook |
Return value
Thelazy_load() and alazy_load() methods yield a single Document object:
tokens array in metadata includes detailed information for each transcribed word:
text: The transcribed textstart_ms: Start time in millisecondsend_ms: End time in millisecondsspeaker: Speaker ID (if diarization enabled), for example"1","2", etc.language: Detected language (if identification enabled), for example"en","fr", etc.translation_status: Translation status ("original","translated"or"none")
Related
Connect these docs to Claude, VSCode, and more via MCP for real-time answers.

