Setup
Install the package:npm2yarn
Credentials
Get your Soniox API key from the Soniox Console and set it as an environment variable:Usage
Basic transcription
Example how to transcribe audio file using theSonioxAudioTranscriptLoader and generate the summary with an LLM.
Translation
Translate from any detected language to a target language:two_way translation type. Learn more about translation here.
Language hints
Soniox automatically detects and transcribes speech in 60+ languages. When you know which languages are likely to appear in your audio, providelanguage_hints to improve accuracy by biasing recognition toward those languages.
Language hints do not restrict recognition—they only bias the model toward the specified languages, while still allowing other languages to be detected if present.
Speaker diarization
Enable speaker identification to distinguish between different speakers:Language identification
Enable automatic language detection and identification:Context for improved accuracy
Provide domain-specific context to improve transcription accuracy:API reference
Constructor parameters
SonioxLoaderParams (required)
| Parameter | Type | Required | Description |
|---|---|---|---|
audio | Uint8Array | string | Yes | Audio file as buffer or URL |
audioFormat | SonioxAudioFormat | No | Audio file format |
apiKey | string | No | Soniox API key (defaults to SONIOX_API_KEY env var) |
apiBaseUrl | string | No | API base URL (defaults to https://api.soniox.com/v1) |
pollingIntervalMs | number | No | Polling interval in ms (min: 1000, default: 1000) |
pollingTimeoutMs | number | No | Polling timeout in ms (default: 180000) |
SonioxLoaderOptions (optional)
| Parameter | Type | Description |
|---|---|---|
model | SonioxTranscriptionModelId | Model to use (default: "stt-async-v4") |
translation | object | Translation configuration |
language_hints | string[] | Language hints for transcription |
language_hints_strict | boolean | Enforce strict language hints |
enable_speaker_diarization | boolean | Enable speaker identification |
enable_language_identification | boolean | Enable language detection |
context | object | Context for improved accuracy |
Supported audio formats
aac- Advanced Audio Codingaiff- Audio Interchange File Formatamr- Adaptive Multi-Rateasf- Advanced Systems Formatflac- Free Lossless Audio Codecmp3- MPEG Audio Layer IIIogg- Ogg Vorbiswav- Waveform Audio File Formatwebm- WebM Audio
Return value
Theload() method returns an array containing a single Document object:
SonioxTranscriptResponse type in the Soniox REST API Reference.
Related
Connect these docs to Claude, VSCode, and more via MCP for real-time answers.

