Deepgram integrations often already separate provider code into “send audio, receive transcript” or “send text, receive audio” adapters. With that structure in place, moving to Gradium is mostly an endpoint, auth-header, and field-name swap. Use Gradium’s STT endpoints when you are replacing Deepgram Listen, and Gradium’s TTS endpoints when you are replacing Deepgram Speak.Documentation Index
Fetch the complete documentation index at: https://docs.gradium.ai/llms.txt
Use this file to discover all available pages before exploring further.
Endpoint swap
| Flow | Deepgram | Gradium |
|---|---|---|
| Pre-recorded STT | POST https://api.deepgram.com/v1/listen | POST https://api.gradium.ai/api/post/speech/asr |
| Streaming STT | wss://api.deepgram.com/v1/listen | wss://api.gradium.ai/api/speech/asr |
| One-shot TTS | POST https://api.deepgram.com/v1/speak | POST https://api.gradium.ai/api/post/speech/tts |
| Streaming TTS | wss://api.deepgram.com/v1/speak | wss://api.gradium.ai/api/speech/tts |
Speech-to-text POST
For pre-recorded audio, keep sending the audio bytes in the request body and switch the URL to Gradium. Gradium streams newline-delimited JSON back; collecttext messages to build the transcript.
Speech-to-text WebSocket
Deepgram Listen streaming accepts audio over the WebSocket connection. Gradium does the same, but direct WebSocket audio frames are JSON messages with base64 audio.Gradium messages
| Deepgram Listen concept | Gradium STT message |
|---|---|
WebSocket connect to /v1/listen | Connect to wss://api.gradium.ai/api/speech/asr |
| Audio stream | {"type":"audio","audio":"base64_encoded_audio"} |
| Transcript result | {"type":"text","text":"..."} |
| Close/finalize stream | {"type":"end_of_stream"} |
input_format to match the audio you send: pcm, wav, opus,
ulaw_8000, alaw_8000, or another supported Gradium input format.
Text-to-speech POST
Deepgram Speak uses themodel query parameter for voice/model
selection and sends text in the JSON body. Gradium uses voice_id in
the JSON body and returns raw audio bytes when only_audio is true.
| Deepgram concept | Gradium field |
|---|---|
model query parameter for Speak | voice_id plus optional model_name |
text | text |
| Output encoding options | output_format |
Authorization: Token ... | x-api-key |
Text-to-speech WebSocket
For streaming TTS, replace Deepgram Speak messages with Gradium’ssetup, text, and end_of_stream messages:
| Deepgram Speak message | Gradium TTS message |
|---|---|
{"type":"Speak","text":"..."} | {"type":"text","text":"..."} after setup |
{"type":"Flush"} | Send more text, or finish with end_of_stream for a simple utterance |
{"type":"Close"} | {"type":"end_of_stream"} |
| Binary/audio response frames | {"type":"audio","audio":"base64..."} |
Gradium messages
Checklist
- Replace the Deepgram URL with the matching Gradium endpoint.
- Change auth to
x-api-key. - For STT, set
Content-Typeorinput_formatto match your audio. - For TTS, pass a Gradium
voice_id. - For direct WebSocket STT, send audio as base64 inside Gradium
audiomessages. - Keep your transcript collector or audio playback path.
Next steps
Gradium STT REST guide
Complete-file transcription with streamed NDJSON responses.
Gradium STT WebSocket guide
Real-time audio streaming, VAD messages, and flush.
Gradium TTS REST guide
One-shot text-to-speech with raw audio responses.
Gradium TTS WebSocket guide
Streaming text-to-speech over WebSocket.