x-api-key header, and use a Gradium voice_id or model setting.
The bigger win is that the same API covers realtime TTS, realtime STT,
semantic VAD, adaptive delay, browser-safe WebSocket tokens, and
custom voices.
Your app can keep the same shape:
- POST when you already have the full input.
- WebSocket when you want streaming input or low-latency output.
- Audio bytes or streamed chunks come back in the same places your current provider integration already handles them.
- Semantic VAD and adaptive delay give voice agents first-class turn-taking signals instead of forcing you to bolt on endpointing heuristics.
- Browser clients should use short-lived Gradium tokens instead of embedding API keys. See Browser WebSockets.
If you already wrapped ElevenLabs, Cartesia, or Deepgram behind a small
provider adapter, migrating is usually just changing the URL, auth
header, and a few field names.
Gradium POST example
For a complete text block, send one HTTP request and write the audio response to a file:Gradium WebSocket example
For streaming TTS, connect to the Gradium WebSocket, sendsetup once,
then send text:
audio messages back with base64-encoded audio chunks.
For the full message contract, see Text-to-Speech WebSocket.
Provider guides
ElevenLabs to Gradium
Move existing TTS calls to Gradium REST and WebSocket endpoints.
Cartesia to Gradium
Move TTS and STT adapters to Gradium request fields and message types.
Deepgram to Gradium
Replace speech adapters with Gradium STT, TTS, semantic VAD, and flush.
What usually changes
| Area | Change |
|---|---|
| Base URL | Use https://api.gradium.ai/api for REST and wss://api.gradium.ai/api for WebSockets. |
| Auth | Send x-api-key: your_api_key. |
| TTS voice | Pass a Gradium voice_id in the request body or WebSocket setup message. |
| TTS output | Use output_format, for example wav, pcm, or opus. |
| Streaming start | Send a Gradium setup message first on WebSocket connections. |
| Streaming end | Send {"type":"end_of_stream"} when you are done sending input. |
| Browser auth | Generate a temporary token with GET /api/api-keys/token, then connect with ?token=.... |
| Concurrent WebSocket requests | Use client_req_id and close_ws_on_eos: false. |
| Turn-taking | Use STT step messages, inactivity_prob, delay_in_frames, and flush. |
Speech-to-text endpoints
If you are migrating an STT integration, use the same idea with the STT routes:| Flow | Gradium endpoint |
|---|---|
| Complete audio file | POST https://api.gradium.ai/api/post/speech/asr |
| Live audio stream | wss://api.gradium.ai/api/speech/asr |
Production Patterns
WebSocket lifecycle
Setup, ready, input, flush, end-of-stream, multiplexing, and errors.
Browser WebSockets
Issue short-lived tokens for browser and mobile WebSocket clients.