Migrate to Gradium - Gradium API

Most voice API migrations come down to the same small swap: point your existing request code at Gradium, send your Gradium API key in the x-api-key header, and use a Gradium voice_id or model setting. The bigger win is that the same API covers realtime TTS, realtime STT, semantic VAD, adaptive delay, browser-safe WebSocket tokens, and custom voices. Your app can keep the same shape:

POST when you already have the full input.
WebSocket when you want streaming input or low-latency output.
Audio bytes or streamed chunks come back in the same places your current provider integration already handles them.
Semantic VAD and adaptive delay give voice agents first-class turn-taking signals instead of forcing you to bolt on endpointing heuristics.
Browser clients should use short-lived Gradium tokens instead of embedding API keys. See Browser WebSockets.

If you already wrapped ElevenLabs, Cartesia, or Deepgram behind a small provider adapter, migrating is usually just changing the URL, auth header, and a few field names.

Gradium POST example

For a complete text block, send one HTTP request and write the audio response to a file:

curl -L -X POST https://api.gradium.ai/api/post/speech/tts \
  -H "x-api-key: your_api_key" \
  -H "Content-Type: application/json" \
  -d '{"text": "Hello from Gradium.", "voice_id": "YTpq7expH9539ERJ", "output_format": "wav", "only_audio": true}' \
  > output.wav

That is the whole path for one-shot TTS: request body in, audio bytes out. For the full schema, see Text-to-Speech REST.

Gradium WebSocket example

For streaming TTS, connect to the Gradium WebSocket, send setup once, then send text:

wscat -c "wss://api.gradium.ai/api/speech/tts" \
  -H "x-api-key: your_api_key"

After the connection opens, send:

{"type":"setup","voice_id":"YTpq7expH9539ERJ","model_name":"default","output_format":"wav"}
{"type":"text","text":"Hello from Gradium."}
{"type":"end_of_stream"}

Gradium streams audio messages back with base64-encoded audio chunks. For the full message contract, see Text-to-Speech WebSocket.

Provider guides

ElevenLabs to Gradium

Move existing TTS calls to Gradium REST and WebSocket endpoints.

Cartesia to Gradium

Move TTS and STT adapters to Gradium request fields and message types.

Deepgram to Gradium

Replace speech adapters with Gradium STT, TTS, semantic VAD, and flush.

What usually changes

Area	Change
Base URL	Use `https://api.gradium.ai/api` for REST and `wss://api.gradium.ai/api` for WebSockets.
Auth	Send `x-api-key: your_api_key`.
TTS voice	Pass a Gradium `voice_id` in the request body or WebSocket `setup` message.
TTS output	Use `output_format`, for example `wav`, `pcm`, or `opus`.
Streaming start	Send a Gradium `setup` message first on WebSocket connections.
Streaming end	Send `{"type":"end_of_stream"}` when you are done sending input.
Browser auth	Generate a temporary token with `GET /api/api-keys/token`, then connect with `?token=...`.
Concurrent WebSocket requests	Use `client_req_id` and `close_ws_on_eos: false`.
Turn-taking	Use STT `step` messages, `inactivity_prob`, `delay_in_frames`, and `flush`.

Speech-to-text endpoints

If you are migrating an STT integration, use the same idea with the STT routes:

Flow	Gradium endpoint
Complete audio file	`POST https://api.gradium.ai/api/post/speech/asr`
Live audio stream	`wss://api.gradium.ai/api/speech/asr`

See Speech-to-Text REST and Speech-to-Text WebSocket for the message formats.

Production Patterns

WebSocket lifecycle

Setup, ready, input, flush, end-of-stream, multiplexing, and errors.

Browser WebSockets

Issue short-lived tokens for browser and mobile WebSocket clients.

​Gradium POST example

​Gradium WebSocket example

​Provider guides

ElevenLabs to Gradium

Cartesia to Gradium

Deepgram to Gradium

​What usually changes

​Speech-to-text endpoints

​Production Patterns

WebSocket lifecycle

Browser WebSockets

Gradium POST example

Gradium WebSocket example

Provider guides

What usually changes

Speech-to-text endpoints

Production Patterns