Text-to-Speech
TTS WebSocket Stream
Stream real-time Gradium text-to-speech audio over WebSocket.
Lifecycle
ready, then one or more audio and text
messages, and finally end_of_stream. See
WebSocket Lifecycle for connection
behavior, reusable sockets, browser tokens, and errors.
Client Messages
setup
| Field | Type | Required | Description |
|---|---|---|---|
type | string | Yes | Always "setup". |
model_name | string | No | Model alias, defaults to "default". |
voice_id | string | Recommended | Voice library or custom voice ID. |
voice | string | No | Voice name fallback when voice_id is not provided. |
output_format | string | No | wav, pcm, opus, ulaw_8000, alaw_8000, or explicit PCM rates such as pcm_16000. Defaults to wav. |
json_config | object or string | No | Advanced TTS settings. See Voice Settings. |
pronunciation_id | string | No | Pronunciation dictionary ID. |
client_req_id | string | No | Correlates multiplexed requests. |
close_ws_on_eos | boolean | No | Defaults to true; set false to keep the socket open. |
retry_for_s | number | No | Optional setup retry window in seconds. |
text
| Field | Type | Required | Description |
|---|---|---|---|
type | string | Yes | Always "text". |
text | string | Yes | Text chunk to synthesize. |
client_req_id | string | No | Required when routing a multiplexed request. |
end_of_stream
| Field | Type | Required | Description |
|---|---|---|---|
type | string | Yes | Always "end_of_stream". |
client_req_id | string | No | End the matching multiplexed request. |
Server Messages
ready
| Field | Type | Description |
|---|---|---|
type | string | Always "ready". |
request_id | string | Gradium request ID for logging and support. |
model_name | string | Requested model alias. |
model_ext | string | Resolved model identifier, when present. |
sample_rate | integer | Output sample rate. |
frame_size | integer | Output frame size in samples. |
audio_stream_names | string[] | Named audio streams, when present. |
text_stream_names | string[] | Named text streams, when present. |
client_req_id | string | Present for multiplexed requests. |
audio
| Field | Type | Description |
|---|---|---|
type | string | Always "audio". |
audio | string | Base64-encoded audio chunk. |
start_s | number | Chunk start time in seconds. |
stop_s | number | Chunk stop time in seconds. |
stream_id | integer | Stream identifier, when present. |
client_req_id | string | Present for multiplexed requests. |
text
| Field | Type | Description |
|---|---|---|
type | string | Always "text". |
text | string | Text segment associated with generated audio. |
start_s | number | Segment start time in seconds. |
stop_s | number | Segment stop time in seconds. |
stream_id | integer | Stream identifier, when present. |
client_req_id | string | Present for multiplexed requests. |
Terminal messages
| Type | Description |
|---|---|
end_of_stream | The request is complete. |
flushed | Reserved for flush acknowledgements. |
error | Terminal error message; the socket closes after the error. |
Error
Headers
Your Gradium API key
Response
101
WebSocket connection established