STT WebSocket Stream - Gradium API

wscat -c "wss://api.gradium.ai/api/speech/asr" \
  -H "x-api-key: YOUR_API_KEY"

Lifecycle

{"type":"setup","model_name":"default","input_format":"pcm","json_config":{"language":"en"}}
{"type":"audio","audio":"base64_encoded_audio"}
{"type":"flush","flush_id":1}
{"type":"end_of_stream"}

The server responds with ready, then text, end_text, step, and flushed messages as available, and finally end_of_stream. See WebSocket Lifecycle for connection behavior, reusable sockets, browser tokens, and errors.

Client Messages

`setup`

Field	Type	Required	Description
`type`	string	Yes	Always `"setup"`.
`model_name`	string	No	Model alias, defaults to `"default"`.
`input_format`	string	No	`pcm`, `wav`, `opus`, `ulaw_8000`, `alaw_8000`, or explicit PCM rates such as `pcm_16000`. Defaults to `wav`.
`json_config`	object or string	No	Advanced STT settings. See Transcription Settings.
`client_req_id`	string	No	Correlates multiplexed requests.
`close_ws_on_eos`	boolean	No	Defaults to `true`; set `false` to keep the socket open.
`retry_for_s`	number	No	Optional setup retry window in seconds.

`audio`

Field	Type	Required	Description
`type`	string	Yes	Always `"audio"`.
`audio`	string	Yes	Base64-encoded audio chunk.
`client_req_id`	string	No	Required when routing a multiplexed request.

`flush`

Field	Type	Required	Description
`type`	string	Yes	Always `"flush"`.
`flush_id`	integer	Yes	Echoed in the matching `flushed` response.
`client_req_id`	string	No	Required when routing a multiplexed request.

`end_of_stream`

Field	Type	Required	Description
`type`	string	Yes	Always `"end_of_stream"`.
`client_req_id`	string	No	End the matching multiplexed request.

Server Messages

`ready`

Field	Type	Description
`type`	string	Always `"ready"`.
`request_id`	string	Gradium request ID for logging and support.
`model_name`	string	Requested model alias.
`sample_rate`	integer	Input sample rate after setup.
`frame_size`	integer	Frame size in samples.
`delay_in_frames`	integer	Model delay, in 80 ms frames.
`text_stream_names`	string[]	Named text streams, when present.
`client_req_id`	string	Present for multiplexed requests.

`text`

Field	Type	Description
`type`	string	Always `"text"`.
`text`	string	Transcribed text segment.
`start_s`	number	Segment start time in seconds.
`stream_id`	integer	Stream identifier, when present.
`client_req_id`	string	Present for multiplexed requests.

`end_text`

Field	Type	Description
`type`	string	Always `"end_text"`.
`stop_s`	number	Stop time for the previous text segment.
`stream_id`	integer	Stream identifier, when present.
`client_req_id`	string	Present for multiplexed requests.

`step`

Field	Type	Description
`type`	string	`"step"` or legacy `"vad"`.
`vad`	object[]	Horizon predictions with `horizon_s` and `inactivity_prob`.
`step_idx`	integer	Step index.
`step_duration_s`	number	Step duration in seconds, usually `0.08`.
`total_duration_s`	number	Audio duration processed so far.
`client_req_id`	string	Present for multiplexed requests.

`flushed`

Field	Type	Description
`type`	string	Always `"flushed"`.
`flush_id`	integer	The `flush_id` from the matching request.
`client_req_id`	string	Present for multiplexed requests.

Terminal messages

Type	Description
`end_of_stream`	The request is complete.
`error`	Terminal error message; the socket closes after the error.

Error

{"type":"error","message":"Error description","code":1008}

Headers

x-api-key

string

required

Your Gradium API key

Response

101

WebSocket connection established

​Lifecycle

​Client Messages

​setup

​audio

​flush

​end_of_stream

​Server Messages

​ready

​text

​end_text

​step

​flushed

​Terminal messages