Streaming use case?
For low-latency synthesis of long or generated text, streaming TTS
via the SDK gives you audio chunks as they’re produced.
Quickstart
only_audio=true returns the raw audio bytes directly, easiest for
“text in, file out”.
Response modes
Theonly_audio field in the request body picks one of two response
shapes:
only_audio: true: the response body is the raw audio in the requestedoutput_format(WAV, PCM, Opus, …). Save it directly to a file or pipe it to a player. TheContent-Typereflects the format (audio/wav,audio/ogg,audio/pcm).only_audio: false(or omitted): the response is a JSON stream using the same message format as the WebSocket endpoint, includingaudio(base64),text(with timestamps), anderror. Read the body line-by-line until it closes.
Streaming the JSON response
Passonly_audio: false and read the body as it arrives. With cURL,
the -N (--no-buffer) flag prints each line as soon as the server
sends it.
Next steps
TTS POST API reference
Full request body schema, output formats, error contracts.
Streaming with the SDK
Chunked audio output, custom voices, flush, timestamps.