TTS models accept advanced options via theDocumentation Index
Fetch the complete documentation index at: https://docs.gradium.ai/llms.txt
Use this file to discover all available pages before exploring further.
json_config parameter. In
the Python SDK, this is a dict mapping option name to value (float or
string). When using the REST endpoints, pass it as a URL-encoded JSON
string in the query parameters.
These options apply to both the WebSocket and
REST transports. For STT, see
Transcription Settings.
Quick reference
| Parameter | Range | Default | Effect |
|---|---|---|---|
temp | 0.0–1.4 | 0.7 | Sampling temperature. 0.0 is deterministic; higher values produce more diverse output. |
cfg_coef | 1.0–4.0 | 2.0 | Voice similarity. Higher values stay closer to the target voice; very high values can introduce artifacts. |
padding_bonus | -4.0–4.0 | 0.0 | Speech speed. Negative values are faster, positive values are slower. |
rewrite_rules | string | none | Text-rewriting rules applied before synthesis. See Text Rewriting Rules. |
pronunciation_id | string | none | A pronunciation dictionary ID, applied per request. See Pronunciations. |
temp to 0.0. For multi-utterance
flows on a single session, see Multiplexing.
The TTS engine recognises the <flush> and <break time="..." />
tags described in Text-to-Speech.
Speed control
You can guide the speed of the model using the padding bonus parameter. Default value is 0.0. Negative values mean that the speaker will speak faster (values between -4.0 and -0.1). Positive values mean that the speaker will speak slower (values between 0.1 and 4.0).Temperature control
The temperature for the generation can be set with values ranging from 0 to 1.4. A value of 0 corresponds to a deterministic generation, while higher values lead to more diverse outputs. Default value is 0.7.Voice similarity control
Thecfg_coef parameter can be used to control the similarity of the
generated speech to the target voice. Values range from 1.0 to 4.0.
The default value is 2.0. The higher the value, the more the model
replicates the cloned voice but larger values can lead to audio
artifacts.
Rewrite rules
Therewrite_rules parameter can be used to pass text rewriting rules
that are applied before the text is synthesized. The rules should be
passed as a string. More details on the rules themselves can be found
in the Text Rewriting Rules guide. Values
such as "en", "fr", "de", "es", "pt" enable all the rewriting
rules for a given language.