Features
- Multilingual: We currently support five languages: English (en), French (fr), German (de), Spanish (es) and Portuguese (pt) for our Text-To-Speech and Speech-To-Text with more languages to come.
- Low-latency: Our servers are based in Europe and in the US, with our expected time-to-first-token is below 300ms when streaming.
- Voice selection: We provide a voice library, with multiple voices to choose from in different languages. You can also clone voices instantaneously using a 10” voice sample.
Installation
Get started with the Gradium Python SDK
Text-to-Speech
Convert text to natural-sounding speech
Speech-to-Text
Transcribe audio to text in real-time
API Reference
Explore the full API reference