Skip to main content

Get Started

Build voice-powered applications with the Gradium API. Our TTS and STT models deliver natural-sounding output with best-in-class accuracy, low latency, semantic VAD for turn-taking, and adaptive delay controls for realtime agents.

Installation

Install the Python SDK and make your first API call.

Text-to-Speech

Convert text to natural-sounding speech with streaming support.

Speech-to-Text

Transcribe audio in real time with semantic VAD and flush.

API Reference

Explore the full REST and WebSocket API reference.

Explore

Voice Library

Browse available voices or create custom voice clones.

Voice Settings

Control speed, temperature, and voice similarity.

Turn-Taking

Use semantic VAD and adaptive delay for conversational agents.