Text to Speech API Documentation

Complete reference for the multi-provider TTS API

Authentication

Authorization: Bearer sk_live_your_key_here

POST /v1/tts/synthesize

Parameters

Parameter Type Required Description
textstringYesText to convert (max 5,000 chars, or 10,000 for SSML)
providerstringNolocal, google, openai (default: local)
voicestringNoVoice ID. Use /v1/tts/voices to list available voices
languagestringNoBCP-47 language code, e.g. en-US, ar-XA (default: en-US)
formatstringNomp3, wav, ogg, aac (default: mp3)
speedfloatNoSpeaking rate 0.25–4.0 (default: 1.0)
pitchfloatNoPitch adjustment -20 to +20 (default: 0, Google only)
ssmlbooleanNoTreat input as SSML (Google provider)

GET /v1/tts/voices

List available voices for a provider.

GET /v1/tts/voices?provider=openai

Providers

local

Uses macOS say or Linux espeak-ng. Free, no API key needed. Best for development.

google

Google Cloud Text-to-Speech. Neural/WaveNet voices. Requires GOOGLE_TTS_API_KEY. Supports SSML and pitch.

openai

OpenAI TTS (tts-1-hd). 6 neural voices. Requires OPENAI_TTS_API_KEY. Best quality.