What it does
Converts text into spoken audio using ElevenLabs voice synthesis. The generated MP3 is attached to the conversation so it can be played back, downloaded, or handed to other tools (for example, transcribed back via elevenlabs_speech_to_text).Key features
- Curated enum of 10 voices spanning gender, accent, and tone — pick by name, no voice IDs required
- Escape hatch for custom or cloned voices via raw
voice_id - Latest
eleven_v3model by default; opt intoeleven_multilingual_v2,eleven_flash_v2_5, oreleven_turbo_v2_5for specialized tradeoffs - Configurable output format (MP3 at multiple bitrates, PCM, µ-law)
- Output file is attached to the thread for downstream use
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
text | string | Yes | The text to speak (up to 10,000 characters) |
voice | enum | No | One of rachel, sarah, jessica, charlotte, lily, george, brian, daniel, will, charlie. Defaults to rachel. |
voice_id | string | No | Raw ElevenLabs voice ID (e.g. a cloned voice). Takes precedence over voice if set. |
model_id | enum | No | eleven_v3 (default, most expressive), eleven_multilingual_v2 (29 languages), eleven_flash_v2_5 (~75ms latency), eleven_turbo_v2_5 (speed/quality balance) |
output_format | enum | No | mp3_44100_128 (default), mp3_44100_192, mp3_22050_32, pcm_16000, pcm_22050, pcm_24000, pcm_44100, ulaw_8000 |
filename | string | No | Filename for the generated file. Extension is added automatically. Defaults to tts_<timestamp>.mp3. |
Voice reference
voice | Gender | Accent | Best for |
|---|---|---|---|
rachel | F | American | Calm narration, explainers |
sarah | F | American | Soft professional voiceover |
jessica | F | American | Expressive, upbeat reads |
charlotte | F | British | Sultry narration |
lily | F | British | Warm, conversational |
george | M | British | Warm narrator, storytelling |
brian | M | American | Deep, authoritative |
daniel | M | British | News/authoritative |
will | M | American | Chill, conversational |
charlie | M | Australian | Casual |
Common use cases
Generate a short voiceover
Use a specific model for low-latency playback
Use a custom cloned voice
Response
Returns agenerated_files attachment with the MP3, plus metadata about the voice, model, output format, and file size. The attachment is immediately available to downstream tools in the same conversation.
