ElevenLabs Text to Speech

What it does

Converts text into spoken audio using ElevenLabs voice synthesis. The generated MP3 is attached to the conversation so it can be played back, downloaded, or handed to other tools (for example, transcribed back via elevenlabs_speech_to_text).

Key features

Curated enum of 10 voices spanning gender, accent, and tone — pick by name, no voice IDs required
Escape hatch for custom or cloned voices via raw voice_id
Latest eleven_v3 model by default; opt into eleven_multilingual_v2, eleven_flash_v2_5, or eleven_turbo_v2_5 for specialized tradeoffs
Configurable output format (MP3 at multiple bitrates, PCM, µ-law)
Output file is attached to the thread for downstream use

Parameters

Parameter	Type	Required	Description
`text`	string	Yes	The text to speak (up to 10,000 characters)
`voice`	enum	No	One of `rachel`, `sarah`, `jessica`, `charlotte`, `lily`, `george`, `brian`, `daniel`, `will`, `charlie`. Defaults to `rachel`.
`voice_id`	string	No	Raw ElevenLabs voice ID (e.g. a cloned voice). Takes precedence over `voice` if set.
`model_id`	enum	No	`eleven_v3` (default, most expressive), `eleven_multilingual_v2` (29 languages), `eleven_flash_v2_5` (~75ms latency), `eleven_turbo_v2_5` (speed/quality balance)
`output_format`	enum	No	`mp3_44100_128` (default), `mp3_44100_192`, `mp3_22050_32`, `pcm_16000`, `pcm_22050`, `pcm_24000`, `pcm_44100`, `ulaw_8000`
`filename`	string	No	Filename for the generated file. Extension is added automatically. Defaults to `tts_<timestamp>.mp3`.

Voice reference

`voice`	Gender	Accent	Best for
`rachel`	F	American	Calm narration, explainers
`sarah`	F	American	Soft professional voiceover
`jessica`	F	American	Expressive, upbeat reads
`charlotte`	F	British	Sultry narration
`lily`	F	British	Warm, conversational
`george`	M	British	Warm narrator, storytelling
`brian`	M	American	Deep, authoritative
`daniel`	M	British	News/authoritative
`will`	M	American	Chill, conversational
`charlie`	M	Australian	Casual

Common use cases

Generate a short voiceover

text: "Welcome to the onboarding flow. Let's get you set up."
voice: "rachel"

Use a specific model for low-latency playback

text: "Your order has shipped."
voice: "will"
model_id: "eleven_flash_v2_5"

Use a custom cloned voice

text: "This is our branded voice."
voice_id: "<your_custom_voice_id>"

Response

Returns a generated_files attachment with the MP3, plus metadata about the voice, model, output format, and file size. The attachment is immediately available to downstream tools in the same conversation.

Setup

No per-user setup. ElevenLabs is configured at the platform level — just enable the tool on your agent in Control Hub > Edit Agent under the Audio section.

​What it does

​Key features

​Parameters

​Voice reference

​Common use cases

​Generate a short voiceover

​Use a specific model for low-latency playback

​Use a custom cloned voice

​Response

​Setup