Text to Speech

How to use Text-to-Speech within the chatbot — voice selection, style customization, and multi-speaker generation.

The Text-to-Speech tool converts written text into natural-sounding audio directly within the chat — powered by Gemini. Turn any AI-generated response into a playable audio clip without leaving the conversation.


How to Use It

No toggle required. Simply prompt the chatbot to generate speech from any text in the conversation — including AI-generated responses.

Example prompts:

Generate speech for the above response.
Turn this into audio.
Read the previous paragraph in a calm, female voice.
Generate audio for this in a deep male voice with a formal tone.

The chatbot instantly converts the text into an audio file that plays directly within the chat interface. All generated audio clips are automatically saved to Dashboard → History and remain accessible in your chatbot history as well.


Voice & Style Options

Voice library — Choose from over 30 unique voices to match your content and audience.

Style instructions — Specify tone, mood, and delivery directly in your prompt. Examples: formal, excited, storytelling, conversational, calm, authoritative.

Multi-speaker generation — If the text contains a dialogue or conversation between two or more people, prompt the chatbot to generate a realistic multi-speaker audio clip with distinct voices for each speaker.

Example multi-speaker prompt:

Generate a multi-speaker audio clip for this dialogue —
use a professional male voice for the interviewer
and a confident female voice for the interviewee.
circle-info

For standalone audio generation outside the chatbot — including model selection, the full voice library, and style controls — see Text-to-Speech →

Last updated