Single Speaker Mode
Single Speaker Mode is used when a single voice narrates the entire script. This mode is ideal for announcements, narrations, advertisements, and monologue-style content.
4.1 Voice Selection
Users can choose from a curated library of voice profiles. Each voice has distinct characteristics in terms of:
Tone
Pitch
Energy level
Speaking style
Voice categories may include styles such as:BrightUpbeatInformativeFirmExcitableYouthfulClearSmoothSoftGravellyA search feature is available to quickly filter voices by attributes such as gender or tone.Selecting the correct voice is essential for aligning audio output with the intended audience and message.
4.2 Language and Accent Configuration
The output language is automatically determined by the language of the input text.For example:
If the script is written in Hindi, the output audio will be in Hindi.
If written in English, the output will be in English.
Accent selection is available to refine pronunciation in languages with multiple dialects.Examples include:
Arabic (Egypt, Global)
Mandarin (China, Taiwan)
Indian regional accents (Hindi, Gujarati, Kannada, Punjabi, Sindhi, etc.)
Accent selection enhances clarity and naturalness in region-specific communication.
4.3 Style Instructions
The Style Prompt allows users to guide how the speech should be delivered.Examples of style instructions include:
Speak warmly and enthusiastically
Calm and slow-paced narration
Professional and authoritative tone
Energetic and engaging delivery
This field enables emotional and tonal customization beyond voice selection.
4.4 Script Input
Users enter or paste their content into the script input field.Single Speaker Mode supports:
Long-form narration
Promotional scripts
Announcements
Informational content
Maximum character limit: 4,000 characters.Well-structured punctuation improves pacing and realism in output.
4.5 Model Selection
Speech Generator offers two Gemini-powered models:Flash Text-to-SpeechDesigned for faster generation and lower credit usage. Suitable for drafts and quick iterations.Pro Text-to-SpeechOptimized for higher quality, more expressive, and natural-sounding output. Ideal for final production use.Model selection affects both audio realism and credit consumption.
4.6 Generating Audio
Once configuration is complete:
Click Generate.
The system processes the script.
The audio output appears below.
You can then:
Play the audio
Download it
Share it
The Reset option clears all selections and inputs.
Last updated