Voices, Accents & Style
How to select voices, configure language and accent, and write style instructions in Qolaba's Speech Generation workspace.
Before generating audio, configure how your output should sound — which voice delivers it, what accent and dialect it uses, and what tone and style it carries. These three settings work together to define the personality, clarity, and emotional quality of your generated audio.
Voice Library
Qolaba provides a library of 30+ voice profiles, each with distinct characteristics in tone, pitch, energy, and speaking style. Selecting the right voice is the single most impactful decision in your configuration — it defines how your audience experiences the content.
Browsing and Previewing Voices
Click any voice in the library to hear a preview before selecting it. This lets you evaluate tone and style before committing to a generation.
Voice Categories
Voices are organized by characteristic style to help you find the right fit quickly:
Bright
Clear, positive, and energetic
Upbeat
Enthusiastic and engaging
Informative
Measured, authoritative, and clear
Firm
Confident and direct
Excitable
High energy and expressive
Youthful
Fresh, casual, and approachable
Clear
Neutral and precise
Smooth
Warm and fluid delivery
Soft
Gentle and calm
Gravelly
Deep and textured
Filtering Voices
Use the search and filter options to narrow down the library by:
Gender — male or female voices
Tone category — filter by characteristic style (Bright, Smooth, Firm, etc.)
Match the voice category to the content type — an Informative voice works well for product walkthroughs and instructional content, while an Upbeat or Excitable voice suits marketing and promotional audio.
Language & Accent
Multi-Language Support
The output language of your generated audio is determined entirely by the language of your script. Write your script in any language — English, Hindi, French, Arabic, Mandarin, or any other supported language — and the audio will be generated in that language automatically. There is no separate language setting to configure.
This makes Speech Generation natively multilingual — switch languages simply by changing the language of your input text.
Accent & Dialect Selection
Many languages have multiple regional dialects with distinct pronunciation patterns. Accent selection lets you specify which regional variant the voice should follow — improving clarity, naturalness, and audience relatability for region-specific content.
Examples of available accents:
English
United States, United Kingdom, India, Australia
French
France, Canada
Arabic
Egypt, Global
Mandarin
China, Taiwan
Hindi
India
Spanish
Spain, Latin America
If your script is in a language with multiple regional dialects and your audience is in a specific region, selecting the matching accent improves pronunciation accuracy and makes the audio feel more natural to that audience.
Style Instructions
Style instructions let you guide the emotional tone and delivery manner of the generated audio — going beyond voice selection to define how the voice speaks, not just which voice speaks.
How to Write Style Instructions
Enter a brief, plain-language description of the desired delivery in the Style Prompt field. The model interprets this and adjusts its delivery accordingly.
Examples:
Warm and personable
"Speak warmly and conversationally, like talking to a friend"
Professional narration
"Clear, professional, and authoritative tone"
Energetic marketing
"Enthusiastic and high-energy delivery"
Calm instructional
"Calm, slow-paced, and easy to follow"
Storytelling
"Engaging narrative style, with natural pauses and expression"
Combining Voice and Style
Voice selection and style instructions work best in combination. A Smooth voice with a "warm and conversational" style instruction produces a noticeably different output than the same voice with a "professional and authoritative" instruction.
If your first generation doesn't match the intended tone, refine the style instruction before switching voices. Often a more specific style prompt produces better results than changing the voice entirely.
Last updated