Introduction
An introduction to Speech Generation in Qolaba — what it does, how to access it, an overview of the interface, and the available models.
Qolaba's Speech Generation tool converts written text into high-quality, natural-sounding audio using Gemini-powered text-to-speech models. Whether you need a single narrator for a voiceover or a two-speaker dialogue for a podcast, Speech Generation gives you full control over voice, accent, language, tone, and delivery style — all from a single interface.
What You Can Do
Generate natural-sounding audio from any written script
Choose from a library of 30+ voice profiles with distinct tones and styles
Write your script in any language — audio is generated in that language automatically
Refine pronunciation with accent and dialect selection
Guide delivery style with custom style instructions
Produce single-speaker narration or two-speaker dialogue audio
Download, share, and manage all generated audio from one place
Core Use Cases
Podcast narration and introductions
YouTube and video voiceovers
Product walkthroughs and demos
Marketing advertisements and announcements
Audiobook-style storytelling
Training scripts and instructional audio
Conversational simulations and interview-format dialogue
How to Access
Go to the left navigation panel
Click Audio & Video
Select Text-to-Speech
This opens the dedicated Speech Generation workspace.
Interface Overview
The Speech Generation workspace is organized into two areas:
Audio History Panel A persistent panel displaying all previously generated audio files. Each entry includes playback controls and a three-dot menu for download, share, and delete actions. All generated audio is saved here automatically.
Configuration & Generation Area The primary workspace where you configure and generate audio. This is where you select your mode, choose voices, set accent and style, write your script, select a model, and generate output.
Available Models
Speech Generation is powered by Gemini and offers two models:
Flash TTS
Faster
Good
Lower
Script drafting, quick iterations, testing
Pro TTS
Slightly slower
Higher — more expressive and natural
Higher
Final production output, client-ready audio
Test your script with Flash TTS first to validate voice, accent, and style choices. Switch to Pro TTS for the final generation. This approach saves credits while ensuring production-quality output.
What's in This Section
Speech Generation Modes →
Last updated