> For the complete documentation index, see [llms.txt](https://docs.qolaba.ai/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.qolaba.ai/speech-generation/introduction.md).

# Introduction

Qolaba's Speech Generation tool converts written text into high-quality, natural-sounding audio using Gemini-powered text-to-speech models. Whether you need a single narrator for a voiceover or a two-speaker dialogue for a podcast, Speech Generation gives you full control over voice, accent, language, tone, and delivery style — all from a single interface.

***

#### What You Can Do

* Generate natural-sounding audio from any written script
* Choose from a library of 30+ voice profiles with distinct tones and styles
* Write your script in any language — audio is generated in that language automatically
* Refine pronunciation with accent and dialect selection
* Guide delivery style with custom style instructions
* Produce single-speaker narration or two-speaker dialogue audio
* Download, share, and manage all generated audio from one place

***

#### Core Use Cases

* Podcast narration and introductions
* YouTube and video voiceovers
* Product walkthroughs and demos
* Marketing advertisements and announcements
* Audiobook-style storytelling
* Training scripts and instructional audio
* Conversational simulations and interview-format dialogue

***

#### How to Access

1. Go to the **left navigation panel**
2. Click **Audio & Video**
3. Select [**Text-to-Speech**](https://www.qolaba.ai/ai-speech-generator/text-to-speech)

This opens the dedicated Speech Generation workspace.

***

#### Interface Overview

The Speech Generation workspace is organized into two areas:

1. **Audio History Panel** A persistent panel displaying all previously generated audio files. Each entry includes playback controls and a three-dot menu for download, share, and delete actions. All generated audio is saved here automatically.
2. **Configuration & Generation Area** The primary workspace where you configure and generate audio. This is where you select your mode, choose voices, set accent and style, write your script, select a model, and generate output.

***

#### Available Models

Speech Generation is powered by Gemini and offers two models:

| Model         | Speed           | Quality                              | Credit Cost | Best For                                    |
| ------------- | --------------- | ------------------------------------ | ----------- | ------------------------------------------- |
| **Flash TTS** | Faster          | Good                                 | Lower       | Script drafting, quick iterations, testing  |
| **Pro TTS**   | Slightly slower | Higher — more expressive and natural | Higher      | Final production output, client-ready audio |

Test your script with Flash TTS first to validate voice, accent, and style choices. Switch to Pro TTS for the final generation. This approach saves credits while ensuring production-quality output.

***

#### What's in This Section

1. [Voices, Accents & Style →](/speech-generation/voices-accents-and-style.md)
2. Speech Generation Modes →

* [Single Speaker Mode →](/speech-generation/speech-generation-modes/single-speaker-mode.md)
* [Multi-Speaker Mode →](/speech-generation/speech-generation-modes/multi-speaker-mode.md)

3. [Managing Generated Audio →](/speech-generation/managing-generated-audios.md)