> For the complete documentation index, see [llms.txt](https://docs.qolaba.ai/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.qolaba.ai/speech-generation/speech-generation-modes/multi-speaker-mode.md).

# Multi Speaker Mode

Multi-Speaker Mode generates dialogue-based audio using two distinct voice profiles. Instead of a single continuous narration, the output alternates between two speakers — producing a realistic conversation, interview, or dialogue from a structured script.

***

#### When to Use It

Multi-Speaker Mode is best for any content that involves a conversation or exchange between two voices:

* Podcast interviews and co-hosted episodes
* Q\&A and FAQ format audio
* Story-based scripts with two characters
* Training simulations and role-play scenarios
* Conversational product demos
* Interview-style marketing and promotional content

***

#### How the Dialogue Card System Works

Multi-Speaker Mode uses a **dialogue card system** to structure the conversation. Each card represents one turn in the dialogue and is assigned to either Speaker 1 or Speaker 2. Cards alternate between speakers automatically — the first card is Speaker 1, the second is Speaker 2, the third is Speaker 1 again, and so on.

This structure ensures the voices alternate naturally throughout the conversation, producing a single combined audio file with both speakers in sequence.

***

#### Step-by-Step Configuration

**Step 1 — Select Multi-Speaker Mode** Open the Speech Generation workspace and select **Multi-Speaker** from the mode selector.

**Step 2 — Assign Voices** Select a distinct voice for **Speaker 1** and a distinct voice for **Speaker 2**. Choose voices that contrast in tone, energy, or style to make the conversation feel natural and easy to follow.

**Step 3 — Set Accent and Style Instructions** Accent and style instructions apply across the full conversation — not per speaker. Select the accent that matches the language and region of your script, and add a style instruction that sets the overall tone of the dialogue.

**Examples:**

```
Casual and conversational podcast tone — relaxed and natural.
```

```
Professional interview style — measured and articulate.
```

```
Energetic and engaging — fast-paced and enthusiastic.
```

**Step 4 — Write Your Dialogue**

Enter the dialogue for each speaker using the card system:

1. The first card is pre-assigned to **Speaker 1** — enter their opening line or turn
2. Click **Add Dialog** — a new card appears assigned to **Speaker 2**
3. Enter Speaker 2's response or turn
4. Click **Add Dialog** again to add Speaker 1's next turn
5. Continue until the full dialogue is complete

**Multi-language support:** Each dialogue card can be written in any language. The audio for that card will be generated in the language of the text entered — allowing multilingual dialogue if needed.

**Script writing tips for dialogue:**

* Write the way people actually speak — contractions, natural pauses, and conversational phrasing produce more realistic audio
* Use punctuation deliberately — commas for brief pauses, full stops for complete breaks, question marks for rising intonation
* Avoid overly long turns per card — break lengthy speaker contributions across multiple cards if needed
* Add emotional cues through the style prompt rather than trying to describe them in the script text itself

**Step 5 — Select a Model**

| Model         | Best For                                                  |
| ------------- | --------------------------------------------------------- |
| **Flash TTS** | Draft generation, testing dialogue flow and voice pairing |
| **Pro TTS**   | Final production output, client delivery, publishing      |

**Step 6 — Generate** Click **Generate**. The system processes all dialogue cards in sequence and produces a **single combined audio file** with both voices alternating as configured.

***

#### Reviewing and Using Your Output

Once generated, you can:

* **Play** the combined audio to review the full dialogue
* **Download** the file to your device
* **Share** via a shareable link
* **Regenerate** by adjusting individual dialogue cards, voice assignments, or style instructions if the output needs refinement

All generated audio is automatically saved to the **Audio History panel** and to **Dashboard → History** for future access. See Managing Generated Audio → for full details.

***

#### Resetting

The **Reset** option clears all dialogue cards, voice assignments, style instructions, and model selection. Use this when starting a new dialogue from scratch.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.qolaba.ai/speech-generation/speech-generation-modes/multi-speaker-mode.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.