# Image Generation

The `image_generation` tool is a **built-in tool** automatically available on all `/v1/chat/completions` requests. It uses **Google Gemini** (`gemini-3.1-flash-image-preview`) under the hood. You don't configure it — just describe what you want in natural language and the model decides when to invoke it.

### How It Works

1. You send a normal chat request with an image-related prompt
2. The LLM detects it needs to generate an image and calls the `image_generation` tool internally
3. The tool generates the image and uploads it to GCP CDN
4. The response comes back as a standard chat completion with a markdown image link embedded in the `content`

#### Resolutions

| Value  | Description                        |
| ------ | ---------------------------------- |
| `0.5K` | 512px — quick previews, thumbnails |
| `1K`   | Default — standard quality         |
| `2K`   | High quality                       |
| `4K`   | Maximum detail, upscaling          |

#### Aspect Ratios

| Value         | Best For                                          |
| ------------- | ------------------------------------------------- |
| `1:1`         | Social media posts, profile pictures              |
| `16:9`        | Landscape, YouTube thumbnails, desktop wallpapers |
| `9:16`        | Stories, Reels, TikTok                            |
| `4:3`         | Presentations, traditional photos                 |
| `3:2`         | Photography, prints                               |
| `21:9`        | Cinematic, ultrawide banners                      |
| `4:5` / `5:4` | Instagram portrait/landscape                      |
| `1:4` / `4:1` | Tall/wide banners                                 |

### Use Cases & Examples

#### 1. Simple Image Generation

```json
{
  "model": "google/gemini-2.5-flash",
  "messages": [
    { "role": "user", "content": "Generate an image of a futuristic city at sunset with flying cars" }
  ]
}

```

#### 2. Specific Aspect Ratio + Resolution

```json
{
  "model": "google/gemini-2.5-flash",
  "messages": [
    {
      "role": "user",
      "content": "Generate a 16:9 high quality landscape image of snow-capped mountains with a lake reflection"
    }
  ]
}

```

The model automatically infers `resolution: 2K` and `aspect_ratio: 16:9` from the prompt.

#### 3. Multiple Image Variations

Returns multiple `![Generated Image](url)` links in the content, cost split per image.

```json
{
  "model": "google/gemini-2.5-flash",
  "messages": [
    { "role": "user", "content": "Generate 3 variations of a minimalist logo for a coffee shop" }
  ]
}

```

#### 4. Portrait / Story Format

```json
{
  "model": "google/gemini-2.5-flash",
  "messages": [
    { "role": "user", "content": "Create a 9:16 portrait illustration of a woman reading a book in a cozy library" }
  ]
}

```

#### 5. High-Detail / Infographic (Thinking Mode)

Best for text-heavy images, menus, charts, complex multi-element compositions:

```json
{
  "model": "google/gemini-2.5-flash",
  "messages": [
    { "role": "user", "content": "Create a detailed restaurant menu layout with sections for appetizers, mains, and desserts with prices" }
  ]
}

```

#### 6. Branded / Product Marketing

```json
{
  "model": "google/gemini-2.5-flash",
  "messages": [
    {
      "role": "user",
      "content": "Create a 4:5 product photo of a luxury perfume bottle on a marble surface with soft lighting, suitable for Instagram"
    }
  ]
}

```

### Tips & Best Practices

| Tip                     | Details                                                                            |
| ----------------------- | ---------------------------------------------------------------------------------- |
| **Be specific**         | Include style (photorealistic, cartoon, watercolor), lighting, mood, color palette |
| **Mention ratio**       | Say "16:9 landscape" or "9:16 vertical" — the model picks up the hint              |
| **Multiple variations** | Say "3 variations" or "4 different versions" to get `count > 1`                    |
| **Text in images**      | Mention "clear readable text" — model enables thinking mode automatically          |
| **Cost awareness**      | 4K images and multiple images multiply cost; 0.5K is cheapest for previews         |

***

### Limitations

* Max `count` per request: no hard cap but cost scales linearly
* `use_search_grounding` adds latency and extra cost per query
* Generated images are hosted on `cdn.qolaba.app` — URLs are permanent CDN links
