Image Generation

The image_generation tool is a built-in tool automatically available on all /v1/chat/completions requests. It uses Google Gemini (gemini-3.1-flash-image-preview) under the hood. You don't configure it — just describe what you want in natural language and the model decides when to invoke it.

How It Works

  1. You send a normal chat request with an image-related prompt

  2. The LLM detects it needs to generate an image and calls the image_generation tool internally

  3. The tool generates the image and uploads it to GCP CDN

  4. The response comes back as a standard chat completion with a markdown image link embedded in the content

Resolutions

Value
Description

0.5K

512px — quick previews, thumbnails

1K

Default — standard quality

2K

High quality

4K

Maximum detail, upscaling

Aspect Ratios

Value
Best For

1:1

Social media posts, profile pictures

16:9

Landscape, YouTube thumbnails, desktop wallpapers

9:16

Stories, Reels, TikTok

4:3

Presentations, traditional photos

3:2

Photography, prints

21:9

Cinematic, ultrawide banners

4:5 / 5:4

Instagram portrait/landscape

1:4 / 4:1

Tall/wide banners

Use Cases & Examples

1. Simple Image Generation

2. Specific Aspect Ratio + Resolution

The model automatically infers resolution: 2K and aspect_ratio: 16:9 from the prompt.

3. Multiple Image Variations

Returns multiple ![Generated Image](url) links in the content, cost split per image.

4. Portrait / Story Format

5. High-Detail / Infographic (Thinking Mode)

Best for text-heavy images, menus, charts, complex multi-element compositions:

6. Branded / Product Marketing

Tips & Best Practices

Tip
Details

Be specific

Include style (photorealistic, cartoon, watercolor), lighting, mood, color palette

Mention ratio

Say "16:9 landscape" or "9:16 vertical" — the model picks up the hint

Multiple variations

Say "3 variations" or "4 different versions" to get count > 1

Text in images

Mention "clear readable text" — model enables thinking mode automatically

Cost awareness

4K images and multiple images multiply cost; 0.5K is cheapest for previews


Limitations

  • Max count per request: no hard cap but cost scales linearly

  • use_search_grounding adds latency and extra cost per query

  • Generated images are hosted on cdn.qolaba.app — URLs are permanent CDN links

Last updated