Model Settings

How to use Thinking Depth and Temperature in Qolaba to control reasoning effort, thinking tokens, and response creativity.

Model Settings let you control how a model thinks and responds. Two settings are available — Thinking Depth and Temperature — each affecting a different dimension of the output.


1. Thinking Depth

Thinking Depth controls how much internal reasoning the model applies before generating a response. When enabled, the model works through a series of reasoning steps — called thinking tokens — before producing its final answer.

What Are Thinking Tokens?

Thinking tokens are the model's internal reasoning steps — the process it goes through to interpret your prompt, evaluate different approaches, and arrive at a well-considered response before replying. They are visible in the response so you can follow how the model reasoned through your request.

Thinking tokens are counted as output tokens and consume credits accordingly. The deeper the thinking level, the more reasoning steps the model takes, and the more credits are used.

Thinking tokens are most valuable for:

  • Complex research and analysis

  • Multi-step problem solving

  • Strategy planning and decision-making

  • Advanced coding and debugging

  • Tasks where understanding how the model reasoned matters as much as the answer itself

Thinking Depth Levels

Level
Reasoning Effort
Credit Usage
Best For

None

No thinking tokens

Lowest

Simple Q&A, formatting, short rewrites

Low

Minimal reasoning

Low

Basic content writing, casual prompts

Medium

Balanced reasoning

Moderate

Blog writing, coding assistance, structured business tasks

High

Deep, multi-step reasoning

High

Complex coding, strategy planning, analytical writing

Extra High

Maximum reasoning effort

Highest

Advanced research, long-form reasoning chains, complex problem solving


2. Temperature

Temperature controls how creative or predictable the model's responses are — specifically, the randomness applied when the model selects words and constructs its response. In Qolaba, Temperature is set on a scale of 0 to 100.

  • 0 — Fully deterministic. The model picks the most probable word at every step. Responses are consistent, precise, and repeatable.

  • 100 — Maximum randomness. The model explores less probable word choices, producing more varied, creative, and sometimes unexpected outputs.

Range
Output Style
Best For

0 – 30

Focused, deterministic, factual

Coding, legal drafts, data analysis, structured outputs (JSON, tables)

40 – 60

Balanced, natural, controlled

Blog posts, marketing copy, email drafts, general writing

70 – 100

Creative, varied, less predictable

Storytelling, brainstorming, ad copy, brand naming, ideation


Thinking Depth
Temperature
Output Type

High

0 – 30

Precise and structured — technical reports, competitive analysis, complex coding

Medium

40 – 60

Balanced and reliable — business writing, content drafts, professional communication

Low

70 – 100

Fast and creative — brainstorming, ideation, headline generation, ad variations

Examples:

  • Writing a detailed strategy document → High Thinking Depth + Temperature 10–20

  • Generating 10 creative campaign name ideas → Low Thinking Depth + Temperature 80–90

Last updated