Model Settings
How to use Thinking Depth and Temperature in Qolaba to control reasoning effort, thinking tokens, and response creativity.
Model Settings let you control how a model thinks and responds. Two settings are available — Thinking Depth and Temperature — each affecting a different dimension of the output.
1. Thinking Depth
Thinking Depth controls how much internal reasoning the model applies before generating a response. When enabled, the model works through a series of reasoning steps — called thinking tokens — before producing its final answer.
What Are Thinking Tokens?
Thinking tokens are the model's internal reasoning steps — the process it goes through to interpret your prompt, evaluate different approaches, and arrive at a well-considered response before replying. They are visible in the response so you can follow how the model reasoned through your request.
Thinking tokens are counted as output tokens and consume credits accordingly. The deeper the thinking level, the more reasoning steps the model takes, and the more credits are used.
Thinking tokens are most valuable for:
Complex research and analysis
Multi-step problem solving
Strategy planning and decision-making
Advanced coding and debugging
Tasks where understanding how the model reasoned matters as much as the answer itself
Thinking Depth Levels
None
No thinking tokens
Lowest
Simple Q&A, formatting, short rewrites
Low
Minimal reasoning
Low
Basic content writing, casual prompts
Medium
Balanced reasoning
Moderate
Blog writing, coding assistance, structured business tasks
High
Deep, multi-step reasoning
High
Complex coding, strategy planning, analytical writing
Extra High
Maximum reasoning effort
Highest
Advanced research, long-form reasoning chains, complex problem solving
2. Temperature
Temperature controls how creative or predictable the model's responses are — specifically, the randomness applied when the model selects words and constructs its response. In Qolaba, Temperature is set on a scale of 0 to 100.
0 — Fully deterministic. The model picks the most probable word at every step. Responses are consistent, precise, and repeatable.
100 — Maximum randomness. The model explores less probable word choices, producing more varied, creative, and sometimes unexpected outputs.
0 – 30
Focused, deterministic, factual
Coding, legal drafts, data analysis, structured outputs (JSON, tables)
40 – 60
Balanced, natural, controlled
Blog posts, marketing copy, email drafts, general writing
70 – 100
Creative, varied, less predictable
Storytelling, brainstorming, ad copy, brand naming, ideation
Recommended Combinations
High
0 – 30
Precise and structured — technical reports, competitive analysis, complex coding
Medium
40 – 60
Balanced and reliable — business writing, content drafts, professional communication
Low
70 – 100
Fast and creative — brainstorming, ideation, headline generation, ad variations
Examples:
Writing a detailed strategy document → High Thinking Depth + Temperature 10–20
Generating 10 creative campaign name ideas → Low Thinking Depth + Temperature 80–90
Last updated