Thinking

Type @thinking in any message to trigger explicit reasoning for that turn, or toggle thinking on at the conversation level to keep it active for every response.

Two Activation Methods

Per-message trigger: Type @thinking before your prompt. The agent performs chain-of-thought reasoning for that single response, then returns to normal mode.

Conversation-level toggle: Enable the thinking_enabled toggle in the conversation settings panel. Every response in that conversation will include a thinking phase until you disable it.

Both methods produce identical output — expandable thinking blocks that show the agent’s reasoning process.

How Thinking Works Under the Hood

QARK adapts its thinking implementation based on the model you’re using:

Native thinking (Claude, etc.): When the active model supports built-in thinking (like Claude’s extended thinking), QARK uses the provider’s native implementation. This is the most efficient path — the model reasons internally using its own optimized mechanism.

ThinkTool fallback: For models without native thinking support, QARK provides a tool-based thinking mechanism. The agent calls a dedicated ThinkTool that structures reasoning into explicit steps: breaking down the problem, evaluating options, checking constraints, and synthesizing a conclusion. This gives every model access to chain-of-thought reasoning, regardless of provider support.

Read and Expand Thinking Blocks

Thinking output renders as expandable/collapsible blocks in the conversation. Click to expand and see the full reasoning chain. Collapse to keep the conversation focused on final answers.

Each thinking block shows:

The reasoning chain broken into logical steps
Intermediate conclusions and course corrections
The final synthesized answer that feeds into the visible response

Track Thinking Token Usage

Thinking tokens are tracked separately from response tokens in the cost ledger. This matters because:

Thinking tokens may have different pricing depending on the provider
You can monitor how much of your budget goes to reasoning vs. output
Token counts for thinking appear as a distinct line item in conversation stats

When to Enable Thinking

Thinking delivers the most value for tasks that benefit from deliberate reasoning:

Multi-step problems — tasks requiring planning, sequencing, or dependency tracking
Complex math — calculations, proofs, or numerical analysis where showing work prevents errors
Code analysis — debugging, architecture review, or understanding unfamiliar codebases
Ambiguous prompts — when the request could be interpreted multiple ways, thinking surfaces the tradeoffs before committing to an approach
Research synthesis — combining information from multiple sources into a coherent conclusion

For straightforward tasks like formatting text or answering factual questions, thinking adds latency without meaningful benefit.

Use Thinking in Sparks

Thinking works in the overlay the same way it works in conversations. Enable the thinking toggle before running a Spark, or include @thinking in your message. The thinking output appears in the same expandable block format.

This is useful for Sparks that handle complex workflows — code review, analysis pipelines, or multi-step research tasks — where visible reasoning improves trust in the output.

Sync with Conversation Settings

The @thinking trigger and the thinking_enabled conversation toggle stay in sync. Enabling the toggle is equivalent to prefixing every message with @thinking. Disabling the toggle stops thinking on subsequent messages but preserves thinking blocks already in the conversation history.