Skip to content
Download for Mac

Thinking

Type @thinking in any message to trigger explicit reasoning for that turn, or toggle thinking on at the conversation level to keep it active for every response.

Per-message trigger: Type @thinking before your prompt. The agent performs chain-of-thought reasoning for that single response, then returns to normal mode.

Conversation-level toggle: Enable the thinking_enabled toggle in the conversation settings panel. Every response in that conversation will include a thinking phase until you disable it.

Both methods produce identical output — expandable thinking blocks that show the agent’s reasoning process.

QARK adapts its thinking implementation based on the model you’re using:

Native thinking (Claude, etc.): When the active model supports built-in thinking (like Claude’s extended thinking), QARK uses the provider’s native implementation. This is the most efficient path — the model reasons internally using its own optimized mechanism.

ThinkTool fallback: For models without native thinking support, QARK provides a tool-based thinking mechanism. The agent calls a dedicated ThinkTool that structures reasoning into explicit steps: breaking down the problem, evaluating options, checking constraints, and synthesizing a conclusion. This gives every model access to chain-of-thought reasoning, regardless of provider support.

Thinking output renders as expandable/collapsible blocks in the conversation. Click to expand and see the full reasoning chain. Collapse to keep the conversation focused on final answers.

Each thinking block shows:

  • The reasoning chain broken into logical steps
  • Intermediate conclusions and course corrections
  • The final synthesized answer that feeds into the visible response

Thinking tokens are tracked separately from response tokens in the cost ledger. This matters because:

  • Thinking tokens may have different pricing depending on the provider
  • You can monitor how much of your budget goes to reasoning vs. output
  • Token counts for thinking appear as a distinct line item in conversation stats

Thinking delivers the most value for tasks that benefit from deliberate reasoning:

  • Multi-step problems — tasks requiring planning, sequencing, or dependency tracking
  • Complex math — calculations, proofs, or numerical analysis where showing work prevents errors
  • Code analysis — debugging, architecture review, or understanding unfamiliar codebases
  • Ambiguous prompts — when the request could be interpreted multiple ways, thinking surfaces the tradeoffs before committing to an approach
  • Research synthesis — combining information from multiple sources into a coherent conclusion

For straightforward tasks like formatting text or answering factual questions, thinking adds latency without meaningful benefit.

Thinking works in the overlay the same way it works in conversations. Enable the thinking toggle before running a Spark, or include @thinking in your message. The thinking output appears in the same expandable block format.

This is useful for Sparks that handle complex workflows — code review, analysis pipelines, or multi-step research tasks — where visible reasoning improves trust in the output.

The @thinking trigger and the thinking_enabled conversation toggle stay in sync. Enabling the toggle is equivalent to prefixing every message with @thinking. Disabling the toggle stops thinking on subsequent messages but preserves thinking blocks already in the conversation history.