Skip to content
Download for Mac

Multi-Agent Workflows

Single agents solve single problems. Multi-agent workflows decompose complex problems into specialized subtasks — each handled by an agent optimized for that job. An orchestrator dispatches work, specialists execute, and the execution trace gives you full visibility into every step.

QARK ships with built-in multi-agent pipelines that demonstrate what this architecture can do: a Review Article Writer that coordinates 8 sub-agents to research, outline, write, and format a full academic paper as PDF. This page covers how the system works, how the built-in pipelines are structured, and how to design your own.


Any agent with agent_type: "tool" can be invoked by other agents as if it were a tool. The parent agent sees it in its tool list with a name, description, and input schema — it calls it like any other tool. Under the hood, QARK:

  1. Spawns a sub-agent with its own system prompt, model, temperature, and tool access
  2. Streams the sub-agent’s response back to the parent, with events visible in the UI as nested execution blocks
  3. Saves the result to a file at ~/.qark/agent_tool_results/{conversation_id}/{tool_call_id}.md
  4. Returns a compact JSON reference to the parent — not the full text

That last point is critical. A sub-agent might produce 5,000 words of research. Instead of dumping all of that into the parent’s context window, the parent receives:

{
"agent": "Web Researcher",
"result_file": "/path/to/result.md",
"word_count": 5234,
"preview": "First 300 chars of output..."
}

The parent can then pass that file path to the next sub-agent, which reads it via unix commands. This file-based handoff is what makes deep multi-agent pipelines practical — without it, the orchestrator’s context window would overflow after two or three sub-agent calls.

Each agent-tool has an availability setting:

  • agents_only — Only other agents can invoke it. It does not appear in the @mention popover. Use this for specialized sub-agents that make no sense as standalone tools (e.g., Citation Formatter, Paper Outliner).
  • everywhere — Appears in @mention and is callable by other agents. Use this for agents that are useful both standalone and as building blocks (e.g., Web Researcher, Fact Checker).

When a sub-agent spawns, it inherits shared managers from its parent: the web provider manager, content compressor, MCP connections, vector store, embedding config, and reranker config. These are cloned, not re-initialized — so a 5-level deep agent chain does not spin up 5 separate web browsers or vector databases.


Agent-tools calling other agent-tools creates a depth tree:

Depth 0: Your conversation with the root agent
Depth 1: Root agent calls Specialist A
Depth 2: Specialist A calls Sub-specialist X
Depth 3: Sub-specialist X calls Leaf Agent Y

Each agent has a max_recursion_depth setting that caps how deep its sub-agents can nest. The built-in agents use these limits:

AgentMax DepthReasoning
Review Article Writer5Orchestrates 8 sub-agents, some with their own tools
Codebase Paper Writer5Same architecture as Review Article Writer
Research Director8Research Analyst sub-agent itself calls Fact Checker
Deep Researcher5Coordinates 6 specialist sub-agents
Leaf specialists (Web Researcher, Section Writer, etc.)1Perform direct work, never delegate further
Research Analyst5Calls Fact Checker as a sub-agent

Set depth per agent in the agent editor. Leaf agents that should never delegate: set to 1. Orchestrators managing sub-orchestrators: raise to 4–5.

QARK elevates the tool turn limit to 50 when agent-tools are active (default is 10 for regular tools, 20 with unix commands). This is necessary because orchestrators make many sequential tool calls — dispatching 8 sub-agents in a paper pipeline consumes 8+ tool turns just for the sub-agent calls, before counting any direct tool use.


QARK ships with 18 built-in agents organized into complete pipelines. Four are orchestrator agents that coordinate the other 14 specialist tool-agents.

This is the most complex built-in pipeline. Given a topic and source URLs, it produces a fully formatted academic review article as PDF — complete with themed sections, inline citations, and a bibliography.

Orchestrator: Review Article Writer Sub-agents: 8 (Source Deep Diver, Web Researcher, Academic Researcher, Paper Outliner, Section Writer, Paper Editor, Citation Formatter, plus the built-in Combine Files and Save Paper PDF tools)

The orchestrator’s system prompt defines a strict 7-step execution plan:

Step 1: For each URL the user provides →
call Source Deep Diver (fetch and extract key content)
Step 2: Call Web Researcher 3-5× with different search angles
Call Academic Researcher to find citable papers
→ all results saved as files
Step 3: Call Paper Outliner with topic, page target,
and all research file paths
→ produces structured outline
Step 4: For EVERY section in the outline →
call Section Writer with section title, word target,
outline file, 2-3 relevant source files,
and the academic sources file
Step 5: Call Combine Files to merge all section files
Step 6: Call Citation Formatter to standardize references
Step 7: Call Save Paper PDF → final PDF output

Here is the full agent hierarchy:

Review Article Writer (orchestrator)
├── Source Deep Diver fetch URL, extract facts/arguments
│ Tools: web_fetch, thinking
│ Availability: agents_only
├── Web Researcher search web for diverse sources
│ Tools: web_search, web_fetch, thinking
│ Availability: everywhere
├── Academic Researcher find citable academic papers
│ Tools: web_search, web_fetch, thinking
│ Availability: everywhere
├── Paper Outliner create structured outline
│ Tools: thinking, unix_command, document_search
│ Availability: agents_only
├── Section Writer write individual sections
│ Tools: thinking, unix_command, document_search
│ Availability: agents_only
├── Paper Editor edit and refine sections
│ Tools: thinking, unix_command
│ Availability: agents_only
├── Citation Formatter standardize references (IEEE/APA/Chicago)
│ Tools: thinking, unix_command
│ Availability: agents_only
├── Combine Files merge markdown files (built-in tool)
└── Save Paper PDF render to PDF (built-in tool)

What makes this work:

  • File-based handoff. Each sub-agent writes its output to a file. The orchestrator collects file paths and passes them to the next stage. The Section Writer receives the outline file path and source file paths — it reads them via unix commands (head -c 50000), writes its section, and returns a new file path.
  • Academic sources propagated everywhere. The Academic Researcher produces a file of citation-ready references. That file path is passed to every Section Writer call, so each section can cite real papers.
  • Strict execution order. The orchestrator’s system prompt enforces: “save_paper_pdf is called EXACTLY ONCE, as the very last step. Never call it early or multiple times.”
  • No word count obsession. The system prompt explicitly says: “Do not obsess over word counts or page estimates. Write good content, the length will follow from the outline.”

Same architecture as the Review Article Writer, but swaps Source Deep Diver and Web Researcher for the Codebase Analyzer — a tool-agent that uses unix commands and document search to analyze source code:

Codebase Paper Writer (orchestrator)
├── Codebase Analyzer analyze code (4 passes)
│ Tools: unix_command, document_search, thinking
│ Focus modes: overview, architecture, data_model, algorithms
├── Academic Researcher find related academic papers
├── Paper Outliner structure the analysis
├── Section Writer write each section
├── Citation Formatter format references
├── Combine Files merge sections
└── Save Paper PDF render to PDF

The orchestrator calls Codebase Analyzer 4 times with different focus areas (overview, architecture, data_model, algorithms), then follows the same outline → write → combine → format → PDF pipeline.

A different architecture — the Deep Researcher coordinates 6 specialist sub-agents through a phased research methodology:

Deep Researcher (orchestrator)
├── Phase 1: Query Decomposer break topic into sub-questions
│ Tools: thinking
│ Availability: agents_only
├── Phase 2: Web Researcher search for each sub-question (4-6 calls)
│ Tools: web_search, web_fetch, thinking
├── Phase 3: Source Deep Diver deep-dive 2-4 key sources
│ Tools: web_fetch, thinking
├── Phase 4: Fact Verifier verify 2-3 critical claims
│ Tools: web_search, web_fetch, thinking
│ Availability: agents_only
├── Phase 5: Comparative Analyst structured comparison (if applicable)
│ Tools: web_search, web_fetch, thinking
│ Availability: agents_only
└── Phase 6: Report Synthesizer write final report in sections
Tools: thinking
Availability: agents_only

The output is a rich markdown document with executive summary, findings organized by theme, inline citations, LaTeX for quantitative expressions, and a numbered bibliography.

The simplest orchestrator — delegates to a single but powerful sub-agent:

Research Director (orchestrator)
├── Research Analyst (sub-agent, depth 5)
│ Tools: web_search, web_fetch, document_search
│ Sub-agent: Fact Checker (depth 3)
│ Tools: web_search, web_fetch

The Research Director breaks a topic into 2–3 research angles, dispatches the Research Analyst for each angle, then synthesizes an executive briefing. The Research Analyst itself can call the Fact Checker for claim verification — a 3-level deep chain.


Each agent-tool declares typed parameters. The parent agent sees these as the tool’s input schema and uses them to construct calls.

Input schema fields:

FieldTypePurpose
namestringParameter name (e.g., query, section_title)
typestring | number | booleanData type
descriptionstringTells the parent agent what to pass
requiredbooleanWhether the parent must provide this parameter

Design principles:

  • Name parameters specifically. query is vague; search_topic with a description like “The subject to search for, including geographic and temporal scope” gives the parent agent clear guidance.
  • Use string for most data. File paths, JSON arrays as strings, research content — all strings. Reserve number for true numeric values (word targets, page counts).
  • Mark required only when the agent cannot function without the parameter. Optional parameters with sensible defaults keep the interface flexible.

Output schema defines what the parent expects back. Include metadata — word counts, source counts, file paths — so the parent can make informed decisions about next steps.

Orchestrator pattern — One root agent coordinates specialists. The orchestrator’s system prompt defines the full execution plan. This is what the Review Article Writer uses.

Chain pattern — Agents execute in sequence, each passing output to the next. No central coordinator. Works when the workflow is linear.

Hub-and-spoke pattern — The Research Director uses this: dispatch the same sub-agent multiple times with different inputs, then synthesize.

Nested delegation — The Research Director → Research Analyst → Fact Checker chain. Each level focuses on a narrower task.

For orchestrators, the system prompt is the execution plan. The built-in agents demonstrate key techniques:

  • Numbered steps — “Execute these steps IN ORDER: 1… 2… 3…” Forces sequential execution.
  • File path tracking — “Save each result_file path” and “Save result_file as OUTLINE_FILE” creates named references the LLM tracks across steps.
  • Critical rules — “save_paper_pdf is called EXACTLY ONCE, as the very last step” prevents common failure modes.
  • Minimal commentary — “Never output more than 1 sentence between tool calls” keeps the orchestrator focused on dispatching rather than narrating.

Match model capability to task complexity:

  • Orchestrators — Use capable models (Gemini Pro, Claude Sonnet/Opus). They need to follow complex multi-step instructions reliably.
  • Research specialists (Web Researcher, Source Deep Diver) — Mid-tier models work. The task is search + summarize, not deep reasoning.
  • Writing specialists (Section Writer, Report Synthesizer) — Capable models. Writing quality depends directly on model capability.
  • Mechanical specialists (Citation Formatter, Codebase Analyzer) — Can use faster models. The task is structured transformation.
  • Reasoning specialists (Fact Verifier, Comparative Analyst) — Enable thinking. These agents need to evaluate evidence carefully.

Every multi-agent run produces a nested execution trace visible in the conversation UI:

  • Depth-indented blocks — orchestrator calls are flush left, depth-1 specialists indented one level, depth-2 sub-agents indented further
  • Tool calls within each agent — which tools were invoked, parameters, and results
  • Status indicators — running, completed, or error for each agent
  • Agent name and depth label on each block

Every agent-tool invocation records tokens and cost in the append-only cost ledger. The conversation’s total cost reflects the sum across all agents in the hierarchy.

To identify which agent drives costs:

  • Expand the execution trace and check token counts per agent
  • Orchestrators typically have low output tokens but accumulate input tokens from sub-agent results
  • Writing agents (Section Writer) tend to be the most expensive — they produce the most output
  • Research agents (Web Researcher) drive costs through web fetch content compression
SymptomCauseFix
Orchestrator skips stepsSystem prompt instructions unclearUse numbered steps with explicit “IN ORDER”
Sub-agent output too large for parentResult not file-basedEnsure sub-agents save to file and return paths
Workflow hits recursion limitToo many nesting levelsFlatten hierarchy or raise max_recursion_depth
High cost, poor resultsWrong model on a specialistCheap models for mechanical work, capable models for reasoning
Orchestrator calls save_paper_pdf earlyMissing constraint in system promptAdd explicit “EXACTLY ONCE, as the very last step” rule
Section Writer ignores academic sourcesFile path not passedPass academic_sources_file to every Section Writer call
Timeout on complex workflowsToo many sequential callsRestructure so independent sub-agents can run earlier in the sequence