Skip to content
Download for Mac

Document Search

Type @document-search followed by your query to search through documents attached to the conversation. This tool is the front-end to QARK’s RAG pipeline — it takes your query, runs it against indexed document chunks, and returns the most relevant passages with citations.

Document search results with relevance scores

Before you can search, attach documents to the conversation. QARK provides several ways to do this:

  • Type @/ or @. in the composer — opens a native file picker directly from the keyboard. The @/ and @. characters are removed from your input automatically.
  • Attach menu (+) — click the + button in the composer to access:
    • Add Files — opens a file picker for individual documents
    • Add Folder — opens a folder browser, recursively scans for supported files
    • Add Selected in Finder — grabs the current Finder selection (macOS) without opening a dialog
    • Clipboard History — browse and attach from your recent clipboard entries
  • Drag and drop — drag files or folders directly onto the composer. Folder structure is preserved.
  • Paste — paste images directly from your clipboard into the composer.

Supported file types include PDF, DOCX, XLSX, PPTX, Markdown, HTML, plain text, EPUB, and most source code formats. See File Attachments & Auto-Routing for the full routing logic that determines whether files go to vision or RAG.

Type @document-search in any message. The tool activates and displays the current search strategy in the UI so you know exactly how your query will be processed.

QARK supports 3 query strategies that determine how your search terms are matched against document content:

StrategyHow It WorksBest For
SemanticEmbeds your query and finds chunks with the closest vector similarityDirect questions, specific lookups
HyDE (Hypothetical Document Embedding)Generates a hypothetical answer first, then searches for chunks similar to that answerExploratory questions where you’re not sure of the exact terminology
Step-backReformulates your query into a broader, more abstract version before searchingNarrow questions that need broader context to answer well

The default mode is auto, which lets the AI agent select the best strategy based on your query. You can override this in the tool settings.

Each search result includes:

  • Relevance score — a numerical confidence rating showing how closely the chunk matches your query
  • Result count — the total number of matching chunks returned
  • Source citation — which document and section the passage came from
  • Inline citation badges — clickable references that appear in the agent’s response, linking back to the source passage

When the agent uses information from document search results, it inserts citation badges directly in the response text. Each badge references a specific document chunk. Click the badge to jump to the source passage and verify the information.

When you attach new documents, the RAG pipeline processes them through several stages. The UI displays progress indicators for each stage:

  1. Parsing — extracting text from the document format (PDF, DOCX, etc.)
  2. Chunking — splitting content into overlapping segments
  3. Embedding — generating vector representations of each chunk
  4. Indexing — storing embeddings for retrieval

This progress display is visible during the initial indexing and whenever new documents are added mid-conversation.

QARK offers two modes for how attached documents are searched. The mode determines the underlying tool and search strategy used when an agent needs information from your documents.

The standard mode. Documents are parsed, chunked, embedded, and stored in a vector index. The agent uses the document_search tool with semantic, HyDE, or step-back query strategies to find relevant passages.

  • Search method: Vector similarity (semantic search)
  • Preparation: Documents are chunked and embedded — requires an embedding model
  • Tool: document_search
  • Best for: Conceptual questions, when semantic understanding matters, small-to-medium documents

An alternative mode where documents are extracted to plain text files in a workspace directory (~/.qark/workspace/{conversation_id}/) and the agent searches them using Unix command-line tools — grep, head, tail, wc, and more — through the unix_command tool.

  • Search method: Keyword/pattern search via grep and Unix tools
  • Preparation: Documents extracted to .txt files — no embedding model needed
  • Tool: unix_command (replaces document_search)
  • Best for: Large documents, keyword-focused search, structured content (code, logs, configs), regex patterns

When File Tools mode activates, the system:

  1. Removes document_search from the enabled tools
  2. Adds unix_command if not already present
  3. Creates the workspace directory with extracted text files
  4. Injects search strategy guidance into the system prompt (start with grep -l to find files, then search within)
AspectRAGFile Tools
Search methodSemantic similarityKeyword/pattern (grep, regex)
Document prepChunked + embeddedExtracted to plain text
Embedding modelRequiredNot needed
Best forConceptual queriesKeyword search, large files
Tool useddocument_searchunix_command
CostEmbedding cost per documentZero additional cost

Set the document mode globally or per-conversation:

  • Global: Settings → RAG → Document Mode → select RAG or File Tools
  • Per-conversation: Info panel → RAG Overrides → Document Mode

The per-conversation override takes precedence over the global setting.

Skills that declare skip-doc-search: true in their metadata bypass both RAG and File Tools modes entirely. The skill receives raw document paths instead and handles document interaction itself. See Skills and Documents for details.

Document search works alongside other tools in the same conversation. Common patterns:

  • Document search + web search — verify claims in your documents against live web sources
  • Document search + thinking — enable thinking to reason over multiple retrieved passages before synthesizing an answer
  • Document search + web fetch — pull in a URL to supplement your attached documents with additional context