Skip to content
Download for Mac

File Attachments & Auto-Routing

QARK supports five ways to attach content to a message. Each attachment is automatically routed to the most effective processing path based on the file type, model capabilities, and context window utilization.

The attachment menu in the composer provides five options:

MethodWhat it does
Add ImagesNative file picker filtered to image types (JPEG, PNG, GIF, WebP, BMP, SVG, ICO, TIFF)
Add FilesNative file picker for documents and code — PDF, DOCX, XLSX, PPTX, Markdown, HTML, plain text, and 80+ programming languages
Add FolderDirectory picker that recursively scans for all supported files, auto-excluding build artifacts (node_modules, .git, __pycache__, dist, target, venv, etc.)
Add Selected in FinderGrabs whatever files or folders are currently selected in macOS Finder — no file picker dialog needed
Clipboard HistoryBrowse the last 50 clipboard entries (text, images, and file paths) and attach any of them
Attachment menu showing all five options

When you attach images, QARK decides whether to send them as visual content (vision path) or extract their text and process them as documents (RAG path). This decision is automatic and depends on three factors: model capability, image count, and total payload size.

The image is sent directly to the model as a visual attachment. The model sees the actual image — pixels, layout, charts, handwriting, screenshots — and can describe, analyze, or extract information from it.

All three conditions must be true:

  1. The conversation’s model supports vision (Claude, GPT-4o, Gemini, etc.)
  2. Total images in the message (existing + new) ≤ 3
  3. Total image payload ≤ 50 MB

The image is processed as a document instead. QARK extracts text content from the image (using a configurable image extraction model) and routes the extracted text through the standard document pipeline — direct injection or RAG depending on size.

In the main conversation (ChatInput):

ScenarioWhat happens
1 image, vision modelVision — image sent directly to model
3 images, vision model, under 50 MBVision — all 3 sent as visual attachments
4 images, vision modelRAG — exceeds 3-image limit, all 4 extracted as text documents
2 images, non-vision model (e.g., older text-only model)RAG — model cannot process images, text extracted
1 image + 2 PDFs, vision modelSplit — image goes vision, PDFs go document path
5 images + 3 code files, vision modelRAG — exceeds image limit, all files (images + code) routed as documents
2 large screenshots totaling 60 MB, vision modelRAG — exceeds 50 MB payload limit

In the overlay (OverlayInput):

The same routing logic applies. The overlay uses the same routeImages() function as the main conversation.

ScenarioWhat happens
Drag 1 screenshot into overlay, vision modelVision — sent as visual attachment
Paste image from clipboard, vision modelVision — clipboard image routed through vision
Attach 4 images via clipboard history, vision modelRAG — exceeds 3-image limit
Attach image, non-vision Spark modelRAG — text extraction fallback

From Finder selection (Add Selected in Finder):

ScenarioWhat happens
Select 2 PNGs in FinderVision — pure image selection, under limit
Select 4 PNGs in FinderRAG — exceeds 3-image limit, all treated as documents
Select 1 PNG + 1 PDF in FinderAll documents — mixed content, everything goes to document path (including the PNG)
Select a folder containing images and codeAll documents — folder contents always go to document path

From clipboard history:

ScenarioWhat happens
Attach 1 copied image from history, vision modelVision — single image, capable model
Attach 2 images + 1 text entry from historySplit — images to vision (if model supports it and ≤ 3), text inserted into message
Attach copied file paths from historyDocuments — files read from disk, routed through document pipeline
  • Vision is all-or-nothing per batch: if you attach 4 images at once through Add Images, all 4 go to RAG. You cannot split 3 to vision and 1 to RAG within a single attachment action.
  • Mixed file types force document path for images: attaching 1 image + 1 PDF through Finder selection sends the image through the document path too — not vision.
  • Switching models changes routing: if you switch from a vision model to a text-only model mid-conversation, previously vision-routed images are already stored. New image attachments will route to RAG.
  • The overlay and main app use identical logic: there is no difference in routing behavior between the overlay and the main conversation composer.

Documents (including images that fell back from the vision path) go through text extraction, then either direct injection or the full RAG pipeline depending on size. See auto-routing below.

JPEG, PNG, GIF, WebP, BMP, SVG, ICO, TIFF — up to 20 MB per image, 3 per message.

Structured documents: PDF, DOCX, XLSX, PPTX, EPUB

Markup and text: Markdown, HTML, reStructuredText, LaTeX, AsciiDoc, RTF, plain text, CSV, TSV, JSON, YAML, TOML, XML, INI, .env, .log

Programming languages: Python, JavaScript, TypeScript, Rust, Go, Java, C/C++, Ruby, PHP, Swift, Kotlin, Shell (sh/bash/zsh/fish/PowerShell), SQL, R, Lua, Perl, Scala, Zig, Nim, Dart, Elixir, Erlang, Clojure, Haskell, OCaml, F#, Julia, Objective-C, D, Pascal, Groovy, Terraform — and framework files like Vue, Svelte, Astro, GraphQL, Protocol Buffers.

Notebooks: Jupyter (.ipynb)

Limits: 100 MB per file, 10 GB total across all attached documents.

When you attach a folder, QARK skips directories that contain build output, dependencies, or caches:

node_modules, .git, __pycache__, .next, .nuxt, .svelte-kit, dist, build, out, .output, target, .cache, .turbo, vendor, venv, .venv, env, .tox, coverage, .nyc_output, .pytest_cache, .mypy_cache, .DS_Store, Thumbs.db, .idea, .vscode

Relative paths within the folder structure are preserved in the UI so you can see which subfolder each file came from.

When documents are attached, QARK decides how to deliver their content to the model. This happens automatically based on the RAG threshold — a percentage of the model’s context window (default: 30%).

  1. QARK estimates the total token count of all attached documents (~4 characters = 1 token)
  2. It calculates the threshold: context_window × threshold_pct
  3. It calculates the available budget: context_window − system_prompt − history − max_output

Three outcomes:

ModeConditionBehavior
Direct injectionTotal doc tokens ≤ threshold AND ≤ available budgetAll document text is injected into the system prompt. The model sees the full content on every turn without needing the @document-search tool.
RAGTotal doc tokens > thresholdDocuments are chunked, embedded as vectors, and indexed. The @document-search tool is auto-enabled. The model retrieves relevant chunks per query.
MixedSome documents fit under the threshold, others don’tSmall documents are injected directly. Large documents go through the RAG pipeline. @document-search is enabled for the RAG’d documents.

Using Claude Sonnet 4.6 with a 200K context window and the default 30% threshold:

  • Threshold: 60,000 tokens
  • 5-page PDF (~3,000 tokens): Direct injection — the full text appears in the system prompt
  • 200-page research paper (~120,000 tokens): RAG — chunked, embedded, searchable via @document-search
  • Both attached together: Mixed — the small PDF is injected, the large paper goes through RAG

The RAG threshold is configurable at two levels:

  • Per-conversation — In the Config tab under Advanced → RAG Overrides → Threshold %
  • Globally — In Settings → Tools & MCP → RAG settings

Set it lower (e.g., 10%) to push more documents through RAG. Set it higher (e.g., 50%) to inject more content directly. Direct injection gives the model full visibility but consumes context window space. RAG preserves context budget but requires the model to search for relevant sections.

For more on the RAG pipeline itself — search strategies, reranking, citations — see RAG Pipeline.

QARK continuously monitors your system clipboard and maintains a searchable history of the last 50 entries. Anything you copy — text, images, files — is captured and available to attach to any conversation or Spark.

  • Starts automatically on app launch (macOS, Windows, and Linux)
  • Polls the system clipboard every 2 seconds
  • Deduplicates entries by content hash — copying the same text twice does not create a duplicate
  • Stores entries in memory only — history clears on app restart
  • Detection priority: file paths first, then images, then text
Content typeCaptured dataPicker display
TextFirst 200 characters as previewText snippet + relative timestamp (“2m ago”)
ImagesJPEG thumbnail (200px) + full-resolution PNGThumbnail (40×40px) + dimensions (“1920×1080”) + timestamp
File pathsAbsolute paths from Finder/Explorer copies, folders expanded recursivelyFilename list (up to 3 shown, “+N more” for the rest) + file count + timestamp

Open the clipboard picker from:

  • Attachment menu → Clipboard History (in both main composer and the overlay)
  • Command palette (Cmd+K) → search “clipboard”

The picker is a modal dialog with:

  • Search field (auto-focused) — filters entries by preview text
  • Entry list (max 350px, scrollable) — each entry shows its type icon, preview, and timestamp
  • Multi-select — click entries to toggle selection, selected entries highlight with a primary color border
  • Attach button — appears when one or more entries are selected, shows count (“Attach 3 items”)
Clipboard history picker with text, image, and file entries

When you attach entries from clipboard history, each entry routes independently:

  • Text entries → inserted directly into the message composer as text
  • Image entries → vision path (if model supports vision, ≤ 3 images, ≤ 50 MB) or RAG path (otherwise)
  • File path entries → document path (direct injection or RAG based on size threshold)

These are two separate features:

  • Overlay auto-capture: When you trigger the overlay (Cmd+Option+Space / Ctrl+Alt+Space), it simulates a copy command to grab whatever text is currently selected in your active app. This captured text appears in the overlay’s clipboard preview area. This is a one-time capture of the current selection.
  • Clipboard history: A browsable archive of your last 50 clipboard operations. You open it manually via the attachment menu. You can select multiple entries and attach them to the current message.

Both features are available in the overlay — the auto-capture happens on trigger, and clipboard history is accessible from the overlay’s attachment menu.

On macOS, this option reads the current Finder selection via native integration. No dialog appears — it immediately grabs whatever is selected.

Routing logic for Finder selections:

  • If the selection contains only images (≤ 3): routed to the vision path
  • If the selection contains any non-image files, or more than 3 images: everything goes to the document path

Selected folders are recursively scanned with the same exclusion rules as Add Folder.

DataLocation
Image files{dataDir}/attachments/{conversationId}/{messageId}/{filename} on disk
Image metadatamessages.metadata JSON field (filename, media type, size, path)
Document text + chunksdocuments table in SQLite
Vector embeddingsLocal vector store
Clipboard historyIn-memory (50 entries, cleared on app restart)

All attachment data stays on your machine. Image and document files are never sent to QARK infrastructure — only to the AI provider when you send a message. See Privacy Model for the full data residency breakdown.