Skills and Documents
When a conversation has attached documents, QARK normally handles them through its document search pipeline — either via RAG (semantic search with embeddings) or File Tools mode (plain text files searchable with Unix commands). Skills can interact with this pipeline in two ways: they can work alongside it, or they can take over document handling entirely.
Default Behavior — Skills Alongside Document Search
Section titled “Default Behavior — Skills Alongside Document Search”By default, skills and document search coexist. When a skill is active and documents are attached:
- The document search pipeline runs normally (RAG or File Tools, depending on configuration)
- The agent can use both skill instructions and document search results
- The
activate_skilltool output includes a<document_paths>section listing all attached documents with their absolute paths
This works well for most skills — the skill provides procedural knowledge while the document pipeline provides content retrieval.
Taking Over — The skip-doc-search Flag
Section titled “Taking Over — The skip-doc-search Flag”Some skills need direct control over how documents are processed. A PDF form-filling skill might need raw file access. A data analysis skill might want to parse spreadsheets with its own scripts rather than relying on chunked text. For these cases, skills can declare:
metadata: skip-doc-search: trueWhen this flag is set and the skill is active, the entire document search pipeline is bypassed:
- RAG mode: the
document_searchtool is removed from the enabled tools. No embeddings are generated, no chunks are created. - File Tools mode: the
unix_commandtool is not auto-enabled for document workspace access. No workspace directory is created. - Direct injection: documents are not injected into the system prompt.
Instead, the skill receives raw document information through the activate_skill tool:
<document_paths> <document id="abc123" filename="report.pdf" chars="45000" type="application/pdf"> <path>/absolute/path/to/report.pdf</path> <extracted>/absolute/path/to/report.extracted.txt</extracted> </document> <document id="def456" filename="data.xlsx" chars="12000" type="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet"> <path>/absolute/path/to/data.xlsx</path> <extracted>/absolute/path/to/data.extracted.txt</extracted> </document></document_paths>Each document entry includes:
| Field | Description |
|---|---|
id | Unique document identifier |
filename | Original filename |
chars | Character count of extracted text |
type | MIME type |
<path> | Absolute path to the original file |
<extracted> | Absolute path to the extracted text sidecar (.extracted.txt) |
The skill’s instructions can then direct the agent to process these files using its own scripts or tools — reading the extracted text, parsing the original binary file, or passing paths to custom processing scripts via skill_execute.
Document Processing Metadata Flags
Section titled “Document Processing Metadata Flags”Three metadata flags control how QARK handles documents when a skill is active. All three are set under the metadata: key in the skill’s YAML frontmatter.
skip-doc-search
Section titled “skip-doc-search”| Type | boolean |
| Default | false |
| Effect | Bypasses the entire document search pipeline — RAG, File Tools, and direct injection are all disabled. The skill receives raw document paths instead. |
When true, this flag changes two things:
- At upload time: document chunking and embedding are skipped entirely. The raw file is saved to disk but no vector index entries are created — saving processing time and embedding costs.
- At message time: the
document_searchtool is removed from enabled tools. No search results are injected into context. Instead, theactivate_skilltool output includes a<document_paths>block with absolute paths to the original files and their extracted text sidecars.
extract-text
Section titled “extract-text”| Type | boolean |
| Default | true |
| Effect | Controls whether .extracted.txt sidecar files are created during document upload. |
When skip-doc-search is true, QARK still needs to decide whether to extract text from binary formats (PDF, DOCX, XLSX, PPTX, EPUB). The extract-text flag controls this:
true(default): QARK parses the document and writes a plain-text sidecar file alongside the original. The sidecar path appears in the<extracted>element of the document paths XML. This gives the skill’s scripts easy access to the text content without needing to parse the binary format themselves.false: No sidecar is created. The skill receives only the path to the original binary file. Use this when the skill has its own parser or only needs the raw binary (e.g., a PDF form-filling skill that writes directly to the PDF).
extract-images
Section titled “extract-images”| Type | boolean |
| Default | false |
| Effect | Controls whether a vision model is used to extract text from image files in the document set. |
When true, image files (PNG, JPG, etc.) attached to the conversation are sent to a vision-capable model to extract any text content. The extracted text is written to a .extracted.txt sidecar. This is useful for skills that need to process screenshots, scanned documents, or images containing text — but it incurs additional API cost for the vision model call.
When false (default), images are stored as-is. The skill receives the image path but no text extraction is performed.
Example
Section titled “Example”---name: contract-analyzerdescription: Analyze legal contracts for key clauses, obligations, and risks.metadata: skip-doc-search: true # Skill handles documents with its own scripts extract-text: true # Create text sidecars for easy access extract-images: true # Extract text from scanned contract pages---With this configuration:
- The user attaches a scanned contract PDF and several page images
- QARK extracts text from the PDF into a
.extracted.txtsidecar - QARK sends each image to a vision model to extract text into sidecars
- No RAG chunking or embedding occurs
- The skill’s
activate_skilloutput includes paths to all originals and their sidecars - The skill’s scripts can process the extracted text with full control
Interaction with File Tools Mode
Section titled “Interaction with File Tools Mode”The File Tools document search mode and skill-owned document handling are independent systems:
- File Tools mode active, no skill override: documents are extracted to the workspace directory and searchable via
unix_command. This is the standard File Tools behavior. - Skill with
skip-doc-search: trueactive: File Tools mode is bypassed. The skill receives raw document paths and handles everything itself. - File Tools mode active, skill without
skip-doc-search: both coexist — the workspace is created, and the skill also receives document paths in itsactivate_skilloutput.
The skip-doc-search flag always takes precedence. When a skill declares it, the skill is fully responsible for document interaction — neither RAG nor File Tools will interfere.
When to Use Each Approach
Section titled “When to Use Each Approach”| Scenario | Approach |
|---|---|
| Skill adds expertise but documents are searched normally | Default (no flag) |
| Skill needs raw file access (form filling, binary parsing) | skip-doc-search: true |
| Skill processes documents with its own scripts | skip-doc-search: true + scripts in scripts/ |
| Skill works alongside document context | Default — use both skill instructions and search results |