Message Rendering & Actions

Every message passes through a rendering pipeline that handles markdown, mathematics, syntax-highlighted code, citations, thinking blocks, and tool call results. Each message also carries an action bar and a live token display that updates during streaming.

GitHub Flavored Markdown

QARK supports the full GFM (GitHub Flavored Markdown) specification:

Headings (h1–h6) with proper hierarchy
Bold, italic, ~~strikethrough~~, and inline code
Ordered and unordered lists with nesting
Block quotes with multi-level nesting
Tables with alignment (left, center, right)
Task lists with checkboxes
Horizontal rules
Links and autolinks
Images with alt text

Line breaks, paragraph spacing, and nested structures render identically to how they appear on GitHub.

Math Rendering with KaTeX

QARK renders mathematical notation using KaTeX, supporting both inline and block expressions:

Inline math — Wrap expressions in single dollar signs: $E = mc^2$ renders inline within the surrounding text
Block math — Wrap expressions in double dollar signs for centered, full-width display:

$$
\int_{-\infty}^{\infty} e^{-x^2} dx = \sqrt{\pi}
$$

KaTeX covers a broad subset of LaTeX math commands including matrices, fractions, summations, integrals, Greek letters, and custom operators. Rendering happens client-side with no external service calls.

Code Blocks with Shiki Syntax Highlighting

Code blocks receive syntax highlighting through Shiki, the same engine used by VS Code:

Dual-theme support — github-light in light mode, github-dark in dark mode. The theme switches automatically with your system or application preference.
Language badge — A label in the top-right corner of each code block shows the detected or specified language (Python, TypeScript, Rust, SQL, and 100+ others).
Copy button — One click copies the entire code block contents to your clipboard.
Deferred highlighting — Syntax highlighting runs via requestAnimationFrame to avoid blocking the main thread. Code appears immediately as plain text, then highlights apply in the next animation frame. For long code blocks, this prevents interface freezes.

def fibonacci(n: int) -> list[int]:
    """Generate first n Fibonacci numbers."""
    sequence = [0, 1]
    for i in range(2, n):
        sequence.append(sequence[i - 1] + sequence[i - 2])
    return sequence[:n]

Citations

When RAG (retrieval-augmented generation) provides context from your documents, the model’s response includes inline citation badges:

Citations appear as numeric superscript badges (e.g., [1], [2], [3]) positioned inline with the text they support
Hover over a badge to see a tooltip with:
- Source document name
- Page number or chunk identifier
- Relevance score (percentage match from the retrieval step)
Citations link the model’s claims back to your source material, making it possible to verify accuracy without searching manually

Thinking Blocks

When chain-of-thought thinking is enabled, the model’s internal reasoning appears in dedicated thinking blocks:

Visually separated from the main response with distinct styling (muted background, different text treatment)
Expandable and collapsible — Click to toggle visibility. Collapsed by default to keep the focus on the final answer.
Thinking content renders with the same markdown and code highlighting as regular messages
Thinking tokens are tracked separately in the token/cost badge

Tool Call Blocks

When the model invokes tools, each call renders as a collapsible block with structured information:

Icon — Visual identifier for the tool type
Label — The tool name and a brief summary of what was called
Status indicator — Shows whether the call is pending, succeeded, or failed
Color coding by tool type:

Color	Tool Category	Examples
Purple	Thinking / reasoning	Chain-of-thought, planning
Blue	Web operations	Web search, browse
Cyan	Fetch / retrieval	URL fetch, API calls
Emerald	Document operations	File read, RAG query

Expand a tool call block to see the full input parameters and output returned by the tool. Failed tool calls display the error message with enough context to diagnose the issue.

Nested Agent-Tool Blocks

When an agent tool invokes a sub-agent, the resulting tool calls render with a depth indicator — a visual indent and connector line showing the nesting hierarchy. You can trace the full execution path: top-level agent → agent tool → sub-agent’s tool calls → results bubbling back up.

Image and Video Inline Rendering

Images and videos generated by AI models or returned by tools render inline within the conversation:

Images display at a responsive width with aspect ratio preserved
Click any image to open a lightbox overlay with full-resolution viewing, zoom, and pan
Video content embeds with native playback controls
Generated images include metadata (model used, prompt, dimensions) in a caption area

Rich message showing a syntax-highlighted Python code block, a rendered LaTeX equation, an inline citation badge with hover tooltip, and a collapsed tool call block with status indicator

Message Actions

Hover over any message to reveal the action bar. The available actions differ by role:

Assistant messages (5 actions)

Icon	Action	Behavior
Copy	Copy	Copies the raw markdown text to your clipboard. Icon changes to a checkmark for 2 seconds as confirmation.
Regenerate	Regenerate	Re-sends the conversation to the model and generates a new response. The original response is preserved as a sibling — use the branch navigator (X/Y indicator + arrows) to switch between versions. See Conversations for branch navigation.
Fork	Fork	Creates a new conversation containing all messages up to and including this response. The new conversation opens in a new tab. Useful for exploring a different direction without affecting the original thread.
Export	Export	Opens a submenu with three options: Markdown, HTML, PDF. Exports this single message with its attachments, tool calls, and citations. See Export.
Delete	Delete	Removes the message after a confirmation dialog. Hidden while the message is actively streaming.

User messages (4 actions)

Icon	Action	Behavior
Copy	Copy	Copies the user message text
Fork	Fork	Creates a new conversation containing all messages before this one — the user message itself is not copied, so you can rewrite it in the new conversation
Export	Export	Same three-format submenu
Delete	Delete	Removes the message with confirmation

User messages do not have a Regenerate action. Messages are immutable — there is no inline edit action. To change what you said, fork from the message and rewrite it.

Token Badge

After a response completes, the bottom of each assistant message displays a token count:

1,022 in · 767 out

in — input tokens sent to the model for this turn (conversation history + system prompt + tools)
out — output tokens the model generated

The badge appears on hover alongside the message actions. Cost in USD is not shown per-message — cumulative cost is displayed in the conversation’s Info Panel.

Live Streaming Display

During response generation, the message header shows real-time metrics that update as tokens arrive:

1,022 in · ~234 out · 45%

Three values, updated via requestAnimationFrame batching:

Value	Source	Updates
Input tokens	Actual context tokens calculated at stream start	Fixed for the duration of the stream
~Output tokens	Estimated from streamed content length (`content.length / 4`). Includes thinking tokens if chain-of-thought is active. Prefixed with `~` because it is an estimate until the stream completes.	Real-time as tokens arrive
Context usage %	`(input tokens / model context window) × 100`, rounded	Fixed for the duration of the stream

Context usage color coding

The percentage value changes color as context fills up:

Usage	Color	Meaning
Below 60%	Normal (muted)	Plenty of headroom
60% – 79%	Yellow	Approaching capacity
80%+	Red	Near or past compaction threshold

This gives you a live signal of how much of the model’s context window the current conversation is consuming — before the response even finishes.

Streaming status indicators

While waiting for content to appear, the message shows contextual status text:

Thinking… — default status while the model processes
Compacting context… — context strategy is summarizing history (auto_compact triggered)
Fetching {hostname}… — URL prefetch is in progress

Once tokens start arriving, these status messages are replaced by the streaming content with an animated cursor.

RAG progress

When documents are being processed for the response, a progress indicator shows:

Current processing stage (parsing → chunking → embedding → indexing → searching)
Progress bar with percentage (for embedding and indexing stages)
Current file being processed
Completed / total document count

Streaming Optimizations

Rendering must keep pace with token streaming without consuming excessive CPU or dropping frames. QARK uses two key techniques:

Throttled Rendering at 80ms Intervals

During streaming, incoming tokens arrive faster than the DOM should update. QARK batches token arrivals and re-renders the message content at most every 80 milliseconds (approximately 12 renders per second). This interval balances perceived responsiveness against rendering cost — the content appears to flow smoothly without saturating the browser’s layout engine.

requestAnimationFrame Batching for Syntax Highlighting

Syntax highlighting is computationally expensive for large code blocks. Rather than highlighting on every render cycle, QARK defers Shiki processing to requestAnimationFrame callbacks. This means:

Raw code text appears immediately during streaming
Highlighting applies during the browser’s next idle paint frame
If the code block is still growing (more tokens arriving), highlighting re-applies only after the stream settles

The result: code blocks never cause visible jank during streaming, even for 500+ line outputs.