Skip to content
Download for Mac

Cloud Providers

QARK supports 9 cloud providers. Each connects through the same workflow: paste your API key in Settings → Providers, QARK validates it and fetches the model list. Every provider supports a custom base URL override for enterprise endpoints or proxies.

Cloud providers

API key: console.anthropic.com → API Keys Base URL: https://api.anthropic.com Capabilities: Chat

ModelContextOutputThinkingVisionTools
Claude Opus 4.6200K128KAdaptiveYesYes
Claude Sonnet 4.6200KAdaptiveYesYes
Claude Opus 4.5200K64KYesYesYes
Claude Sonnet 4.5200KYesYesYes
Claude Haiku 4.5200KYesYesYes

All Claude models support extended thinking — the model reasons step-by-step before responding. Opus 4.6 and Sonnet 4.6 support adaptive thinking mode, which lets the model decide when and how deeply to reason. Vision works with images, diagrams, screenshots, and documents.


API key: platform.openai.com → API Keys Base URL: https://api.openai.com/v1 Capabilities: Chat, Embedding, Image Generation

ModelContextOutputThinkingVisionTools
GPT-5.41.05M128KAdaptiveYesYes
GPT-5.4 Pro1.05M128KYesYesYes
GPT-5 Pro400K128KYesYesYes
GPT-5400K128KYesYesYes
GPT-4.11.04M32KNoYesYes
GPT-4.1 Mini1.04M32KNoYesYes
GPT-4.1 Nano1.04M32KNoYesYes

GPT-5.4 is the current flagship with a 1.05M token context window and adaptive thinking. The GPT-4.1 series offers million-token context without thinking — good for large document processing at lower cost.

Embedding models: text-embedding-3-large (3072 dimensions), text-embedding-3-small (1536 dimensions).

Image generation: gpt-image-1, gpt-image-1.5, gpt-image-1-mini. See Image Generation.

Custom base URL covers Azure OpenAI deployments and compatible endpoints.


API key: aistudio.google.com → Get API Key Base URL: https://generativelanguage.googleapis.com Capabilities: Chat, Embedding, Image Generation, Video Generation

ModelContextOutputThinkingVisionTools
Gemini 3.1 Flash1.04M65KYesYesYes
Gemini 3.1 Flash Lite1.04M65KYesYesYes
Gemini 3 Pro1.04MYesYesYes
Gemini 31.04MAdaptiveYesYes
Gemini 2.5 Pro1.04M65KYesYesYes
Gemini 2.5 Flash1.04M65KYesYesYes

All Gemini models accept up to 1 million tokens of context — the largest window among cloud providers. The 3.x series is the current generation. Gemini 3 supports adaptive thinking mode.

Embedding models: gemini-embedding-001, gemini-embedding-2-preview (multimodal, free tier).

Image generation: Imagen 4 (standard, Ultra, Fast), Gemini Flash/Pro native image generation. See Image Generation.

Video generation: Veo 2, Veo 3, Veo 3.1 (standard and fast variants). See Video Generation.

Free tier available with rate limits. Paid tier unlocks higher throughput.


API key: console.groq.com → API Keys Base URL: https://api.groq.com/openai/v1 Capabilities: Chat

ModelContextThinkingVisionTools
Llama 3.3 70B128KNoNoYes
Llama 3.1 8B128KNoNoYes
Mixtral 8x7B32KNoNoYes
Gemma 2 9B8KNoNoNo

Groq runs models on custom LPU hardware — expect hundreds of tokens per second. All models are open-source (Meta Llama, Mistral Mixtral, Google Gemma). Free tier available with rate limits.


API key: api.together.xyz → Settings → API Keys Base URL: https://api.together.xyz/v1 Capabilities: Chat, Embedding

ModelContextThinkingVisionTools
Llama 4 Scout512KNoYesYes
Kimi K2 Instruct128KNoNoYes
Llama 3.3 70B Turbo128KNoNoYes
DeepSeek R164KYesNoYes
Qwen3 32B128KNoNoYes

Hundreds of open-source models from Meta, DeepSeek, Qwen, Mistral, and others. Lower per-token cost than proprietary alternatives for comparable models.

Embedding models: BAAI/bge-large-en-v1.5, intfloat/multilingual-e5-large-instruct.


API key: console.x.ai → API Keys Base URL: https://api.x.ai/v1 Capabilities: Chat, Image Generation, Video Generation

ModelContextOutputThinkingVisionTools
Grok 4.1 Fast Reasoning2M128KAlwaysYesYes
Grok 4.1 Fast2M128KNoYesYes
Grok 4256K128KAlwaysYesYes
Grok 4 Fast Reasoning2M128KAlwaysYesYes
Grok 4 Fast2M128KNoYesYes
Grok Code Fast256K128KAlwaysNoYes
Grok 3128K128KNoNoYes
Grok 3 Mini128K128KYesNoYes

Grok 4.1 Fast Reasoning is the current flagship — 2M token context, vision, reasoning, and tool use. The “Fast” variants without reasoning skip the thinking step for faster responses. Grok Code Fast is optimized for code tasks. Grok 3 Mini supports configurable reasoning effort (low/medium/high).

Image generation: Grok Imagine Image, Grok Imagine Image Pro. See Image Generation.

Video generation: Grok Imagine Video. See Video Generation.


API key: openrouter.ai → Keys Base URL: https://openrouter.ai/api/v1 Capabilities: Chat, Embedding, Image Generation

OpenRouter aggregates models from dozens of providers through a single API key. Access Claude, GPT, Gemini, Llama, DeepSeek, Qwen, Mistral, and many more without signing up for each provider individually.

Model (example)SourceThinkingVisionTools
Claude Sonnet 4.6AnthropicAdaptiveYesYes
GPT-5.4OpenAIAdaptiveYesYes
Gemini 3.1 FlashGoogleYesYesYes
Llama 4 ScoutMetaNoYesYes

Per-token pricing based on the underlying provider’s rate plus a margin. Integrated Exa search plugin for web search augmentation. Check openrouter.ai/models for current model list and pricing.

Embedding models: 20+ models including OpenAI, Mistral, Qwen, and Sentence Transformer variants.

Image generation: Gemini Flash/Pro image, FLUX.2 variants, Seedream, Riverflow. See Image Generation.


API key: perplexity.ai → API Settings Base URL: https://api.perplexity.ai Capabilities: Chat

ModelContextKey feature
Sonar Pro200KDeep research with citations
Sonar128KFast search-augmented responses

Sonar models search the web before responding — answers are grounded in current information with source URLs. No separate search tool setup needed.


API key: platform.deepseek.com → API Keys Base URL: https://api.deepseek.com Capabilities: Chat

ModelContextThinkingVisionTools
DeepSeek V364KNoNoYes
DeepSeek R164KYesNoYes

DeepSeek R1 performs explicit chain-of-thought reasoning, showing its work step-by-step. Strong on coding, math, and logic tasks. Significantly lower per-token cost compared to proprietary models of similar capability.