Model tiers
Free models (!)
No credit cost. Great for everyday coding tasks, quick questions, and exploring the platform. Available on all plans including the free tier.
Beta models (*)
Experimental and cutting-edge models. Some are free, others consume credits. Unlocked on the Starter plan and above.
Premium models ($)
The most powerful models for complex tasks. Higher credit cost per request. Unlocked on the Starter plan (2 models) and fully available on the Premium plan.
Available models
Free models (!)
These models consume zero credits.| Model | Notes |
|---|---|
| Gemini 2.5 Flash | Fast and capable for most coding tasks |
| Gemini 3 Flash Preview | Next-generation Flash, preview access |
| Qwen 3 Coder Plus | Optimized for code generation |
| Claude Haiku 4.5 | Anthropic’s fastest model |
| OpenAI o4 Mini High | Reasoning-focused, efficient |
| Qwen 3 Max | High-capacity general-purpose model |
| Cogito V2 Llama 405B | Large open-weight model |
Beta models (*)
| Model | Credit cost (avg per request) | Notes |
|---|---|---|
| MiMo-V2-Flash | Free | No credit cost |
| GPT-OSS-120B | Free | No credit cost |
| GPT-OSS-120B (paid) | ~0.0015 credits | Higher throughput tier |
| Grok 4.1 Fast | ~0.004 credits | Fast inference |
| GPT 5.1 Codex Mini | ~0.015 credits | Code-focused, compact |
| Nex AGI Deepseek V3.1 | ~0.008 credits | Strong reasoning |
| MiniMax M2.1 | ~0.009 credits | Multimodal capable |
| GPT 5.1 Codex | ~0.076 credits | Full Codex model |
Premium models ($)
| Model | Credit cost (avg per request) | Best for |
|---|---|---|
| GPT 5.1 Codex | ~0.076 credits | Complex code generation |
| Gemini 2.5 Pro | ~0.076 credits | Long context, multi-file projects |
| Gemini 3 Pro Preview | ~0.091 credits | Latest Google flagship |
| Claude Sonnet 4.5 | ~0.114 credits | Balanced power and speed |
| Grok 3 | ~0.114 credits | Strong reasoning and coding |
| Claude Opus 4.5 | ~0.19 credits | Most capable for hardest tasks |
Switching models
Open the model selector by tapping the model name badge at the bottom of the screen. A menu slides up with three tabs — Beta, Free, and Premium — plus a BYOK tab if you want to use your own API keys.Open the model selector
Tap the model name displayed near the bottom center of the screen. The selector panel opens above the input bar.
Choose a tier tab
Select Beta, Free, or Premium to filter models by tier. Models you do not have access to on your current plan are indicated.
Smart routing: lite vs. full prompts
Voxcode automatically decides how much context to send with each request based on query complexity.- Simple queries — short questions, quick edits, single-line changes — use a lite system prompt. This reduces the token count and makes responses significantly faster.
- Complex queries — multi-file generation, architecture questions, component scaffolding — use the full system prompt, which includes complete project context and detailed instructions.
Prompt caching
When you continue an existing conversation, Voxcode uses prompt caching to avoid re-processing the same context on every message. Repeat conversations and iterative sessions are 30–50% faster than a fresh start, and consume fewer credits because the cached portion of the prompt is not re-billed at full rate.Free models are available on all plans. Beta models (excluding the two free Beta models) require a Starter plan or higher. Premium models require a Starter plan for limited access (2 models) or a Premium plan for full access. See Plans & Pricing for a full breakdown.

