Skip to main content
Voxcode gives you access to 15+ AI models from Anthropic, Google, OpenAI, xAI, and others — all through a single interface. Models are organized into three tiers based on cost and capability, and you can switch between them at any time from the model selector.

Model tiers

Free models (!)

No credit cost. Great for everyday coding tasks, quick questions, and exploring the platform. Available on all plans including the free tier.

Beta models (*)

Experimental and cutting-edge models. Some are free, others consume credits. Unlocked on the Starter plan and above.

Premium models ($)

The most powerful models for complex tasks. Higher credit cost per request. Unlocked on the Starter plan (2 models) and fully available on the Premium plan.

Available models

Free models (!)

These models consume zero credits.
ModelNotes
Gemini 2.5 FlashFast and capable for most coding tasks
Gemini 3 Flash PreviewNext-generation Flash, preview access
Qwen 3 Coder PlusOptimized for code generation
Claude Haiku 4.5Anthropic’s fastest model
OpenAI o4 Mini HighReasoning-focused, efficient
Qwen 3 MaxHigh-capacity general-purpose model
Cogito V2 Llama 405BLarge open-weight model

Beta models (*)

ModelCredit cost (avg per request)Notes
MiMo-V2-FlashFreeNo credit cost
GPT-OSS-120BFreeNo credit cost
GPT-OSS-120B (paid)~0.0015 creditsHigher throughput tier
Grok 4.1 Fast~0.004 creditsFast inference
GPT 5.1 Codex Mini~0.015 creditsCode-focused, compact
Nex AGI Deepseek V3.1~0.008 creditsStrong reasoning
MiniMax M2.1~0.009 creditsMultimodal capable
GPT 5.1 Codex~0.076 creditsFull Codex model

Premium models ($)

ModelCredit cost (avg per request)Best for
GPT 5.1 Codex~0.076 creditsComplex code generation
Gemini 2.5 Pro~0.076 creditsLong context, multi-file projects
Gemini 3 Pro Preview~0.091 creditsLatest Google flagship
Claude Sonnet 4.5~0.114 creditsBalanced power and speed
Grok 3~0.114 creditsStrong reasoning and coding
Claude Opus 4.5~0.19 creditsMost capable for hardest tasks

Switching models

Open the model selector by tapping the model name badge at the bottom of the screen. A menu slides up with three tabs — Beta, Free, and Premium — plus a BYOK tab if you want to use your own API keys.
1

Open the model selector

Tap the model name displayed near the bottom center of the screen. The selector panel opens above the input bar.
2

Choose a tier tab

Select Beta, Free, or Premium to filter models by tier. Models you do not have access to on your current plan are indicated.
3

Select a model

Tap any model in the list. The selector closes and your next message will use the selected model. Your choice persists across the session.

Smart routing: lite vs. full prompts

Voxcode automatically decides how much context to send with each request based on query complexity.
  • Simple queries — short questions, quick edits, single-line changes — use a lite system prompt. This reduces the token count and makes responses significantly faster.
  • Complex queries — multi-file generation, architecture questions, component scaffolding — use the full system prompt, which includes complete project context and detailed instructions.
You do not need to configure this. Voxcode evaluates each request and routes it accordingly.

Prompt caching

When you continue an existing conversation, Voxcode uses prompt caching to avoid re-processing the same context on every message. Repeat conversations and iterative sessions are 30–50% faster than a fresh start, and consume fewer credits because the cached portion of the prompt is not re-billed at full rate.
Free models are available on all plans. Beta models (excluding the two free Beta models) require a Starter plan or higher. Premium models require a Starter plan for limited access (2 models) or a Premium plan for full access. See Plans & Pricing for a full breakdown.
Start with a free model like Gemini 2.5 Flash or Claude Haiku 4.5 to test your prompts and workflow. Once you have a prompt style that produces good results, you can switch to a premium model for the final, more complex work — and spend credits only where it matters.