> ## Documentation Index
> Fetch the complete documentation index at: https://docs.voxcode.app/llms.txt
> Use this file to discover all available pages before exploring further.

# AI Models

> Choose the right AI model for your coding tasks in Voxcode

Voxcode gives you access to 15+ AI models from Anthropic, Google, OpenAI, xAI, and others — all through a single interface. Models are organized into three tiers based on cost and capability, and you can switch between them at any time from the model selector.

## Model tiers

<CardGroup cols={3}>
  <Card title="Free models (!)" icon="bolt">
    No credit cost. Great for everyday coding tasks, quick questions, and exploring the platform. Available on all plans including the free tier.
  </Card>

  <Card title="Beta models (*)" icon="flask">
    Experimental and cutting-edge models. Some are free, others consume credits. Unlocked on the Starter plan and above.
  </Card>

  <Card title="Premium models ($)" icon="star">
    The most powerful models for complex tasks. Higher credit cost per request. Unlocked on the Starter plan (2 models) and fully available on the Premium plan.
  </Card>
</CardGroup>

## Available models

### Free models (!)

These models consume zero credits.

| Model                  | Notes                                  |
| ---------------------- | -------------------------------------- |
| Gemini 2.5 Flash       | Fast and capable for most coding tasks |
| Gemini 3 Flash Preview | Next-generation Flash, preview access  |
| Qwen 3 Coder Plus      | Optimized for code generation          |
| Claude Haiku 4.5       | Anthropic's fastest model              |
| OpenAI o4 Mini High    | Reasoning-focused, efficient           |
| Qwen 3 Max             | High-capacity general-purpose model    |
| Cogito V2 Llama 405B   | Large open-weight model                |

### Beta models (\*)

| Model                 | Credit cost (avg per request) | Notes                  |
| --------------------- | ----------------------------- | ---------------------- |
| MiMo-V2-Flash         | Free                          | No credit cost         |
| GPT-OSS-120B          | Free                          | No credit cost         |
| GPT-OSS-120B (paid)   | \~0.0015 credits              | Higher throughput tier |
| Grok 4.1 Fast         | \~0.004 credits               | Fast inference         |
| GPT 5.1 Codex Mini    | \~0.015 credits               | Code-focused, compact  |
| Nex AGI Deepseek V3.1 | \~0.008 credits               | Strong reasoning       |
| MiniMax M2.1          | \~0.009 credits               | Multimodal capable     |
| GPT 5.1 Codex         | \~0.076 credits               | Full Codex model       |

### Premium models (\$)

| Model                | Credit cost (avg per request) | Best for                          |
| -------------------- | ----------------------------- | --------------------------------- |
| GPT 5.1 Codex        | \~0.076 credits               | Complex code generation           |
| Gemini 2.5 Pro       | \~0.076 credits               | Long context, multi-file projects |
| Gemini 3 Pro Preview | \~0.091 credits               | Latest Google flagship            |
| Claude Sonnet 4.5    | \~0.114 credits               | Balanced power and speed          |
| Grok 3               | \~0.114 credits               | Strong reasoning and coding       |
| Claude Opus 4.5      | \~0.19 credits                | Most capable for hardest tasks    |

## Switching models

Open the model selector by tapping the model name badge at the bottom of the screen. A menu slides up with three tabs — **Beta**, **Free**, and **Premium** — plus a **BYOK** tab if you want to use your own API keys.

<Steps>
  <Step title="Open the model selector">
    Tap the model name displayed near the bottom center of the screen. The selector panel opens above the input bar.
  </Step>

  <Step title="Choose a tier tab">
    Select **Beta**, **Free**, or **Premium** to filter models by tier. Models you do not have access to on your current plan are indicated.
  </Step>

  <Step title="Select a model">
    Tap any model in the list. The selector closes and your next message will use the selected model. Your choice persists across the session.
  </Step>
</Steps>

## Smart routing: lite vs. full prompts

Voxcode automatically decides how much context to send with each request based on query complexity.

* **Simple queries** — short questions, quick edits, single-line changes — use a **lite system prompt**. This reduces the token count and makes responses significantly faster.
* **Complex queries** — multi-file generation, architecture questions, component scaffolding — use the **full system prompt**, which includes complete project context and detailed instructions.

You do not need to configure this. Voxcode evaluates each request and routes it accordingly.

## Prompt caching

When you continue an existing conversation, Voxcode uses prompt caching to avoid re-processing the same context on every message. Repeat conversations and iterative sessions are **30–50% faster** than a fresh start, and consume fewer credits because the cached portion of the prompt is not re-billed at full rate.

<Note>
  Free models are available on all plans. Beta models (excluding the two free Beta models) require a **Starter** plan or higher. Premium models require a **Starter** plan for limited access (2 models) or a **Premium** plan for full access. See [Plans & Pricing](/plans/overview) for a full breakdown.
</Note>

<Tip>
  Start with a free model like **Gemini 2.5 Flash** or **Claude Haiku 4.5** to test your prompts and workflow. Once you have a prompt style that produces good results, you can switch to a premium model for the final, more complex work — and spend credits only where it matters.
</Tip>
