# Gemma 4 31B API — pricing & specs

Google's Gemma open-weight model — an instruction-tuned (~31B) release in Google DeepMind's lightweight open model family, for text generation and reasoning. On KeepRouter, Gemma 4 31B costs $0.12 per 1M input tokens and $0.35 per 1M output tokens, billed at cost with no markup and no monthly fee (e.g. a 1,000-token prompt with a 500-token reply costs about $0.0003). Call it through the OpenAI- or Anthropic-compatible API with the model id `gemma-4-31B-it`.

| Spec | Value |
|---|---|
| Maker | Google DeepMind |
| Modality | Text |
| Input price | $0.12 per 1M tokens |
| Output price | $0.35 per 1M tokens |
| Capabilities | Streaming, Tool / function calling |
| APIs | OpenAI-compatible, Anthropic-compatible |
| Model id | `gemma-4-31B-it` |

## Call it via the OpenAI API (cURL)

```bash
curl https://keeprouter.com/v1/chat/completions \
  -H "Authorization: Bearer $KEEPROUTER_KEY" -H "Content-Type: application/json" \
  -d '{"model":"gemma-4-31B-it","messages":[{"role":"user","content":"Hello"}]}'
```

## Python

```python
from openai import OpenAI
client = OpenAI(base_url="https://keeprouter.com/v1", api_key="$KEEPROUTER_KEY")
r = client.chat.completions.create(model="gemma-4-31B-it", messages=[{"role":"user","content":"Hello"}])
```

## JavaScript

```js
import OpenAI from "openai";
const client = new OpenAI({ baseURL: "https://keeprouter.com/v1", apiKey: process.env.KEEPROUTER_KEY });
const r = await client.chat.completions.create({ model: "gemma-4-31B-it", messages: [{ role: "user", content: "Hello" }] });
```

## Frequently asked questions

### How much does Gemma 4 31B cost on KeepRouter?

Gemma 4 31B is $0.12 per 1M input tokens and $0.35 per 1M output tokens, billed at cost with no markup and no monthly fee.

### How much does a typical Gemma 4 31B request cost?

On KeepRouter, a 1,000-token prompt with a 500-token reply costs about $0.0003 — billed at cost with no markup.

### Who makes Gemma 4 31B?

Gemma 4 31B is made by Google DeepMind. KeepRouter provides it through one OpenAI- and Anthropic-compatible API.

### How do I call Gemma 4 31B via API?

Point your OpenAI- or Anthropic-compatible client at KeepRouter and set the model to "gemma-4-31B-it". It works with the OpenAI SDK, the Anthropic SDK, and Claude Code — no provider-specific SDK needed.

### Does Gemma 4 31B support streaming?

Yes. KeepRouter streams Gemma 4 31B responses over both the OpenAI and Anthropic APIs.

### Does Gemma 4 31B support tool (function) calling?

Yes. KeepRouter passes tool calls through to Gemma 4 31B and translates them between the OpenAI and Anthropic formats.

## Guides

- [Call Gemma 4 31B with the OpenAI SDK](https://keeprouter.com/use-cases/openai-sdk.md)
- [Use Gemma 4 31B in Claude Code](https://keeprouter.com/use-cases/claude-code.md)

## Related models from Google DeepMind

- [Gemini 3.5 Flash](https://keeprouter.com/models/gemini-3.5-flash.md) — $1.50 per 1M input tokens and $9 per 1M output tokens

## More

- [All models & pricing](https://keeprouter.com/models.md)
- [Quickstart](https://keeprouter.com/docs/quickstart.md)
- [Get an API key](https://keeprouter.com/login)

_Prices as of 2026-07-05._
