Gemma 4 31B API — pricing & specs

Google's Gemma open-weight model — an instruction-tuned (~31B) release in Google DeepMind's lightweight open model family, for text generation and reasoning. On KeepRouter, Gemma 4 31B costs $0.12 per 1M input tokens and $0.35 per 1M output tokens, billed at cost with no markup and no monthly fee. Call it through the OpenAI- or Anthropic-compatible API with the model id gemma-4-31B-it.

Maker	Google DeepMind
Modality	Text
Input price	$0.12 per 1M tokens
Output price	$0.35 per 1M tokens
Capabilities	Streaming, Tool / function calling
APIs	OpenAI-compatible & Anthropic-compatible
Model id	`gemma-4-31B-it`

How pricing works for Gemma 4 31B

Gemma 4 31B is billed per token — $0.12 per 1M input tokens and $0.35 per 1M output tokens. As a worked example, a 1,000-token prompt with a 500-token reply costs about $0.0003. All prices are at cost (a 0% markup) with no monthly fee.

Calling Gemma 4 31B on KeepRouter

Point your OpenAI- or Anthropic-compatible client at KeepRouter, set the model to gemma-4-31B-it, and keep the rest of your code unchanged. Streaming and tool calling work over both APIs.

cURL

curl https://keeprouter.com/v1/chat/completions \
  -H "Authorization: Bearer $KEEPROUTER_KEY" -H "Content-Type: application/json" \
  -d '{"model":"gemma-4-31B-it","messages":[{"role":"user","content":"Hello"}]}'

Python

from openai import OpenAI
client = OpenAI(base_url="https://keeprouter.com/v1", api_key="$KEEPROUTER_KEY")
r = client.chat.completions.create(model="gemma-4-31B-it", messages=[{"role":"user","content":"Hello"}])

JavaScript

import OpenAI from "openai";
const client = new OpenAI({ baseURL: "https://keeprouter.com/v1", apiKey: process.env.KEEPROUTER_KEY });
const r = await client.chat.completions.create({ model: "gemma-4-31B-it", messages: [{ role: "user", content: "Hello" }] });

Frequently asked questions

How much does Gemma 4 31B cost on KeepRouter?

Gemma 4 31B is $0.12 per 1M input tokens and $0.35 per 1M output tokens, billed at cost with no markup and no monthly fee.

How much does a typical Gemma 4 31B request cost?

On KeepRouter, a 1,000-token prompt with a 500-token reply costs about $0.0003 — billed at cost with no markup.

Who makes Gemma 4 31B?

Gemma 4 31B is made by Google DeepMind. KeepRouter provides it through one OpenAI- and Anthropic-compatible API.

How do I call Gemma 4 31B via API?

Point your OpenAI- or Anthropic-compatible client at KeepRouter and set the model to "gemma-4-31B-it". It works with the OpenAI SDK, the Anthropic SDK, and Claude Code — no provider-specific SDK needed.

Does Gemma 4 31B support streaming?

Yes. KeepRouter streams Gemma 4 31B responses over both the OpenAI and Anthropic APIs.

Does Gemma 4 31B support tool (function) calling?

Yes. KeepRouter passes tool calls through to Gemma 4 31B and translates them between the OpenAI and Anthropic formats.

Guides

Related models from Google DeepMind

Gemini 3.5 Flash — $1.50 per 1M input tokens and $9 per 1M output tokens

All models & pricing · Quickstart · KeepRouter vs OpenRouter · Glossary · Get an API key

Prices as of 2026-07-05.