Перейти к основному содержанию

Documentation Index

Fetch the complete documentation index at: https://docs.crazyrouter.com/llms.txt

Use this file to discover all available pages before exploring further.

Official Model Pricing Methods

This page explains upstream vendor pricing rules only. It does not describe Crazyrouter resale prices.
  • Upstream pricing: the vendor’s own input, output, cached input, tool call, search grounding, prompt caching, and similar billing rules
  • Crazyrouter pricing: Crazyrouter’s own sell price, multiplier, discount, channel variation, and settlement logic
This page reflects vendor documentation checked on 2026-04-27. Vendors can change prices, preview-model policies, long-context rules, search-tool fees, and caching behavior. Recheck the official source links before relying on any number in production.

Official Example Price Table

The examples below intentionally use the most common pricing tier that is easiest to map to budget planning:
  • Anthropic: standard API price, excluding Fast mode, US-only inference, and Batch discounts
  • OpenAI: Standard price, excluding Batch, Priority, and Regional Processing surcharges
  • Gemini: Standard price, usually the text/image/video default path, and for Gemini 3* models the <= 200K prompt tier
  • xAI: if the public static docs do not list a standalone price for the exact routed name, this page says so explicitly
  • Z.AI: the official Text Models price
  • MiniMax: the main table uses Pay as You Go; subscription-style pricing is listed separately under Token Plan
Unless otherwise noted, prices below are USD / 1M tokens.

Anthropic Example Prices

Crazyrouter NameOfficial ModelBase InputCache 5m / 1h / HitOutputNotes
claude-sonnet-4-6claude-sonnet-4-6$3$3.75 / $6 / $0.30$151M context uses the standard rate; Batch is roughly 50% off
claude-opus-4-6claude-opus-4-6$5$6.25 / $10 / $0.50$25Fast mode is billed at 6x the standard price
claude-opus-4-7claude-opus-4-7$5$6.25 / $10 / $0.50$25US-only inference has an extra regional multiplier in the official docs
claude-sonnet-4-5-20250929claude-sonnet-4-5 snapshot$3$3.75 / $6 / $0.30$15Snapshot IDs keep the Sonnet 4.5 price; they do not define a separate pricing table

OpenAI Example Prices

Crazyrouter NameOfficial ModelInputCached inputOutputNotes
gpt-54gpt-5.4$2.50$0.25$15.00Standard price applies below 270K context; long-context multipliers apply above that
gpt-4ogpt-4o$2.50$1.25$10.00Classic three-part token pricing
gpt-5gpt-5$1.25$0.125$10.00Tool charges are separate
gpt-51-codex-maxgpt-5.1-codex-max$1.25$0.125$10.00Commonly used in Codex and agentic coding workflows, but still billed with the standard three-part structure
gpt-5-minigpt-5-mini$0.25$0.025$2.00Lower-cost GPT-5 variant
gpt-5-nanogpt-5-nano$0.05$0.005$0.40Lowest-cost GPT-5 route
gpt-5.2gpt-5.2$1.75$0.175$14.00Earlier flagship tier, still standard three-part token pricing
gpt-5.5gpt-5.5$5.00$0.50$30.00Current flagship tier; the official standard-price note applies below 270K context

Gemini Example Prices

Crazyrouter NameOfficial ModelStandard InputStandard CacheStandard OutputNotes
gemini-3-progemini-3-pro$2.00$0.20 + $4.50 / 1M tok / hr storage$12.00<= 200K prompt tier; > 200K becomes $4 / $0.40 / $18; Google also lists Search and Maps free quotas plus $14 / 1,000 search queries
gemini-2.5-flash-litegemini-2.5-flash-lite$0.10 for text / image / video$0.01 for text / image / video + $1.00 / 1M tok / hr storage$0.40Audio input is $0.30, audio cache is $0.03; Search is 1,500 RPD free then $35 / 1,000 grounded prompts

xAI Example Prices

Crazyrouter NameOfficial MappingPrompt / Cached / OutputTool FeesNotes
grok-4.1-thinkingInternal reasoning routeThe public static docs do not list a standalone grok-4.1-thinking token priceweb_search $5 / 1k, x_search $5 / 1k, code_execution $5 / 1k, attachment_search $10 / 1k, collections_search $2.50 / 1kreasoning tokens are billed at the completion token price
grok-4.1Internal Grok 4.1 routeThe public static docs do not list a standalone grok-4.1 token pricesame as aboveThe Crazyrouter route name should not be treated as a public xAI static SKU
What the current xAI static docs reliably expose is the token-category structure, tool-invocation pricing, Batch 50% off, and the instruction to check model details or the console for model-specific token prices. Since grok-4.1 and grok-4.1-thinking are not individually priced in the static public docs, this page does not assign another SKU’s numbers to them.

Z.AI / GLM Example Prices

Crazyrouter NameOfficial ModelInputCached inputCached input storageOutputNotes
glm-5GLM-5$1.0$0.2Limited-time Free$3.2Web Search is billed separately at $0.01 / use

MiniMax Example Prices

Pay as You Go

Crazyrouter NameOfficial ModelInputPrompt caching readPrompt caching writeOutputNotes
MiniMax-M27MiniMax-M2.7$0.3$0.06$0.375$1.2This is the clearest tier to use for direct API budgeting

Token Plan

PlanMonthly PriceM2.7 Quota
Starter$10 / month1,500 requests / 5hrs
Plus$20 / month4,500 requests / 5hrs
Max$50 / month15,000 requests / 5hrs

Anthropic

Anthropic pricing is not just a simple input / output split. The official structure breaks out:
  • Base Input Tokens
  • 5m Cache Writes
  • 1h Cache Writes
  • Cache Hits & Refreshes
  • Output Tokens
Two common rules also matter:
  • Batch API usually bills input and output at roughly 50% off
  • Prompt caching follows the public multiplier rules: 5-minute cache write = 1.25x input, 1-hour cache write = 2x input, and cache hit = 0.1x input
Crazyrouter NameOfficial MappingOfficial Billing MethodNotes
claude-sonnet-4-6Anthropic claude-sonnet-4-6billed as base input / 5m cache write / 1h cache write / cache hit / outputAnthropic explicitly documents Sonnet 4.6 with 1M context, still using standard pricing
claude-opus-4-6Anthropic claude-opus-4-6billed as base input / 5m cache write / 1h cache write / cache hit / outputOpus 4.6 also supports 1M context; Fast mode uses a separate multiplier
claude-opus-4-7Anthropic claude-opus-4-7billed as base input / 5m cache write / 1h cache write / cache hit / outputOpus 4.7 is a distinct model on the official Anthropic pricing surface
claude-sonnet-4-5-20250929Anthropic snapshot ID claude-sonnet-4-5-20250929, usually displayed publicly as claude-sonnet-4-5still billed under the Sonnet 4.5 base input / cache write / cache hit / output structureSnapshot IDs are dated API names, not separate product prices
Official links:

OpenAI

OpenAI pricing is the most uniform of the set. The public structure is usually:
  • Input
  • Cached input
  • Output
Common extra rules:
  • Responses API itself does not add a separate model fee; billing still follows the selected model’s token rates
  • Web search, containers/code execution, Computer Use, and other tools are billed separately
  • Batch API usually discounts token pricing by about 50%
Crazyrouter NameOfficial MappingOfficial Billing MethodNotes
gpt-54internal alias, usually meaning OpenAI gpt-5.4billed as input / cached input / outputthe public standard price applies below 270K context; regional processing endpoints add another surcharge
gpt-4oOpenAI gpt-4obilled as input / cached input / outputclassic unified OpenAI pricing structure
gpt-5OpenAI gpt-5billed as input / cached input / outputtool usage is billed separately; there is no separate public “thinking token price” line item
gpt-51-codex-maxshould map to OpenAI gpt-5.1-codex-maxbilled as input / cached input / outputCodex-oriented naming, but the public price structure still matches the regular GPT-5.1 token format
gpt-5-miniOpenAI gpt-5-minibilled as input / cached input / outputsmaller lower-cost tier
gpt-5-nanoOpenAI gpt-5-nanobilled as input / cached input / outputnano tier, same billing structure
gpt-5.2OpenAI gpt-5.2billed as input / cached input / outputstill part of the GPT-5 pricing pattern
gpt-5.5OpenAI gpt-5.5billed as input / cached input / outputcurrent flagship tier; use the latest long-context notes from OpenAI before production rollout
Official links:

Google Gemini

Gemini pricing differs the most from OpenAI and Anthropic because Google often publishes multiple processing modes:
  • Standard
  • Batch
  • Flex
  • Priority
And it often breaks out:
  • Input
  • Output (including thinking tokens)
  • Context caching price
  • Context caching storage
  • Grounding with Google Search
  • Grounding with Google Maps
Crazyrouter NameOfficial MappingOfficial Billing MethodNotes
gemini-3-proGoogle gemini-3-pro, often displayed internally as gemini-3-proofficially split by Standard / Batch / Flex / Priority, and then by input / output / caching / search grounding / maps groundingpreview model, with separate <= 200K and > 200K prompt pricing tiers
gemini-2.5-flash-liteGoogle gemini-2.5-flash-litepriced by Standard / Batch / Flex / Priority, with separate input / output / context caching / storage / Google Search / Google Mapsstable model rather than preview; official output pricing explicitly includes thinking tokens
Official links:

xAI Grok

xAI pricing is best understood as token categories plus tool billing, rather than a separate pricing table for every visible route name:
  • Prompt tokens
  • Cached prompt tokens
  • Completion tokens
  • Reasoning tokens
xAI also explicitly states:
  • Reasoning tokens are billed at the completion token price
  • Web Search, X Search, Code Execution, and similar server-side tools are billed separately per 1,000 calls
  • Batch API usually applies a 50% token discount
Crazyrouter NameOfficial MappingOfficial Billing MethodNotes
grok-4.1-thinkinginternal reasoning route; conceptually closest to a Grok 4.1 reasoning configurationbilled using the xAI token-category model: prompt / cached prompt / completion / reasoning, plus tool invocation fees when relevantthe public xAI pricing docs emphasize SKU-style names such as grok-4-1-fast-reasoning; this route name should not be treated as an official standalone public SKU
grok-4.1internal Grok 4.1 route; conceptually closer to a non-thinking Grok 4.1 configurationstill follows the same xAI token-category billing logic, with tools billed separatelyfor production budgeting, rely on the current xAI public token and tool rules, not on the internal route name alone
Official links:

Z.AI / GLM

Z.AI GLM-5 pricing is close to the OpenAI pattern, but it adds cache storage as its own line item:
  • Input
  • Cached Input
  • Cached Input Storage
  • Output
Built-in tools such as Web Search are priced separately per use.
Crazyrouter NameOfficial MappingOfficial Billing MethodNotes
glm-5Z.AI GLM-5billed as input / cached input / cached input storage / outputofficial Web Search is charged separately, so token price alone is not the full cost picture
Official links:

MiniMax

MiniMax is the one vendor here where two official commercial modes matter:
  • Pay as You Go
  • Token Plan
That means:
  • Pay as You Go is the normal token-priced API path
  • Token Plan is subscription-style and, for M2.7, often described as a requests per 5-hour rolling window capacity product rather than plain token billing
Crazyrouter NameOfficial MappingOfficial Billing MethodNotes
MiniMax-M27should map to MiniMax-M2.7under Pay as You Go, billed as input / output / prompt caching read / prompt caching write; under Token Plan, billed as requests / 5hrs capacityfor production API budgeting, start with Pay as You Go; evaluate Token Plan separately for heavy coding or interactive usage
Official links:

Summary

If you only need a quick mental model of how each vendor bills, the shortest version is:
  • Anthropic: base input + cache write/read + output
  • OpenAI: input + cached input + output, with tools billed separately
  • Gemini: multiple processing modes + input/output + caching + search/maps grounding
  • xAI: prompt/cached/completion/reasoning, with server-side tools billed separately
  • Z.AI: input + cached input + cached storage + output
  • MiniMax: token pay-as-you-go or subscription-style capacity plan
This page explains upstream official pricing rules, not Crazyrouter selling prices. For actual recharge, billing, multiplier, and discount behavior, use Crazyrouter pricing pages, the console, and /api/pricing.