AI Models

Choose from our range of AI models optimized for different tasks. All models are available across all subscription tiers.

Understanding USD credits

USD credits are our billing unit for both AI model usage and environment/container runtime. LLM prices include the managed 10% margin. The USD Cost column shows relative cost compared to Claude Sonnet 4.5 (our 1x baseline).

1xBaseline (Sonnet 4.5)

0.05x20x cheaper

2.0x2x more expensive

Model Comparison

Model	Intelligence	Speed	Context	Input / 1M	Cached / 1M	Output / 1M	USD Cost
Claude Opus 4.8 Anthropic’s latest flagship for complex reasoning, long-horizon coding, and high-autonomy agent work.	Highest	Medium	1M	$5.50	$0.55	$27.50	5.0x
Claude Opus 4.7 Anthropic’s latest flagship for the most complex reasoning and agentic coding work.	Highest	Medium	1M	$5.50	$0.55	$27.50	5.0x
Claude Opus 4.6 Previous-generation flagship Claude model for complex reasoning and coding.	Highest	Medium	1M	$5.50	$0.55	$27.50	5.0x
Claude Sonnet 4.5 Best balance of intelligence and speed. Excellent for coding and daily tasks.	High	Fast	200k	$3.30	$0.33	$16.50	3.0x
Claude Haiku 4.5Recommended Fast and cost-effective for quick tasks and high-volume operations.	Good	Very Fast	200k	$1.10	$0.11	$5.50	1x (baseline)
GPT-5.5 Pro OpenAI highest-accuracy model for the hardest professional and agentic work.	Highest	Medium	1M	$33.00	—	$198.00	35.1x
GPT-5.5 OpenAI frontier model for coding, professional work, and long-context agents.	Highest	Fast	1M	$5.50	$0.55	$33.00	5.6x
GPT-5.4 OpenAI flagship model for advanced coding, planning, and computer use.	Highest	Fast	1M	$2.75	$0.28	$16.50	2.8x
GPT-5.4 mini OpenAI mini model tuned for fast coding, subagents, and computer-use tasks.	High	Fast	400k	$0.83	$0.08	$4.95	0.84x
GPT-5.4 nano Lightweight OpenAI model for simple instructions, routing, and high-volume tasks.	Good	Very Fast	400k	$0.22	$0.02	$1.38	0.23x
Grok 4.5 xAI frontier model for coding, agentic tasks, and knowledge work.	Highest	Fast	500k	$2.20	—	$6.60	1.6x
Gemini 3.1 Pro Higher-reasoning Gemini model for complex planning and analysis.	High	Fast	1M	$3.30	$0.33	$16.50	3.0x
Gemini 3 Flash Fast default model for everyday agent execution.	Good	Very Fast	1M	$1.10	$0.11	$5.50	1x (baseline)
Gemini 3.1 Flash Fast Gemini tier that currently routes through the same runtime as Gemini 3 Flash.	Good	Very Fast	1M	$1.10	$0.11	$5.50	1x (baseline)
DeepSeek V4 Pro DeepSeek flagship model running through Clawcode for agentic coding and long-context reasoning.	High	Fast	1M	$0.48	$0.04	$0.96	0.28x
DeepSeek V4 Flash Fast DeepSeek V4 model running through Clawcode for efficient agent execution.	High	Very Fast	1M	$0.15	$0.03	$0.31	0.09x
MiniMax M3 MiniMax long-context model served via Cloudflare Workers AI for efficient agentic coding, tool use, and digital work.	High	Fast	1M	$0.66	$0.13	$2.64	0.53x
Kimi K2.6 Moonshot flagship served via Cloudflare Workers AI for long-horizon coding and agentic execution.	High	Fast	262k	$1.04	$0.18	$4.40	0.87x
Kimi K2.7 Code Moonshot coding model served via Cloudflare Workers AI for long-horizon software engineering and agentic execution.	High	Fast	262k	$1.04	$0.21	$4.40	0.87x
ZAI GLM 5.2 Z.ai flagship agentic coding model served via Cloudflare Workers AI for software engineering, planning, and tool use.	Highest	Very Fast	262k	$1.54	$0.29	$4.84	1.1x
Qwen 3.5 397B A17B Alibaba Qwen mixture-of-experts model served via Cloudflare AI for reasoning, coding, and multimodal agent work.	High	Fast	262k	$0.66	$0.66	$3.96	0.70x

Research Models

Specialized models for comprehensive web research, multi-source analysis, and detailed report generation.

Model	Intelligence	Speed	Context	Input / 1M	Output / 1M	Best For
Gemini 3 ProRecommended Most capable research model with Deep Think reasoning and 1M token context.	Highest	Medium	1M	$2.20	$13.20	Deep researchComplex reasoningComprehensive reports
Gemini 3 Flash Fast research model, 3x faster than Pro with excellent cost efficiency.	Highest	Fast	1M	$0.55	$3.30	Quick researchSummariesHigh-volume tasks

Image Generation Models

Create stunning visuals with our AI image generation models, from rapid prototypes to photorealistic artwork.

Model	Quality	Speed	Max Resolution	Price / Image	Best For
Gemini 3 Pro ImageRecommended Premium image generation with up to 4K resolution and exceptional detail.	Highest	Medium	Up to 4096x4096	$0.134 2K $0.24 4K	Professional graphicsMarketing assetsHigh-res artwork
Gemini 3 Flash Image Fast image generation for rapid prototyping and high-volume workflows.	High	Very Fast	Up to 1024x1024	$0.039	Quick mockupsSocial mediaBatch generation

Flagship Models

Our most capable models with the highest intelligence and reasoning capabilities.

Claude Opus 4.8

Anthropic’s latest flagship for complex reasoning, long-horizon coding, and high-autonomy agent work.

5.0xrelative cost

Highest

Medium1M context

Pricing per 1M tokens

Input

$5.50

Cached

$0.55

Output

$27.50

Complex reasoningAgentic codingResearchLong-context analysis

Claude Opus 4.7

Anthropic’s latest flagship for the most complex reasoning and agentic coding work.

5.0xrelative cost

Highest

Medium1M context

Pricing per 1M tokens

Input

$5.50

Cached

$0.55

Output

$27.50

Complex reasoningAgentic codingResearchLong-context analysis

GPT-5.5 Pro

OpenAI highest-accuracy model for the hardest professional and agentic work.

35.1xrelative cost

Highest

Medium1M context

Pricing per 1M tokens

Input

$33.00

Cached

—

Output

$198.00

Hard professional workHigh-accuracy reasoningTool useDeep analysis

GPT-5.5

OpenAI frontier model for coding, professional work, and long-context agents.

5.6xrelative cost

Highest

Fast1M context

Pricing per 1M tokens

Input

$5.50

Cached

$0.55

Output

$33.00

Complex codingAgentic workflowsProfessional workComputer use

GPT-5.4

OpenAI flagship model for advanced coding, planning, and computer use.

2.8xrelative cost

Highest

Fast1M context

Pricing per 1M tokens

Input

$2.75

Cached

$0.28

Output

$16.50

Complex codingAgentic workflowsComputer useDeep planning

Grok 4.5

xAI frontier model for coding, agentic tasks, and knowledge work.

1.6xrelative cost

Highest

Fast500k context

Pricing per 1M tokens

Input

$2.20

Cached

—

Output

$6.60

Agentic codingTool callingKnowledge workComplex reasoning

Gemini 3.1 Pro

Higher-reasoning Gemini model for complex planning and analysis.

3.0xrelative cost

High

Fast1M context

Pricing per 1M tokens

Input

$3.30

Cached

$0.33

Output

$16.50

Complex reasoningLong-context analysisDeep planning

MiniMax M3

MiniMax long-context model served via Cloudflare Workers AI for efficient agentic coding, tool use, and digital work.

0.53xrelative cost

High

Fast1M context

Pricing per 1M tokens

Input

$0.66

Cached

$0.13

Output

$2.64

Agentic codingTool useLong-context workDigital workflows

Kimi K2.6

Moonshot flagship served via Cloudflare Workers AI for long-horizon coding and agentic execution.

0.87xrelative cost

High

Fast262k context

Pricing per 1M tokens

Input

$1.04

Cached

$0.18

Output

$4.40

Long-horizon codingAgentic executionTool callingComplex UI generation

Kimi K2.7 Code

Moonshot coding model served via Cloudflare Workers AI for long-horizon software engineering and agentic execution.

0.87xrelative cost

High

Fast262k context

Pricing per 1M tokens

Input

$1.04

Cached

$0.21

Output

$4.40

Agentic codingLong-horizon software workTool callingCode review

ZAI GLM 5.2

Z.ai flagship agentic coding model served via Cloudflare Workers AI for software engineering, planning, and tool use.

1.1xrelative cost

Highest

Very Fast262k context

Pricing per 1M tokens

Input

$1.54

Cached

$0.29

Output

$4.84

Agentic codingPlanningTool useSoftware engineering

Qwen 3.5 397B A17B

Alibaba Qwen mixture-of-experts model served via Cloudflare AI for reasoning, coding, and multimodal agent work.

0.70xrelative cost

High

Fast262k context

Pricing per 1M tokens

Input

$0.66

Cached

$0.66

Output

$3.96

ReasoningCodingMultimodal agentsStructured tool use

DeepSeek V4 Pro

DeepSeek flagship model running through Clawcode for agentic coding and long-context reasoning.

0.28xrelative cost

High

Fast1M context

Pricing per 1M tokens

Input

$0.48

Cached

$0.04

Output

$0.96

Agentic codingLong-context reasoningTool callingComplex planning

Claude Opus 4.6

Previous-generation flagship Claude model for complex reasoning and coding.

5.0xrelative cost

Highest

Medium1M context

Pricing per 1M tokens

Input

$5.50

Cached

$0.55

Output

$27.50

Complex reasoningResearchMulti-step analysisCoding

Claude Sonnet 4.5

Best balance of intelligence and speed. Excellent for coding and daily tasks.

3.0xrelative cost

High

Fast200k context

Pricing per 1M tokens

Input

$3.30

Cached

$0.33

Output

$16.50

CodingAnalysisGeneral tasksContent creation

Efficient Models

Fast and cost-effective for high-volume operations and simpler tasks.

GPT-5.4 mini

OpenAI mini model tuned for fast coding, subagents, and computer-use tasks.

0.84xrelative cost

High

Fast400k context

Pricing per 1M tokens

Input

$0.83

Cached

$0.08

Output

$4.95

CodingSubagentsComputer useFast iteration

GPT-5.4 nano

Lightweight OpenAI model for simple instructions, routing, and high-volume tasks.

0.23xrelative cost

Good

Very Fast400k context

Pricing per 1M tokens

Input

$0.22

Cached

$0.02

Output

$1.38

ClassificationRoutingSimple tasksHigh volume

Gemini 3 Flash

Fast default model for everyday agent execution.

1x (baseline)relative cost

Good

Very Fast1M context

Pricing per 1M tokens

Input

$1.10

Cached

$0.11

Output

$5.50

General tasksFast iterationDaily workflows

Gemini 3.1 Flash

Fast Gemini tier that currently routes through the same runtime as Gemini 3 Flash.

1x (baseline)relative cost

Good

Very Fast1M context

Pricing per 1M tokens

Input

$1.10

Cached

$0.11

Output

$5.50

General tasksFast iterationDaily workflows

DeepSeek V4 Flash

Fast DeepSeek V4 model running through Clawcode for efficient agent execution.

0.09xrelative cost

High

Very Fast1M context

Pricing per 1M tokens

Input

$0.15

Cached

$0.03

Output

$0.31

Fast agent executionHigh-volume workflowsTool callingDaily automation

Claude Haiku 4.5

Recommended

Fast and cost-effective for quick tasks and high-volume operations.

1x (baseline)relative cost

Good

Very Fast200k context

Pricing per 1M tokens

Input

$1.10

Cached

$0.11

Output

$5.50

Quick tasksClassificationSimple queriesHigh volume

About Pricing

All prices are in USD per 1 million tokens. Input tokens are charged when sending prompts to the model. Cached tokens are charged when using prompt caching (if supported). Output tokens are charged for model responses. Your actual cost depends on the model used and the complexity of your tasks.View subscription plans