ChatGPTPricing.com is an independent pricing guide. We are not affiliated with, endorsed by, or connected to OpenAI, ChatGPT, or any AI vendor. All pricing data is sourced from publicly available information and may change without notice.

Last verified April 2026

OpenAI API Pricing: Every Model, Every Cost

The OpenAI API offers a range of models at different price points, from the budget GPT-4o-mini at $0.15 per million input tokens to the frontier O3-Pro at $150.00 per million tokens. This page covers every available model with input, output, cached, and batch pricing so you can choose the right model for your workload and budget.

GPT-5 Family

ModelInput / 1MOutput / 1MCached InputBatch (In/Out)Notes
GPT-5$1.25$10.00$0.625$0.625 / $5.00Flagship model
GPT-5-mini$0.25$2.00$0.125$0.125 / $1.00Budget GPT-5
GPT-5.2 Instant$0.50$3.00$0.25$0.25 / $1.50Fast variant (Go plan model)

Reasoning Models

ModelInput / 1MOutput / 1MCached InputBatch (In/Out)Notes
O3$2.00$8.00$1.00$1.00 / $4.00Advanced reasoning
O3-mini$0.50$2.00$0.25$0.25 / $1.00Budget reasoning
O3-Pro$150.00$150.00N/AN/AMaximum capability

GPT-4o Family (Legacy)

ModelInput / 1MOutput / 1MCached InputBatch (In/Out)Notes
GPT-4o$2.50$10.00$1.25$1.25 / $5.00Previous flagship
GPT-4o-mini$0.15$0.60$0.075$0.075 / $0.30Cheapest model

Embedding Models

ModelInput / 1MOutput / 1MCached InputBatch (In/Out)Notes
text-embedding-3-large$0.13--$0.065Best quality embeddings
text-embedding-3-small$0.02--$0.01Budget embeddings

All prices in USD per million tokens unless noted. Source: developers.openai.com/api/docs/pricing. Calculate your monthly cost

How Tokens Work

What Is a Token?

Tokens are the fundamental units that language models process. In English, one token roughly equals 4 characters or about 0.75 words. A typical sentence contains 15-20 tokens. A full page of text (approximately 500 words) contains about 670 tokens. Common words like "the", "is", and "a" are single tokens, while longer or rarer words may be split into multiple tokens. Punctuation marks are usually separate tokens.

Quick reference:

  • 1 sentence~15-20 tokens
  • 1 paragraph (100 words)~130 tokens
  • 1 page (500 words)~670 tokens
  • 1 book (80,000 words)~107K tokens

Input vs Output Pricing

You pay separately for input tokens (what you send to the model, including your prompt and any context) and output tokens (what the model generates in response). Output tokens are always more expensive, typically 2-8x the input cost, because generating new text requires more computation than reading existing text.

This pricing structure means you can optimise costs by sending concise prompts (reducing input tokens) and requesting concise responses with the max_tokens parameter (reducing output tokens). For GPT-5, the output-to-input price ratio is 8x ($10.00 vs $1.25), making output optimisation particularly impactful.

Cost Reduction Features

Batch API (50% Discount)

The Batch API allows you to submit large sets of requests for asynchronous processing within a 24-hour window. In exchange for the flexibility on timing, OpenAI charges exactly half the standard rate for both input and output tokens. The results are identical to real-time API calls - same models, same quality, same format.

Best for: content generation pipelines, data classification, email processing, document summarisation, and any workload where you do not need instant results. Not suitable for chatbots, real-time applications, or interactive user experiences.

Prompt Caching (50% Input Discount)

When you send the same prompt prefix repeatedly (common in chatbots and RAG applications), OpenAI caches the processed tokens and charges half the standard input rate for cached tokens. This happens automatically - you do not need to opt in. The cache is maintained per-model and typically expires after minutes of inactivity.

Best for: chatbots with system prompts, RAG applications with consistent context, any application where the beginning of each prompt is the same. Can save 30-50% on input costs for qualifying workloads.

Rate Limit Tiers

TierQualificationRPM (GPT-5)TPM (GPT-5)
Free$0 spent340,000
Tier 1$5+ spent500200,000
Tier 2$50+ spent, 7+ days5,0002,000,000
Tier 3$100+ spent, 7+ days5,00010,000,000
Tier 4$250+ spent, 14+ days10,00050,000,000
Tier 5$1,000+ spent, 30+ days10,000150,000,000

RPM = Requests per minute. TPM = Tokens per minute. Limits vary by model.

Real-World Cost Examples

Customer Support Chatbot

Model: GPT-5-mini | 1,000 conversations/day

Each conversation averages 500 input tokens (system prompt + user message) and 300 output tokens. Using GPT-5-mini keeps costs under $1/day for moderate volume.

Daily cost

$0.73

Monthly cost

$21.90

Content Generation Pipeline

Model: GPT-5 (Batch) | 100 articles/day

Generating 100 blog articles daily with detailed prompts. Using Batch API cuts costs 50%. Output-heavy workload means output tokens dominate the cost.

Daily cost

$10.12

Monthly cost

$303.75

RAG Application

Model: GPT-5-mini | 5,000 queries/day

Retrieval-augmented generation with cached system prompt and context. High input volume offset by prompt caching (50% input discount).

Daily cost

$2.63

Monthly cost

$78.75

Code Review Tool

Model: GPT-5 | 200 reviews/day

Reviewing code diffs with detailed analysis. Using the full GPT-5 model for quality. Each review averages 5K input tokens (code + prompt) and 2.5K output tokens.

Daily cost

$6.25

Monthly cost

$187.50

When to Use the API vs a Subscription

Choose the API When

  • You need programmatic access to integrate into your application
  • You process large volumes of data automatically
  • You need fine-grained control over model parameters
  • Your monthly usage would exceed the Plus message limits
  • You want to use Batch API for 50% cost savings
  • You need to use multiple models for different tasks

Choose a Subscription When

  • You primarily interact with ChatGPT through the web interface
  • You need Deep Research, image generation, or Sora
  • You want a predictable monthly cost with no surprises
  • Your usage is moderate (under 80 messages per 3 hours)
  • You need team management and admin features
  • You prefer a no-code approach without API integration