AI Token Cost Calculator

Template

Article generation

Load template

Long-form writing with moderately sized prompts and substantial output.

4k input · 2.5k output · 1.5k requests/day · 10% cache Batch off

Template

Customer support

Load template

Short multi-turn support replies with high daily traffic and reusable system prompts.

1.8k input · 600 output · 120k requests/day · 35% cache Batch off

Template

RAG Q&A

Load template

Retrieval-heavy prompts where context dominates cost and cache can matter.

6k input · 800 output · 30k requests/day · 45% cache Batch off

Template

Code completion

Load template

Interactive coding assistance with medium prompts and short outputs.

2.5k input · 350 output · 50k requests/day · 20% cache Batch off

Template

Batch summarization

Load template

Offline bulk summarization where batch mode and cache reuse are both realistic.

12k input · 1.6k output · 8k requests/day · 50% cache · batch on Batch on

Workload assumptions

Display currency Default USD display

Display currency

Live FX rates come from Frankfurter. If conversion is unavailable, the calculator falls back to the model source currency and marks that row.

Input tokens / request Output tokens / request Daily requests Active days / month Cache hit ratio Monthly budget (optional) Apply batch discount where supported

Quick token estimate

Optional helper for rough sizing before you set request token numbers.

English words Chinese characters Pages

Current workload summary

Scenario template Custom workload Directly edited inputs with no template preset.

Selected models 1

Display currency USD

Monthly request volume 300,000

Request shape 2,000 in · 1,000 out

Cache and batch 20% cache · Batch off

Budget Not set

FX mode Source currency

Request details

Show request details

{
  "modelCodes": ["gpt-4.1"],
  "inputTokens": 2000,
  "outputTokens": 1000,
  "dailyRequests": 10000,
  "activeDays": 30,
  "cacheHitRatio": 0.20,
  "useBatch": false,
  "monthlyBudget": null,
  "displayCurrencyCode": "USD"
}

How the calculator interprets inputs

`Cache hit ratio` discounts only the cached share of input tokens.

`Batch discount` applies only when the stored snapshot lists a batch ratio for that model.

`Budget fit` means the maximum monthly requests you can afford with the exact request shape above.

Model	Monthly cost	1k requests	Blend / 1M	Cache savings / month	Batch savings / month	Budget fit
GPT-4.1 gpt-4.1 OpenAI Updated Mar 31, 09:56 Fallback source PricePerToken OpenAI Currency USD Fallback snapshot from PricePerToken because OpenAI official pricing pages currently return an anti-bot challenge to server-side crawlers. Source updated at 2026-03-28T08:30:28.409699Z. Cache saves $180.00 No batch savings	$3420.00	$11.40	$3.50	$180.00	$0.0000	Set budget to see fit

Model

Monthly cost

1k requests

Blend / 1M

Cache savings / month

Batch savings / month

Budget fit

GPT-4.1

gpt-4.1

OpenAI

Updated Mar 31, 09:56

Fallback source

PricePerToken OpenAI

Currency USD

Fallback snapshot from PricePerToken because OpenAI official pricing pages currently return an anti-bot challenge to server-side crawlers. Source updated at 2026-03-28T08:30:28.409699Z.

Cache saves $180.00 No batch savings

$3420.00

$11.40

$3.50

$180.00

$0.0000

Set budget to see fit

Cost calculator

Scenario templates

Article generation

Customer support

RAG Q&A

Code completion

Batch summarization

Workload assumptions

Current workload summary

Request details

Token estimate helper

How the calculator interprets inputs

Shareable by URL

Estimated results