What Does a Million AI Tokens Cost in Energy? (Free Calculator)

Every AI answer you read was, physically, a brief surge of electricity through a GPU. As intelligence becomes a utility, its price floor is set the way any utility's is — by physics and depreciation: joules per token and silicon per token. The free Energy Cost of Intelligence calculator turns a deployment's specs into that floor, and shows what happens to it as electricity gets cheaper.

The two-line model

Serving a million tokens takes 1,000,000 ÷ (tokens-per-second × 3,600) GPU-hours. Multiply by the GPU's power draw and the facility's PUE (the cooling-and-overhead multiplier, typically 1.1–1.5) and you have the energy; price it at your electricity rate. Hardware is the second line: the GPU's installed cost divided by its lifetime hours, corrected for utilization — depreciation ticks around the clock whether or not tokens are flowing.

A worked example

A 1.2 kW GPU serving 60 tokens/second at PUE 1.25 consumes about 6.9 Wh per thousand tokens — call it a phone charge per novel-length of output. At $0.06/kWh that's just $0.42 of electricity per million tokens. But a $30,000 GPU amortized over 5 years at 60% utilization adds about $5.28 — hardware outweighs energy more than tenfold. The physical floor for this deployment is roughly $5.70 per million tokens; everything above that in a commercial API price is networking, staff, R&D and margin.

Why cheap energy still wins

If electricity is only 7% of the floor, why do AI companies chase cheap power? Three reasons. First, scale: at gigawatt fleets, single-digit percentages are hundreds of millions of dollars. Second, the hardware term is falling fast — as GPUs get cheaper per token, the energy share grows toward dominance. Third, and most important: power availability now gates buildouts more than capital does. The datacenters get built where the megawatts are — which is the whole thesis of the Power Abundance suite: cheap, abundant energy is cheap, abundant thought.

Play with the floor

The calculator charts the energy/hardware split live and runs an electricity-price sensitivity from $0.02 to $0.25/kWh. Watch utilization: dropping from 90% to 30% triples the hardware cost per token — idle silicon is the most expensive kind. Then send your GPU specs straight into the Hyper-Scale Data Center Power Planner to size the solar array that would run the fleet.

FAQ

Does this include training costs?

No — it models inference (serving). Training is a separate, one-time energy bill amortized across every future token the model serves.

What about batching and networking?

Tokens-per-second-per-GPU already absorbs batching efficiency; networking, storage and staff sit above the physical floor this tool isolates.

Estimate your own deployment: Energy Cost of Intelligence. Related: Solar + Battery Sizing and Solar + Data Center Optimizer.