Thinking tokens are the hidden reasoning steps a reasoning model generates internally before its visible reply. On most providers they are billed as output tokens even though the user never sees them directly.

Thinking tokens — AI pricing glossary

Reasoning models (OpenAI's o-series / GPT-5.5 Thinking, Anthropic's extended thinking, Gemini Thinking, DeepSeek R1) break their work into two phases: internal chain-of-thought and the final user-facing response. The internal chain-of-thought is paid for but hidden.

This can drastically change the economics of "reasoning" workloads. A 200-word visible answer might cost the equivalent of a 1,500-word generation because of thinking tokens. Always check whether the provider bills thinking as output (most do) and account for it in your model.

Thinking tokens

Related terms

Input tokens vs output tokens

Context window