The Physical Stack of AI · The inference economy
You can explain the specific Opus price move, compare it with other public API menus, and separate source-backed drivers from forecast guesses.
Anthropic's public pricing table now shows the Opus ladder at a very different price point than the older Opus 4/4.1 era: current Opus 4.7 is listed at \$5 per million input tokens and \$25 per million output tokens, while the deprecated Opus 4/4.1 rows sit at \$15 input and \$75 output. Anthropic's Opus 4.7 announcement also says Opus 4.7 keeps Opus 4.6 pricing while improving capability. The key fact is not just "a cheaper model." It is more capability in the same flagship slot at one-third of the older listed output price.
The broader API menu makes the same pressure visible. OpenAI lists GPT-5.4 at \$2.50 input and \$15 output, and GPT-5.5 at \$5 input and \$30 output. Google's Gemini API lists paid Gemini 3.5 Flash at \$1.50 input and \$9 output, with batch/flex options below that. DeepSeek lists V4-Flash at \$0.14 cache-miss input and \$0.28 output. These are live price sheets, so the lesson is the comparison workflow: check the current menu, normalize input/output/cache/batch terms, and avoid treating last quarter's rate as a law.
Three forces explain why price menus can move this fast:
Chapter contains 3 lessons.