The Physical Stack of AI · Chips and accelerators
You can name three specialist inference chip makers in 2026 (Cerebras, Groq, Etched), explain the architectural bet each is making, and place them against general-purpose GPUs.
There is a category of accelerator that doesn't try to be a GPU at all. These are specialist inference chips, built on the bet that a chip purpose-tuned for one shape of workload can beat a general-purpose GPU on cost, speed, or both — even though general-purpose GPUs have the better software ecosystem and a much larger ecosystem of supporting tools.
Three companies illustrate three different bets. Cerebras builds a chip the size of a dinner plate — a single wafer-scale processor — to remove the cost of chip-to-chip interconnect. The company IPO'd in May 2026 at $5.55 billion, after a $20 billion contract with OpenAI for inference capacity. Groq builds a deterministic chip with no HBM at all, betting that for output-token speed (latency per token, not throughput per dollar) you want SRAM and a different memory architecture entirely; Nvidia bought Groq for $20 billion in December 2025. Etched bakes the transformer architecture itself into hardcoded silicon — no flexibility, just enormous speed on one model family.
Chapter contains 3 lessons.