Jan. 26, 2026 — Microsoft today introduced Maia 200, a new inference accelerator engineered to improve the economics of AI token generation. Maia 200 is an AI inference powerhouse: an accelerator built on TSMC’s 3nm process with native FP8/FP4 tensor cores, a redesigned memory system with 216GB HBM3e at 7 TB/s and 272MB of on-chip SRAM, [...]
Microsoft Unveils Maia 200 Inference Chip for Large-Scale AI Deployment - HPCwire