Google has introduced Ironwood, its seventh-generation AI chip, aimed at handling demanding AI inference workloads at a large scale.
During Google Cloud Next 25 on April 9, Google revealed the new Ironwood tensor processing unit (TPU) as a significant advancement in AI development and infrastructure.
Amin Vahdat, vice president and general manager of machine learning at Google’s Systems and Cloud AI division, described Ironwood as the most powerful and energy-efficient TPU to date. It is specifically designed to support inferential AI models at scale. Over the past decade, TPUs have been instrumental in powering Google’s most challenging AI training and serving workloads, enabling cloud customers to do the same.
Ironwood signifies a shift towards generative AI, moving beyond responsive AI models that provide real-time information to models that actively generate insights and interpretations. It is tailored to support the computational and communication requirements of this new phase in AI development.
As part of the Google Cloud AI Hypercomputer architecture, Ironwood can scale up to 9,216 liquid-cooled chips connected with Inter-Chip Interconnect (ICI) networking, spanning nearly 10 MW. Each chip delivers a peak performance of 4,614 teraflops. When scaled to 9,216 chips per pod for 42.5 exaflops, Ironwood boasts more than 24 times the compute power of the world’s largest supercomputer, El Capitan.
Key features of Google Ironwood include:
1. Significant performance gains with a focus on efficiency. Ironwood’s performance per watt is twice that of the previous generation TPU, Trillium.
2. Increased high-bandwidth memory (HBM) capacity. Ironwood offers 192 GB per chip, six times that of Trillium.
3. Improved HBM bandwidth, reaching 7.2 TBps per chip. This high bandwidth ensures rapid data access for memory-intensive AI workloads.
Vahdat highlighted that Ironwood represents a breakthrough in the age of inference, offering increased computation power, memory capacity, ICI networking advancements, and reliability. These advancements, combined with nearly double the improvement in power efficiency, enable customers to tackle training and serving workloads with high performance and low latency while meeting the growing demand for computing power.
Google’s Ironwood announcement is part of a series of next-gen chip launches aimed at powering large-scale AI workloads. Nvidia, Intel, and Microsoft have also introduced new chips and processors designed to support AI, virtualization, and quantum computing, marking significant advancements in the field of artificial intelligence.