Google Ironwood TPU details confirm 42.5 exaflops compute and record shared memory

(Image credit: Google) Googles Ironwood TPU scales to 9216 chips with record 1.77PB shared memoryDual die architecture delivers 4614 TFLOPs FP8 and 192GB HBM3e per chipEnhanced reliability cooling and AI assisted design features enable efficient inference workloads at scaleGoogle closed out the machine learning sessions at the recent Hot Chips 2025 event with a detailed look at its newest tensor processing unit, Ironwood.The chip, which was first revealed at Google Cloud Next 25 back in April 2025, is the companys first TPU designed primarily for large scale inference workloads, rather than training, and arrives as its seventh generation of TPU hardware.Each Ironwood chip integrates two compute dies, delivering 4,614 TFLOPs of FP8 performance - and eight stacks of HBM3e provide 192GB of memory capacity per chip, paired with 7.3TBs bandwidth.1.77PB of HBMGoogle has built in 1.2TBps of IO bandwidth to allow a system to scale up to 9,216 chips per pod without glue logic. That configuration reaches a whopping 42.5 exaflops of performance.Memory capacity also scales impressively. Across a pod, Ironwood offers 1.77PB of directly addressable HBM. That level sets a new record for shared memory supercomputers, and is enabled by optical circuit switches linking racks together.The hardware can reconfigure around failed nodes, restoring workloads from checkpoints.The chip integrates multiple features aimed at stability and resilience. These include an on-chip root of trust, built-in self test functions, and measures to mitigate silent data corruption.Are you a pro? Subscribe to our newsletterSign up to the TechRadar Pro newsletter to get all the top news, opinion, features and guidance your business needs to succeed!Contact me with news and offers from other Future brandsReceive email from us on behalf of our trusted partners or sponsorsLogic repair functions are included to improve manufacturing yield. An emphasis on RAS, or reliability, availability, and serviceability, is visible throughout the architecture.Cooling is handled by a cold plate solution supported by Googles third generation of liquid cooling infrastructure.Google claims a twofold improvement in performance per watt compared with Trillium. Dynamic voltage and frequency scaling further improves efficiency during varied workloads.Ironwood also incorporates AI techniques within its own design. It was used to help optimize the ALU circuits and floor plan.A fourth generation SparseCore has been added to accelerate embeddings and collective operations, supporting workloads such as recommendation engines.Deployment is already underway at hyperscale in Google Cloud data centers, although the TPU remains an internal platform not available directly to customers.Commenting on the session at Hot Chips 2025, ServeTheHomes Ryan Smith said, This was an awesome presentation. *Reporting by Techradar.*

Discussion

Join 0 others in the conversation

Comments

Likes

Views

Share Your Thoughts

Your voice matters in this discussion

Press Enter to add line breaks Tap to expand

Keep it respectful and constructive Be respectful

Start the Conversation

Be the first to share your thoughts and engage with this article. Your perspective matters!

Welcome to Crene

Google Ironwood TPU details confirm 42.5 exaflops compute and record shared memory

AI Analysis

Discussion

Share Your Thoughts

Start the Conversation

More Stories

32GB of RAM On Track To Become the New Majority For Gamers - Slashdot

Nvidia Smashes Q2 Earnings with $46.7B Revenue, But ASIC Economics Looms Large

Japan Unveils FugakuNEXT: A Zettascale-Class Supercomputer Revolutionizing AI and HPC

Nvidia Smashes Q2 Earnings with $46.7B Revenue, But ASIC Economics Loom on the Horizon

Alibaba Develops AI Chip to Bridge Nvidia Gap in China

Tesla’s Dojo, a timeline | TechCrunch