Skip to main content

Thinking Machines Lab Expands Use of Google Cloud AI Hypercomputer

Google Cloud said it signed a new agreement with Thinking Machines Lab to provide expanded AI infrastructure capabilities and capacity on its AI Hypercomputer. The change will expand Thinking Machines’ use of Google Cloud as it develops its platform and trains frontier models.

Thinking Machines will use A4X Max virtual machines with NVIDIA Blackwell architecture through Google Cloud, including early access to NVIDIA GB300 NVL72. The company said early testing showed training and serving speed increases of 2X with A4X Max VMs compared with prior generation GPUs, supported by Google Cloud’s Jupiter network for fast weight transfers for its reinforcement learning workloads.

In addition to the compute and networking described, Thinking Machines said it uses services including Google Kubernetes Engine for large-scale orchestration, Spanner, Cluster Director, Cloud Storage, and Anywhere Cache. It said it combined Cloud Storage, Spanner for transactional metadata, and a custom node-level caching solution to support continuous training while serving production workloads at global scale.

“By leveraging A4X Max and the AI Hypercomputer integrated stack, Google Cloud got us running at record speed with the reliability we demand,” said Myle Ott, Founding Researcher, TML. “This seamless integration of high-performance compute, fast storage, GKE orchestration, and automated remediation via Cluster Director has allowed us to focus on the unique aspects of the stack like Tinker and reinforcement learning.”

Thinking Machines began working with Google Cloud in 2025. “The team at Thinking Machines Lab is generating very exciting research and product offerings that will help organizations more effectively utilize AI,” said Mark Lohmeyer, VP & GM, AI and Computing Infrastructure at Google Cloud. “Through this new agreement, and our deep partnership with NVIDIA, we'll help Thinking Machines accelerate even further, using Google Cloud's AI Hypercomputer which brings together purpose-built hardware, open software and flexible consumption models in an optimized architecture.” “As model sizes grow and reinforcement learning workflows become more complex, system-level optimization becomes critical,” said Ian Buck, Vice President and General Manager of Hyperscale and HPC at NVIDIA. “NVIDIA GB300 NVL72 provides the performance leap and interconnect bandwidth needed to reduce bottlenecks and improve goodput. Running on Google Cloud's integrated AI stack, these advancements strengthen the platform — making it faster and smarter — so TML can extend and build on what the world's researchers are creating with NVIDIA.”