Nvidia’s third-quarter earnings call expanded on new disclosures that shift the competitive balance between its GPU platform and Google’s TPU effort.
The most significant development was Nvidia’s confirmation that Anthropic, one of Google Cloud’s largest TPU customers, will adopt Nvidia systems for the first time. Nvidia also framed its architecture as the only platform capable of running every category of frontier model and detailed new performance and efficiency results that strengthen its position in large-scale training and inference.
“We’re now the only architecture in the world that runs every AI model,” Jensen Huang said on the call.
Pressure on Google’s TPU Strategy
Anthropic has been one of Google Cloud’s most visible TPU partners. In October, the company announced it would expand its use of Google TPUs to as many as one million chips and more than one gigawatt of compute capacity by 2026. That deal positioned TPUs as the primary training platform for Anthropic’s Claude models.
According to Nvidia this dynamic is changing. Anthropic is adopting Nvidia systems for the first time, backed by an initial compute commitment of up to one gigawatt using Grace Blackwell and Rubin systems.
This is the first time Anthropic has committed significant capacity to Nvidia. It represents a material shift in a relationship that had been centered on Google Cloud’s TPU infrastructure.
“For the first time, Anthropic is adopting Nvidia,” Huang said.
CUDA as a Broad Moat Against TPU Specialization
Nvidia used the call to emphasize the breadth of its software and model support. The company said its architecture can run every major frontier model, scientific model, and robotics model across pre-training, post-training and inference.
The company contrasted this with the narrower specialization of fixed-function accelerators. Huang said the diversity of model architectures, the rapid rate of change and the need for backward compatibility give Nvidia an advantage.
CUDA remains central to this argument. Nvidia said that its installed base continues to run at full utilization, including A100 GPUs shipped six years ago, because of the company’s software stack.
“CUDA’s compatibility and our massive installed base extend the life of Nvidia systems well beyond their original estimated useful life,” the company said.
Performance and Power Constraints Reinforce Nvidia’s Position
Nvidia detailed new performance results from the Blackwell architecture. In MLPerf training benchmarks, Blackwell Ultra delivered a five-fold improvement over Hopper. For inference, Nvidia reported ten-times higher performance per watt and ten-times lower cost per token compared with its previous generation.
The company tied these results to real-world power constraints. Huang said that large data centers are limited by available megawatts and that performance per watt directly determines how much revenue a customer can generate from a fixed-power facility.
“You still only have one gigawatt of power,” Huang said. “Performance per watt translates directly to your revenues.”
He also pushed back on the idea that inference is a simple workload, noting that chain-of-thought and reasoning models require large amounts of computation.
“Thinking is quite hard,” he said. “Inference is the hardest of all.”








