The GPU Era Is Fading and Qualcomm Smells Opportunity

Their new AI chips aim to power the world’s data centers more efficiently, and end the GPU’s long reign over artificial intelligence

Every time ChatGPT drafts an email or Midjourney renders a portrait, the heavy lifting is not training a new brain, but running one that already exists. This phase of computation, known as inference, is fast becoming the most expensive and competitive part of the artificial intelligence ecosystem.

For years, AI progress was measured by who could train the biggest model or deploy the most GPUs. Now, as those models are rolled out across billions of devices and enterprise applications, the focus has shifted to how efficiently they can run. Training built the brains of AI, but inference is what keeps them alive.

That shift has triggered what might be called an architecture war, a contest to design chips that make inference faster, cheaper, and less power-hungry. The latest challenger is Qualcomm, a company better known for powering smartphones than data centers.

Back in 2023, Qualcomm was grappling with a steep downturn in its handset business. Global smartphone demand had fallen, dragging revenue and forcing job cuts across divisions. That slump marked a turning point. Qualcomm began looking beyond mobile chips toward cars, PCs, and, crucially, the data centers that would soon power the world’s AI systems. The pivot into AI infrastructure was not just opportunistic; it was existential.

The San Diego-based company just announced the AI200 and AI250, its first data-center processors designed for artificial intelligence workloads. The move triggered a 20 percent surge in Qualcomm’s stock, its biggest one-day jump in years, pushing its valuation above $200 billion. It also signaled that the era of GPU dominance might finally be loosening.

A New Kind of Chip War

For more than a decade, Nvidia’s GPUs have defined AI computing. Their parallel architecture made them the default engine for training large language models such as OpenAI’s GPT series and Google’s Gemini. But as those models stabilize, the economics of AI are shifting.

Training is a one-time investment. Inference, by contrast, is a recurring cost: billions of daily queries, chat responses, and recommendations. These workloads demand speed, efficiency, and low latency. GPUs, built for throughput, are not always ideal for that purpose.

Qualcomm’s answer is the neural processing unit (NPU), a specialized chip optimized for running pre-trained neural networks efficiently. Its AI200 and AI250 accelerators are purpose-built for inference: liquid-cooled, power-efficient racks that draw roughly 160 kilowatts and include 768 gigabytes of LPDDR memory to reduce data bottlenecks. The upcoming AI250 adds a “near-memory compute” design that moves data closer to processors, addressing one of AI’s biggest performance constraints.

“With Qualcomm AI200 and AI250, we’re redefining what’s possible for rack-scale AI inference,” said Durga Malladi, senior vice president and general manager for data center and edge products, in the official launch announcement.

The company’s first announced customer is Humain, a Saudi AI startup backed by the kingdom’s Public Investment Fund. It plans to deploy 200 megawatts of Qualcomm-powered compute beginning in 2026 as part of its effort to build a regional AI hub.

“By establishing advanced AI data centers powered by Qualcomm’s industry-leading inference solutions, we are helping the kingdom create a technology ecosystem that will accelerate its AI ambitions,” said Cristiano Amon, Qualcomm’s chief executive, in the same announcement.

That partnership shows how AI hardware is now inseparable from geopolitics; chips have become instruments of national strategy.

For now, Qualcomm’s market share is negligible next to Nvidia’s, whose data-center business will exceed $180 billion in revenue this year. But the reaction on Wall Street suggests that investors see more than a single product launch. As Bloomberg’s Ed Ludlow observed on air, “It’s a small but real entry into a market Nvidia still dominates, Qualcomm’s shot at a monopoly that’s been unchallenged.”

Fragmentation, Efficiency, and the Long Game

If training defined the GPU era, inference may define what comes next. McKinsey estimates that global data-center capital expenditure will exceed $6.7 trillion by 2030, with the majority spent on AI infrastructure (McKinsey). A growing portion of that investment will go toward running models, not training them.

That is where Qualcomm hopes to carve its niche. “Training really is Nvidia,” said Stacy Rasgon, managing director and senior semiconductor analyst at Bernstein, in an interview with CNBC. “Inference should be more fragmented. The workloads are more specialized, power matters more, and a small slice of a big pie could be big enough for players like Qualcomm.”

The inference market’s openness has already attracted new entrants. AMD has expanded its Instinct line, Intel is developing the Crescent Island AI chip, and startups like Groq and Cerebras are promoting radically different architectures. Hyperscalers such as Google, Amazon, and Microsoft are designing their own accelerators, a shift that hints at a broader structural diversification in AI hardware.

Even the software ecosystem is adjusting. Tushar Katarki, head of AI products at Red Hat, said in a previous interview with AIM Media House that the company is “moving from AI experimentation to AI deployment, building for inference, not training.” This is a widespread mindset shift: inference is no longer a downstream task, but the central design constraint across both hardware and software.

Qualcomm has yet to publish performance benchmarks or MLPerf results for the AI200 and AI250, and analysts such as Bernstein’s Stacy Rasgon caution that cost and efficiency claims will have to be proven once systems are deployed. Still, the company’s long-term cadence, it plans annual AI chip launches, mirrors Nvidia’s and shows a seriousness absent from earlier attempts to enter the data-center market.

Amon appears aware of both the hype and the uncertainty. Speaking at the Fortune Global Forum in Riyadh, he compared the current AI boom to the early days of the internet. “It’s hard right now to declare who the winners are,” he said. “The opportunity is probably bigger than people think.”

The parallel fits. In 1999, investors thought the story of the web was who could build the biggest portal. The real winners were the companies that built the infrastructure, the routers, the servers, the chips, that kept the internet running. AI is replaying that cycle. 

📣 Want to advertise in AIM Media House? Book here >

Picture of Mukundan Sivaraj
Mukundan Sivaraj
Mukundan covers the AI startup ecosystem for AIM Media House. Reach out to him at mukundan.sivaraj@aimmediahouse.com or Signal at mukundan.42.
Global leaders, intimate gatherings, bold visions for AI.
CDO Vision is a premier, year-round networking initiative connecting top Chief
Data Officers (CDOs) & Enterprise AI Leaders across major cities worldwide.

Subscribe to our Newsletter: AIM Research’s most stimulating intellectual contributions on matters molding the future of AI and Data.