AI’s Infrastructure Buckles as a Startup Tests a Distributed Alternative

Hub.xyz wants to rebuild AI’s data layer by turning unused internet bandwidth into live data streams

AI’s infrastructure is starting to strain under its own weight. Models keep growing, but the returns are shrinking. Each new generation costs more to train, deploy, and feed with fresh data.

A recent MIT study found that even frontier models may soon hit diminishing returns, as smaller systems trained more efficiently begin to close the gap. “In the next five to ten years, things are very likely to start narrowing,” said Neil Thompson, one of the study’s authors.

The economics back that up. McKinsey estimates that AI-ready data centers will require about $5.2 trillion in new investment by 2030, and Deloitte reports that two-thirds of executives already see funding gaps as their biggest constraint.

Meanwhile, data (the raw material that keeps models useful) is proving harder to scale than compute. A Gartner survey found that 63 percent of organizations lack the right data-management practices for AI, and APQC links most failed AI projects to weak data foundations.

“AI’s biggest bottleneck is no longer compute, it’s data readiness.”

That gap has created space for companies rethinking how data flows. Scale AI and Appen depend on large human-labeling operations; Bright Data uses proxy networks to scrape the web; Acceldata helps enterprises monitor and optimize pipelines. All of them work within centralized systems.

Hub.xyz, a Palo Alto startup founded in 2022, takes a different approach. Describing its system as a decentralized, Web3-style network of nodes, it repurposes idle bandwidth across its network into usable infrastructure for AI systems. The company calls this a “programmable-bandwidth network”: a distributed system that turns everyday internet connections into live data streams for AI companies.

“We turn a wasted resource – bandwidth – into something valuable for AI companies around data,” said Tim Sprecher, co-founder and COO of Hub.xyz, in a conversation with AIM Media House.

Where incumbents rely on scale, Hub is betting on distribution. It believes the future of AI infrastructure won’t be built in data centers, but across the millions of connections already online.

“We believe the solution isn’t more infrastructure, it’s using what’s already there more intelligently.”

Some observers have expressed caution about large infrastructure builds. “It feels like there’s been a real shift… in terms of the public just becoming aware of what data centres are and becoming increasingly skeptical,” said Ben Green, assistant professor at the University of Michigan’s School of Information.

Hub’s Distributed Model

Hub’s system operates on the premise of using what already exists. Users install a browser extension or desktop application and share a fraction of their internet connection with the network, which operates through Web3-style peer nodes rather than centralized servers. Hub uses this distributed bandwidth to collect public data at scale for clients in the AI sector.

“Through our pipeline we can process, refine, and make it ready for companies,” Sprecher said.

The company began its business-to-business sales cycle about six weeks before the interview, Sprecher said. Early customers include a voice-AI firm seeking rare-language audio data and a text-to-video startup that needs specialized content. Hub is also in talks with a top-ten technology company, according to Sprecher. Sprecher said Hub’s system is designed for multimodal data (video, image, and audio) reflecting the growing demand from AI companies for richer, more varied training material.

“We can transform something unusable into AI-ready, gold-standard data,” he said.

Sprecher said Hub also plans to expand into crowdsourced data collection, allowing users to contribute specific content, such as short videos or anonymized documents, to meet training needs that can’t be filled through public data alone.

Hub manages the full data pipeline internally, from acquisition to refinement and annotation. Its process includes automated cleaning, transcription, and targeted human review for five to ten percent of every dataset. “For some customers, five to ten percent of the datasets will be QA’d by humans to make sure the data we deliver is AI-ready,” he said.

Hub’s structure also allows for active and passive participation. While most users contribute automatically by sharing bandwidth, a subset will take short “quests” for annotation or quality checks. “They will be able to say, ‘I have ten minutes today for some annotations,’ and earn extra points,” Sprecher said.

Participants earn IQ points, which Hub says it intends to convert into a Hub token in future. Hub’s public materials describe the token as a utility and governance mechanism within the network, used to reward contributors and enable community participation; the company has not published final tokenomics or regulatory guidance on trading or market status.

Hub’s network is being seeded through its partnership with SwissBorg, Europe’s largest community-driven wealth platform. SwissBorg led Hub’s pre-seed and joined its seed round, bringing total fundraising to $1.7 million. The collaboration gives Hub access to more than one million KYC-verified users who can become node operators once the network goes live. “We see with SwissBorg a longer partnership that goes beyond funding, to build something valuable alongside their community” Sprecher said.

SwissBorg’s verified community also provides a compliance buffer that most decentralized networks lack. Each user is pre-vetted under European KYC standards, giving Hub a path to scalability that aligns with enterprise and regulatory expectations .

For enterprise clients, Hub positions its model as complementary to existing infrastructure rather than a replacement.

“Enterprises don’t really care if the network is distributed: they care about results,” Sprecher said. “If you can provide large-scale, high-quality data at lower cost, that’s what moves the deal.”

“Enterprises don’t really care if the network is distributed: they care about results.”

The Efficiency Rebellion

Hub’s timing aligns with a broader shift across AI. Researchers and executives alike are beginning to question whether ever-larger models and data centers represent progress or inertia. McKinsey’s projection of $6.7 trillion in infrastructure spending illustrates the scale of the commitment, and the potential inefficiency. If the next generation of AI depends on efficiency rather than size, companies like Hub represent a structural experiment.

“We’re proving that the future of AI infrastructure is distributed, powered by people across the globe”

The company’s architecture departs from the logic of hyperscale computing. Rather than concentrate resources in a few massive facilities, Hub distributes them across thousands of nodes. Each user becomes a contributor to a shared data pipeline that, in theory, grows in coverage and resilience as participation expands.

A 2023 paper in Frontiers in Integrative Neuroscience proposed that intelligence in the human body is not centralized in the brain but distributed across neural and immune systems. The comparison is imperfect, but the parallel holds: a system can become more capable by decentralizing its work.

Hub’s next test is operational. The company employs about ten people today and is hiring additional machine-learning engineers and solutions architects to meet demand, Sprecher said. “We’re getting a lot of demand with many use cases… For that, we need more workforce: more ML engineers, more architects”.

The enterprise case for distributed data collection is still largely untested. Hub will look to demonstrate that its model can meet quality and compliance standards while maintaining cost efficiency, as the pressure on centralized AI infrastructure suggests a market for alternatives.

📣 Want to advertise in AIM Media House? Book here >

Picture of Mukundan Sivaraj
Mukundan Sivaraj
Mukundan covers the AI startup ecosystem for AIM Media House. Reach out to him at mukundan.sivaraj@aimmediahouse.com.
Global leaders, intimate gatherings, bold visions for AI.
CDO Vision is a premier, year-round networking initiative connecting top Chief
Data Officers (CDOs) & Enterprise AI Leaders across major cities worldwide.

Subscribe to our Newsletter: AIM Research’s most stimulating intellectual contributions on matters molding the future of AI and Data.