Databricks Says Clean AI Data Doesn’t Exist and Acquires Fennel

Machine learning models are only as good as the data they learn from.
By all accounts, Databricks has a data problem and so does everyone else. “Everybody has some data, and has an idea of what they want to do,” said Jonathan Frankle, chief AI scientist at Databricks’ Mosaic AI. “Nobody shows up with nice, clean fine-tuning data that you can stick into a prompt or an [application programming interface].” That gap between ambition and usable input has become the defining bottleneck in enterprise AI. Dirty data, not model size or GPU availability, is what’s holding back most organizations from realizing the promise of generative AI. And Databricks, a company that’s built its reputation on infrastructure for training large models, now finds itself facing the same challenge its customers do: the data is too messy. To close that gap, Databr
Subscribe or log in to Continue Reading

Uncompromising innovation. Timeless influence. Your support powers the future of independent tech journalism.

Already have an account? Sign In.

📣 Want to advertise in AIM Research? Book here >

Picture of Anshika Mathews
Anshika Mathews
Anshika is the Senior Content Strategist for AIM Research. She holds a keen interest in technology and related policy-making and its impact on society. She can be reached at anshika.mathews@aimresearch.co
25 July 2025 | 583 Park Avenue, New York
The Biggest Exclusive Gathering of CDOs & AI Leaders In United States

Subscribe to our Newsletter: AIM Research’s most stimulating intellectual contributions on matters molding the future of AI and Data.