Merck's New AI Partnership Reveals Drug Discovery's Biggest Bottleneck

Protillion is tackling a problem many biotech companies now face: generating the biological data needed to improve AI systems.
Merck just signed a multi-target drug discovery collaboration and license agreement with Protillion Biosciences, a California-based biotechnology company specializing in lab-in-the-loop AI drug design. Under the terms of the agreement, Protillion will receive an undisclosed upfront payment and is eligible to receive research, development, and commercial milestone payments of up to $510 million for the successful development of multiple therapies.
The collaboration combines Merck's therapeutic development expertise with Protillion's Prot-MaP protein-engineering platform to discover new biologic medicines. The two companies said their first two programs will focus on inflammatory diseases.
Prot-MaP, short for Protein Display on a Massively Parallel Array, is a megascale data generation platform designed to deliver large training sets to protein design AI systems. It works by generating tens of millions of clusters of immobilized proteins directly on an Illumina DNA sequencing flow cell, enabling the quantitative analysis of protein libraries and the characterization of millions of variants per run.
The platform can test up to 1 million protein variants simultaneously in a single experiment and generate results in as little as 48 hours. Running multiple platforms in parallel, Protillion says it can scale that output to generate large volumes of experimental data on demand.
Curtis Layton, PhD, CEO and Co-founder of Protillion Biosciences, developed Prot-MaP as a postdoctoral fellow at Stanford University School of Medicine before founding Protillion in 2019 to commercialize the technology. "Prot-MaP is a technology platform that allows us to test millions of protein interactions simultaneously, generating an unprecedented amount of data in a matter of days rather than months," Robert Hollingsworth, PhD, Chief Scientific Officer at Protillion, told Genetic Engineering and Biotechnology News.
The platform's design reflects a deliberate inversion of the industry's standard approach to AI drug discovery. "Many companies start with AI and then look for data. We took the opposite approach," Hollingsworth said.
Protillion combines that experimental output with proprietary machine learning tools to identify the best-performing protein candidates and design improved versions. "Prot-MaP combines high-throughput experimentation with AI. The platform allows us to rapidly generate the data, and the AI helps us learn from it, creating a cycle that accelerates the discovery of next-generation biologic medicines," Hollingsworth said.
The first wave of AI drug discovery focused on prediction. Companies built algorithms and computational platforms designed to identify promising drug candidates faster and reduce laboratory work. Those tools delivered advances, but they also exposed a structural problem: drug-discovery AI systems depend on experimental biological data that is expensive and time-consuming to generate, unlike large language models that can draw on publicly available internet text.
Protillion built its business around that constraint. Its "Megascale Data + AI" approach treats data generation as a core capability, and Merck's investment reflects an industry-wide shift in where the competitive advantage in AI drug discovery now sits.
"Powerful emerging technologies offer the potential to transform the speed and precision with which we characterize protein landscapes and identify novel therapeutic candidates," Juan Alvarez, PhD, Vice President Discovery Biologics at Merck Research Laboratories, said in a statement. "Protillion's platform offers a compelling opportunity, and we look forward to working with the team to advance these programs."
The Protillion deal is one of several platform-focused collaborations Merck has signed in recent months, driven in part by the impending 2028 expiry of Keytruda (pembrolizumab), currently the world's best-selling drug.
In March, Merck entered an up-to-$2.2 billion collaboration with Quotient Therapeutics, a somatic genomics company, to discover novel drug targets in inflammatory bowel disease (IBD). That deal also centered on proprietary biological data: Quotient's platform interrogates patient tissue for naturally occurring somatic genetic mutations that cause or protect against disease.
The pattern across these agreements is consistent. Merck is acquiring access to systems that generate unique biological information, not just platforms that analyze data Merck already holds.
Many pharmaceutical companies now have access to powerful AI tools and sophisticated biological foundation models. What remains difficult to replicate is the infrastructure to produce proprietary experimental datasets at scale. Companies that can generate that data efficiently are becoming more valuable to large pharmaceutical firms looking to replenish their pipelines.
Protillion, which currently employs 30 people and plans to add six more full-time employees by year-end, says it continues to expand its team and facilities to support both its internal pipeline and its partnerships. Layton said the Prot-MaP platform's capabilities extend well beyond inflammation. "As we continue to advance the platform, we expect to expand into additional disease areas where its unique capabilities can have the greatest impact," he said.
Key Takeaways
- Merck partners with Protillion to enhance AI drug discovery through advanced biological data generation.
- Protillion's Prot-MaP platform enables testing of millions of protein variants in a single experiment.
- Collaboration aims to develop new therapies, initially focusing on inflammatory diseases.
- Prot-MaP can generate large datasets quickly, addressing critical bottlenecks in drug discovery.
- Merck's agreement includes potential milestone payments up to $510 million for successful therapy development.