Tahoe Therapeutics has secured $30 million in funding to construct what the company describes as the world’s most comprehensive dataset for training artificial intelligence models of human cells. The biotech startup plans to generate one billion single-cell datapoints that will map interactions between tens of thousands of drug molecules and human biology.
Building on Previous Success
The funding round, spearheaded by Amplify Partners, attracted participation from Databricks Ventures, Wing Venture Capital, General Catalyst, Civilization Ventures, Conviction, Mubadala Capital Ventures, Overlap Holdings, and AIX Ventures. This investment follows the company’s earlier release of Tahoe-100M, which the firm positioned as the first gigascale perturbative single-cell dataset.
Since its open-source release several months ago, Tahoe-100M has been downloaded close to 100,000 times and has become a reference point for organizations developing virtual cell models, from major AI laboratories to specialized research institutions. The dataset has already contributed to identifying potential therapeutic candidates for significant cancer subtypes and novel targets across multiple treatment approaches.
Scaling Data Generation
The new initiative represents a substantial expansion of the company’s data generation capabilities. While Tahoe-100M required the development of novel methods for producing single-cell data, the upcoming dataset will be ten times larger in scope.
“Building Tahoe-100M required us to invent new ways to generate single-cell data. Now, we’re applying that superpower to go 10x further” ~ Nima Alidoust, CEO and co-founder.
The expanded dataset aims to map one million drug-patient interactions, a scale that the company says was previously unachievable. This comprehensive mapping effort is intended to support the development of precision medicines for cancer and other therapeutic areas.
Strategic Partnership Model
Rather than making the new dataset broadly available, Tahoe Therapeutics plans to select a single strategic partner to share access to the data. This partner, whether a pharmaceutical company or AI firm, would bring complementary capabilities in clinical development or modeling expertise. The collaboration aims to develop the first medicines powered by virtual cell models.
This approach marks a departure from the open-source model used for Tahoe-100M and reflects the company’s focus on translating data insights into clinical applications. The partnership structure is designed to accelerate the path from biological discovery to therapeutic development.
Addressing Clinical Translation Challenges
The investment thesis behind Tahoe Therapeutics addresses persistent challenges in drug development, particularly the gap between molecular design and clinical success.
“While structural models have accelerated molecular design, they rarely translate to clinical success — a problem that remains one of the biggest challenges in drug development” ~ Sunil Dhaliwal, General Partner at Amplify Partners
The company’s approach centers on generating large-scale drug-patient datasets to train high-dimensional, cell-based AI models. This methodology aims to reduce clinical trial failure rates by providing more accurate predictions of how therapeutic candidates will perform in human patients.
Scientific Foundation and Team
Tahoe Therapeutics was founded by Nima Alidoust, Johnny Yu, Hani Goodzari, and Kevan Shokat, bringing together expertise in single-cell genomics, machine learning, and drug discovery. The company’s technological platform builds on scientific breakthroughs developed at the University of California, San Francisco.
The platform enables large-scale, single-cell drug screening across diverse patient contexts, making previously impossible experiments both feasible and scalable. This capability forms the foundation for training disease-relevant foundation models of human cells.
Future Applications
Beyond dataset construction, Tahoe Therapeutics is advancing its own therapeutic programs toward clinical development. The company views the expanded dataset as supporting what CEO Nima Alidoust described as “the GPT moment for AI models of human cells” drawing parallels to breakthrough advances in language modeling.
The focus on precision medicine reflects the company’s belief that understanding drug-patient interactions at the cellular level can improve therapeutic outcomes. By mapping these interactions at unprecedented scale, Tahoe aims to enable the development of treatments tailored to specific patient populations and disease contexts.
