92 Went

AI x Single-cell Biology Reading Group

Hosted by Kenny Workman & 5 others
Register to See Address
San Francisco, CA
Registration
Past Event
Welcome! To join the event, please register below.
About Event

Technical talks on engineering challenges and interesting problems at the intersection of AI and single cell data.

Talks cover emerging analysis methods for new kits, benchmarks + evaluations for frontier models and practical ML for drug screening.

We'll hear from the following:

Valentine Svensson - Principal Computational Biology Scientist @ Tahoe Therapeutics | Effective drug screen extrapolation

At Tahoe therapeutics we generate drug screen data with transcriptomic readout at massive scale for therapeutic discovery. To extend the utility of these screens even further we have developed state of the art predictive models for extrapolation to unseen experimental conditions, feeding into the discovery process. This gives researchers access to hypothetical experimental results that can be used to guide hypotheses and prioritize future experiments.

Mikaela Koutrouli - Core Developer @ scverse | Making scverse AI-Ready: From Optimized Data Infrastructure to Agentic Single-Cell Workflows

The scverse ecosystem (scverse.org) is used by over 50,000 researchers and downloaded 8 million times annually, making it one of the most widely adopted open-source stacks for single-cell and spatial omics. But as AI foundation models and LLM-orchestrated agents enter biology, a critical gap has emerged: the data infrastructure these tools depend on was never designed for AI-native workflows. We aim to address this across three layers. First, we are making scverse data structures AI-ready—optimizing streaming throughput so I/O no longer bottlenecks model training, and integrating machine-readable metadata via Croissant descriptors and biomedical ontologies. Second, with the help of Anthropic and in partnership with BioContext.ai, we are building MCP-enabled skills and LLM-orchestrated agents that can perform single-cell analysis through natural language—from preprocessing and annotation to perturbation modeling and cell-cell interaction inference. Third, we are benchmarking existing foundation models as pluggable components within these agent workflows. This talk will explore what is already working, what remains challenging, and what infrastructure the field will need to fully support AI-native biology.

Harihara Muralidharan - Technical Staff @ LatchBio
Zhen Yang - Technical Staff @ LatchBio

Recent advances in language models have enabled increasingly capable agents for computational biology, yet evaluating these systems on realistic single-cell workflows remains challenging. Across benchmarks spanning single-cell RNA-seq, single-nucleus RNA-seq, and single-cell DNA sequencing platforms, model accuracy varies substantially across assay technologies, with top-model performance ranging from approximately 40% to 80% depending on the kit and modality, highlighting the importance of kit-specific biological and technical reasoning. While modern models often succeed at routine analytical workflows and code generation, they continue to struggle with technical confounders, experimental design assumptions, and biologically grounded interpretation.

This presentation discusses the design of large-scale benchmark suites for single-cell analysis, with emphasis on scientifically grounded evaluation principles rather than procedural task completion. We will cover how benchmark design differs across assay types, common pitfalls of overly procedural evaluations, and practical approaches for constructing robust ground truth and grading tolerances that distinguish genuine scientific reasoning from workflow memorization.

Across 195 benchmark tasks, the strongest frontier models achieve overall pass rates of approximately 55–58%, showing measurable improvements in reasoning and long-horizon task execution over earlier generations. However, persistent failure modes remain across modalities, particularly in handling ambiguous analyses and dataset-specific biological edge cases. Collectively, these challenges motivate the next generation of benchmark design: long-horizon evaluation tasks that assess whether agents can independently execute end-to-end biological analyses and derive scientifically sound conclusions with minimal procedural guidance.

Hosted at the LatchBio office. Food + drink provided.

Agenda

  • 5:00 - 6:00 Meet others. Eat + drink.

  • 6:00 - 8:00 Talks + Q&A

  • 8:00 - TBD Socialize

Abstracts posted closer to the event

Location
Please register to see the exact location of this event.
San Francisco, CA
92 Went