Skip to content Skip to footer

GenBio AI at the International Conference on Machine Learning (ICML) 2025

TL;DR

GenBio AI will present five papers at the 42nd International Conference on Machine Learning (ICML) 2025 across the ICML 2025 Workshop on Multi-modal Foundation Models and Large Language Models for Life Sciences and the ICML 2025 Workshop on Generative AI and Biology. These papers showcase progress in building biological foundation models, benchmarking cellular perturbation predictions, and improving protein design.


Each year, ICML brings together the global machine learning community to exchange ideas and advance the field. This year, GenBio AI will participate in the conference with a focus on applying foundation models to solve fundamental problems in biology.

Our team will present research spanning three accepted papers across two leading workshops at ICML 2025. From new tools for rapidly generating multimodal biological models, to benchmarking cellular perturbation prediction, to improving protein design with uncertainty-aware diffusion models, these works reflect GenBio AI’s broader mission to build powerful AI models for biology that deliver practical impact.

Papers

Rapid and Reproducible Multimodal Biological Foundation Model Development with AIDO.ModelGenerator

Venues: ICML 2025 Workshop on Multi-modal Foundation Models and Large Language Models for Life Sciences, ICML 2025 Workshop on Generative AI and Biology (Spotlight)

Authors: Caleb N. Ellington*, Dian Li*, Shuxian Zou, Elijah Cole, Ning Sun, Sohan Addagudi, Le Song, Eric P. Xing

Abstract: Foundation models (FMs) for DNA, RNA, proteins, cells, and tissues have begun to close long-standing performance gaps in biological prediction tasks, yet each modality is usually studied in isolation. Bridging them requires software that can ingest heterogeneous data, apply large pretrained backbones from various sources, and perform multimodal benchmarking studies at scale. 

We present AIDO.ModelGenerator, an open-source toolkit that turns these needs into declarative experiment recipes through a structured experimental framework. AIDO.ModelGenerator provides (i) 300+ datasets covering DNA, RNA, protein, cell, spatial, and multimodal data types; (ii) 30+ pretrained FMs ranging from 3M to 16B parameters; (iii) 10+ plug-and-play use-cases covering inference, adaptation, prediction, generation, and zero-shot evaluation; and (iv) YAML-driven experiment recipes that enable exact reproducibility. 

On a sequence-to-expression prediction task, AIDO.ModelGenerator systematically builds and tests unimodal and multimodal models, achieving a new SOTA by combining DNA and RNA FMs that outperforms unimodal baselines by over 10%. In a Crohn’s disease case-study, the framework’s simulated knockout protocol ranks the clinically implicated target SOX4 6,000 positions higher than differential-expression baselines, illustrating its utility for therapeutic target discovery. We release code, tutorials, checkpoints, datasets, and API reference to accelerate multimodal FM research in the life sciences.

Multimodal Benchmarking of Foundation Model Representations for Cellular Perturbation Response Prediction

Venues: ICML 2025 Workshop on Multi-modal Foundation Models and Large Language Models for Life Sciences, ICML 2025 Workshop on Generative AI and Biology

Authors: Euxhen Hasanaj*, Elijah Cole*, Shahin Mohammadi, Sohan Addagudi, Xingyi Zhang, Le Song, Eric P. Xing

Abstract: The decreasing cost of single-cell RNA sequencing (scRNA-seq) has enabled the collection of massive scRNA-seq datasets, which are now being used to train transformer-based cell foundation models (FMs). One of the most promising applications of these FMs is perturbation response modeling. This task aims to forecast how cells will respond to drugs or genetic interventions. Accurate perturbation response models could drastically accelerate drug discovery by reducing the space of interventions that need to be tested in the wet lab. However, recent studies have shown that FM-based models often struggle to outperform simpler baselines for perturbation response prediction. A key obstacle is the lack of understanding of the components driving performance in FM-based perturbation response models. In this work, we conduct the first systematic pan-modal study of perturbation embeddings, with an emphasis on those derived from biological FMs. We benchmark their predictive accuracy, analyze patterns in their predictions, and identify the most successful representation learning strategies. Our findings offer insights into what FMs are learning and provide practical guidance for improving perturbation response modeling.

AIDO.Tissue: Spatial Cell-Guided Pretraining for Scalable Spatial Transcriptomics Foundation Model

Venue(s):  ICML 2025 Workshop on Multi-modal Foundation Models and Large Language Models for Life Sciences, ICML 2025 Workshop on Generative AI and Biology (Poster)

Authors: Jing Gong, Yixuan Wang, Nicholas Ho, Xingyi Cheng, Le Song, Eric Xing

Abstract: Single-cell spatial transcriptomics enables high-resolution insights into tissue organization and cell-cell interactions, yet poses significant computational and modeling challenges due to its scale and complexity. Here we introduce AIDO.Tissue, a spatially-informed pretraining framework. The design employs multiple cells as input and an asymmetric encoder-decoder architecture, making it effectively encode cross-cell dependencies while scaling to large data. Systematic evaluation shows that our method scales with neighboring size and achieves state-of-the-art performance across diverse downstream tasks, including spatial cell type classification, cell niche type prediction, and cell density estimation. These results highlight the importance of spatial context in building general-purpose foundation models for tissue-level understanding

Retrieval Augmented Protein Language Models for Protein Structure Prediction

Venue(s): ICML 2025 Workshop on Multi-modal Foundation Models and Large Language Models for Life Sciences, ICML 2025 Workshop on Generative AI and Biology (Spotlight)

Authors: Pan Li, Xingyi Cheng, Le Song, Eric Xing

Abstract: The advent of advanced artificial intelligence technology has significantly accelerated progress in protein structure prediction, with AlphaFold2 setting a new benchmark for prediction accuracy by leveraging the Evoformer module to automatically extract co-evolutionary information from multiple sequence alignments (MSA). To address AlphaFold2’s dependence on MSA depth and quality, we propose two novel models: AIDO.RAGPLM and AIDO.RAGFold, pretrained modules for Retrieval-AuGmented protein language model and structure prediction in an AI-driven Digital Organism (Song et al., 2024). AIDO.RAGPLM integrates pretrained protein language models with retrieved MSA, surpassing single-sequence protein language models in perplexity, contact prediction, and fitness prediction. When sufficient MSA is available, AIDO.RAGFold achieves TM-scores comparable to AlphaFold2 while operating up to eight times faster, and significantly outperforms AlphaFold2 when MSA is insufficient (∆TM-score=0.379, 0.116, and 0.059 for 0, ,5 and 10 MSA sequences as input). Additionally, we developed an MSA retriever using hierarchical ID generation that is 45 to 90 times faster than traditional methods, expanding the MSA training set for AIDO.RAGPLM by 32%. Our findings suggest that AIDO.RAGPLM provides an efficient and accurate solution for protein structure prediction, particularly in scenarios with limited MSA data. The AIDO.RAGPLM model has been open-sourced and is available on https://huggingface.co/genbio-ai/AIDO.Protein-RAG-3B.

Uncertainty-Aware Discrete Diffusion Improves Protein Design

Venue: ICML 2025 Workshop on Multi-modal Foundation Models and Large Language Models for Life Sciences

Authors: Sazan Mahbub, Christoph Feinauer, Caleb N. Ellington, Le Song, Eric Xing

Abstract: Protein inverse folding involves generating amino acid sequences that adopt a specified 3D structure—a key challenge in structural biology and molecular engineering. While discrete diffusion models have demonstrated strong performance, existing methods often apply uniform denoising across residues, overlooking position-specific uncertainty. We propose an uncertainty-aware discrete denoising diffusion model that employs a prior-posterior signaling mechanism to dynamically guide the denoising process. Our approach further integrates learned priors from a pretrained protein language model and a structure encoder within a modular framework, jointly optimized through multi-objective training. Across multiple benchmarks, our method achieves substantial improvements over state-of-the-art baselines, offering a principled framework for structureconditioned sequence generation in proteins and beyond.

Workshops

Our team will be attending the following workshops:

ICML 2025 Workshop on Multi-modal Foundation Models and Large Language Models for Life Sciences

Date: July 18, 2025
Location: Vancouver Convention Center, Ex Hall A
Keynote Talk: Eric Xing, timing TBA

This workshop focuses on the intersection of generative AI and biology, including applications in protein design, RNA modeling, molecular synthesis, and systems biology. The agenda includes invited speakers from academia and industry, as well as poster sessions and spotlight presentations.

ICML 2025 Workshop on Generative AI and Biology

Date: July 19, 2025
Location: Vancouver Convention Center, Meeting Rooms 301–305
Keynote Talk: Eric Xing, timing TBA

This workshop brings together research on multi-modal and large language models for life sciences. Topics include learning joint representations of biological data across omics layers, model robustness, and applications in precision medicine and drug discovery. 


Join us in our mission to push the frontiers of AI-driven biology and strive to make a lasting impact on medicine, biotechnology, and human health. We are hiring across teams. Visit our Careers page to learn more and apply. Follow us on X, YouTube, and LinkedIn.  

Leave a comment