ISCB Transition Page

Attention Presenters - please review the Speaker Information Page available here

Schedule subject to change
All times listed are in GMT

Tuesday, April 21^st

8:45-9:00

Welcome

Authors List: Show

9:00-9:15

PISTACHIO: Proteomics-Constrained Negative Binomial Matrix Factorisation for Spatial Transcriptomics Deconvolution

Confirmed Presenter: Esra Büşra Işık, University of Manchester, United Kingdom

Authors List: Show

Presentation Overview: Show

9:15-9:30

MultiCOCor: identifying multiple clustering structures inhigh-dimensional data

Confirmed Presenter: Jack Hodgkinson, MRC Biostatistics Unit, University of Cambridge, United Kingdom

Authors List: Show

Presentation Overview: Show

9:30-9:45

DeepPathway: Predicting Pathway Expression from Histopathology Images

Confirmed Presenter: Muhammad Ahtazaz Ahsan, University of Manchester, United Kingdom

Authors List: Show

Presentation Overview: Show

9:45-10:00

Leveraging open-access data in ChEMBL to explore emergingdrug modalities

Confirmed Presenter: Emma Manners, EMBL-EBI, United Kingdom

Authors List: Show

Presentation Overview: Show

10:00-10:15

Predicting drug resistance across cancer types usingmulti-omics transfer learning

Confirmed Presenter: Semih Alpsoy, Department of Molecular Biotechnology, Türkisch-DeutscheUniversität, Istanbul, Turkey, Turkey

Authors List: Show

Presentation Overview: Show

10:15-10:30

Cell-specific rewiring of GPCR signalling networks: A systems pharmacology perspective

Confirmed Presenter: Shanlin Rao, University of Cambridge, United Kingdom

Authors List: Show

Presentation Overview: Show

10:30-10:45

Integrating protein structure and population genomic datato detect diversifying selection related to immunity

Confirmed Presenter: Leonie J. Lorenz, EMBL-EBI, United Kingdom

Authors List: Show

Presentation Overview: Show

10:45-11:00

Detection of recombination in Arabidopsis centromeres

Authors List: Show

Presentation Overview: Show

11:30-11:45

Pushing the limits of AlphaFold3: detecting DNA-binding domains at scale

Confirmed Presenter: Francesco Costa, EMBL-EBI, United Kingdom

Authors List: Show

Presentation Overview: Show

11:45-12:00

Identification and characterization ofPolyurethane-Degrading Enzymes from MGnify Metagenomes

Confirmed Presenter: Joel Roca-Martinez, University College London, United Kingdom

Authors List: Show

Presentation Overview: Show

12:00-12:15

Metagenomic analysis identifies co-occurrance ofDesulfovibrio and curli genes in Parkinson´s patients

Confirmed Presenter: Fang Chi, University of Helsinki, Finland

Authors List: Show

Presentation Overview: Show

Parkinson’s disease (PD) is increasingly linked to gut
microbiome dysbiosis, yet specific microbial drivers remain
inconsistent across heterogeneous populations. While
sulfate-reducing bacteria and bacterial amyloids have been
individually implicated in PD, their ecological
interactions and combined contributions to disease risk and
metabolic dysfunction remain unclear.
We performed a large-scale meta-analysis of 1,609 fecal
metagenomes from 10 PD cohorts across three continents.
Using a unified mixed-effects framework, we integrated
community-level profiling, targeted feature analysis, and
functional annotation to investigate the prevalence,
abundance, and synergistic interactions of Desulfovibrio,
Escherichia coli, and bacterial amyloid (curli) genes.
Global gut microbiome structure showed extensive overlap
between PD patients and controls, with limited
disease-associated variance at the community level.
Desulfovibrio, E. coli, and curli genes exhibited robust
and reproducible enrichment in PD patients at the
prevalence level. We identified a disease-specific
ecological coupling between Desulfovibrio and curli genes
that was absent in healthy controls. Individuals with
concurrent high exposure to both features displayed a
non-linear amplification of PD risk (odds ratio = 2.87),
exceeding the effects of either feature alone. This
synergistic interaction was associated with a distinct
functional reorganization of the gut microbiome,
characterized by the enrichment of virulence
factors—including lipopolysaccharide and aerobactin
biosynthesis—concomitant with the depletion of protective
biosynthetic pathways, specifically L-glutamine and biotin
biosynthesis.
PD-associated gut dysbiosis is predominantly
prevalence-driven and shaped by interaction-based
functional reprogramming rather than global community
shifts. Our findings support a multi-hit model in which
synergistic interactions between sulfate-reducing bacteria
and curli-producing microbes jointly define a
disease-relevant metabolic state.

12:15-12:30

Exploring Feature Representations for Cancer-AssociatedsORF Prediction in Non-coding RNA

Confirmed Presenter: Fabiana Rodrigues de Goes, Rosalind Franklin Institute, United Kingdom

Authors List: Show

Presentation Overview: Show

Advances in cancer bioinformatics have expanded our
understanding of tumor evolution and molecular
heterogeneity. Microproteins encoded by small open reading
frames (sORFs) in non-coding RNAs (ncRNAs) represent a
largely unexplored layer of cancer biology, with potential
roles in oncogenic regulation, biomarker discovery, and
therapeutic targeting. However, systematic identification
of cancer-associated sORFs remains challenging due to
experimental costs and technical limitations, highlighting
the need for computational approaches to enable large-scale
screening. Here, we present a comprehensive evaluation of
machine learning models and sequence feature
representations for predicting cancer-associated sORFs
using the Spencer database, which catalogs 29,526
ncRNA-derived small peptides across 15 cancer types.
Instead of relying on increasingly complex model
architectures, we systematically investigate the impact of
feature extraction strategies. Three classifiers (Random
Forest, Support Vector Machine, and Multilayer Perceptron)
were benchmarked with three feature types: k-mer frequency,
Word2Vec embeddings, and embeddings from pre-trained
genomic language models (gLMs). Our results show that
classical models, when combined with appropriate feature
engineering, consistently outperform the CoraL baseline,
achieving up to 10% higher accuracy. Notably, k-mer–based
representations often provided more stable and accurate
predictions than gLM embeddings without fine-tuning,
indicating that increased model complexity does not
guarantee superior performance. Tokenization choices, such
as k-mer length, also significantly affected outcomes.
Certain datasets, for example skin cancer, exhibited
reduced sensitivity, suggesting intrinsic challenges for
positive-case detection. Overall, our findings emphasize
the critical role of feature representation in
cancer-associated sORF prediction and demonstrate that
well-designed, interpretable models can outperform more
complex deep learning approaches in this domain.

12:30-12:40

Closing Remarks and Awards

Authors List: Show

13:30-13:45

Welcome and Opening Remarks

Authors List: Show

13:45-14:30

Invited Presentation: Keynote from Dr. Rob Finn

Moderator(s): Mark Wass

Authors List: Show

14:30-14:45

ImmunoMatch learns and predicts cognate pairing of heavyand light immunoglobulin chains

Confirmed Presenter: Dongjun Guo, University College London, United Kingdom

Authors List: Show

Presentation Overview: Show

14:45-15:00

DNA Language Models for Efficient Non-Coding Variant EffectPrediction

Confirmed Presenter: Megha Hegde, Kingston University London, United Kingdom

Authors List: Show

Presentation Overview: Show

15:00-15:15

ESMRank: A ranking-based AI framework for interpretableprediction of protein variant effects Topic

Confirmed Presenter: Riccardo Arnese, Università di Napoli Federico II, Italy

Authors List: Show

Presentation Overview: Show

16:30-16:45

Leveraging protein language models and a scoring functionfor indel characterisation and transfer learning

Confirmed Presenter: Oriol Gracia I Carmona, King's College London and University College London, United Kingdom

Authors List: Show

Presentation Overview: Show

16:45-17:00

Mapping the space of protein binding sites by integratingsequence-based protein language models with pocket-context

Confirmed Presenter: Jim Horne, Astex Pharmaceuticals, United Kingdom

Authors List: Show

Presentation Overview: Show

17:00-17:15

Are We Teaching Computational Biology Backwards? A Call for a Renaissance of Critical Thinking in the GenAI Era

Confirmed Presenter: Eva Caamano Gutierrez, University of Liverpool, United Kingdom

Authors List: Show

Presentation Overview: Show

17:15-18:00

Panel: AI Roundtable Discussions

Authors List: Show

Wednesday, April 22^nd

9:00-9:20

Panel: AI Roundtable Discussion Outcomes and Highlights

Authors List: Show

9:20-9:35

Integrating Predicted and Experimental Structures: The Roleof AlphaFold DB in Modern Structural Biology

Confirmed Presenter: Joseph Ellaway, EMBL-EBI, United Kingdom

Authors List: Show

Presentation Overview: Show

The AlphaFold Protein Structure Database (AFDB), developed
by EMBL-EBI and Google DeepMind, provides open,
proteome-scale access to high-accuracy protein structure
predictions, offering structural coverage for hundreds of
millions of sequences across UniProt reference proteomes.
AFDB delivers standardised coordinate files, unified
metadata, and detailed confidence metrics, including pLDDT
and predicted aligned error (PAE) plots, ensuring reliable
interpretation and downstream use. Structural coverage has
recently been expanded to include isoforms and the
underlying multiple sequence alignments supporting each
prediction, enabling deeper analysis of conservation,
co-evolution, and model support.

A comprehensive redesign of the entry page enhances
usability and structural interpretation by integrating
annotations directly with an interactive Mol* viewer and
introducing dedicated Domains, Annotations, and Similar
Proteins tabs. The Similar Proteins tab presents
Foldseek-based structural homologues and clustered views of
evolutionarily related proteins, while the Annotations tab
displays AlphaMissense visualisation and a new system for
uploading user-defined annotations, creating a flexible
framework that will accommodate custom data. Programmatic
APIs, FTP and cloud-hosted datasets, and bulk download
options support large-scale computational workflows.
Integration with UniProt, PDBe and PDBe-KB lets researchers
place AFDB predictions within their broader biological and
experimental context.

AFDB continues to expand in partnership with the scientific
community, guided by three principles: filling gaps in
structural coverage; improving model accuracy and utility;
and addressing global challenges such as antimicrobial
resistance and food security. Prioritising model inclusion
and validation, and fostering community-driven annotation,
AFDB endeavours to be a FAIR, knowledge-rich resource that
accelerates discovery and amplifies the impact of protein
structure data.

9:35-9:50

Phyre2.2: Predicting protein structure and protein/ligandinteractions prediction in the AlphaFold era

Confirmed Presenter: Michael J E Sternberg, Imperial College London, United Kingdom

Authors List: Show

Presentation Overview: Show

9:50-10:05

FAIRDOM-SEEK: Platform for FAIR data and research assetmanagement

Confirmed Presenter: Munazah Andrabi, The University of Manchester, United Kingdom

Authors List: Show

Presentation Overview: Show

10:05-10:20

Royal Society journals and open access publishing

Confirmed Presenter: Jessica Miller, Royal Society Publishing, United Kingdom

Authors List: Show

Presentation Overview: Show

10:20-10:25

A foundation model to study the molecular principles of codon usage in eukaryotes

Confirmed Presenter: Susanne Bornelöv, Department of Biochemistry, University of Cambridge, UK, United Kingdom

Authors List: Show

Presentation Overview: Show

10:25-10:30

A Sneaky Peek at the CRUK Data Hub

Confirmed Presenter: Frances Pearl, University of Sussex, United Kingdom

Authors List: Show

Presentation Overview: Show

10:30-10:35

Pandemic-scale phylogenetics

Confirmed Presenter: Nicola De Maio, EMBL-EBI, United Kingdom

Authors List: Show

Presentation Overview: Show

11:45-12:00

Federated Learning Approaches to Biomedical KnowledgeDiscovery

Confirmed Presenter: Gamze Gursoy, University of Cambrdige, United Kingdom

Authors List: Show

Presentation Overview: Show

12:00-12:15

Mind your own binding: computational prediction ofparatope-epitope interfaces

Confirmed Presenter: Montader Ali, University of Cambridge, United Kingdom

Authors List: Show

Presentation Overview: Show

12:15-12:30

Chemistry Aware AI Model for Interpretable siRNAEngineering and Activity Prediction

Confirmed Presenter: Aparajita Karmakar, The Rosalind Franklin Institute, United Kingdom

Authors List: Show

Presentation Overview: Show

12:30-12:35

Training a force field for proteins and small molecules from scratch

Confirmed Presenter: Joe Greener, MRC Laboratory of Molecular Biology, United Kingdom

Authors List: Show

Presentation Overview: Show

12:35-12:40

Allosteric Communication and Kinetic Regulation in Membrane Protein

Authors List: Show

Presentation Overview: Show

12:40-12:45

InterProScan 6: a modern large-scale protein functionannotation pipeline

Confirmed Presenter: Matthias Blum, European Molecular Biology Laboratory, European
Bioinformatics Institute (EMBL-EBI), United Kingdom

Authors List: Show

Presentation Overview: Show

12:45-13:10

Invited Presentation: BioFAIR

Authors List: Show

14:25-14:40

REMAG: recovery of eukaryotic genomes from metagenomes using contrastive learning

Confirmed Presenter: Daniel Gómez Pérez, Earlham Institute, United Kingdom

Authors List: Show

Presentation Overview: Show

14:40-14:55

Protal: Ultra-fast metagenomic profiling and strain-resolved analysis

Confirmed Presenter: Joachim Fritscher, M3 Research Center Tuebingen, Quadram Institute Bioscience
Norwich, Earlham Institute Norwich, Germany

Authors List: Show

Presentation Overview: Show

14:55-15:10

Detecting signatures underlying the composition ofbiological data

Confirmed Presenter: Anthony Duncan, Earlham Institute, Quadram Institute, United Kingdom

Authors List: Show

Presentation Overview: Show

15:10-15:15

COTAN: scRNA-seq comprehensive workflow based on genecorrelations

Confirmed Presenter: Silvia Giulia Galfre', University of Pisa, Italy

Authors List: Show

Presentation Overview: Show

15:15-15:20

Fine-tuning Oxford Nanopore basecalling models for high-accuracy repeat expansion calling

Confirmed Presenter: Rugare Maruzani, King's College London, United Kingdom

Authors List: Show

Presentation Overview: Show

15:20-15:25

ProQuest: A Large Language Model Application on the UniprotProtein Sequence and Annotation Database

Confirmed Presenter: Melike Akkaya, Hacettepe University, Turkey

Authors List: Show

Presentation Overview: Show

Accessing complex biological data through natural language
remains a significant challenge for researchers,
particularly in fields like proteomics where large-scale,
annotated datasets are the norm. In this project, we
present ProQuest (https://proquest.ngrok.app /
https://github.com/HUBioDataLab/PROQUEST), a
Retrieval-Augmented Generation (RAG) system which uses flat
files from UniProtKB (https://www.uniprot.org/) that
enables intuitive, efficient, and semantically rich
querying of protein-related information. The system is
built on a two-stage pipeline: retrieval and generation. In
the retrieval phase, user queries are vectorized using the
nomic-ai/nomic-embed-text-v1
(https://huggingface.co/nomic-ai/nomic-embed-text-v1) model
and matched with semantically similar documents stored in a
ChromaDB (https://www.trychroma.com/) vector database. To
enhance retrieval coverage, we also integrate two
keyword-based search techniques: SQLite FTS5
(https://www.sqlite.org/fts5.html) using trigram-based
inverted indexing for substring-level precision, and BM25
Encoder
(https://pinecone-io.github.io/pinecone-text/pinecone_text.html)
via Pinecone, which enables sparse vector scoring based on
dot product similarity. In the generation stage, the
retrieved documents are synthesized with the user query to
produce natural language responses using a large language
model. This design allows users to explore complex protein
data through simple queries, making the system accessible
to both domain experts and non-specialists. Although
numerical performance evaluations are ongoing, early
semantic testing has shown that the system consistently
provides coherent and relevant results. Future phases of
the project will focus on parameter tuning, deeper analysis
across biological use cases, and integration with
additional data sources. Overall, the system offers a
powerful new interface for biological data
exploration—enhancing search efficiency, reducing cognitive
load, and accelerating insight generation in protein
research.

15:45-16:00

NetREm: Network Regression Embeddings reveal cell-type transcription factor coordination for gene regulation

Authors List: Show

Presentation Overview: Show

Background: Transcription factor (TF) coordination plays a
key role in gene regulation via direct and/or indirect
protein–protein interactions (PPIs) and co-binding to
regulatory elements on DNA. Single-cell technologies enable
gene expression measurement for individual cells and
identification of distinct cell types, yet the link between
TF-TF coordination and target gene (TG) regulation across
diverse cell types remains poorly understood.

Method: In response, we introduce Network Regression
Embeddings (NetREm), an innovative computational approach
to uncover cell-type-specific TF-TF coordination activities
driving TG regulation. NetREm leverages network-constrained
regularization, integrating prior knowledge of TF-TF PPIs
with single-cell/bulk-level gene expression data. It
identifies transcriptional regulatory modules (TRMs)
composed of antagonistic and/or cooperative TF-TF PPIs and
predicts novel TF-TG regulatory links complementing
state-of-the-art gene regulatory networks (GRNs).

Results: We validate NetREm’s performance through
simulation studies and benchmark it across multiple
datasets in humans, mice, yeast. NetREm prioritizes
biologically-meaningful TF-TF coordination networks in 9
peripheral blood mononuclear cell types and 42 immune cell
subtypes. Additionally, we apply NetREm to cell types
(e.g., neurons, glia, Schwann cells) from central and
peripheral nervous systems, and to Alzheimer’s disease
versus control brains. Top predictions are supported by
orthogonal experimental validation data, including:
ChIP-seq, CUT&RUN, scATAC-seq, knockout studies, expression
QTLs, genome-wide association studies, and beyond. We
further link disease-associated single nucleotide
polymorphism variants to our inferred networks.

Conclusion: NetREm provides a powerful and interpretable
framework to predict cutting-edge GRNs and unprecedented
coordination networks in a cell-type-specific manner. Our
tool is on GitHub to help propel functional genomics and
therapeutic discovery.

16:00-16:15

Evolutionary conservation and rewiring of enhancer-promoterconnectivity across mammals

Confirmed Presenter: Stephen Rong, Institute of Clinical Sciences, Imperial College London;
MRC Laboratory of Medical Sciences, United Kingdom

Authors List: Show

Presentation Overview: Show

16:15-16:30

FlowSign – A NextFlow Workflow for Network Orientation andRegulatory Sign Prediction Using Prior Knowledge and OmicsData

Confirmed Presenter: Benjamin Dominik Maier, European Bioinformatics Institute (EMBL-EBI), United Kingdom

Authors List: Show

Presentation Overview: Show

Biological networks derived from literature and omics data are widely used to study cellular signaling, yet most lack information on interaction direction and regulatory effect—whether a protein activates or inhibits another—limiting interpretability and mechanistic modeling. We present FlowSign, a framework that predicts edge directionality and regulatory sign in protein networks by integrating prior knowledge with data-driven inference.

FlowSign provides a precomputed, harmonized protein–protein interaction resource with directionality and sign confidence scores for all protein interaction. It integrates diverse prior knowledge, including transcription
factor regulons, pathway databases, and kinase–substrate relationships. We then propagate regulatory insights to related proteins and infer inconsistent/missing annotations using a random forest classifier.

Users supply a network together with anchor proteins (e.g., receptors and transcription factors) to guide regulatory flow. FlowSign maps prior knowledge onto the network and iteratively infers edge direction and effect, integrating
user-provided omics and annotations alongside perturbation data. Contradictory edges are only retained when supported by evidence. For interpretability, FlowSign can trim
non-contributing nodes, compress linear cascades, and extract subnetworks or shortest paths between anchors.

Benchmarking on knock-out and drug response datasets shows that FlowSign consistently outperforms state-of-the-art tools such as SIGNAL and NeKo in predicting regulatory direction and effect.

Implemented in R and distributed as a scalable Nextflow pipeline, FlowSign bridges data-driven network inference and executable mechanistic models, enabling conversion to
Boolean or ODE frameworks for context-specific signaling simulations. Future extensions will support time-course
data and iterative updating of prior knowledge through predictions. GitHub: https://github.com/benjamindmaier/flowsign-public

16:30-16:35

Advancing Careers and Team Science in Biomedical DataScience

Confirmed Presenter: Daria Sokolova, EMBL-EBI, United Kingdom

Authors List: Show

Presentation Overview: Show

16:35-16:40

InterProScan 6: a modern large-scale protein functionannotation pipeline

Confirmed Presenter: Matthias Blum, European Molecular Biology Laboratory, European
Bioinformatics Institute (EMBL-EBI), United Kingdom

Authors List: Show

Presentation Overview: Show

16:40-16:45

A Sneaky Peek at the CRUK Data Hub

Confirmed Presenter: Frances Pearl, University of Sussex, United Kingdom

Authors List: Show

Presentation Overview: Show

16:45-17:30

Invited Presentation: Keynote from Dr. Syma Khalid

Authors List: Show

17:30-17:45

Closing Remarks and Awards

Authors List: Show