Imagine you're a biotech researcher trying to use artificial intelligence (AI) to predict how a cell will respond to genetic changes or other interventions. You're not doing this just for fun – your goal is to speed up the discovery of new medicines or to uncover how diseases like Alzheimer’s or cancer develop. That's exactly what the Vienna-based biotech company Myllia Biotechnology is working on. Using the EuroHPC supercomputer Leonardo, they are training an AI model that could transform biological research – and their first results are impressive.
Bettina Benesch
"So far, the only true breakthrough AI has achieved in biology is predicting protein structures," says Adam Krejci, Head of Bioinformatics at Myllia. That happened around 2018 – and since then, not much else has made headlines in the field of biological AI. But that may be about to change. Many experts now see single-cell RNA sequencing as the next big thing. It could massively improve our understanding of diseases and bring personalised medicine closer to reality. The problem? No major breakthroughs have happened yet.
Single-cell RNA sequencing (scRNA-seq) shows which genes are currently active in each individual cell – what scientists call the cell’s transcriptional state. In a cell not all genes are active all the time:
When a certain one is needed, it is read and copied into a molecule called messenger RNA (mRNA), which is then used to make proteins. This process is called transcription. The entire process from transcription to function – usually the production of a protein – is referred to as gene expression.
"With scRNA-seq, we obtain the level of expression for each of approximately 20,000 genes in each cell. Up to millions of cells can be profiled in a single experiment. This is how the technology creates very large datasets, which makes it an attractive data source for AI", explains Adam Krejci. There have already been attempts to use AI to model this kind of cellular behaviour. But there's still a major hurdle.
The focus is on predicting the behavior of cells that have been altered, e.g. by genetic changes or by the effect of a drug treatment. To predict how such changed cells behave, AI needs to be trained on exactly that kind of data – cells that have been altered. But most of the public datasets used in research today come from unmodified, or “wild-type” cells. That means all the existing AI models have been trained almost entirely on unaltered material.
And that’s a problem: AI can’t recognise or predict what it has never seen or learned from.
This is where Myllia has a clear advantage: Over the years, the team has built a unique dataset of cells modified using CRISPR*, a method for precise gene editing. These are now being used to train a new kind of AI model – one that can deliver far more useful and accurate predictions for research.
To train an AI model of this scale, you need enormous computing power. That’s where EuroCC Austria comes in – offering free access to high-performance computing for proof-of-concept projects. Myllia started out using Austria’s Vienna Scientific Cluster (VSC), and later moved to Leonardo, one of the most powerful supercomputers in the world, based in Italy.
The benefit of Leonardo’s huge capacity is that Myllia can train several models in parallel, add more data, and improve the best ones rapidly. And it’s working: Some of Myllia’s early AI models already outperform all existing tools when it comes to predicting cellular responses to genetic changes.The team is continuing to train, optimise and refine – with the goal of achieving a real breakthrough in how we understand and model cell behaviour.
"EuroCC Austria gave us access to powerful HPC infrastructure. The team was always highly committed and supportive. Thanks to their help, we were able to start our computations almost immediately,” says Adam Krejci.
“
EuroCC Austria gave us access to powerful HPC infrastructure. The team was always highly committed and supportive. Thanks to their help, we were able to start our computations almost immediately.
„
Founded in 2018 and based in Vienna, Myllia Biotechnology combines two groundbreaking technologies: single-cell RNA sequencing (scRNA-seq) and CRISPR, a tool for precise gene editing.
By combining these approaches, Myllia performs functional genomic screenings that reveal how thousands of genetic changes affect individual cells. This enables experts in medical research to develop new drugs more quickly and to better understand complex diseases such as cancer or neurological disorders.
*CRISPR (pronounced “krisper”) stands for Clustered Regularly Interspaced Short Palindromic Repeats – short, repeating DNA sequences found in the genomes of bacteria. These sequences are part of a natural defence system that bacteria use to protect themselves against viruses, known as phages.
When a bacterium is infected by a virus, it can insert small fragments of the viral DNA into a specific region of its own genome – the CRISPR region. These stored fragments, called spacers, act like a molecular memory of past infections. If the same virus attacks again, the bacterium produces short RNA sequences (called crRNA) that match the stored viral DNA.
This crRNA combines with a CRISPR-associated protein, known as Cas protein, to form an active defence complex. When this complex recognises a matching sequence in a new viral DNA, the crRNA binds to it – and the Cas protein cuts the virus’s DNA at exactly that spot, stopping the virus from spreading.
Today, scientists use this precise cutting mechanism in biotechnology. By transferring the CRISPR/Cas system from bacteria into cells of interest (e.g. human cell lines), genes can be edited, switched off, or inserted with great accuracy – a technique known as genome editing, or more informally, the gene scissors.
The method was first developed in 2012 by the biochemists Emmanuelle Charpentier and Jennifer Doudna, who were awarded the Nobel Prize in Chemistry in 2020 for their discovery of the CRISPR/Cas9 technology.