hpc for biotech startups maggie goiser
EuroCC provides access to supercomputers for biotech startups
Bioinformatics and supercomputing go together like Google and search – and everyone knows that computing resources can sometimes be scarce. EuroCC Austria, the national competence centre for supercomputing, big data, and AI, offers free access to the Austrian supercomputer VSC as well as biotech expertise for startups and SMEs. Bioinformatician Malgorzata Goiser explains how this works in practice.
Interview by Bettina Benesch
Maggie, how do you support startups and SMEs in the field of bioinformatics?
I deal with questions in the field of bioinformatics that users at the Vienna Scientific Cluster (VSC) might have. I provide consulting to those who are just getting started with supercomputing or want to set up a new project. This includes advice on the tools they can use. Bioinformatics is a constantly evolving field. Our expertise at EuroCC lies in quickly passing on these changes to our users. Additionally, I’m always on the lookout for partners who either need High-Performance Computing (HPC) resources or have expertise that our clients can benefit from.
“
Many people are unaware that we have a supercomputer in Austria that is open to any company. We also facilitate access to HPC systems across all Europe.
„
What fascinates you most about your work?
I love inspiring people to embrace HPC. Many don’t realise that Austria has a supercomputer that is essentially open to any company – not to mention the other European systems to which every company in the region can gain access. I think this is an opportunity that should be taken advantage of because HPC can make a significant contribution and drive innovation across Europe.
Who is the typical customer of EuroCC Austria?
We primarily work with startups and SMEs, but we welcome anyone who doesn’t have their own cluster or access to one. What makes EuroCC special is that the project enables access to high-performance computers across Europe, including pre-exascale systems currently located in Finland, Italy, and Spain.
Which applications in bioinformatics require High-Performance Computing?
Typically, these involve analysing sequenced data in genetics or proteomics. In the past, predicting protein structures in the lab, for example, was very tedious – now, thanks to AlphaFold and similar tools, it works quite quickly and with high predictive accuracy.
What’s happening right now in the field of bioinformatics and artificial intelligence (AI)?
“
A project is currently in progress that aims to consolidate everything ever researched in the life sciences.
„
One project currently in progress, which I find very exciting, is the consolidation of everything ever researched in the life sciences. The goal is to integrate everything ever written in this field into an AI system to identify connections. Of course, this is an extremely demanding project because the texts need to be validated first. It will take a long time before AI can be widely implemented.
What challenges does healthcare face in connection with bioinformatics?
As a bioinformatician, I would love to have access to a large amount of patient data: age, gender, medical histories, diagnostic values, genetic data, and so on. Much of this is protected by privacy laws, and patients must consent to share it. So, data is a scarce resource.
Does that mean bioinformaticians would like to have more data?
Yes. If we had information on lifestyle and living conditions alongside clinical diagnoses, research could take a huge leap forward. We know that our genes only play a partial role in the development of diseases. Everything else is epigenetics – the environment and living conditions. This would involve an enormous number of factors, and working with them would be an exciting and immensely important project for humanity. That would be a dream, and I’d love to analyse it.
Until then – if that day ever comes – does research just make do with the data available?
Thankfully, there are people willing to share their data, and even with that, a lot can already be discovered.
Where do you think bioinformatics is heading with HPC, HPDA, and AI?
AI will certainly have a major impact on healthcare. There will probably always be basic research, but everything will likely move faster with AI tools. Bioinformatics is a very broad field. For example, it’s possible to analyse every stressor in every conceivable configuration. The possibilities are endless.
If you had one wish, what would it be?
“
Patient data is protected by privacy laws, and patients must consent to share it. So, data is a scarce resource.
„
It’s about data: there are many doctors and scientists who have an incredible amount of data stored away. I’m convinced that we could achieve a great deal if this data were analysed. So, here’s a call to anyone with data who doesn’t know what to do with it: come to us, let us analyse it. Your data could make a significant contribution.
Short bio
Malgorzata Goiser began her career as a bioinformatician at the Medical University of Vienna and subsequently worked eight years at the Vienna BioCenter. Since 2021, she has been with EuroCC as an expert in HPC and High-Performance Data Analysis (HPDA), responsible for connecting HPC experts with HPC users. She supports entrepreneurs in running their biotech projects on Austria’s supercomputer, the VSC.
About the key concepts
As an interdisciplinary field of research, bioinformatics combines computer science, mathematics, and biology to analyse and interpret biological data. Bioinformaticians use algorithms and software and typically work with large datasets, such as genes or protein structures, to study biological processes. Ultimately, this research is applied in medicine, for instance, in the development of new drugs.
When the datasets exceed a certain size, a standard desktop computer or a small cluster is no longer sufficient for the calculations. In such cases, bioinformaticians use high-performance computers, which can process large datasets very quickly.
Believe it or not, High-Performance Computing (HPC) is actually a relatively old concept: the word "supercomputing" was first used in 1929, and the first mainframe computers appeared in the 1950s. However, they had far less capacity than today's mobile phones. The technology really took off in the 1970s.
HPC systems are used whenever the personal computer's memory is too small, larger simulations are required that cannot be run on the personal system, or when what was previously calculated locally now needs to be calculated much more frequently.
The performance of supercomputers is measured in FLOPS (Floating Point Operations Per Second). In 1997, a supercomputer achieved 1.06 TeraFLOPS (1 TeraFLOPS = 10^12 FLOPS) for the first time; Austria's currently most powerful supercomputer, the VSC-5, reaches 2.31 PetaFLOPS or 2,310 TeraFLOPS (1 PetaFLOPS = 10^15 FLOPS). The era of exascale computers began in 2022, with performance measured in ExaFLOPS (1 ExaFLOPS = 10^18 FLOPS). An ExaFLOPS equals one quintillion floating-point operations per second.
As of June 2024, there were only two exascale systems in the TOP500 list of the world's best supercomputers: Frontier at Oak Ridge National Laboratory and Aurora at Argonne National Laboratory, both in the USA. In Europe, there are currently three pre-exascale computers, which are precursors to exascale systems. Two European exascale systems will be operational shortly.
VSC (Vienna Scientific Cluster) is Austria's supercomputer, co-financed by several Austrian universities. The computers are located at the TU Wien university in Vienna. From 2025, the newest supercomputer, MUSICA (Multi-Site Computer Austria), will be in use at locations in Vienna, Linz, and Innsbruck.
Researchers from the participating universities can use the VSC for their simulations, and under the EuroCC programme, companies also have easy and free access to computing time on Austria's supercomputer. Additionally, the VSC team is an important source of know-how: in numerous workshops, future HPC users, regardless of their level, learn everything about supercomputing, AI and big data.
EuroCC is an initiative of EuroHPC Joint Undertaking. EuroHPC is a public-private partnership of the European Union aimed at building a Europe-wide high-performance computing infrastructure and keeping it internationally competitive.
Each participating country (EU plus some associated states) has established a national competence centre for supercomputing, big data and artificial intelligence – EuroCC Austria is one of them. They are part of the EuroCC project, which brings technology closer to future users and facilitates access to supercomputers. The goal of the project is to help industry, academia and private sector adopt and leverage HPC, AI and High-Performance Data Analytics.