The nUCLeus Hub: Scalable training in computational biomedicine

How can we provide sustainable and affordable HPC training to the next generation of life sciences graduates and future physicians during a global pandemic? This is the unforeseen question that instigated an ambitious collaboration between CompBioMed core partner University College London and associate partner Alces Flight at the start of the year, just at Europe was entering the depths of its’ initial wave of COVID-19. This proof-of-concept project utilises cloud-based HPC to host a high-performance multi-core build of the QIIME 2 (Quantitative Insights Into Microbial Ecology) bioinformatics pipeline, and provide life sciences and medical students with no prior HPC experience with the HPC resources they need to run high-throughput QIIME 2 training exercises.

The training exercises are designed to demonstrate how HPC can drastically accelerate analyses of large and unwieldly datasets and teach students how to effectively tune and execute applications on HPC platforms. They build on material taken from a student selected component module named “From Skin to Metagenomics: Explore your Microbiome” organised by Professor Andrea Townsend-Nicholson of UCL as part of the training and engagement activities in the first phase of CompBioMed. The first phase of training was deemed a success, attracting a total of 350 students between 2017 and 2020 and precipitating a steady increase in the rate of HPC use (in terms of core hours consumed each year) amongst life sciences and medical students at UCL over the three-year period. This is a group that has historically been severely underrepresented in the community of HPC users with scientific backgrounds. Sustainable uptake of computational biomedicine in the clinical setting, and hence the full realisation of the virtual human demands a new generation of computationally fluent and HPC literate medical researchers and practitioners. Therefore, it is vital that these students receive adequate exposure to HPC early on in their training, before they specialise.

Now that CompBioMed has entered its’ second phase, UCL has teamed up with cloud HPC experts Alces Flight to build on this success by gradually extending the availability of training resources to students and post-doctoral researchers across Europe over the next three years. They are currently laying the groundwork for a highly scalable, resource-efficient and affordable remote training hub, now officially known as the nUCLeus hub. At present, the nUCLeus hub is connected a permanent three-node test cluster, on which a team at Alces Flight cloud HPC engineers are working to build and test a scalable remote QIIME 2 training environment for a cohort of students at the University of Sheffield. The test cluster will be scaled up to production size using AWS when teaching commences in late September 2020, and performance data will be monitored closely by the Alces Flight team during this initial trial to optimise efficiency and use of resources for future training courses.