Informatics Support Group
High Performance Computing
Today’s computational challenges have reached a scale that require professional high performance compute infrastructure to tackle the largest challenges in informatics. To provide the scale and the flexibility our scientists need to conduct their research, while also maintaining significant industry-leading up times, we deploy and link together high performance compute clusters and OpenStack private cloud environments.
We work in close partnership with scientists and informaticians across the Institute to design, deliver, manage, code develop, and develop new solutions. Depending on the project and required solution, we employ waterfall or agile techniques including scrum, CICD, DevOps and similar workflow processes.
The infrastructure and solutions we provide have supported the delivery of the Institute’s COVID-19 Genomic Surveillance and variant analysis for the UK Government and almost 250,000 human genome sequences for the UK Biobank project without service interruptions.
Our goal is to deliver and support platforms that are scalable, resilient, cost-effective and, whenever possible, self-service.
- High performance compute clusters that are rated for the highest possible performance to tackle the largest informatics jobs our scientists can devise.
- Private cloud environments to provide flexibility and enable transition to the public cloud when necessary, so that we can take advantage of developing technologies and enable our researchers to co-develop and collaborate on national and international projects.
Producing, providing, architecting and designing these types of platforms consistently requires us to develop and use at scale tools to automate our processes whenever possible. In addition, our research colleagues require large quantities of data to be delivered to their compute systems in a timely manner so that they can conduct their analyses. To achieve this, we work in collaboration with the Institute’s scientists to develop an indepth understanding of their needs so that we can provide infrastructure solutions that can adapt and grow to meet future demands.
We are constantly re-evaluating and reconfiguring our system to ensure that we can supply future requirements to power the next wave of research discovery. We have built strong working relationships with vendors and third parties so that, when off-the-shelf solutions are not available, we are able to codevelop, or shape the creation of, the next generation of solutions and features. For example, we currently have a network bandwidth that, in some areas, tops out at 1.6 terabits per second.
Our working relationships place us at the vanguard of delivering new technologies. For example, we developed secure lustre with DDN and were the first to bring iRODS as a data management system for the informatics world. The system has been such a success at the Sanger Institute that it is now the standard for our informatics community and holds more than 45 Petabytes of usable data and has a capacity of almost 60 Petabytes.
We are also developing a new solution for data management within clusters. The volumes of data our scientists need to analyse mean that it is more important to bring the compute capacity to the data. In collaboration with IBM, we have been able to introduce ‘Data Manager’ a system that sends the compute job to where the data is, thereby moving the compute node and not the data. In this way, we have been able to ensure that the Institute’s research data is available to our scientists in the most efficient way.
If you are interested in joining our team – where no two days are the same – please visit the Sanger Institute’s Careers board which advertises all our vacancies.
Dr Peter Clapham
ISG Team Leader
Peter leads the Informatics Support Group (ISG) which provides the high performance compute (HPC) environments for Sangers scientific research teams. Our team investigates new and upcoming technical solutions that will drive our HPC platforms for tomorrow. In this way we can continue to keep abreast of the research challenges presented.
Cancer Genome Project
Cancer Genetics & Genomics
Throughout life, the genome within cells of the human body is exposed to DNA damage and suffers mistakes in replication. These ...
Cellular Genetics Informatics
Our team provides efficient access to cutting-edge analysis methods, environments and pipelines for Cellular Genetics programme, which leads and is involved ...
Human Genetics Informatics (HGI)
Human Genetics Informatics (HGI) supports the scientific aims of the Human Genetics programme by developing and operating computational analysis workflows, managing ...
Informatics and Digital Solutions
We support the Sanger Institute’s mission to deliver innovative and ambitious genomics research at a scale to improve human health ...