Sanger’s cloud computing wins award

High performance computing community recognises pioneering work of Sanger’s scientific IT teams

Sanger’s cloud computing wins award

High performance computing community recognises pioneering work of Sanger’s scientific IT teamsWellcome Sanger Institute, Genome Research Limited

Sanger’s use of cloud computing has been recognised by HPCwire, the journal of high performance and data-intensive computing. The IT team won the Readers’ Choice award for Best Use of High Performance Computing in the Cloud.

The high performance computing community cited the way that the Sanger Institute is using a private OpenStack cloud IT environment to enable the sequencing and assembly of 100 complete human genomes a day. The flexible computing space is a vital part of the pipeline developed at Sanger to deliver the UK Biobank Vanguard project. This ambitious pilot project will sequence the 50,000 genomes from volunteers whose DNA samples are stored with UK Biobank, and the techniques and pipelines developed will pave the way for further large-scale projects.

“I’m delighted for the teams, who thoroughly deserve this award. Their creativity and dedication to overcoming the unique challenges posed by storing and analysing genomic data means that we are not only able to cope with, but can also explore at scale, the vast volume of data being generated by our sequencing operation, and to deliver that capacity quickly. Winning the Readers’ Choice award is especially humbling as it means that our peers worldwide acknowledge the groundbreaking work of our scientific computing and informatics teams.”

Tim Cutts, Head of Scientific Computing at the Wellcome Sanger Institute

To enable the Sanger’s high-throughput sequencing teams to store, assemble and analyse the data of more than 100 genomes at 30X pass, the High Performance Computing Team called up the flexibility and scalability of cloud computing. Because a person’s genome is one of the most personal and sensitive datasets that can be stored, the team built an onsite instance of the OpenStack Cloud computing environment to marry the best of cloud computing’s flexibility with the security of onsite management behind the Institute’s enterprise-grade firewall.

"The high bandwidth and data locality, coupled with large scale-out performance that our on-site OpenStack cloud supplies, allows us to flexibly adapt and dynamically scale our compute to meet emerging scientific challenges. This enabled us to rapidly deploy the infrastructure required to sequence up to 3,000 human genomes per month, with the capability of further growth in capability as needed."

Peter Clapham, Leader of the High Performance Computing team at the Wellcome Sanger Institute

The result of this implementation will allow high-quality whole genome sequences to be assembled, stored and analysed at scale.

Notes to Editors

About HPCwire

HPCwire is the #1 news and information resource covering the fastest computers in the world and the people who run them. With a legacy dating back to 1986, HPCwire has enjoyed a history of world-class editorial and journalism, making it the news source of choice selected by science, technology and business professionals interested in high performance and data-intensive computing. https://www.hpcwire.com/

Selected Websites
What happens to DNA sequence when it comes off a sequencing machine?FactsWhat happens to DNA sequence when it comes off a sequencing machine?
DNA sequencing produces huge amounts of data essentially comprising of lots of short sections of DNA letters. The first step is to check that the sequence is of the highest quality before we start to piece the sections together.

How do you put a genome back together after sequencing?FactsHow do you put a genome back together after sequencing?
After DNA sequencing is complete, the fragments of DNA that come out of the machine are all jumbled up. Like a jigsaw puzzle we need to take the pieces of the genome and put them back together.

How are sequenced genomes stored and shared?FactsHow are sequenced genomes stored and shared?
After a genome has been sequenced, assembled and annotated it needs to be shared in a format that is easily and freely accessible to all. This can be done via a database called a genome browser.

Contact the Press Office

Dr Samantha Wynne, Media Officer

Tel +44 (0)1223 492 368

Emily Mobley, Media Officer

Tel +44 (0)1223 496 851

Wellcome Sanger Institute,
Hinxton,
Cambridgeshire,
CB10 1SA,
UK

Mobile +44 (0) 7900 607793

Recent News

Sanger’s cloud computing wins award

High performance computing community recognises pioneering work of Sanger’s scientific IT teams

Wellcome Genome Campus wins silver watermark to recognise its support for public engagement

Award reflects work to nurture a culture where public engagement is supported and encouraged

Largest study of CRISPR-Cas9 mutations creates prediction tool for gene editing

Prediction resource could make CRISPR-Cas9 editing more reliable