Whole Genome Sequencing data on 200,000 UK Biobank participants are made widely available for research through unique public-private partnership

First release from world’s largest whole genome sequencing project could help researchers to understand the genetic determinants of disease and accelerate innovative drug discovery work

Email newsletter

News and blog updates

Sign up

In a major step forward for the advancement of genomics research, today Whole Genome Sequencing (WGS) data[1] for the first 200,000 UK Biobank participants has been made available to researchers through the recently launched Research Analysis Platform.

This dataset represents the world’s largest single release of WGS data[2]. When combined with the extensive amount of lifestyle, biochemical, imaging and health outcome data already held for UK Biobank participants, it will enable researchers to better understand the role of genetics for health outcomes and to advance drug discovery and development.

The whole genome sequencing of all 500,000 UK Biobank participants is the most ambitious project of its kind ever undertaken. It has been funded through a public-private partnership involving Amgen, AstraZeneca, GlaxoSmithKline (GSK) and Johnson & Johnson[3], alongside Wellcome and UK Research and Innovation (UKRI), and sequencing has been carried out by deCODE Genetics and the Wellcome Sanger Institute[4]. The release of these 200,000 whole genomes today will be followed up by the release of the WGS data for the remaining 300,000 participants in early 2023.

Importance of these data for understanding genetics and relationship with human health

Access to these WGS data will allow researchers from across the world to study the 98 per cent of the genetic code that until recently had no clear purpose. Whole genome sequencing on this unprecedented scale will significantly enhance understanding of the following:

  • WGS data will enable researchers to identify rare non-coding variants that contribute to disease onset and progression. By combining the WGS data with the rich clinical and lifestyle data of UK Biobank participants, researchers are now uniquely equipped to answer questions about why some individuals develop particular diseases but others do not, and why certain conditions worsen in some individuals over time.
  • The WGS data will help to accelerate drug discovery and development by allowing researchers to identify new drug targets. This is important because pharmaceutical companies have found that potential drug targets supported by clear genetic evidence are twice as likely to result in effective medicines[5].
  • The large-scale nature of the UK Biobank cohort and constellation of health outcome information available also afford an opportunity to assess patient stratification by identifying subgroups of individuals who are more or less likely to respond to treatment, or who are more or less likely to experience side-effects.

A collaborative achievement

This highly anticipated project has only been made possible through collaboration between government, industry, and charity in a project that showcases the strengths of the UK life sciences industry.

The collaborative effort of all partners has resulted in the £200m project being delivered according to plan and budget, despite the challenging conditions caused by the Covid-19 pandemic and remote working.

“Sequencing at such a large scale and speed would not have been possible without the long-term vision of UKRI and Wellcome, the support of the industry consortium, and the expertise of the sequencing teams. The WGS project will make UK Biobank the most detailed genomics database in the world and by sharing these data with the global research community our aim is to enable breakthroughs in understanding, diagnosis, prevention and treatment strategies for a range of common and life-threatening diseases.”

Professor Sir Rory Collins, Principal Investigator at UK Biobank

“We are all incredibly proud of contributing to the creation of the largest whole genome sequencing data set in the world. These data, combined with the extensive lifestyle, biochemical and health outcome data already available, makes the UK Biobank an increasingly powerful resource for understanding the genetic architecture of diseases and accelerating drug discovery and development. It is a truly pivotal moment for scientific research aimed at improving human health.”

Letizia Goretti, Chair of the Joint Steering Committee, representing Johnson & Johnson Innovation[6] on behalf of the industry consortium parties, Amgen, AstraZeneca, GSK and Johnson & Johnson

“The release of the first 200,000 whole genome sequences is a tremendous achievement, not only for UK Biobank, but also for the sequencing partners, deCODE Genetics and the Wellcome Sanger Institute. The integration of the sequences with the other characteristic data sets from participants will create a powerful resource to enable major discoveries that will benefit health outcomes.”

Dr Michael Dunn, Director of Discovery Research at Wellcome

“This data resource is a true testament to the dedication and expertise of our sequencing teams, who delivered genome sequencing at large scale and speed in unprecedented times during the pandemic. These data will enable new discoveries into the onset and progression of diseases, as well as accelerating drug discovery, and we are privileged to have been a part of making these whole genome sequence data available to the research community.”

Dr Cordelia Langford, Director of Scientific Operations at the Wellcome Sanger Institute

“The UK Biobank programme demonstrates how transformative science can be delivered much more rapidly by working in partnership. Through the Data to Early Diagnosis Challenge, UKRI is proud to have co-invested in this ambitious programme, which has delivered an unprecedented data resource to accelerate the application of genomics to improve health.”

Professor Dame Ottoline Leyser, Chief Executive at UK Research and Innovation

More information

[1] Whole Genome Sequencing analyses the entire human genome, a unique genetic code of 3 billion building blocks that contain the 24,000 genes inside a human cell and which control the biochemical processes that underpin life.

[2] Five petabytes of WGS data have been made available to the research community today.

[3] On behalf of Johnson & Johnson, the WGS contract was entered into by Janssen Biotech, Inc., one of the Janssen Pharmaceutical Companies of Johnson & Johnson, and the collaboration was facilitated by the Johnson & Johnson Innovation Centre in London, UK.

[4] The whole genome sequencing of the first 50,000 UK Biobank participants was conducted by the Wellcome Sanger Institute and funded by the Medical Research Council (MRC). Following this pilot, the WGS consortium began sequencing the whole genomes of the remaining 450,000 UK Biobank participants.

[5] Advancing human genetics research and drug discovery through exome sequencing of the UK Biobank (2021)

[6] Letizia Goretti is an employee of Janssen Pharmaceutica NV.

About UK Biobank

UK Biobank is a large-scale biomedical database and research resource containing anonymised genetic, lifestyle and health information from half a million UK participants. UK Biobank’s database, which includes blood samples, heart and brain scans and genetic data of the volunteer participants, is globally accessible to approved researchers who are undertaking health-related research that’s in the public interest.

UK Biobank recruited 500,000 people aged between 40-69 years in 2006-2010 from across the UK. With their consent, they provided detailed information about their lifestyle, physical measures and had blood, urine and saliva samples collected and stored for future analysis.

UK Biobank’s research resource is a major contributor in the advancement of modern medicine and treatment, enabling better understanding of the prevention, diagnosis and treatment of a wide range of serious and life-threatening illnesses – including cancer, heart diseases and stroke. Since the UK Biobank resource was opened for research use in April 2012, over 27,000 researchers from +90 countries have been approved to use it and more than 3,000 peer-reviewed papers that used the resource have now been published.

UK Biobank is generously supported by its founding funders the UK Medical Research Council (MRC) and Wellcome, as well as the British Heart Foundation, Cancer Research UK, the National Institute of Health Research (NIHR), UK Research and Innovation (UKRI). The organisation has over 150 dedicated members of staff, based in multiple locations across the UK. Find out more here: https://www.ukbiobank.ac.uk/