500,000 whole human genomes will be a game-changer for research into human diseases

Following on from a successful pilot at the Sanger Institute, we are leading a project to sequence the genomes of all UK Biobank volunteers to power the next wave of genetic and health research

Email newsletter

News and blog updates

Sign up

In a major advance for public health and for the UK’s global leadership in genomics, a £200m project involving the government, charity, researchers and four leading pharmaceutical companies, was announced today (11 September). The Whole Genome Sequencing (WGS) project will become a game-changing resource accessible to the global scientific community to understand, diagnose, treat and prevent life-changing diseases such as cancer and dementia.

The genetic code of all 500,000 UK Biobank* volunteer participants will be sequenced by researchers at the Wellcome Sanger Institute in the UK and deCODE genetics in Iceland, using the Illumina sequencing platform.

This project is the single most ambitious sequencing programme in the world undertaken by a public-private partnership. Supported for over 16 years with public funding and charity investment, UK Biobank has already created a uniquely rich data resource that has dramatically increased the understanding of the factors that contribute to the development of disease.

Funding for the project comes from the government’s research and innovation agency, UK Research and Innovation (UKRI) with £50m through the Industrial Strategy Challenge Fund, £50m from Wellcome and a further £100m in total from Amgen, AstraZeneca, GlaxoSmithKline (GSK) and Johnson & Johnson**.

The total amount of genetic data generated will be vast, roughly equivalent to around 5000 billion pages of text and will require unique technical expertise to store and analyse. Data will be linked to the other detailed clinical and lifestyle data for each volunteer in the UK Biobank programme. The end result will be an encyclopaedia of genetic information, linked with comprehensive clinical characterisation, appropriately de-identified and protected, that will help to provide a unique insight into why some people develop particular diseases and others do not.

This project follows the successful initiation of a pilot programme at the Sanger Institute, known as the Vanguard Project, which involves sequencing the genomes of 10 per cent – 50,000 individuals – of the UK Biobank participants. Funding for this pilot programme was led by the Medical Research Council (MRC), through the Industrial Strategy Challenge Fund.

Building on the work of the pilot programme, the plan is to complete sequencing of the remaining 450,000 participants in two tranches. After both phases industry partners will have preferential access to the data for nine months. At the end of this period the requests to access the whole genome sequence data will be managed in the same way as all requests to work with datasets held by UK Biobank and subject to a Material Transfer Agreement (MTA) with the approval of the UK Biobank Access Sub-committee.

The first tranche of data is expected to comprise of up to 125,000 whole genome sequences, anticipated to be accessible to all in Spring 2021, and at the same time the 50,000 Vanguard sequences will be available. 
The expectation is that sequence data for the entire cohort of UK Biobank participants would become generally accessible by early 2023.

“We are thrilled to be contributing to the UK Biobank project by sequencing 225,000 whole human genomes. Together with deCODE in Iceland, we will read and assemble the whole genome sequences of 500,000 volunteers, and this data will transform the way we carry out research into human health and disease. A dataset of this magnitude will be incredibly powerful for understanding the genetic architecture that contributes to disease and we are one of only a few institutes in the world with the technical and scientific expertise to undertake a project of this scale.”

Dr Cordelia Langford Director of Scientific Operations at the Wellcome Sanger Institute

 “Genomics is transforming our understanding of human health and disease. The UK Biobank Whole Genome Sequencing project is an exemplar of science at scale and we are proud to be a part of this initiative. The rich encyclopaedia of genomic data that will become available as a result of this ambitious effort combined with the incredibly detailed information already collated in UK Biobank will accelerate discoveries in diagnosing and ultimately treating diseases such as cardiovascular disease and cancer.”

Professor Sir Mike Stratton Director of the Wellcome Sanger Institute

“This exciting new project will help scientists and doctors develop new ways of preventing, diagnosing and treating a range of life changing diseases such as cancer and dementia. By sequencing the genomes of the UK Biobank participants, the research community will have an unprecedented resource to gain new insights into human disease. This work would not be possible without the generous support of the 500,000 participants of the UK Biobank who, without any direct benefit to themselves, have allowed their lives to be studied through blood tests, body scans and information from their medical records all in the hope that it will benefit others.”

Sara Marshall Head of Clinical Research and Physiological Sciences at Wellcome

More information

* The 500,000 participants to UK Biobank project were recruited between 2006 and 2010 and have consented to their medical records being linked to a range of physical measurements and biological samples collected at the recruitment. Participants were recruited between the ages of 40 and 69 and the focus of the project is to investigate the factors that lead to a range late-onset conditions. The scale of the project allows the interplay of genetic and environmental factors to be evaluated and the prospective nature of the study means that it will allow the identification of early indicators of disease prior to clinical diagnosis.

** Contract entered by Janssen Biotech Inc., one of the Pharmaceutical Companies of Johnson & Johnson; collaboration facilitated by the Johnson & Johnson EMEA Innovation center in London, UK

Further reading:

World’s largest genetics project to tackle deadly diseases launches – UK Government Department for Business, Energy and Industrial Strategy press release

World-leading genomics project to give insights into health and disease –  UKRI media narrative

Selected websites

  • About UK Research and Innovation

    UKRI is a new body which works in partnership with universities, research organisations, businesses, charities, and government to create the best possible environment for research and innovation to flourish. We aim to maximise the contribution of each of our component parts, working individually and collectively. We work with our many partners to benefit everyone through knowledge, talent and ideas. 

    Operating across the whole of the UK with a combined budget of more than £7 billion, UKRI brings together the seven Research Councils, Innovate UK and Research England.


  • About UK Biobank

    UK Biobank was established by the Wellcome Trust medical charity, Medical Research Council, Department of Health, Scottish Government and the Northwest Regional Development Agency. It has also had funding from the Welsh Government, British Heart Foundation, Cancer Research UK and Diabetes UK. UK Biobank is supported by the National Health Service (NHS). UK Biobank is open to bona fide researchers anywhere in the world, including those funded by academia and industry.  The medical research project is a non-profit charity which had initial funding of about £62 million.


  • The Wellcome Sanger Institute

    The Wellcome Sanger Institute is a world leading genomics research centre. We undertake large-scale research that forms the foundations of knowledge in biology and medicine. We are open and collaborative; our data, results, tools and technologies are shared across the globe to advance science. Our ambition is vast – we take on projects that are not possible anywhere else. We use the power of genome sequencing to understand and harness the information in DNA. Funded by Wellcome, we have the freedom and support to push the boundaries of genomics. Our findings are used to improve health and to understand life on Earth. Find out more at www.sanger.ac.uk or follow us on Twitter, Facebook, LinkedIn and on our Blog.

  • About Wellcome

    Wellcome exists to improve health by helping great ideas to thrive. We support researchers, we take on big health challenges, we campaign for better science, and we help everyone get involved with science and health research. We are a politically and financially independent foundation.