Genetic code of 66,000 UK species to be sequenced

The Darwin Tree of Life Project, led by the Wellcome Sanger Institute, plans to read the genomes all known species of animals, birds, fish and plants in the UK

Genetic code of 66,000 UK species to be sequenced

Genetic code of 66,000 UK species are set to be sequenced by the newly launched Darwin Tree of Life Project
Genetic code of 66,000 UK species are set to be sequenced by the newly launched Darwin Tree of Life Project

The genetic codes of 66,000 species in the UK are planned to be sequenced by the Wellcome Sanger Institute and its collaborators as part of a global effort to sequence the genomes of all 1.5 million known species of animals, plants, protozoa and fungi on Earth.

The UK effort, known as the Darwin Tree of Life Project, officially launches today in London (1 November) alongside the global effort, the Earth BioGenome Project (EBP). The launch is marked by a gathering of key scientific partners and funders from around the globe to discuss progress in organising and funding the project.

The EBP will ultimately create a new foundation for biology to drive solutions for preserving biodiversity and sustaining human societies.

A greater understanding of Earth’s biodiversity and the responsible stewarding of its resources are among the most crucial scientific and social challenges of the new millennium. The overcoming of these challenges requires new scientific knowledge of evolution and interactions among millions of the planet’s organisms.

The Sanger Institute will serve as the genomics hub in the UK and will collaborate with the Natural History Museum in London, Royal Botanic Gardens, Kew, Earlham Institute, Edinburgh Genomics, University of Edinburgh, EMBL-EBI and others in sample collection, DNA sequencing, assembling and annotating genomes and storing the data. Further, the Sanger Institute will coordinate with other groups contributing to the EBP, such as the G10K Vertebrate Genomes Project (VGP) and the 10,000 Genomes Plant Project, to ensure there is no redundancy of effort, and that each project contributes to the other.

The Darwin Tree of Life project is estimated to cost approximately £100 million over the first five years, and the sequencing of 66,000 species’ genomes will take around 10 years.

To mark the 25th anniversary of the Wellcome Sanger Institute, the institute and its collaborators used PacBio® long-read technology and protocols developed by the VGP to sequence the genomes of 25 UK species for the first time*, including red and grey squirrels, the European robin, Fen raft spider and blackberry. The insights gained from the 25 Genomes Project form a basis for scaling up to sequence the genomes of 66,000 species.

The Darwin Tree of Life project is now possible due to recent and expected advances in sequencing and information technology that will enable the reading and interpretation of thousands of species’ genomes each year by the Sanger Institute and its partner institutions across the UK and internationally. All of the data will be stored in public domain databases and made freely available for research use.

Sequencing the eukaryotic species in the UK and worldwide will revolutionise our understanding of biology and evolution, bolster efforts to conserve, help protect and restore biodiversity, and in return create new benefits for society and human welfare.

“Globally, more than half of the vertebrate population has been lost in the past 40 years, and 23,000 species face the threat of extinction in the near future. Using the biological insights we will get from the genomes of all eukaryotic species, we can look to our responsibilities as custodians of life on this planet, tending life on Earth in a more informed manner using those genomes, at a time when nature is under considerable pressure, not least from us.”

Professor Sir Mike Stratton, Director of the Wellcome Sanger Institute

“The Darwin Tree of Life Project is a tremendously important advance for the Earth BioGenome Project and will serve as a model for other parallel national efforts. The Wellcome Sanger Institute brings decades of experience in genome sequencing and biology to help build the global capacity necessary to produce high quality genomes at scale. The Earth BioGenome Project and its partner organizations welcome the outstanding leadership that the Wellcome Sanger Institute brings to our efforts to sequence all known eukaryotic life on our planet.”

Professor Harris Lewin, University of California, Davis, United States and Chair of the Earth BioGenome Project

“When the Human Genome Project began 25 years ago, we could not imagine how the DNA sequence produced back then would transform research into human health and disease today. Embarking on a mission to sequence all life on Earth is no different. From nature we shall gain insights into how to develop new treatments for infectious diseases, identify drugs to slow ageing, generate new approaches to feeding the world or create new bio materials.”

Sir Jim Smith, Director of Science at Wellcome

Notes to Editors

Eukaryotic species are defined as organisms whose cells have a nucleus enclosed within membranes, unlike prokaryotes, which are unicellular organisms that lack a membrane-bound nucleus, mitochondria or any other membrane-bound organelle (Bacteria and Archaea).

The Earth BioGenome Project (EBP) aims to sequence, catalogue and categorise the genomes of all of Earth’s eukaryotic biodiversity over a period of ten years. The estimated cost of the EBP is $4.7 billion. Accounting for inflation, the Human Genome Project today would cost $5 billion.

* For more information on the Sanger 25 Genomes Anniversary Project news story, visit: https://www.sanger.ac.uk/news/view/25-uk-species-genomes-sequenced-first-time

Funding:

The Wellcome Sanger Institute will use core funding from Wellcome to introduce a research programme in Tree of Life genomics. Further funding support for sample collection, sequencing machines, data infrastructure is required.

Activities of the EBP are currently being funded by the participating organisations as well as private foundations, governmental organisations and crowd-funding sources. Participating institutions are committed to raising funds to complete the project in 10 years. Significant funds have already been raised by taxon-based communities, national and regional projects to meet the $600 million goal necessary to complete Phase 1 of the project, which aims to produce approximately 9000 reference quality genomes across all taxonomic families.

Appendix of quotes from partners:

“The Natural History Museum is proud to be collaborating with the Wellcome Sanger Institute on the Darwin Tree of Life project. It is hoped that together we can uncover the blueprints of the diversity of UK life, which will effectively re-write what we know about these species. By comparing those blueprints within and between species we can understand the genetic diversity of fauna and flora from the UK and beyond. Sequencing the genomes of all life will reveal aspects of evolution we’ve not even dreamt of.”

Dr Tim Littlewood, Head of Life Sciences at the Natural History Museum, London

“This project not only has the potential to tell us a lot about the evolution of the diversity of life on Earth, as well as preserving valuable genetic information for future generations, but also to harness this data for the public good. This information will enable us to better protect ecosystems and understand how they function. We will also be able to mine genomic data for valuable new materials and medicines as well as new genetic diversity that can be used to protect crops from disease or climate change.”

Professor Federica Di Palma, Director of Science at the Earlham Institute


“The launch of the Darwin Tree of Life project is the realisation of a longstanding dream. Having the full genomes of all the organisms we share the planet with will change our ability to understand and care for them. The UK environmental and evolutionary research community has for many years been leading the way in sequencing the DNA of diverse species, and this revolutionary project will transform the science we can do.”

Professor Mark Blaxter, of Edinburgh Genomics and the University of Edinburgh


“Sequencing genomes will not only help us to understand better how species have evolved and diversified but will also provide vital insights into how they impact and influence ecosystem functioning and global change response. At Kew, we’re utilising our unique collections for large sequencing projects, where genome data can provide us with diagnostic tools to be able to effectively respond to disease outbreaks and minimise the impact on food security.”

Dr Ester Gaya, a senior mycologist and Dr Felix Forest, a senior scientist at Royal Botanic Gardens, Kew


"The Darwin Tree of Life project is an exciting opportunity to understand life, evolution, ecosystems and biodiversity by leveraging genomics and our experience in creating biological data resources that are freely available to everyone in the world.”

Dr Paul Flicek, a senior scientist and team leader at EMBL’s European Bioinformatic Institute

“PacBio recently provided the foundational technology to enable completion of the 25 Genomes Project at the Wellcome Sanger Institute, and we are honored to be an integral part of the Darwin Tree of Life project as it deploys the power of our sequencing technology on a much broader scale. With the recent and ongoing improvements in our technology, we are well positioned to support the needs for scaling the sequencing and assembling of the genomes for the large number of species targeted by this project as well as the Earth BioGenome Project.”

Dr Jonas Korlach, Chief Scientific Officer at Pacific Biosciences

Selected Websites
What is a genome?FactsWhat is a genome?
A genome is an organism’s complete set of genetic instructions. Each genome contains all of the information needed to build that organism and allow it to grow and develop.

How are sequenced genomes stored and shared?FactsHow are sequenced genomes stored and shared?
After a genome has been sequenced, assembled and annotated it needs to be shared in a format that is easily and freely accessible to all. This can be done via a database called a genome browser.

How do you put a genome back together after sequencing?FactsHow do you put a genome back together after sequencing?
After DNA sequencing is complete, the fragments of DNA that come out of the machine are all jumbled up. Like a jigsaw puzzle we need to take the pieces of the genome and put them back together.

How do you identify the genes in a genome?FactsHow do you identify the genes in a genome?
After the sections of DNA sequence have been assembled into a complete genome sequence we need to identify where the genes and key features are, but how do we do this?

Contact the Press Office

Dr Samantha Wynne, Media Officer

Tel +44 (0)1223 492 368

Emily Mobley, Media Officer

Tel +44 (0)1223 496 851

Wellcome Sanger Institute,
Hinxton,
Cambridgeshire,
CB10 1SA,
UK

Mobile +44 (0) 7900 607793

Recent News

Human cell atlas study reveals how the maternal immune system is modified early in pregnancy

Cell map of healthy pregnancy could also help understand miscarriages or preeclampsia

Recessive genes explain only small fraction of undiagnosed developmental disorders

Study will guide research and help estimate risk for future pregnancies

Largest parasitic worm genetic study hatches novel treatment possibilities

Study helps understand how parasitic worms cause disease and uncovers potential new de-worming drugs