The mission of the international Vertebrate Genomes Project (VGP) is to provide high quality genome assemblies of all vertebrate species to address fundamental questions in biology and disease. The current phase 1 of the project aims at creating reference quality, near gapless, chromosomal level, haplotype-phased assemblies of selected species representing all vertebrate orders.
The VGP at the Sanger Institute covers the sequencing, assembly generation and subsequent analysis of fish, caecilians and rodents. The current pilot focusses on cyprinids, cichliforms, notothenioid and anabantoid fishes, as well as select rodents and caecilians.
Samples to be sequenced are selected between collaborators and submitted to the Sanger Institute. A variety of sequencing and assembling technologies are currently trialled, amongst them PacBio, Oxord Nanopore and Chromium sequencing and BioNano mapping.
The Genome Reference Informatics Team analyses genome assemblies to reveal and correct quality issues and to identify and add variation. It forms the Sanger division of the Genome Reference Consortium.
Many high profile projects such as the Cancer Genome and 1000 Genomes projects need quality assemblies for downstream analysis using 2nd and 3rd generation sequencing data. Efficient bioinformatics tools in processing and analysis of large quantities of genomic data play crucial roles in producing high quality assemblies as well as variation detection. The High Performance Assembly Group (HPAG), headed by Zemin Ning, develops algorithms and software tools for genome analysis. Currently the team is also involved in data processing and quality evaluation of Oxford Nanopore sequencing technology.
The DNA Pipelines Research and Development group is the entry point for new technologies to the Institute, especially sequencing instruments. The team develops new methods and procedures to maximise the efficiency, quality and throughput of all incoming machinery for the benefit all Institute researchers.
Data use policy
The Sanger Institute Vertebrate Genomes Project releases sequence data, assemblies, SNPs and other variant calls as a service to the research community. These data are released in accordance with the Fort Lauderdale and Toronto agreements, following Sanger Institute policies. We reserve the right to first publication of a genome-wide analysis of the data we have generated, including the use of genome-wide data for phylogenetic and evolutionary analysis, on behalf of ourselves as data producers, the sample providers and other collaborators. The pre-publication data that we release via this website and the relevant archives is embargoed for publication except for analyses of regions smaller than one chromosome in single species or a maximum of 10 gene loci across multiple species, or for use as a reference for mapping reads from independent studies. We strongly encourage researchers to contact us if there are any queries about referencing or publishing analyses based on pre-publication data from this project.
If you have a query about using the project data in your studies or publications, we are happy to answer any queries and can be contacted at vgp-help [at] sanger.ac.uk