Wellcome Sanger Institute

Public Health England reference collections

This project aims to provide annotated and assembled reference genomes for 3,000 bacteria and 500 viruses as part of a new eResource

The project is split into two parts:

NCTC 3000: A joint collaboration between Public Health England, Pacific Biosciences and the Wellcome Sanger Institute to complete the sequencing of 3,000 bacterial strains from PHE’s National Collection of Type Cultures (NCTC) using Pacific Biosciences’ Single Molecule, Real-Time (SMRT) sequencing technology.

NCPV 500: A collaboration between PHE and Sanger to produce 500 viral genomes from PHE’s National Collection of Pathogenic Viruses (NCPV) using the Illumina sequencing platform.

Collectively, the data generated will be housed in a publically accessible web-based eResource that integrates metadata and genome sequences for type and reference strains of biomedically important bacterial and viral pathogens. This resource will integrate accession, taxonomy and authentication information with publications, genome sequences, comparative analysis databases and other resources at EMBL and NCBI.

This is a community resource project. Data will be available from here, and from the NCTC. We will submit assembled, annotated sequences to the International Sequence Databases as they become available. We request that you cite this webpage in any publication using the data, and would appreciate it if you contact us to discuss the use of this data.


If you need help or have any queries, please contact us using the details below.

External partners and funders



The National Collection of Type Cultures