The Genome Reference Consortium (GRC) was founded in 2007 to improve the reference genome assemblies of human, mouse and zebrafish. One of the first tasks was to modernise the assembly model to make sure that complex variation within a species can be captured and represented. The GRC also guarantees INSDC submission and long term maintenance of all produced assemblies. All this is achieved through genome analysis and additional sequencing and collection of other data, for instance optical mapping. We collaborate with major players in the respective communties to obtain additional data helping us to identify and correct issues in the existing genome assemblies.
The GRC genome annotations are available as a trackhub at http://ngs.sanger.ac.uk/production/grit/track_hub/hub.txt and can also be viewed in the genome evalution browser gEVAL. We provide a blog and an announcement list. You can report genome issues for review, or search regions under review, e.g. here.
The GRC is a collaboration between the The Wellcome Sanger Institute, represented by the Genome Reference Informatics Team, the McDonnell Genome Institute at Washington University (MGI), the European Bioinformatics Institute (EBI) and the The National Center for Biotechnology Information (NCBI). The NCBI hosts the GRC homepages.
FASTA format sequences for a genome assembly in a package convenient for use by various Next Generation Sequence read alignment pipelines. The sequence names, sequence order, and format of the sequence definition lines, were developed in consultation with several developers and major users of alignment pipelines and include masking of e.g. the PAR region and the EBV sequence. Index files generated by BWA, Samtools and Bowtie are also provided. The set is available with (full assembly) and without (primary assembly only) alternate locus sequences.
gEVAL is a genome browser for easy evaluation of the quality of genome assemblies, including the most current and not necessarily publicly released versions. It contains the GRC species plus a few others.
The Mouse Genomes Project is an ongoing effort to catalog all forms of genetic variation between the common laboratory mouse strains and to construct and annotate reference genomes for the key strains.
The zebrafish genome project lead to the generation of the zebrafish reference assembly based on the Tuebingen strain that is now being updated and maintained by the Sanger Institute division for the genome Reference Consortium. Further strain assemblies will be generated.