Genome Reference Informatics

Genome Reference Informatics

Genome Reference Informatics


As the impact of the human reference genome assembly on biomedical research has shown, the availability of a high quality reference genome assembly is essential for the understanding of a species' biology. Our team is responsible for further improving and extending the human reference assembly, as well as the mouse and zebrafish reference assemblies. To achieve this, we are generating sequencing and mapping data from various technologies, develop software to analyse and interpret these data and apply manual curation to resolve assembly issues. We are also generating new assemblies of strains/individuals of importance for the respective community.

Current projects
  • improvement of the human reference assembly, correcting errors and adding variation
  • correcting and improving the draft assemblies of 16 mouse strains
  • adding missing gene loci to the mouse reference assembly
  • creating new zebrafish assemblies from different strains, developing techniques to enable haplotype resolution
  • improving the zebrafish reference assembly through addition of more clone sequence and integration with a new de novo assembly

We develop software to support our research, focussing on the genome evaluation browser (gEVAL). This browser allows for an easy assessment of genome assemblies and the resolution of identified issues and is one of the main tools used by the Genome Reference Consortium, but also by e.g. the zebrafish community. We also provide Chromoview, a browser for viewing the latest assembly paths.


Genome Reference Consortium

The GRC aims to ensure that the human, mouse and zebrafish reference assemblies are biologically relevant by closing gaps, fixing errors and representing complex variation.

Zebrafish Genome Project

The zebrafish genome project lead to the generation of the zebrafish reference assembly based on the Tuebingen strain that is now being updated and maintained by the Sanger Institute division for the genome Reference Consortium. Further strain assemblies will be generated.

Mouse Genomes Project

The Mouse Genomes Project is an ongoing effort to catalog all forms of genetic variation between the common laboratory mouse strains and to construct and annotate reference genomes for the key strains.


The Genome Reference Informatics Team analyses genome assemblies to reveal and correct quality issues and to identify and add variation. It forms the Sanger division of the Genome Reference Consortium and the Vertebrate Genomes Project.

External collaborators


  • gEVAL - a web-based browser for evaluating genome assemblies.

    Chow W, Brugger K, Caccamo M, Sealy I, Torrance J and Howe K

    Bioinformatics (Oxford, England) 2016;32;16;2508-10

  • Using optical mapping data for the improvement of vertebrate genome assemblies.

    Howe K and Wood JM

    GigaScience 2015;4;10

  • The zebrafish reference genome sequence and its relationship to the human genome.

    Howe K, Clark MD, Torroja CF, Torrance J, Berthelot C et al.

    Nature 2013;496;7446;498-503

  • Modernizing reference genome assemblies.

    Church DM, Schneider VA, Graves T, Auger K, Cunningham F et al.

    PLoS biology 2011;9;7;e1001091