Genome Reference Informatics Team | Tree of Life

Genome Reference Informatics Team | Tree of Life

Genome Reference Informatics Team

Genome Refernce InfomaticsSanger Institute, Genome Research Limited
Genome Refernce Infomatics

Our Research and Approach

The Genome Reference Informatics Team analyses genome assemblies to reveal and correct quality issues and to identify and add variation. The team consists of Senior Bioinformaticians and Senior Computer Biologists to cover software development, data analysis and genome curation. Besides comissioning sequencing and generation of genome data (e.g. optical mapping) and assembly generation, we develop bespoke software like gEVAL to identify and visualize genome assembly issues. This is used by our curators to resolve those issues through changes or additions to the assemblies, thereby vastly improving assembly accuracy.

We are evaluating and improving assemblies as members of the Darwin Tree of Life Project (see also here), the Vertebrate Genome Project, the Human Pangenome Project, the Genome Reference Consortium and others. We work closely with our consortium partners and other collaborators to ensure access to the latest data and analyses. Assembly improvements are submitted to INSDC on a regular schedule.


Dr Kerstin Howe
Group Leader

Kerstin is a computational biologist whose primary interest is in the provision of accurate reference genome sequences and structures to support biological, agricultural and clinical science.

Show Alumni


Yan, Ying

Yan, Ying
Dr Ying Yan
Former Senior Bioinformatician at the Sanger Institute

Key Projects, Collaborations, Tools & Data

The Genome Reference Informatics Team is responsible for the reference genome assemblies of human, mouse and zebrafish. We are also involved in the analysis and improvement of assemblies of 16 mouse strains, chicken, individual humans and additional zebrafish strains.

Programmes, Associate Research Programmes and Facilities

Partners and Funders

Internal Partners
External Partners and Funders


  • Sixteen diverse laboratory mouse reference genomes define strain-specific haplotypes and novel functional loci.

    Lilue J, Doran AG, Fiddes IT, Abrudan M, Armstrong J et al.

    Nature genetics 2018;50;11;1574-1583

  • Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly.

    Schneider VA, Graves-Lindsay T, Howe K, Bouk N, Chen HC et al.

    Genome research 2017;27;5;849-864

  • gEVAL - a web-based browser for evaluating genome assemblies.

    Chow W, Brugger K, Caccamo M, Sealy I, Torrance J and Howe K

    Bioinformatics (Oxford, England) 2016;32;16;2508-10

  • Using optical mapping data for the improvement of vertebrate genome assemblies.

    Howe K and Wood JM

    GigaScience 2015;4;10

  • The zebrafish reference genome sequence and its relationship to the human genome.

    Howe K, Clark MD, Torroja CF, Torrance J, Berthelot C et al.

    Nature 2013;496;7446;498-503

  • Modernizing reference genome assemblies.

    Church DM, Schneider VA, Graves T, Auger K, Cunningham F et al.

    PLoS biology 2011;9;7;e1001091