The Genome Reference Informatics Team analyses genome assemblies to reveal and correct quality issues and to identify and add variation.

The team consists of Senior Bioinformaticians and Senior Computer Biologists to cover software development, data analysis and genome curation. Besides comissioning sequencing and generation of genome data (e.g. optical mapping) and assembly generation, we develop bespoke software like gEVAL to identify and visualize genome assembly issues. This is used by our curators to resolve those issues through changes or additions to the assemblies, thereby vastly improving assembly accuracy.

We are evaluating and improving assemblies as members of the Darwin Tree of Life Project (see also here), the Vertebrate Genome Project, the Human Pangenome Project, the Genome Reference Consortium and others. We work closely with our consortium partners and other collaborators to ensure access to the latest data and analyses. Assembly improvements are submitted to INSDC on a regular schedule.

We work with the following groups


Genome Reference Consortium

The GRC aims to ensure that the human, mouse and zebrafish reference assemblies are biologically relevant by closing gaps, fixing errors and representing complex variation.






Vertebrate Genomes Project

The Vertebrate Genomes Project (VGP), a project of the G10K Consortium, aims to generate near error-free reference genome assemblies of all 66,000 extant vertebrate species.


Human Pangenome Project

Diverse Human References Driving Genomic Discoveries for Everyone


Darwin Tree of Life Project

Reading the genomes of all life: a new platform for understanding our biodiversity



