Genome on a Tree (GoaT)

GoaT is a powerful data aggregator and portal to explore and report underlying data for the eukaryotic tree of life. It indexes publicly available genomic metadata for all eukaryotic species and interpolates missing values through phylogenetic comparison.

GoaT currently holds direct or estimated values for over 70 taxon attributes and over 30 assembly attributes across 1.5 million eukaryotic species. It also holds target priority and sequencing status information for many projects affiliated to the Earth BioGenome Project (EBP) to aid project coordination. Metadata and status attributes in GoaT can be queried through a mature API, a web front end, and a command line interface. The web front end additionally provides summary visualisations for data exploration and reporting.


As genomic data transform our understanding of biodiversity, the EBP has set a goal of generating reference quality genome assemblies for all ~1.9 million described eukaryotic taxa. Meeting this goal requires coordination among many individual regional and taxon-focussed projects working under the EBP umbrella. Large-scale sequencing projects require ready access to validated genome-relevant metadata, such as genome sizes and karyotypes, but these data are dispersed across the literature, and directly measured values are lacking for most taxa.

To meet these needs, a team at the Wellcome Sanger Institute’s Tree of Life Programme have developed Genomes on a Tree (GoaT), an Elasticsearch-powered datastore and search index for genome-relevant metadata and sequencing project plans and statuses.

Sanger Institute Contributors

Photo of Professor Mark Blaxter

Professor Mark Blaxter

Programme Lead for Tree of Life Programme and Senior Group Leader

Photo of Dr Richard Challis

Dr Richard Challis

Senior Bioinformatician

Photo of Dr Cibele Sotero-Caio

Dr Cibele Sotero-Caio

Genomic Data Curator - Tree of Life Genomics

Previous contributors

See full index


Loading publications...