The Ensembl project aims to automatically annotate genome sequences, integrate these data with other biological information and to make the results freely available to geneticists, molecular biologists, bioinformaticians and the wider research community. Ensembl is jointly headed by Dr Stephen Searle at the Wellcome Trust Sanger Institute and Dr Paul Flicek at the European Bioinformatics Institute (EBI).



Ensembl was established in 1999, towards the end of the Human Genome Project, in response to a recognition that understanding the genetic code of organisms is as important as reading it. However, purely manual curation of all genome sequences is an unthinkable task, given the labour-intensive and time-consuming nature of such work. To overcome this problem, the Ensembl project team developed new software pipelines to automatically generate evidence-based annotation of genome sequences.

Since its inception, the Ensembl project has expanded from the curation of the human genome to embrace more than 50 vertebrate species. These include many model organisms central to the study of human diseases. Ensembl has participated in many genome consortia, producing annotation used in the initial genomic analyses of newly sequenced organisms.

The project provides an expanding wealth of information for a diverse list of species, including:

  • intron and exon structure for protein-coding and non-coding genes
  • genomic variations and somatic mutations and their consequences on genes and genotypes in populations and individuals
  • cross-species gene trees and genomic alignments
  • functional genomic data - including regulatory region annotation.

Ensembl website

Generating the annotation is just the start. To provide the data in the most useful format for researchers, Ensembl provides several means of access, the foremost of which is the Ensembl website. This is a highly customisable, interactive site, providing a track-based genome browser location view, and many additional displays to supply highly integrated views of genomic annotation.

Rapid and open data access

Free and unrestricted access to the information held in Ensembl is one of the primary principles of the project, which was founded with a vision to promote rapid research into all areas of human disease.

Ensembl code is all open source.

