Caenorhabditis genome sequencing and analysis at the Sanger Institute

The Wellcome Trust Sanger Institute's work in the mapping and sequencing of the genome of Caenorhabditis elegans was one of the early milestone projects for the institute. The informatics aspects of this project were led by Dr Richard Durbin. Current C. elegans work at the Institute is focused on sequencing methodology development and is led by Dr. Matthew Berriman.

Caenorhabditis elegans (informally known as 'the worm') is a small, soil-dwelling nematode that is widely used as a model system for studies of metazoan biology. C. elegans' popularity results from the confluence of several factors: its developmental program is understood at the single-cell level; it is highly amenable to genetic manipulation, including RNAi intervention; and it has a complete, high-quality reference genome sequence.

Genomic data for C. elegans, C. briggsae and a host of other nematodes can be found at WormBase.

Caenorhabditis genome sequencing

C. elegans was the first animal to have its genome completely sequenced. The WTSI's contribution to this effort was significant. Indeed, the project was one of the flagship activities in the early life of the WTSI, and as such is one of the defining legacies of the institute itself.

The mapping and sequencing of the reference genome was a joint project between The Wellcome Trust Sanger Institute and The Genome Institute at Washington University (St. Louis). The essentially-complete sequence was formally published in December 1998, and data was made regularly and freely available in advance of publication. The last remaining gap was closed in 2002, although the genome sequence continues to be scrutinzed and improved as new evidence is published.

In addition to C. elegans, The WTSI and the WUGI also collaborated on the genome sequencing of the related nematode Caenorhabditis briggsae. A whole-genome-shotgun assembly was made available in July 2002 and formally published in November 2003.



WormBase is a collaborative project to capture, curate and distribute information about C. elegans biology. It began life as ACeDB, a database application software package developed by jointly Richard Durbin at the Sanger Institute and Jean-Thierry Mieg. ACeDB was used extensively during the course of the C. elegans sequencing project to coordinate the sequencing effort and to integrate the worm sequence with the genetic and physical maps.

WormBase was originally started in 2000 as a way to make data in ACeDB easily accessible via a web-browser. From the outset, the project was heavily committed to the curation and interpretation of the C. elegans literature, and rapidly moved from a genome-centric perspective to one that more evenly balances the worm genome with other aspects of its biology.

The original WormBase consortium consisted of four groups: one at the Sanger Institute, led by Richard Durbin; one at Cold Spring Harbor laboratory, led by Lincoln Stein; one at Washington University St. Louis, led by John Speith; and one at the California Institute of Technology, led by Paul Sternberg (lead principle investigator for the project as a whole).

WormBase and parasite genomics

In 2010, Richard Durbin was appointed as joint head of human genetics at the Sanger Institute. In response to this, and consistent with a general shift in research interests over the last several years, Dr. Durbin took the decision to step down from the WormBase consortium. He retains a strong connection to the project in an advisory capacity.

The WTSI continues to participate in WormBase via new consortium member Matthew Berriman. Dr. Berriman's research programme into parasite genomics and Neglected Tropical Diseases uses C. elegans as a model for the development of effective methodologies for the genome sequencing of parasitic worms. His involvement in WormBase aligns with the one of the key strategic goals of the project: to provide a resource that is useful and accessible to scientists working on non-Caenorhabditis nematodes.

