Protozoan parasites within the species Trypanosoma brucei are the etiological agent of human sleeping sickness and Nagana in animals. Infections are limited to patches of sub-Saharan Africa where insects vectors of the Glossina genus are endemic. The most recent estimates indicate between 50,000 - 70,000 human cases currently exist, with 17 000 new cases each year (WHO Factsheet, 2006).
T. brucei possesses a two-unit genome, a nuclear genome and a mitochondrial (kinetoplast) genome with a total estimated size of 35Mb/haploid genome. The nuclear genome is split into three classes of chromosomes according to their size on pulsed-field gel electrophoresis, 11 pairs of megabase chromosomes (0.9-5.7 Mb), intermediate (300-900 kb) and minichromosomes (50-100 kb). The T. brucei genome contains a ~0.5Mb segmental duplication affecting chromosomes 4 and 8, which is responsible for some 75 gene duplicates unique to this species.
Published Genome Data
Megabase Chromosome Sequencing Project
In July 2005, the megabase complement of the genome of T. brucei strain 927 was published along with the genomes of Leishmania major and Trypanosoma cruzi as well as a paper describing the comparative gene content and architecture of the 3 parasites.
Beowulf Genomics funded a pilot scheme to sequence chromosome Ia of T. brucei as a collaboration with Professor Keith Gull (University of Oxford) and Dr. Sara Melville (University of Cambridge) within the framework of the World Health Organisation T. brucei Genome Project.
In April 2000, the Wellcome Trust, via its Beowulf Genomics Initiative, awarded more funding to the T.brucei genome project at the Sanger Institute to sequence chromosomes IX (3.5 Mb), X (4.4 Mb) and XI (5.2 Mb) in collaboration with Dr. Sara Melville and the WHO/TDR supported T. brucei Genome Network.
At around the same time, NIAID funded TIGR to sequence chromosomes II to VIII in collaboration with Drs. John Donelson (University of Iowa), Sara Melville (University of Cambridge) and Elisabetta Ullu (Yale University).
Chromosomes I to VIII are complete and the Sanger Institute is actively continuing to finish chromosomes IX, X and XI in order to remove the few remaining gaps. Annotation and curation of the whole T. brucei genome sequence is ongoing.
T. brucei possesses a two-unit genome, a nuclear genome and a mitochondrial (kinetoplast) genome with a total estimated size of 35Mb/haploid genome. The nuclear genome is split into three classes of chromosomes according to their size on pulsed-field gel electrophoresis, 11 pairs of megabase chromosomes (0.9-5.7 Mb), intermediate (300-900 kb) and minichromosomes (50-100 kb).
The T. brucei genome contains a ~0.5Mb segmental duplication affecting chromosomes 4 and 8, which is responsible for some 75 gene duplicates unique to this species.
Intermediate and Mini Chromosome Sequencing Project
T. brucei devotes 10-20% of its nuclear genome to chromosomes < 1 Mb in size. Pulsed-field gel electrophoresis of whole chromosome-sized DNAs from the genome of T. brucei strain 927 resolves 2 "intermediate-sized" bands of ~350kb, representing 2, or possibly 4, intermediate chromosomes. Unfortunately, the lack of specific markers prevents further karyotype analysis (pers. comm. S. Melville, University of Cambridge). The intermediate chromosomes of T. brucei strain 427 are known to encode bloodstream expression sites, 3 of which have been sequenced. These are: ES Bn-2 (TB13J3, AL670322); ES VO2 (TBN19B2, AL671256); ES221 (TBH25N7, AL671259).
In addition, the 927 genome contains ~100 minichromosomes of 30 - ~150kb in size, which expand the parasite's repertoire of telomeric variant surface glycoprotein (VSG) genes. Both classes of chromosomes share a canonical structure based around a large central core of 177-bp repeats.(Pubmed)
The intermediate and mini-chromosome will be sequenced as a collaboration between Dr Sara Melville (University of Cambridge) and Dr Bill Wickstead/Professor Keith Gull (University of Oxford). The intermediate chromosomes will be separated by pulsed field gel electrophoresis and sequenced by whole chromosome shotgun approach.
The minichromosomes will be sequenced by two complementary approaches. Unique minichromosome sequence willl be TAR cloned using the 177-bp repeat sequence in the S. cerevisiae vector. Where possible, resulting clones will be screened to reduce redundancy. In addition, the whole minichromosome population will be sequenced to a low coverage (1-2x), using a whole chromosome shotgun approach.
Telomere Sequencing Projects
The Pathogen Genomics group in collaboration with Drs Gloria Rudenko (University of Oxford) and Ed Louis (University of Nottingham) has initiated the sequencing of VSG expression site containing telomeres of two T. brucei strains.
T. brucei strain 927
Due to the under-representation of these sequences in the BAC libraries, a separate cloning project targeting the 927 telomeric regions was initiated to complement the megabase genome sequencing project. Using transformation-associated recombination (TAR) cloning with internal, chromosome-specific sequences as target, the telomeres of chromosomes 4 and 6 have so far been cloned. They are currently being subcloned into smaller insert libraries. Clones from the small-insert library from the right hand end of chromosome 4 are currently in shotgun.
T. brucei strain 427
The Pathogen Genomics group is also involved in the sequencing of VSG expression sites cloned from strain 427. The T. brucei 427 telomere-specific library of 182 bloodstream expression sites was constructed using the 427 dominant expression site promoter as recombinational target. These clones were further sorted into a minimal set of 17 groups, based on sequence determination of regions from the promoter and ESAG6. See here for more details.
Reads are available for download and searching and annotations are available via GeneDB.
T. brucei GSS Sequence Data
In addition to sequencing the megabase chromosomes of the T. brucei genome, the Wellcome Sanger Institute and TIGR have carried out extensive genome survey sequencing.
TIGR has provided 47,000 single-pass reads of randomly selected clones: these derived both from ends of P1 and BAC genomic clones as well as from genomic DNA clones, selected from a T. brucei TREU927 GUTat 10.1 whole genome TIGR manufactured sheared DNA library (av. insert size 2-3 kb). These have proved immensely useful resources to the research community for gene discovery. The end-sequences of the P1 and BAC clones have also been used in physical mapping.
The Sanger Institute has in turn submitted > 43,000 GSS sequences from the 2-kb sheared genomic DNA clones constructed by TIGR. These end sequences have since been clustered with ESTs available through public databases and some preliminary automated analysis has been carried out. The sequences can be obtained from our ftp site.
As an aid to the community, all GSS sequences were subjected to a BLASTX analysis of Swissprot/TrEMBL databases in February 2002. The summary data are shown below:
Applying a probability cut-off of 1e-10 to the BLAST output, 8196 (21%) had a hit. Detailed subsets of these results may be found on our ftp site.
GSS and EST clustering
All T. brucei genome survey sequences plus approximately 5,500 EST/mRNA sequences were clustered, using the sequence assembly programme phrap. The ESTs were retrieved from EMBL in February 2001, using Trypanosoma brucei listed as an organism as a search term. This will therefore include EST data generated from different Trypanosoma brucei subspecies and strains. The dataset totalled 96,474 sequences (~45.87Mb). 12,251 contigs were generated, while 8,242 sequences could not be placed in a contig (singletons). The GSS/EST clusters have an estimated coverage of >95% of the T. brucei genome.
The Trypanosoma brucei gambiense Partial Genome Project
The Wellcome Sanger Institute Pathogen Genomics group has partially shotgun sequenced the nuclear genome of the human-infective Trypanosoma brucei gambiense. While the non-human-infective T. b. brucei is the preferred model organism for studying trypanosome biology, comparison with a human-infective organism is required to study mechanisms of disease. Human trypanosomiasis is caused by 2 other subspecies of T. brucei - T. b. rhodesiense and T. b. gambiense. T. b. rhodesiense is very similar to T. b. brucei, and so the genome sequence of T. b. brucei will provide information on both subspecies. However, T. b. gambiense stands apart, with profoundly different biological and genetic characteristics. The T. b. gambiense genome will serve as a useful comparative genomics resource to complement the genomes of Trypanosoma vivax, Trypanosoma congolense and Trypanosoma brucei. The T. b. gambiense partial shotgun project is being carried out in collaboration with Wendy Gibson (University of Bristol, UK).
The strain chosen for sequencing is Dal 972 clone 1 (MHOM/CI/86/DAL972) isolated from a patient in Cote d'Ivoire in 1986. This strain has been little passaged and has an extremely chronic phenotype in experimental rodents. It has been transmitted through both Glossina morsitans morsitans and G. palpalis gambiensis in the lab, and grows well as procyclics in vitro. By biochemical characterisation, Dal 972 is a typical group 1 T. b. gambiense.
The genome sequence of Trypanosoma brucei gambiense, causative agent of chronic human african trypanosomiasis.
PLoS neglected tropical diseases 2010;4;4;e658
The genome of the African trypanosome Trypanosoma brucei.
Science (New York, N.Y.) 2005;309;5733;416-22
The DNA sequence of chromosome I of an African trypanosome: gene content, chromosome organisation, recombination and polymorphism.
Nucleic acids research 2003;31;16;4864-73
Data Use Statement
This sequencing centre plans on publishing the completed and annotated sequences in a peer-reviewed journal as soon as possible. Permission of the principal investigator should be obtained before publishing analyses of the sequence/open reading frames/genes on a chromosome or genome scale. See our data sharing policy.
Please address all sequencing enquiries to: firstname.lastname@example.org