The Kinetoplastida (Euglenozoa) are unicellular flagellates that include the trypanosomatid parasites, most notably Trypanosoma brucei, T. cruzi and Leishmania spp. These organisms cause substantial mortality and morbidity in humans and their livestock worldwide as the causative agents of African sleeping sickness, Chagas disease and leishmaniasis respectively. Draft genome sequences are available for several species of both Trypanosoma and Leishmania, many of which are described elsewhere in these pages. Bodo saltans is a free-living heterotroph found worldwide in freshwater and marine habitats. It is a kinetoplastid, but not a trypanosomatid, and possesses the diagnostic kinetoplastid features, such as flagella sited within a specialised flagellar pocket, glycolytic processes confined to a dedicated organelle (the 'glycosome'), and the characteristic concentration of mitochondrial DNA at the base of the flagellum (the 'kinetoplast').
The purpose of a B. saltans genome sequence is to provide an 'outgroup' for comparative genomic studies. As it is among the closest bodonid relatives of the trypanosomatids, it will provide a model of the ancestral trypanosomatid to distinguish those derived parts of the parasite genomes (i.e., unique trypanosomatid adaptations) from those which are a legacy of the free-living ancestor. This objective can be resolved into three principal comparative issues:
1. Trypanosomatid disease; understanding how human trypanosomatid parasites acquired their distinct pathological strategies;
2. Evolution of parasitism; understanding how the ancestral trypanosomatid became parasitic in terms of derived innovations (e.g., cell surfaces) and loss of genomic repertoire;
3. Eukaryotic evolution; understanding how typical kinetoplastid features (e.g., glycosomes) evolved and how these might have been modified for parasitism.
Published Genome Data
A pilot study that described ~400kbp of B. saltans genome sequenced from a whole genome fosmid library has been published (Jackson, Quail & Berriman 2008, BMC Genomics, 9:594). A draft sequence for the entire B. saltans genome has now been produced using Illumina HiSeq technology. The data were derived from two libraries of 300bp and 3kbp insert sizes respectively. Sequence assembly was carried out using Velvet and subsequently corrected for misassembly using large-insert reads and custom scripts.
The draft genome sequence contains 39,864,435bp at ~110x coverage, arranged into 2,256 scaffolds. The average scaffold length is 16,804.54bp and the largest scaffold is 190,927bp long. N50 = 31,548bp (n=372).
The draft genome annotation contains 18,963 coding sequences. When analyzed using CEGMA, this covers 78% of the conserved sequence set.
In addition, we have produced a draft transcriptome using RNAseq technology. 85.5% of reads mapped to the genome assembly. The RNAseq reads were assembled using Velvet, producing 8,901 putative transcripts and transcript fragments.
All data can be downloaded via our ftp site: ftp://ftp.sanger.ac.uk/pub/pathogens/Bodo/saltans
and searched using the GeneDB BLAST server: http://www.genedb.org/blast/submitblast/GeneDB_Bsaltans
Data Use Statement
This sequencing centre plans on publishing the completed and annotated sequences in a peer-reviewed journal as soon as possible. Permission of the principal investigator should be obtained before publishing analyses of the sequence/open reading frames/genes on a chromosome or genome scale. See our data sharing policy.
Please address all sequencing enquiries to: email@example.com