Trypanoplasma (Cryptobia) borreli is a haematozoic endoparasite transmitted by leeches (Piscicola sp.). As a species of Cryptobia, T. borreli causes cryptobiosis in cyprinid fish, characterized severe anemia and splenomegaly, affecting both wild fish and commerical fisheries. T. borreli is often found co-infecting fish with Trypanosoma spp.; both Cryptobia spp. (Parabodonidae) and Trypanosoma spp. (Trypanosomatidae) are members of the Kinetoplastida and represent independent origins of blood parasitism.

As part of our efforts to understand the evolution of trypanosomatid genomes, we have produced a draft genome sequence for T. borreli K-100 (ATCC 50432) using the Illumina HiSeq platform. 400bp and 3kb-insert libraries were created from whole genomic DNA isolated from an axenic T. borreli culture. The primary purpose of the genome sequence is to provide an outgroup for the comparison of trypanosomatids with Bodo saltans, a free-living kinetoplastid more closely related to trypanosomatids than T. borreli. When compariing the free-living B. saltans with parasitic trypanosomatids, we need to distinguish losses of conserved kinetoplastid genes in trypanosomatids (present in T. borreli) from Bodo-specific gene gains (absent in T. borreli). Secondarily, we will compare the T. borreli and Trypanosoma brucei genomes to explore any similarities associated with the convergent evolution of blood parasitism.

The T. borreli genome sequence was assembled from 100bp paired-end Illumina reads from a 400bp-insert library using Velvet. This assembly was then corrected for misassembly and subsequently expanded using reads from a 3kb-insert library using custom scripts and IMAGE.

The final assembly contains 25,816,007bp in 23,265 contigs (N50 = 12100bp). The average contig length is 1109.6 bp and the largest contig is 133333bp.

This sequencing centre plans on publishing the completed and annotated sequences in a peer-reviewed journal as soon as possible. Permission of the principal investigator should be obtained before publishing analyses of the sequence/open reading frames/genes on a chromosome or genome scale. See our data sharing policy.

