Background
We are interested in studying the diversity of eukaryotic parasites and their complex interactions with their hosts. In particular, we wish to uncover the genomic basis for differences in the biology of parasites causing malaria and Neglected Tropical Diseases. Our approach starts with the establishment of a reference genome, followed by comparative sequencing of related strains or species to find candidate genes (or other sequences) relating to species-specific differences, such as diseases tropisms.
Research
Helminths
Despite their importance globally, both medically and economically, parasitic helminth (worm) research has remained relatively untouched by genomics. Worm infections account for morbidity equivalent to more than 100 million disability-adjusted life years from more than one billion infections globally. With this in mind, we have developed the Sanger Helminth Genomes Initiative. Initially we are using de novo sequencing projects to produce reference genomes for a cross-phyla list that includes hookworms, whipworms, threadworms, Schistosomes, a tapeworm and the filarial parasite responsible for river blindness.
Protozoa
Amongst the protozoan parasites, we focus on two areas:
- The Apicomplexa. These parasites include malaria parasites
- The Kinetoplastida, which include Trypanosoma and Leishmania parasites.
In each area, we have built comparative genomic studies around high-quality reference genomes. Comparisons at the species level allow us the identification of candidate genes to that account for major biological differences. Comparisons deeper within species allow us to study the genomic consequences of natural and experimental variation.
Expression studies
Gene prediction is a major challenge in genome analysis. We are using next generation sequencing technologies to directly sequence transcriptomes and identify coding regions by alignment to the genome sequence.
Resources
Tool development
Software that supports our annotation and analysis are under constant development. In particular, we work with the Pathogen Informatics team to develop
- Artemis and ACT. Portable and intuitive sequence viewing and browsing tools. Recently a new Chado database version has been launched.
- GeneDB. A web accessible database that provides a window on our annotation as it is produced.
- Drug Target Portfolio. We work with colleagues around the world to provide a single point of entry to identify current likely drug targets.
Selected Publications
-
Genome-wide discovery and verification of novel structured RNAs in Plasmodium falciparum.
Genome research 2008;18;2;281-92
PUBMED: 18096748; PMC: 2203626; DOI: 10.1101/gr.6836108
-
Schistosoma mansoni genome: closing in on a final gene set.
Experimental parasitology 2007;117;3;225-8
PUBMED: 17643433; DOI: 10.1016/j.exppara.2007.06.005
-
Comparative genomic analysis of three Leishmania species that cause diverse human disease.
Nature genetics 2007;39;7;839-47
PUBMED: 17572675; PMC: 2592530; DOI: 10.1038/ng2053
-
Genome variation and evolution of the malaria parasite Plasmodium falciparum.
Nature genetics 2007;39;1;120-5
PUBMED: 17159978; PMC: 2663918; DOI: 10.1038/ng1931
-
Schistosoma mansoni (Platyhelminthes, Trematoda) nuclear receptors: sixteen new members and a novel subfamily.
Gene 2006;366;2;303-15
PUBMED: 16406405; DOI: 10.1016/j.gene.2005.09.013
-
Plasmodium falciparum variant surface antigen expression patterns during malaria.
PLoS pathogens 2005;1;3;e26
PUBMED: 16304608; PMC: 1287908; DOI: 10.1371/journal.ppat.0010026
-
ACT: the Artemis Comparison Tool.
Bioinformatics (Oxford, England) 2005;21;16;3422-3
PUBMED: 15976072; DOI: 10.1093/bioinformatics/bti553
-
Comparative genomics of trypanosomatid parasitic protozoa.
Science (New York, N.Y.) 2005;309;5733;404-9
PUBMED: 16020724; DOI: 10.1126/science.1112181
-
The genome of the African trypanosome Trypanosoma brucei.
Science (New York, N.Y.) 2005;309;5733;416-22
PUBMED: 16020726; DOI: 10.1126/science.1112642
-
The genome of the kinetoplastid parasite, Leishmania major.
Science (New York, N.Y.) 2005;309;5733;436-42
PUBMED: 16020728; PMC: 1470643; DOI: 10.1126/science.1112680
-
Genome of the host-cell transforming parasite Theileria annulata compared with T. parva.
Science (New York, N.Y.) 2005;309;5731;131-3
PUBMED: 15994557; DOI: 10.1126/science.1110418
-
The genome of the protist parasite Entamoeba histolytica.
Nature 2005;433;7028;865-8
PUBMED: 15729342; DOI: 10.1038/nature03291
-
A comprehensive survey of the Plasmodium life cycle by genomic, transcriptomic, and proteomic analyses.
Science (New York, N.Y.) 2005;307;5706;82-6
PUBMED: 15637271; DOI: 10.1126/science.1103717
-
Viewing and annotating sequence data with Artemis.
Briefings in bioinformatics 2003;4;2;124-32
-
Genome sequence of the human malaria parasite Plasmodium falciparum.
Nature 2002;419;6906;498-511
PUBMED: 12368864; DOI: 10.1038/nature01097
-
The architecture of variant surface glycoprotein gene expression sites in Trypanosoma brucei.
Molecular and biochemical parasitology 2002;122;2;131-40
PUBMED: 12106867
Team
Team members
Members
- James Cotton
- jc17@sanger.ac.ukSenior Staff Scientist
- Tim Downing
- Postdoctoral Fellow
- Bernardo Foth
- Senior Staff Scientist
- Thomas Otto
- Senior Staff Scientist
- Adam Reid
- Postdoctoral Fellow
- Jason Tsai
- jit@sanger.ac.ukPostdoctoral Fellow
- Magdalena Zarowiecki
- Postdoctoral Fellow
James Cotton
jc17@sanger.ac.uk Senior Staff Scientist
I studied biology at Oxford, and then did a PhD on gene family evolution with Rod Page at the University of Glasgow, followed by post-docs at the Natural History Museum in London and at the National University of Ireland, Maynooth, working on various topics in phylogenetics and molecular evolution. I was subsequently an RCUK Fellow at Queen Mary, University of London for three years before joining the parasite genomics group in 2010.
Research
At the Sanger Institute, I'm involved in a range of projects across a diverse array of parasitic species, including nematodes, schistosomes and kinetoplastids. I play a leading role in a number of de-novo genome sequencing projects, but particularly focus on projects with a strong comparative or population genomics component.
References
-
Whole genome sequencing of multiple Leishmania donovani clinical isolates provides insights into population structure and mechanisms of drug resistance.
Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, United Kingdom.
Visceral leishmaniasis is a potentially fatal disease endemic to large parts of Asia and Africa, primarily caused by the protozoan parasite Leishmania donovani. Here, we report a high-quality reference genome sequence for a strain of L. donovani from Nepal, and use this sequence to study variation in a set of 16 related clinical lines, isolated from visceral leishmaniasis patients from the same region, which also differ in their response to in vitro drug susceptibility. We show that whole-genome sequence data reveals genetic structure within these lines not shown by multilocus typing, and suggests that drug resistance has emerged multiple times in this closely related set of lines. Sequence comparisons with other Leishmania species and analysis of single-nucleotide diversity within our sample showed evidence of selection acting in a range of surface- and transport-related genes, including genes associated with drug resistance. Against a background of relative genetic homogeneity, we found extensive variation in chromosome copy number between our lines. Other forms of structural variation were significantly associated with drug resistance, notably including gene dosage and the copy number of an experimentally verified circular episome present in all lines and described here for the first time. This study provides a basis for more powerful molecular profiling of visceral leishmaniasis, providing additional power to track the drug resistance and epidemiology of an important human pathogen.
Funded by: Wellcome Trust: 076355, 085775/Z/08/Z
Genome research 2011;21;12;2143-56
PUBMED: 22038251; PMC: 3227103; DOI: 10.1101/gr.123430.111
-
Genomic insights into the origin of parasitism in the emerging plant pathogen Bursaphelenchus xylophilus.
Forestry and Forest Products Research Institute, Tsukuba, Japan. kikuchit@affrc.go.jp
Bursaphelenchus xylophilus is the nematode responsible for a devastating epidemic of pine wilt disease in Asia and Europe, and represents a recent, independent origin of plant parasitism in nematodes, ecologically and taxonomically distinct from other nematodes for which genomic data is available. As well as being an important pathogen, the B. xylophilus genome thus provides a unique opportunity to study the evolution and mechanism of plant parasitism. Here, we present a high-quality draft genome sequence from an inbred line of B. xylophilus, and use this to investigate the biological basis of its complex ecology which combines fungal feeding, plant parasitic and insect-associated stages. We focus particularly on putative parasitism genes as well as those linked to other key biological processes and demonstrate that B. xylophilus is well endowed with RNA interference effectors, peptidergic neurotransmitters (including the first description of ins genes in a parasite) stress response and developmental genes and has a contracted set of chemosensory receptors. B. xylophilus has the largest number of digestive proteases known for any nematode and displays expanded families of lysosome pathway genes, ABC transporters and cytochrome P450 pathway genes. This expansion in digestive and detoxification proteins may reflect the unusual diversity in foods it exploits and environments it encounters during its life cycle. In addition, B. xylophilus possesses a unique complement of plant cell wall modifying proteins acquired by horizontal gene transfer, underscoring the impact of this process on the evolution of plant parasitism by nematodes. Together with the lack of proteins homologous to effectors from other plant parasitic nematodes, this confirms the distinctive molecular basis of plant parasitism in the Bursaphelenchus lineage. The genome sequence of B. xylophilus adds to the diversity of genomic data for nematodes, and will be an important resource in understanding the biology of this unusual parasite.
Funded by: Wellcome Trust: WT 085775/Z/08/Z
PLoS pathogens 2011;7;9;e1002219
PUBMED: 21909270; PMC: 3164644; DOI: 10.1371/journal.ppat.1002219
-
Cetaceans on a molecular fast track to ultrasonic hearing.
School of Life Sciences, East China Normal University, Shanghai, China.
The early radiation of cetaceans coincides with the origin of their defining ecological and sensory differences [1, 2]. Toothed whales (Odontoceti) evolved echolocation for hunting 36-34 million years ago, whereas baleen whales (Mysticeti) evolved filter feeding and do not echolocate [2]. Echolocation in toothed whales demands exceptional high-frequency hearing [3], and both echolocation and ultrasonic hearing have also evolved independently in bats [4, 5]. The motor protein Prestin that drives the electromotility of the outer hair cells (OHCs) is likely to be especially important in ultrasonic hearing, because it is the vibratory response of OHC to incoming sound waves that confers the enhanced sensitivity and selectivity of the mammalian auditory system [6, 7]. Prestin underwent adaptive change early in mammal evolution [8] and also shows sequence convergence between bats and dolphins [9, 10], as well as within bats [11]. Focusing on whales, we show for the first time that the extent of protein evolution in Prestin can be linked directly to the evolution of high-frequency hearing. Moreover, we find that independent cases of sequence convergence in mammals have involved numerous identical amino acid site replacements. Our findings shed new light on the importance of Prestin in the evolution of mammalian hearing.
Current biology : CB 2010;20;20;1834-9
PUBMED: 20933423; DOI: 10.1016/j.cub.2010.09.008
-
Eukaryotic genes of archaebacterial origin are more important than the more numerous eubacterial genes, irrespective of function.
Department of Biology, National University of Ireland, Maynooth, County Kildare, Ireland.
The traditional tree of life shows eukaryotes as a distinct lineage of living things, but many studies have suggested that the first eukaryotic cells were chimeric, descended from both Eubacteria (through the mitochondrion) and Archaebacteria. Eukaryote nuclei thus contain genes of both eubacterial and archaebacterial origins, and these genes have different functions within eukaryotic cells. Here we report that archaebacterium-derived genes are significantly more likely to be essential to yeast viability, are more highly expressed, and are significantly more highly connected and more central in the yeast protein interaction network. These findings hold irrespective of whether the genes have an informational or operational function, so that many features of eukaryotic genes with prokaryotic homologs can be explained by their origin, rather than their function. Taken together, our results show that genes of archaebacterial origin are in some senses more important to yeast metabolism than genes of eubacterial origin. This importance reflects these genes' origin as the ancestral nuclear component of the eukaryotic genome.
Proceedings of the National Academy of Sciences of the United States of America 2010;107;40;17252-5
PUBMED: 20852068; PMC: 2951413; DOI: 10.1073/pnas.1000265107
-
Convergent sequence evolution between echolocating bats and dolphins.
School of Life Sciences, East China Normal University, Shanghai 200062, China.
Cases of convergent evolution - where different lineages have evolved similar traits independently - are common and have proven central to our understanding of selection. Yet convincing examples of adaptive convergence at the sequence level are exceptionally rare [1]. The motor protein Prestin is expressed in mammalian outer hair cells (OHCs) and is thought to confer high frequency sensitivity and selectivity in the mammalian auditory system [2]. We previously reported that the Prestin gene has undergone sequence convergence among unrelated lineages of echolocating bat [3]. Here we report that this gene has also undergone convergent amino acid substitutions in echolocating dolphins, which group with echolocating bats in a phylogenetic tree of Prestin. Furthermore, we find evidence that these changes were driven by natural selection.
Current biology : CB 2010;20;2;R53-4
PUBMED: 20129036; DOI: 10.1016/j.cub.2009.11.058
-
The evolution of color vision in nocturnal mammals.
Institute of Zoology and Graduate University, Chinese Academy of Sciences, Beijing 100080, China.
Nonfunctional visual genes are usually associated with species that inhabit poor light environments (aquatic/subterranean/nocturnal), and these genes are believed to have lost function through relaxed selection acting on the visual system. Indeed, the visual system is so adaptive that the reconstruction of intact ancestral opsin genes has been used to reject nocturnality in ancestral primates. To test these assertions, we examined the functionality of the short and medium- to long-wavelength opsin genes in a group of mammals that are supremely adapted to a nocturnal niche: the bats. We sequenced the visual cone opsin genes in 33 species of bat with diverse sensory ecologies and reconstructed their evolutionary history spanning 65 million years. We found that, whereas the long-wave opsin gene was conserved in all species, the short-wave opsin gene has undergone dramatic divergence among lineages. The occurrence of gene defects in the short-wave opsin gene leading to loss of function was found to directly coincide with the origin of high-duty-cycle echolocation and changes in roosting ecology in some lineages. Our findings indicate that both opsin genes have been under purifying selection in the majority bats despite a long history of nocturnality. However, when spectacular losses do occur, these result from an evolutionary sensory modality tradeoff, most likely driven by subtle shifts in ecological specialization rather than a nocturnal lifestyle. Our results suggest that UV color vision plays a considerably more important role in nocturnal mammalian sensory ecology than previously appreciated and highlight the caveat of inferring light environments from visual opsins and vice versa.
Proceedings of the National Academy of Sciences of the United States of America 2009;106;22;8980-5
PUBMED: 19470491; PMC: 2690009; DOI: 10.1073/pnas.0813201106
-
Supertrees join the mainstream of phylogenetics.
School of Biological and Chemical Sciences, Queen Mary, University of London, Mile End Road, London E1 4NS, UK. j.a.cotton@qmul.ac.uk
Supertree methods are fairly widely used to build comprehensive phylogenies for particular groups, but concerns remain over the adequacy of existing approaches. Steel and Rodrigo recently introduced a statistical model of incongruence between trees, allowing maximum-likelihood supertree inference. This approach to supertree construction will enable hypothesis-testing and model-choice methods that are now routine in sequence phylogenetics to be applied in this setting, and might form an important part of future phylogenetic inference from genomic data.
Trends in ecology & evolution 2009;24;1;1-3
PUBMED: 19022523; DOI: 10.1016/j.tree.2008.08.006
-
The hearing gene Prestin reunites echolocating bats.
School of Life Science, East China Normal University, Shanghai 200062, China.
The remarkable high-frequency sensitivity and selectivity of the mammalian auditory system has been attributed to the evolution of mechanical amplification, in which sound waves are amplified by outer hair cells in the cochlea. This process is driven by the recently discovered protein prestin, encoded by the gene Prestin. Echolocating bats use ultrasound for orientation and hunting and possess the highest frequency hearing of all mammals. To test for the involvement of Prestin in the evolution of bat echolocation, we sequenced the coding region in echolocating and nonecholocating species. The resulting putative gene tree showed strong support for a monophyletic assemblage of echolocating species, conflicting with the species phylogeny in which echolocators are paraphyletic. We reject the possibilities that this conflict arises from either gene duplication and loss or relaxed selection in nonecholocating fruit bats. Instead, we hypothesize that the putative gene tree reflects convergence at stretches of functional importance. Convergence is supported by the recovery of the species tree from alignments of hydrophobic transmembrane domains, and the putative gene tree from the intra- and extracellular domains. We also found evidence that Prestin has undergone Darwinian selection associated with the evolution of specialized constant-frequency echolocation, which is characterized by sharp auditory tuning. Our study of a hearing gene in bats strongly implicates Prestin in the evolution of echolocation, and suggests independent evolution of high-frequency hearing in bats. These results highlight the potential problems of extracting phylogenetic signals from functional genes that may be prone to convergence.
Proceedings of the National Academy of Sciences of the United States of America 2008;105;37;13959-64
PUBMED: 18776049; PMC: 2544561; DOI: 10.1073/pnas.0802097105
-
The prokaryotic tree of life: past, present... and future?
Department of Biology, National University of Ireland Maynooth, Maynooth, County Kildare, Ireland. james.o.mcinerney@nuim.ie
No accepted phylogenetic scheme for prokaryotes emerged until the late 1970s. Prior to that, it was assumed that there was a phylogenetic tree uniting all prokaryotes, but no suitable data were available for its construction. For 20 years, through the 1980s and 1990s, rRNA phylogenies were the gold standard. However, beginning in the last decade, findings from genomic data have challenged this new consensus. Gene trees can conflict greatly, and strains of the same species can differ enormously in genome content. Horizontal gene transfer is now known to be a significant influence on genome evolution. The next decade is likely to resolve whether or not we retain the centuries-old metaphor of the tree for all of life.
Trends in ecology & evolution 2008;23;5;276-81
PUBMED: 18367290; DOI: 10.1016/j.tree.2008.01.008
-
The tree of genomes: an empirical comparison of genome-phylogeny reconstruction methods.
Bioinformatics laboratory, Department of Biology, National University of Ireland Maynooth, Maynooth, Co, Kildare, Ireland. angela.mccann@nuim.ie
Background: In the past decade or more, the emphasis for reconstructing species phylogenies has moved from the analysis of a single gene to the analysis of multiple genes and even completed genomes. The simplest method of scaling up is to use familiar analysis methods on a larger scale and this is the most popular approach. However, duplications and losses of genes along with horizontal gene transfer (HGT) can lead to a situation where there is only an indirect relationship between gene and genome phylogenies. In this study we examine five widely-used approaches and their variants to see if indeed they are more-or-less saying the same thing. In particular, we focus on Conditioned Reconstruction as it is a method that is designed to work well even if HGT is present.
Results: We confirm a previous suggestion that this method has a systematic bias. We show that no two methods produce the same results and most current methods of inferring genome phylogenies produce results that are significantly different to other methods.
Conclusion: We conclude that genome phylogenies need to be interpreted differently, depending on the method used to construct them.
BMC evolutionary biology 2008;8;312
PUBMED: 19014489; PMC: 2592249; DOI: 10.1186/1471-2148-8-312
Tim Downing
- Postdoctoral Fellow
PhD in Avian Evolutionary Genetics (TCD)
MSc in Bioinformatics (DCU)
BA(Mod) in Human Genetics (TCD)
Research
Leishmania population and evolutionary epidemiology
References
-
Genome-wide SNP and microsatellite variation illuminate population-level epidemiology in the Leishmania donovani species complex.
Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK. Tim.Downing@sanger.ac.uk
The species of the Leishmania donovani species complex cause visceral leishmaniasis, a debilitating infectious disease transmitted by sandflies. Understanding molecular changes associated with population structure in these parasites can help unravel their epidemiology and spread in humans. In this study, we used a panel of standard microsatellite loci and genome-wide SNPs to investigate population-level diversity in L. donovani strains recently isolated from a small geographic area spanning India, Bihar and Nepal, and compared their variation to that found in diverse strains of the L. donovani complex isolates from Europe, Africa and Asia. Microsatellites and SNPs could clearly resolve the phylogenetic relationships of the strains between continents, and microsatellite phylogenies indicated that certain older Indian strains were closely related to African strains. In the context of the anti-malaria spraying campaigns in the 1960s, this was consistent with a pattern of episodic population size contractions and clonal expansions in these parasites that was supported by population history simulations. In sharp contrast to the low resolution provided by microsatellites, SNPs retained a much more fine-scale resolution of population-level variability to the extent that they identified four different lineages from the same region one of which was more closely related to African and European strains than to Indian or Nepalese ones. Joining results of in vitro testing the antimonial drug sensitivity with the phylogenetic signals from the SNP data highlighted protein-level mutations revealing a distinct drug-resistant group of Nepalese and Indian L. donovani. This study demonstrates the power of genomic data for exploring parasite population structure. Furthermore, markers defining different genetic groups have been discovered that could potentially be applied to investigate drug resistance in clinical Leishmania strains.
Funded by: Wellcome Trust: WT 085775/Z/08/Z
Infection, genetics and evolution : journal of molecular epidemiology and evolutionary genetics in infectious diseases 2012;12;1;149-59
PUBMED: 22119748; PMC: 3315668; DOI: 10.1016/j.meegid.2011.11.005
-
The battle of the SNPs.
Nature reviews. Microbiology 2012;10;1;6
PUBMED: 22138960; DOI: 10.1038/nrmicro2716
-
Whole genome sequencing of multiple Leishmania donovani clinical isolates provides insights into population structure and mechanisms of drug resistance.
Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, United Kingdom.
Visceral leishmaniasis is a potentially fatal disease endemic to large parts of Asia and Africa, primarily caused by the protozoan parasite Leishmania donovani. Here, we report a high-quality reference genome sequence for a strain of L. donovani from Nepal, and use this sequence to study variation in a set of 16 related clinical lines, isolated from visceral leishmaniasis patients from the same region, which also differ in their response to in vitro drug susceptibility. We show that whole-genome sequence data reveals genetic structure within these lines not shown by multilocus typing, and suggests that drug resistance has emerged multiple times in this closely related set of lines. Sequence comparisons with other Leishmania species and analysis of single-nucleotide diversity within our sample showed evidence of selection acting in a range of surface- and transport-related genes, including genes associated with drug resistance. Against a background of relative genetic homogeneity, we found extensive variation in chromosome copy number between our lines. Other forms of structural variation were significantly associated with drug resistance, notably including gene dosage and the copy number of an experimentally verified circular episome present in all lines and described here for the first time. This study provides a basis for more powerful molecular profiling of visceral leishmaniasis, providing additional power to track the drug resistance and epidemiology of an important human pathogen.
Funded by: Wellcome Trust: 076355, 085775/Z/08/Z
Genome research 2011;21;12;2143-56
PUBMED: 22038251; PMC: 3227103; DOI: 10.1101/gr.123430.111
-
The differential evolutionary dynamics of avian cytokine and TLR gene classes.
Smurfit Institute of Genetics, Dublin, Ireland.
The potential for investigating immune gene diversity has been greatly enhanced by recent advances in sequencing power. In this study, variation at two categories of avian immune genes with differing functional roles, pathogen detection and mediation of immune mechanisms, was examined using high-throughput sequencing. TLRs identify and alert the immune system by detecting molecular motifs that are conserved among pathogenic microorganisms, whereas cytokines act as mediators of resulting inflammation and immunity. Nine genes from each class were resequenced in a panel of domestic chickens and wild jungle fowl (JF). Tests on population-wide genetic variation between the gene classes indicated that allele frequency spectra at each group were distinctive. TLRs showed evidence pointing toward directional selection, whereas cytokines had signals more suggestive of frequency-dependent selection. This difference persisted between the distributions considering only coding sites, suggesting functional relevance. The unique patterns of variation at each gene class may be constrained by their different functional roles in the immune response. TLRs identify a relatively limited number of exogeneous pathogenic-related patterns and would be required to adapt quickly in response to evolving novel microbes encountered in new environmental niches. In contrast, cytokines interact with many molecules in mediating the power of immune mechanisms, and accordingly respond to the selective stimuli of many infectious diseases. Analyses also indicated that a general pattern of high variability has been enhanced by widespread genetic exchange between chicken and red JF, and possibly between chicken and gray JF at TLR1LA and TLR2A.
Journal of immunology (Baltimore, Md. : 1950) 2010;184;12;6993-7000
PUBMED: 20483729; DOI: 10.4049/jimmunol.0903092
-
Variation in chicken populations may affect the enzymatic activity of lysozyme.
Smurfit Institute of Genetics, Trinity College, University of Dublin, Ireland.
The chicken lysozyme gene encodes a hydrolase that has a key role in defence, especially in ovo. This gene was resequenced in global chicken populations [red, grey, Ceylon and green jungle fowl (JF)] and related bird species. Networks, summary statistics and tests of neutrality indicate that although there is extensive variation at the gene, little is present at coding sites, with the exception of one non-synonymous site. This segregating site and a further fixed non-synonymous change between red JF and domestic chicken populations are spatially close to the catalytic sites of the enzyme and so might affect its activity.
Animal genetics 2010;41;2;213-7
PUBMED: 19845599; DOI: 10.1111/j.1365-2052.2009.01974.x
-
The avian Toll-Like receptor pathway--subtle differences amidst general conformity.
School of Biochemistry and Immunology, Trinity College, University of Dublin, Dublin 2, Ireland. cormicp@tcd.ie
The Toll-Like receptor (TLR) pathway plays a core role in innate immunity and is maintained with remarkable consistency across all vertebrate species. Amidst this background of overall conservation, subtle differences in the components that make up this pathway may have important implications for species-specific defense against key pathogens. Here we employ a homology-based comparative method to characterize the TLR pathway in the recently sequenced chicken and zebra finch genomes, which represent two distantly related bird species. The key features of the TLR pathway are conserved in birds and mammals, although some clear differences exist. The TLR receptors show a pattern of gene duplication and gene loss in both avian species when compared to mammals. In particular, we observe avian specific duplication of both TLR1 and TLR2 as well and a recent duplication of the TLR7 gene in the zebra finch lineage. Both positive selection and gene conversion shape the evolution of the avian specific TLR2 genes. In addition, there are notable differences in the zebra finch repertoire of antimicrobial peptides (AMPs) when compared to those of the chicken. Bioinformatic analysis reveals no evidence of cathelicidins in the zebra finch genome but does identify a cluster of 12 novel defensins which map to the avian beta-defensin locus on chromosome 3. These findings contribute to the characterization of the differing immune response systems that have evolved in individual vertebrate species in response to their microbiological environment.
Developmental and comparative immunology 2009;33;9;967-73
PUBMED: 19539094; DOI: 10.1016/j.dci.2009.04.001
-
Contrasting evolution of diversity at two disease-associated chicken genes.
Smurfit Institute of Genetics, Trinity College, University of Dublin, Dublin, Ireland.
There have been significant evolutionary pressures on the chicken during both its speciation and its subsequent domestication by man. Infectious diseases are expected to have exerted strong selective pressures during these processes. Consequently, it is likely that genes associated with disease susceptibility or resistance have been subject to some form of selection. Two genes involved in the immune response (interferon-gamma and interleukin 1-beta) were selected for sequencing in diverse chicken populations from Pakistan, Sri Lanka, Bangladesh, Kenya, Senegal, Burkina Faso and Botswana, as well as six outgroup samples (grey, green, red and Ceylon jungle fowl and grey francolin and bamboo partridge). Haplotype frequencies, tests of neutrality, summary statistics, coalescent simulations and phylogenetic analysis by maximum likelihood were used to determine the population genetic characteristics of the genes. Networks indicate that these chicken genes are most closely related to the red jungle fowl. Interferon-gamma had lower diversity and considerable coding sequence conservation, which is consistent with its function as a key inflammatory cytokine of the immune response. In contrast, the pleiotropic cytokine interleukin 1-beta had higher diversity and showed signals of balancing selection moderated by recombination, yielding high numbers of diverse alleles, possibly reflecting broader functionality and potential roles in more diseases in different environments.
Immunogenetics 2009;61;4;303-14
PUBMED: 19247647; DOI: 10.1007/s00251-009-0359-x
-
Evidence of balanced diversity at the chicken interleukin 4 receptor alpha chain locus.
Smurfit Institute of Genetics, Trinity College, University of Dublin, Dublin, Ireland. downint@tcd.ie
Background: The comparative analysis of genome sequences emerging for several avian species with the fully sequenced chicken genome enables the genome-wide investigation of selective processes in functionally important chicken genes. In particular, because of pathogenic challenges it is expected that genes involved in the chicken immune system are subject to particularly strong adaptive pressure. Signatures of selection detected by inter-species comparison may then be investigated at the population level in global chicken populations to highlight potentially relevant functional polymorphisms.
Results: Comparative evolutionary analysis of chicken (Gallus gallus) and zebra finch (Taeniopygia guttata) genes identified interleukin 4 receptor alpha-chain (IL-4Ralpha), a key cytokine receptor as a candidate with a significant excess of substitutions at nonsynonymous sites, suggestive of adaptive evolution. Resequencing and detailed population genetic analysis of this gene in diverse village chickens from Asia and Africa, commercial broilers, and in outgroup species red jungle fowl (JF), grey JF, Ceylon JF, green JF, grey francolin and bamboo partridge, suggested elevated and balanced diversity across all populations at this gene, acting to preserve different high-frequency alleles at two nonsynonymous sites.
Conclusion: Haplotype networks indicate that red JF is the primary contributor of diversity at chicken IL-4Ralpha: the signature of variation observed here may be due to the effects of domestication, admixture and introgression, which produce high diversity. However, this gene is a key cytokine-binding receptor in the immune system, so balancing selection related to the host response to pathogens cannot be excluded.
BMC evolutionary biology 2009;9;136
PUBMED: 19527513; PMC: 3224688; DOI: 10.1186/1471-2148-9-136
-
Evidence of the adaptive evolution of immune genes in chicken.
Smurfit Institute of Genetics, Trinity College, University of Dublin, Ireland.
The basis for understanding the characteristics of gene functional categories in chicken has been enhanced by the ongoing sequencing of the zebra finch genome, the second bird species to be extensively sequenced. This sequence provides an avian context for examining how variation in chicken has evolved since its divergence from its common ancestor with zebra finch as well as well as a calibrating point for studying intraspecific diversity within chicken. Immune genes have been subject to many selective processes during their evolutionary history: this gene class was investigated here in a set of orthologous chicken and zebra finch genes with functions assigned from the human ortholog. Tests demonstrated that nonsynonymous sites at immune genes were highly conserved both in chicken and on the avian lineage. McDonald-Kreitman tests provided evidence of adaptive evolution and a higher rate of selection on fixation of nonsynonymous substitutions at immune genes compared to that at non-immune genes. Further analyses showed that GC content was much higher in chicken than in zebra finch genes, and was significantly elevated in both species' immune genes. Pathogen challenges are likely to have driven the selective forces that have shaped variation at chicken immune genes, and continue to restrict diversity in this functional class.
BMC research notes 2009;2;254
PUBMED: 20003477; PMC: 2804575; DOI: 10.1186/1756-0500-2-254
Bernardo Foth
- Senior Staff Scientist
I studied biology at the University of Erlangen in Germany, followed by PhD work on the relic plastid of malaria parasites in Melbourne with Geoff McFadden. I then carried out postdoctoral research in the labs of Dominique Soldati (on myosins and Toxoplasma gondii cell biology) and Zbynek Bozdech (on quantitative transcript-protein relationships in malaria parasites). I joined the Parasite Genomics group in December 2010.
Research
I am currently involved in a number of functional genomics-related projects ranging from investigating the genetic basis of drug-resistance in African trypanosomes to differential gene expression in the parasitic nematode Trichuris muris. I am also leading the group's renewed efforts to produce the de novo genome sequence of the avian malaria parasite Plasmodium gallinaceum.
References
-
Quantitative time-course profiling of parasite and host cell proteins in the human malaria parasite Plasmodium falciparum.
School of Biological Sciences, Nanyang Technological University, 60 Nanyang Drive, Singapore 637551.
Studies of the Plasmodium falciparum transcriptome have shown that the tightly controlled progression of the parasite through the intra-erythrocytic developmental cycle (IDC) is accompanied by a continuous gene expression cascade in which most expressed genes exhibit a single transcriptional peak. Because the biochemical and cellular functions of most genes are mediated by the encoded proteins, understanding the relationship between mRNA and protein levels is crucial for inferring biological activity from transcriptional gene expression data. Although studies on other organisms show that <50% of protein abundance variation may be attributable to corresponding mRNA levels, the situation in Plasmodium is further complicated by the dynamic nature of the cyclic gene expression cascade. In this study, we simultaneously determined mRNA and protein abundance profiles for P. falciparum parasites during the IDC at 2-hour resolution based on oligonucleotide microarrays and two-dimensional differential gel electrophoresis protein gels. We find that most proteins are represented by more than one isoform, presumably because of post-translational modifications. Like transcripts, most proteins exhibit cyclic abundance profiles with one peak during the IDC, whereas the presence of functionally related proteins is highly correlated. In contrast, the abundance of most parasite proteins peaks significantly later (median 11 h) than the corresponding transcripts and often decreases slowly in the second half of the IDC. Computational modeling indicates that the considerable and varied incongruence between transcript and protein abundance may largely be caused by the dynamics of translation and protein degradation. Furthermore, we present cyclic abundance profiles also for parasite-associated human proteins and confirm the presence of five human proteins with a potential role in antioxidant defense within the parasites. Together, our data provide fundamental insights into transcript-protein relationships in P. falciparum that are important for the correct interpretation of transcriptional data and that may facilitate the improvement and development of malaria diagnostics and drug therapy.
Molecular & cellular proteomics : MCP 2011;10;8;M110.006411
PUBMED: 21558492; PMC: 3149090; DOI: 10.1074/mcp.M110.006411
-
Mitochondrial translation in absence of local tRNA aminoacylation and methionyl tRNA Met formylation in Apicomplexa.
Department of Microbiology and Molecular Medicine, CMU, University of Geneva, 1 rue Michel-Servet, 1211 Geneva 4, Switzerland.
Apicomplexans possess three translationally active compartments: the cytosol, a single tubular mitochondrion, and a vestigial plastid organelle called apicoplast. Mitochondrion and apicoplast are of bacterial evolutionary origin and therefore depend on a bacterial-like translation machinery. The minimal mitochondrial genome contains only three ORFs, and in Toxoplasma gondii the absence of mitochondrial tRNA genes is compensated for by the import of cytosolic eukaryotic tRNAs. Although all compartments require a complete set of charged tRNAs, the apicomplexan nuclear genomes do not hold sufficient aminoacyl-tRNA synthetase (aaRSs) genes to be targeted individually to each compartment. This study reveals that aaRSs are either cytosolic, apicoplastic or shared between the two compartments by dual targeting but are absent from the mitochondrion. Consequently, tRNAs are very likely imported in their aminoacylated form. Furthermore, the unexpected absence of tRNA(Met) formyltransferase and peptide deformylase implies that the requirement for a specialized formylmethionyl-tRNA(Met) for translation initiation is bypassed in the mitochondrion of Apicomplexa.
Funded by: Howard Hughes Medical Institute; Wellcome Trust
Molecular microbiology 2010;76;3;706-18
PUBMED: 20374492; DOI: 10.1111/j.1365-2958.2010.07128.x
-
Evolution of malaria parasite plastid targeting sequences.
School of Botany, University of Melbourne, Melbourne, Victoria 3010, Australia.
The transfer of genes from an endosymbiont to its host typically requires acquisition of targeting signals by the gene product to ensure its return to the endosymbiont for function. Many hundreds of plastid-derived genes must have acquired transit peptides for successful relocation to the nucleus. Here, we explore potential evolutionary origins of plastid transit peptides in the malaria parasite Plasmodium falciparum. We show that exons of the P. falciparum genome could serve as transit peptides after exon shuffling. We further demonstrate that numerous randomized peptides and even whimsical sequences based on English words can also function as transit peptides in vivo. Thus, facile acquisition of transit peptides from existing sequence likely expedited endosymbiont integration through intracellular gene transfer.
Funded by: Howard Hughes Medical Institute
Proceedings of the National Academy of Sciences of the United States of America 2008;105;12;4781-5
PUBMED: 18353992; PMC: 2290815; DOI: 10.1073/pnas.0707827105
-
Quantitative protein expression profiling reveals extensive post-transcriptional regulation and post-translational modifications in schizont-stage malaria parasites.
School of Biological Sciences, Nanyang Technological University, Nanyang Drive, 637551 Singapore. BFoth@ntu.edu.sg
Background: Malaria is a one of the most important infectious diseases and is caused by parasitic protozoa of the genus Plasmodium. Previously, quantitative characterization of the P. falciparum transcriptome demonstrated that the strictly controlled progression of these parasites through their intra-erythrocytic developmental cycle is accompanied by a continuous cascade of gene expression. Although such analyses have proven immensely useful, the correlations between abundance of transcripts and their cognate proteins remain poorly characterized.
Results: Here, we present a quantitative time-course analysis of relative protein abundance for schizont-stage parasites (34 to 46 hours after invasion) based on two-dimensional differential gel electrophoresis of protein samples labeled with fluorescent dyes. For this purpose we analyzed parasite samples taken at 4-hour intervals from a tightly synchronized culture and established more than 500 individual protein abundance profiles with high temporal resolution and quantitative reproducibility. Approximately half of all profiles exhibit a significant change in abundance and 12% display an expression peak during the observed 12-hour time interval. Intriguingly, identification of 54 protein spots by mass spectrometry revealed that 58% of the corresponding proteins--including actin-I, enolase, eukaryotic initiation factor (eIF)4A, eIF5A, and several heat shock proteins--are represented by more than one isoform, presumably caused by post-translational modifications, with the various isoforms of a given protein frequently showing different expression patterns. Furthermore, comparisons with transcriptome data generated from the same parasite samples reveal evidence of significant post-transcriptional gene expression regulation.
Conclusions: Together, our data indicate that both post-transcriptional and post-translational events are widespread and of presumably great biological significance during the intra-erythrocytic development of P. falciparum.
Genome biology 2008;9;12;R177
PUBMED: 19091060; PMC: 2646281; DOI: 10.1186/gb-2008-9-12-r177
-
Dual targeting of antioxidant and metabolic enzymes to the mitochondrion and the apicoplast of Toxoplasma gondii.
Department of Microbiology and Molecular Medicine, Centre Medical Universitaire, University of Geneva, Geneva, Switzerland.
Toxoplasma gondii is an aerobic protozoan parasite that possesses mitochondrial antioxidant enzymes to safely dispose of oxygen radicals generated by cellular respiration and metabolism. As with most Apicomplexans, it also harbors a chloroplast-like organelle, the apicoplast, which hosts various biosynthetic pathways and requires antioxidant protection. Most apicoplast-resident proteins are encoded in the nuclear genome and are targeted to the organelle via a bipartite N-terminal targeting sequence. We show here that two antioxidant enzymes-a superoxide dismutase (TgSOD2) and a thioredoxin-dependent peroxidase (TgTPX1/2)-and an aconitase are dually targeted to both the apicoplast and the mitochondrion of T. gondii. In the case of TgSOD2, our results indicate that a single gene product is bimodally targeted due to an inconspicuous variation within the putative signal peptide of the organellar protein, which significantly alters its subcellular localization. Dual organellar targeting of proteins might occur frequently in Apicomplexans to serve important biological functions such as antioxidant protection and carbon metabolism.
Funded by: Wellcome Trust
PLoS pathogens 2007;3;8;e115
PUBMED: 17784785; PMC: 1959373; DOI: 10.1371/journal.ppat.0030115
-
New insights into myosin evolution and classification.
Department of Microbiology and Molecular Medicine, Centre Médical Universitaire, University of Geneva, 1 Rue Michel-Servet, 1211 Geneva, Switzerland. bernardo.foth@medecine.unige.ch
Myosins are eukaryotic actin-dependent molecular motors important for a broad range of functions like muscle contraction, vision, hearing, cell motility, and host cell invasion of apicomplexan parasites. Myosin heavy chains consist of distinct head, neck, and tail domains and have previously been categorized into 18 different classes based on phylogenetic analysis of their conserved heads. Here we describe a comprehensive phylogenetic examination of many previously unclassified myosins, with particular emphasis on sequences from apicomplexan and other chromalveolate protists including the model organism Toxoplasma, the malaria parasite Plasmodium, and the ciliate Tetrahymena. Using different phylogenetic inference methods and taking protein domain architectures, specific amino acid polymorphisms, and organismal distribution into account, we demonstrate a hitherto unrecognized common origin for ciliate and apicomplexan class XIV myosins. Our data also suggest common origins for some apicomplexan myosins and class VI, for classes II and XVIII, for classes XII and XV, and for some microsporidian myosins and class V, thereby reconciling evolutionary history and myosin structure in several cases and corroborating the common coevolution of myosin head, neck, and tail domains. Six novel myosin classes are established to accommodate sequences from chordate metazoans (class XIX), insects (class XX), kinetoplastids (class XXI), and apicomplexans and diatom algae (classes XXII, XXIII, and XXIV). These myosin (sub)classes include sequences with protein domains (FYVE, WW, UBA, ATS1-like, and WD40) previously unknown to be associated with myosin motors. Regarding the apicomplexan "myosome," we significantly update class XIV classification, propose a systematic naming convention, and discuss possible functions in these parasites.
Funded by: Wellcome Trust
Proceedings of the National Academy of Sciences of the United States of America 2006;103;10;3681-6
PUBMED: 16505385; PMC: 1533776; DOI: 10.1073/pnas.0506307103
-
The malaria parasite Plasmodium falciparum has only one pyruvate dehydrogenase complex, which is located in the apicoplast.
Plant Cell Biology Research Centre, School of Botany, University of Melbourne, Parkville, VIC 3010, Australia.
The relict plastid (apicoplast) of apicomplexan parasites synthesizes fatty acids and is a promising drug target. In plant plastids, a pyruvate dehydrogenase complex (PDH) converts pyruvate into acetyl-CoA, the major fatty acid precursor, whereas a second, distinct PDH fuels the tricarboxylic acid cycle in the mitochondria. In contrast, the presence of genes encoding PDH and related enzyme complexes in the genomes of five Plasmodium species and of Toxoplasma gondii indicate that these parasites contain only one single PDH. PDH complexes are comprised of four subunits (E1alpha, E1beta, E2, E3), and we confirmed four genes encoding a complete PDH in Plasmodium falciparum through sequencing of cDNA clones. In apicomplexan parasites, many nuclear-encoded proteins are targeted to the apicoplast courtesy of two-part N-terminal leader sequences, and the presence of such N-terminal sequences on all four PDH subunits as well as phylogenetic analyses strongly suggest that the P. falciparum PDH is located in the apicoplast. Fusion of the two-part leader sequences from the E1alpha and E2 genes to green fluorescent protein experimentally confirmed apicoplast targeting. Western blot analysis provided evidence for the expression of the E1alpha and E1beta PDH subunits in blood-stage malaria parasites. The recombinantly expressed catalytic domain of the PDH subunit E2 showed high enzymatic activity in vitro indicating that pyruvate is converted to acetyl-CoA in the apicoplast, possibly for use in fatty acid biosynthesis.
Molecular microbiology 2005;55;1;39-53
PUBMED: 15612915; DOI: 10.1111/j.1365-2958.2004.04407.x
-
Tropical infectious diseases: metabolic maps and functions of the Plasmodium falciparum apicoplast.
Institut Pasteur, Biology of Host-Parasite Interactions, 25 Rue du Docteur Roux, 75724, Paris, Cedex 15, France.
Nature reviews. Microbiology 2004;2;3;203-16
PUBMED: 15083156; DOI: 10.1038/nrmicro843
-
Dissecting apicoplast targeting in the malaria parasite Plasmodium falciparum.
Plant Cell Biology Research Centre, School of Botany, University of Melbourne, Parkville, VIC 3010, Australia.
Transit peptides mediate protein targeting into plastids and are only poorly understood. We extracted amino acid features from transit peptides that target proteins to the relict plastid (apicoplast) of malaria parasites. Based on these amino acid characteristics, we identified 466 putative apicoplast proteins in the Plasmodium falciparum genome. Altering the specific charge characteristics in a model transit peptide by site-directed mutagenesis severely disrupted organellar targeting in vivo. Similarly, putative Hsp70 (DnaK) binding sites present in the transit peptide proved to be important for correct targeting.
Science (New York, N.Y.) 2003;299;5607;705-8
PUBMED: 12560551; DOI: 10.1126/science.1078599
-
Regulated degradation of an endoplasmic reticulum membrane protein in a tubular lysosome in Leishmania mexicana.
Department of Biochemistry and Molecular Biology, The University of Melbourne, Victoria 3010, Australia.
The cell surface of the human parasite Leishmania mexicana is coated with glycosylphosphatidylinositol (GPI)-anchored macromolecules and free GPI glycolipids. We have investigated the intracellular trafficking of green fluorescent protein- and hemagglutinin-tagged forms of dolichol-phosphate-mannose synthase (DPMS), a key enzyme in GPI biosynthesis in L. mexicana promastigotes. These functionally active chimeras are found in the same subcompartment of the endoplasmic reticulum (ER) as endogenous DPMS but are degraded as logarithmically growing promastigotes reach stationary phase, coincident with the down-regulation of endogenous DPMS activity and GPI biosynthesis in these cells. We provide evidence that these chimeras are constitutively transported to and degraded in a novel multivesicular tubule (MVT) lysosome. This organelle is a terminal lysosome, which is labeled with the endocytic marker FM 4-64, contains lysosomal cysteine and serine proteases and is disrupted by lysomorphotropic agents. Electron microscopy and subcellular fractionation studies suggest that the DPMS chimeras are transported from the ER to the lumen of the MVT via the Golgi apparatus and a population of 200-nm multivesicular bodies. In contrast, soluble ER proteins are not detectably transported to the MVT lysosome in either log or stationary phase promastigotes. Finally, the increased degradation of the DPMS chimeras in stationary phase promastigotes coincides with an increase in the lytic capacity of the MVT lysosome and changes in the morphology of this organelle. We conclude that lysosomal degradation of DPMS may be important in regulating the cellular levels of this enzyme and the stage-dependent biosynthesis of the major surface glycolipids of these parasites.
Molecular biology of the cell 2001;12;8;2364-77
Thomas Otto
- Senior Staff Scientist
I studied informatics with bioinformatics as minor in Lübeck, Germany. After a short project at the Florida State University (analyzing Functional magnetic resonance imaging data), I started to work at the Fundação Oswaldo Cruz in Rio de Janeiro, Brazil. My role was to provide bioinformatics support to the group and generate algorithmic solutions to biological problems. In 2008, I finished my PhD, presenting alternative ways to improve the assembly of the Brazilian tuberculosis genome.
Research
In 2008 I joined Matt Berriman’s group. My main role is to provide bioinformatics support to our team, other groups at Sanger and within the European EviMalaR network of malaria labs. My projects mostly involve analyzing next generation sequencing data related to Malaria, by developing algorithms.
References
-
Optimal enzymes for amplifying sequencing libraries.
Nature methods 2012;9;1;10-1
PUBMED: 22205512; DOI: 10.1038/nmeth.1814
-
A scalable pipeline for highly effective genetic modification of a malaria parasite.
The Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK.
In malaria parasites, the systematic experimental validation of drug and vaccine targets by reverse genetics is constrained by the inefficiency of homologous recombination and by the difficulty of manipulating adenine and thymine (A+T)-rich DNA of most Plasmodium species in Escherichia coli. We overcame these roadblocks by creating a high-integrity library of Plasmodium berghei genomic DNA (>77% A+T content) in a bacteriophage N15-based vector that can be modified efficiently using the lambda Red method of recombineering. We built a pipeline for generating P. berghei genetic modification vectors at genome scale in serial liquid cultures on 96-well plates. Vectors have long homology arms, which increase recombination frequency up to tenfold over conventional designs. The feasibility of efficient genetic modification at scale will stimulate collaborative, genome-wide knockout and tagging programs for P. berghei.
Funded by: Medical Research Council: G0501670; Wellcome Trust: WT089085/Z/09/Z
Nature methods 2011;8;12;1078-82
PUBMED: 22020067; DOI: 10.1038/nmeth.1742
-
Genome sequence of Mycobacterium bovis BCG Moreau, the Brazilian vaccine strain against tuberculosis.
Laboratório de Genômica Funcional e Bioinformática, Pavilhão Leonidas Deane sala 104, Instituto Oswaldo Cruz, Fiocruz Av., Brasil 4365, Manguinhos, 21040-900 Rio de Janeiro, Brazil.
Mycobacterium bovis bacillus Calmette-Guérin (BCG) is the only vaccine available against tuberculosis, and the strains used worldwide represent a family of daughter strains with distinct genotypic characteristics. Here we report the complete genome sequence of M. bovis BCG Moreau, the strain in continuous use in Brazil for vaccine production since the 1920s.
Journal of bacteriology 2011;193;19;5600-1
PUBMED: 21914899; PMC: 3187452; DOI: 10.1128/JB.05827-11
-
Real-time sequencing.
Nature reviews. Microbiology 2011;9;9;633
PUBMED: 21836624; DOI: 10.1038/nrmicro2638
-
RATT: Rapid Annotation Transfer Tool.
Parasite Genomics, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, CB10 1SA, UK. tdo@sanger.ac.uk
Second-generation sequencing technologies have made large-scale sequencing projects commonplace. However, making use of these datasets often requires gene function to be ascribed genome wide. Although tool development has kept pace with the changes in sequence production, for tasks such as mapping, de novo assembly or visualization, genome annotation remains a challenge. We have developed a method to rapidly provide accurate annotation for new genomes using previously annotated genomes as a reference. The method, implemented in a tool called RATT (Rapid Annotation Transfer Tool), transfers annotations from a high-quality reference to a new genome on the basis of conserved synteny. We demonstrate that a Mycobacterium tuberculosis genome or a single 2.5 Mb chromosome from a malaria parasite can be annotated in less than five minutes with only modest computational resources. RATT is available at http://ratt.sourceforge.net.
Funded by: Wellcome Trust: WT 085775/Z/08/Z
Nucleic acids research 2011;39;9;e57
PUBMED: 21306991; PMC: 3089447; DOI: 10.1093/nar/gkq1268
-
Two nonrecombining sympatric forms of the human malaria parasite Plasmodium ovale occur globally.
Health Protection Agency Malaria Reference Laboratory, Immunology Unit, London School of Hygiene and Tropical Medicine, London, United Kingdom. colin.sutherland@lshtm.ac.uk
Background: Malaria in humans is caused by apicomplexan parasites belonging to 5 species of the genus Plasmodium. Infections with Plasmodium ovale are widely distributed but rarely investigated, and the resulting burden of disease is not known. Dimorphism in defined genes has led to P. ovale parasites being divided into classic and variant types. We hypothesized that these dimorphs represent distinct parasite species.
Methods: Multilocus sequence analysis of 6 genetic characters was carried out among 55 isolates from 12 African and 3 Asia-Pacific countries.
Results: Each genetic character displayed complete dimorphism and segregated perfectly between the 2 types. Both types were identified in samples from Ghana, Nigeria, São Tomé, Sierra Leone, and Uganda and have been described previously in Myanmar. Splitting of the 2 lineages is estimated to have occurred between 1.0 and 3.5 million years ago in hominid hosts.
Conclusions: We propose that P. ovale comprises 2 nonrecombining species that are sympatric in Africa and Asia. We speculate on possible scenarios that could have led to this speciation. Furthermore, the relatively high frequency of imported cases of symptomatic P. ovale infection in the United Kingdom suggests that the morbidity caused by ovale malaria has been underestimated.
Funded by: Wellcome Trust
The Journal of infectious diseases 2010;201;10;1544-50
PUBMED: 20380562; DOI: 10.1086/652240
-
New insights into the blood-stage transcriptome of Plasmodium falciparum using RNA-Seq.
Parasite Genomics, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK.
Recent advances in high-throughput sequencing present a new opportunity to deeply probe an organism's transcriptome. In this study, we used Illumina-based massively parallel sequencing to gain new insight into the transcriptome (RNA-Seq) of the human malaria parasite, Plasmodium falciparum. Using data collected at seven time points during the intraerythrocytic developmental cycle, we (i) detect novel gene transcripts; (ii) correct hundreds of gene models; (iii) propose alternative splicing events; and (iv) predict 5' and 3' untranslated regions. Approximately 70% of the unique sequencing reads map to previously annotated protein-coding genes. The RNA-Seq results greatly improve existing annotation of the P. falciparum genome with over 10% of gene models modified. Our data confirm 75% of predicted splice sites and identify 202 new splice sites, including 84 previously uncharacterized alternative splicing events. We also discovered 107 novel transcripts and expression of 38 pseudogenes, with many demonstrating differential expression across the developmental time series. Our RNA-Seq results correlate well with DNA microarray analysis performed in parallel on the same samples, and provide improved resolution over the microarray-based method. These data reveal new features of the P. falciparum transcriptional landscape and significantly advance our understanding of the parasite's red blood cell-stage transcriptome.
Funded by: NIGMS NIH HHS: P50 GM071508; Wellcome Trust: WT 085775/Z/08/Z
Molecular microbiology 2010;76;1;12-24
PUBMED: 20141604; PMC: 2859250; DOI: 10.1111/j.1365-2958.2009.07026.x
-
ProteinWorldDB: querying radical pairwise alignments among protein sets from complete genomes.
Laboratório de Genômica Funcional e Bioinformática, Instituto Oswaldo Cruz, Fiocruz, Rio de Janeiro, Brazil. otto@fiocruz.br
MOTIVATION: Many analyses in modern biological research are based on comparisons between biological sequences, resulting in functional, evolutionary and structural inferences. When large numbers of sequences are compared, heuristics are often used resulting in a certain lack of accuracy. In order to improve and validate results of such comparisons, we have performed radical all-against-all comparisons of 4 million protein sequences belonging to the RefSeq database, using an implementation of the Smith-Waterman algorithm. This extremely intensive computational approach was made possible with the help of World Community Grid, through the Genome Comparison Project. The resulting database, ProteinWorldDB, which contains coordinates of pairwise protein alignments and their respective scores, is now made available. Users can download, compare and analyze the results, filtered by genomes, protein functions or clusters. ProteinWorldDB is integrated with annotations derived from Swiss-Prot, Pfam, KEGG, NCBI Taxonomy database and gene ontology. The database is a unique and valuable asset, representing a major effort to create a reliable and consistent dataset of cross-comparisons of the whole protein content encoded in hundreds of completely sequenced genomes using a rigorous dynamic programming approach. AVAILABILITY: The database can be accessed through http://proteinworlddb.org
Bioinformatics (Oxford, England) 2010;26;5;705-7
PUBMED: 20089515; PMC: 2828119; DOI: 10.1093/bioinformatics/btq011
-
Improving draft assemblies by iterative mapping and assembly of short reads to eliminate gaps.
Parasite Genomics, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK. jit@sanger.ac.uk
Advances in sequencing technology allow genomes to be sequenced at vastly decreased costs. However, the assembled data frequently are highly fragmented with many gaps. We present a practical approach that uses Illumina sequences to improve draft genome assemblies by aligning sequences against contig ends and performing local assemblies to produce gap-spanning contigs. The continuity of a draft genome can thus be substantially improved, often without the need to generate new data.
Funded by: Wellcome Trust: WT 085775/Z/08/Z
Genome biology 2010;11;4;R41
PUBMED: 20388197; PMC: 2884544; DOI: 10.1186/gb-2010-11-4-r41
-
ABACAS: algorithm-based automatic contiguation of assembled sequences.
Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge CB10 1SA, UK. sa4@sanger.ac.uk
SUMMARY: Due to the availability of new sequencing technologies, we are now increasingly interested in sequencing closely related strains of existing finished genomes. Recently a number of de novo and mapping-based assemblers have been developed to produce high quality draft genomes from new sequencing technology reads. New tools are necessary to take contigs from a draft assembly through to a fully contiguated genome sequence. ABACAS is intended as a tool to rapidly contiguate (align, order, orientate), visualize and design primers to close gaps on shotgun assembled contigs based on a reference sequence. The input to ABACAS is a set of contigs which will be aligned to the reference genome, ordered and orientated, visualized in the ACT comparative browser, and optimal primer sequences are automatically generated. Availability and Implementation: ABACAS is implemented in Perl and is freely available for download from http://abacas.sourceforge.net.
Funded by: Wellcome Trust: WT085775/Z/08/Z
Bioinformatics (Oxford, England) 2009;25;15;1968-9
PUBMED: 19497936; PMC: 2712343; DOI: 10.1093/bioinformatics/btp347
Adam Reid
- Postdoctoral Fellow
I studied for a Genetics BSc at the University of Sheffield and an MRes in Bioinformatics at the University of York. I subsequently worked for AstraZeneca, providing bioinformatics support to proteomics and genotyping projects. I then did my PhD with Prof. Christine Orengo at University College London looking at the evolution of protein domain families.
I joined the Parasite Genomics group in January 2009.
Research
1. I have led the analysis of the Neospora caninum genome and its comparison with the human pathogen Toxoplasma gondii.
2. I am leading analysis of another apicomplexan genome, the chicken parasite Eimeria tenella (and several related species).
3. I am working on various approaches to use gene expression analysis in investigating host-parasite interactions principally in Malaria, but also helminths and trypanosomes.
References
-
Genomic insights into the origin of parasitism in the emerging plant pathogen Bursaphelenchus xylophilus.
Forestry and Forest Products Research Institute, Tsukuba, Japan. kikuchit@affrc.go.jp
Bursaphelenchus xylophilus is the nematode responsible for a devastating epidemic of pine wilt disease in Asia and Europe, and represents a recent, independent origin of plant parasitism in nematodes, ecologically and taxonomically distinct from other nematodes for which genomic data is available. As well as being an important pathogen, the B. xylophilus genome thus provides a unique opportunity to study the evolution and mechanism of plant parasitism. Here, we present a high-quality draft genome sequence from an inbred line of B. xylophilus, and use this to investigate the biological basis of its complex ecology which combines fungal feeding, plant parasitic and insect-associated stages. We focus particularly on putative parasitism genes as well as those linked to other key biological processes and demonstrate that B. xylophilus is well endowed with RNA interference effectors, peptidergic neurotransmitters (including the first description of ins genes in a parasite) stress response and developmental genes and has a contracted set of chemosensory receptors. B. xylophilus has the largest number of digestive proteases known for any nematode and displays expanded families of lysosome pathway genes, ABC transporters and cytochrome P450 pathway genes. This expansion in digestive and detoxification proteins may reflect the unusual diversity in foods it exploits and environments it encounters during its life cycle. In addition, B. xylophilus possesses a unique complement of plant cell wall modifying proteins acquired by horizontal gene transfer, underscoring the impact of this process on the evolution of plant parasitism by nematodes. Together with the lack of proteins homologous to effectors from other plant parasitic nematodes, this confirms the distinctive molecular basis of plant parasitism in the Bursaphelenchus lineage. The genome sequence of B. xylophilus adds to the diversity of genomic data for nematodes, and will be an important resource in understanding the biology of this unusual parasite.
Funded by: Wellcome Trust: WT 085775/Z/08/Z
PLoS pathogens 2011;7;9;e1002219
PUBMED: 21909270; PMC: 3164644; DOI: 10.1371/journal.ppat.1002219
-
Genome sequencing gets func-y.
Nature reviews. Microbiology 2011;9;6;401
PUBMED: 21572456; DOI: 10.1038/nrmicro2583
-
CODA: accurate detection of functional associations between proteins in eukaryotic genomes using domain fusion.
Wellcome Trust Sanger Institute, Cambridge, United Kingdom. ar11@sanger.ac.uk
Background: In order to understand how biological systems function it is necessary to determine the interactions and associations between proteins. Gene fusion prediction is one approach to detection of such functional relationships. Its use is however known to be problematic in higher eukaryotic genomes due to the presence of large homologous domain families. Here we introduce CODA (Co-Occurrence of Domains Analysis), a method to predict functional associations based on the gene fusion idiom.
We apply a novel scoring scheme which takes account of the genome-specific size of homologous domain families involved in fusion to improve accuracy in predicting functional associations. We show that CODA is able to accurately predict functional similarities in human with comparison to state-of-the-art methods and show that different methods can be complementary. CODA is used to produce evidence that a currently uncharacterised human protein may be involved in pathways related to depression and that another is involved in DNA replication.
The relative performance of different gene fusion methodologies has not previously been explored. We find that they are largely complementary, with different methods being more or less appropriate in different genomes. Our method is the only one currently available for download and can be run on an arbitrary dataset by the user. The CODA software and datasets are freely available from ftp://ftp.biochem.ucl.ac.uk/pub/gene3d_data/v6.1.0/CODA/. Predictions are also available via web services from http://funcnet.eu/.
Funded by: Biotechnology and Biological Sciences Research Council
PloS one 2010;5;6;e10908
PUBMED: 20532224; PMC: 2879367; DOI: 10.1371/journal.pone.0010908
-
Comparative evolutionary analysis of protein complexes in E. coli and yeast.
Research Department of Structural & Molecular Biology, University College London, London, WC1E 6BT, UK. ar11@sanger.ac.uk
Background: Proteins do not act in isolation; they frequently act together in protein complexes to carry out concerted cellular functions. The evolution of complexes is poorly understood, especially in organisms other than yeast, where little experimental data has been available.
Results: We generated accurate, high coverage datasets of protein complexes for E. coli and yeast in order to study differences in the evolution of complexes between these two species. We show that substantial differences exist in how complexes have evolved between these organisms. A previously proposed model of complex evolution identified complexes with cores of interacting homologues. We support findings of the relative importance of this mode of evolution in yeast, but find that it is much less common in E. coli. Additionally it is shown that those homologues which do cluster in complexes are involved in eukaryote-specific functions. Furthermore we identify correlated pairs of non-homologous domains which occur in multiple protein complexes. These were identified in both yeast and E. coli and we present evidence that these too may represent complex cores in yeast but not those of E. coli.
Conclusions: Our results suggest that there are differences in the way protein complexes have evolved in E. coli and yeast. Whereas some yeast complexes have evolved by recruiting paralogues, this is not apparent in E. coli. Furthermore, such complexes are involved in eukaryotic-specific functions. This implies that the increase in gene family sizes seen in eukaryotes in part reflects multiple family members being used within complexes. However, in general, in both E. coli and yeast, homologous domains are used in different complexes.
Funded by: Biotechnology and Biological Sciences Research Council
BMC genomics 2010;11;79
PUBMED: 20122144; PMC: 2837643; DOI: 10.1186/1471-2164-11-79
-
Methods of remote homology detection can be combined to increase coverage by 10% in the midnight zone.
Department of Biochemistry and Molecular Biology, University College London, Gower Street, London WC1E 6BT, UK. reid@bioichem.ucl.ac.uk
Motivation: A recent development in sequence-based remote homologue detection is the introduction of profile-profile comparison methods. These are more powerful than previous technologies and can detect potentially homologous relationships missed by structural classifications such as CATH and SCOP. As structural classifications traditionally act as the gold standard of homology this poses a challenge in benchmarking them.
Results: We present a novel approach which allows an accurate benchmark of these methods against the CATH structural classification. We then apply this approach to assess the accuracy of a range of publicly available methods for remote homology detection including several profile-profile methods (COMPASS, HHSearch, PRC) from two perspectives. First, in distinguishing homologous domains from non-homologues and second, in annotating proteomes with structural domain families. PRC is shown to be the best method for distinguishing homologues. We show that SAM is the best practical method for annotating genomes, whilst using COMPASS for the most remote homologues would increase coverage. Finally, we introduce a simple approach to increase the sensitivity of remote homologue detection by up to 10%. This is achieved by combining multiple methods with a jury vote.
Supplementary data are available at Bioinformatics online.
Bioinformatics (Oxford, England) 2007;23;18;2353-60
PUBMED: 17709341; DOI: 10.1093/bioinformatics/btm355
Jason Tsai
jit@sanger.ac.uk Postdoctoral Fellow
I studied Biochemistry and Genetics degree at Nottingham University, and then was awarded a Wellcome Trust PhD scholarship in Bioinformatics at Imperial College London. My PhD with Prof. Austin Burt involved in the investigation of the effect of biological mechanisms at different stages of life cycles on yeast genome evolution.
I joined Parasite Genomics group in Sanger on December 2008.
Research
I am currently involved in many de novo parasite genome sequencing projects where my research focuses on three main topics:
1. Determining genetic features underlying the diversity of biology and pathogenesis of blood fluke (Schistosoma) and tapeworm species through means of comparative genomics.
2. Life cycle dynamics and disease control: I hope to identify genes that regulate the development of individual life cycle stages of different parasite species through transcriptome sequencing and population genetics methods.
3. Algorithm development to improve de novo assemblies automatically using next generation sequencing.
References
-
Genomic insights into the origin of parasitism in the emerging plant pathogen Bursaphelenchus xylophilus.
Forestry and Forest Products Research Institute, Tsukuba, Japan. kikuchit@affrc.go.jp
Bursaphelenchus xylophilus is the nematode responsible for a devastating epidemic of pine wilt disease in Asia and Europe, and represents a recent, independent origin of plant parasitism in nematodes, ecologically and taxonomically distinct from other nematodes for which genomic data is available. As well as being an important pathogen, the B. xylophilus genome thus provides a unique opportunity to study the evolution and mechanism of plant parasitism. Here, we present a high-quality draft genome sequence from an inbred line of B. xylophilus, and use this to investigate the biological basis of its complex ecology which combines fungal feeding, plant parasitic and insect-associated stages. We focus particularly on putative parasitism genes as well as those linked to other key biological processes and demonstrate that B. xylophilus is well endowed with RNA interference effectors, peptidergic neurotransmitters (including the first description of ins genes in a parasite) stress response and developmental genes and has a contracted set of chemosensory receptors. B. xylophilus has the largest number of digestive proteases known for any nematode and displays expanded families of lysosome pathway genes, ABC transporters and cytochrome P450 pathway genes. This expansion in digestive and detoxification proteins may reflect the unusual diversity in foods it exploits and environments it encounters during its life cycle. In addition, B. xylophilus possesses a unique complement of plant cell wall modifying proteins acquired by horizontal gene transfer, underscoring the impact of this process on the evolution of plant parasitism by nematodes. Together with the lack of proteins homologous to effectors from other plant parasitic nematodes, this confirms the distinctive molecular basis of plant parasitism in the Bursaphelenchus lineage. The genome sequence of B. xylophilus adds to the diversity of genomic data for nematodes, and will be an important resource in understanding the biology of this unusual parasite.
Funded by: Wellcome Trust: WT 085775/Z/08/Z
PLoS pathogens 2011;7;9;e1002219
PUBMED: 21909270; PMC: 3164644; DOI: 10.1371/journal.ppat.1002219
-
Genome watch: Honey, I shrunk the mimiviral genome.
Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK. microbes@sanger.ac.uk
This month's Genome Watch describes how the large size of the mimiviral genome is a result of the sympatric lifestyle of mimivirus in host amoebae.
Nature reviews. Microbiology 2011;9;8;563
PUBMED: 21747393; DOI: 10.1038/nrmicro2623
-
Conservation of recombination hotspots in yeast.
Division of Biology, Imperial College London, Silwood Park, Ascot, Berks SL5 7PY, United Kingdom.
Meiotic recombination does not occur randomly along a chromosome, but instead tends to be concentrated in small regions, known as "recombination hotspots." Recombination hotspots are thought to be short-lived in evolutionary time due to their self-destructive nature, as gene conversion favors recombination-suppressing alleles over recombination-promoting alleles during double-strand repair. Consistent with this expectation, hotspots in humans are highly dynamic, with little correspondence in location between humans and chimpanzees. Here, we identify recombination hotspots in two lineages of the yeast Saccharomyces paradoxus, and compare their locations to those found previously in Saccharomyces cerevisiae. Surprisingly, we find considerable overlap between the two species, despite the fact that they are at least 10 times more divergent than humans and chimpanzees. We attribute this unexpected result to the low frequency of sex and outcrossing in these yeasts, acting to reduce the population genetic effect of biased gene conversion. Traces from two other signatures of recombination, namely high mutagenicity and GC-biased gene conversion, are consistent with this interpretation. Thus, recombination hotspots are not inevitably short-lived, but rather their persistence through evolutionary time will be determined by the frequency of outcrossing events in the life cycle.
Funded by: Biotechnology and Biological Sciences Research Council; Wellcome Trust
Proceedings of the National Academy of Sciences of the United States of America 2010;107;17;7847-52
PUBMED: 20385822; PMC: 2867876; DOI: 10.1073/pnas.0908774107
-
Improving draft assemblies by iterative mapping and assembly of short reads to eliminate gaps.
Parasite Genomics, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK. jit@sanger.ac.uk
Advances in sequencing technology allow genomes to be sequenced at vastly decreased costs. However, the assembled data frequently are highly fragmented with many gaps. We present a practical approach that uses Illumina sequences to improve draft genome assemblies by aligning sequences against contig ends and performing local assemblies to produce gap-spanning contigs. The continuity of a draft genome can thus be substantially improved, often without the need to generate new data.
Funded by: Wellcome Trust: WT 085775/Z/08/Z
Genome biology 2010;11;4;R41
PUBMED: 20388197; PMC: 2884544; DOI: 10.1186/gb-2010-11-4-r41
-
Population genomics of domestic and wild yeasts.
Institute of Genetics, Queen's Medical Centre, University of Nottingham, Nottingham NG7 2UH, UK.
Since the completion of the genome sequence of Saccharomyces cerevisiae in 1996 (refs 1, 2), there has been a large increase in complete genome sequences, accompanied by great advances in our understanding of genome evolution. Although little is known about the natural and life histories of yeasts in the wild, there are an increasing number of studies looking at ecological and geographic distributions, population structure and sexual versus asexual reproduction. Less well understood at the whole genome level are the evolutionary processes acting within populations and species that lead to adaptation to different environments, phenotypic differences and reproductive isolation. Here we present one- to fourfold or more coverage of the genome sequences of over seventy isolates of the baker's yeast S. cerevisiae and its closest relative, Saccharomyces paradoxus. We examine variation in gene content, single nucleotide polymorphisms, nucleotide insertions and deletions, copy numbers and transposable elements. We find that phenotypic variation broadly correlates with global genome-wide phylogenetic relationships. S. paradoxus populations are well delineated along geographic boundaries, whereas the variation among worldwide S. cerevisiae isolates shows less differentiation and is comparable to a single S. paradoxus population. Rather than one or two domestication events leading to the extant baker's yeasts, the population structure of S. cerevisiae consists of a few well-defined, geographically isolated lineages and many different mosaics of these lineages, supporting the idea that human influence provided the opportunity for cross-breeding and production of new combinations of pre-existing variations.
Funded by: Biotechnology and Biological Sciences Research Council: G10415; Wellcome Trust: 067008, 084507
Nature 2009;458;7236;337-41
PUBMED: 19212322; PMC: 2659681; DOI: 10.1038/nature07743
-
Population genomics of the wild yeast Saccharomyces paradoxus: Quantifying the life cycle.
Division of Biology, Imperial College London, Silwood Park, Ascot, Berks SL5 7PY, United Kingdom.
Most microbes have complex life cycles with multiple modes of reproduction that differ in their effects on DNA sequence variation. Population genomic analyses can therefore be used to estimate the relative frequencies of these different modes in nature. The life cycle of the wild yeast Saccharomyces paradoxus is complex, including clonal reproduction, outcrossing, and two different modes of inbreeding. To quantify these different aspects we analyzed DNA sequence variation in the third chromosome among 20 isolates from two populations. Measures of mutational and recombinational diversity were used to make two independent estimates of the population size. In an obligately sexual population these values should be approximately equal. Instead there is a discrepancy of about three orders of magnitude between our two estimates of population size, indicating that S. paradoxus goes through a sexual cycle approximately once in every 1,000 asexual generations. Chromosome III also contains the mating type locus (MAT), which is the most outbred part in the entire genome, and by comparing recombinational diversity as a function of distance from MAT we estimate the frequency of matings to be approximately 94% from within the same tetrad, 5% with a clonemate after switching the mating type, and 1% outcrossed. Our study illustrates the utility of population genomic data in quantifying life cycles.
Funded by: Biotechnology and Biological Sciences Research Council; Wellcome Trust
Proceedings of the National Academy of Sciences of the United States of America 2008;105;12;4957-62
PUBMED: 18344325; PMC: 2290798; DOI: 10.1073/pnas.0707314105
Magdalena Zarowiecki
- Postdoctoral Fellow
My research interests are tropical diseases, in particular the evolution of parasitism and host-parasite interactions. I did a M.Sc. in Zoological Systematic at Gothenburg University, Sweden, and an M.Res. in Biosystematics, at Natural History Museum and Imperial College, London. I worked with many non-model worms; ribbon worms, Oligochaetes, Cestodes and Trematodes. I also have interests in the wider field of tropical diseases from a Ph.D. in population genetics of mosquitoes. I previously held a postdoctoral position funded by the SynTax scheme; working with assembly and annotation of the Hymenolepis microstoma genome, and comparative phylogeny of flatworms.
Research
The current research is focusing on genomics of parasitic flatworms, including important platyhelminth parasites of humans in the genera Taenia, Hymenolepis, Echinococcus and Schistosoma. These platyhelminths have severe impact on the health and productivity of the poorest people in developing countries. The aim of the post-doc project is to develop comparative genomics of flatworms within the Parasite Genomics group. We use high-throughput approaches including RNAseq, gene-prediction, methylome studies, re-sequencing and microRNA-studies to increase the accuracy and biological depth of our platyhelminth genome annotations. Producing good-quality genomes, gene models and annotations is a vital underpinning for future translational research.
References
-
Cestode genomics - progress and prospects for advancing basic and applied aspects of flatworm biology.
Department of Zoology, The Natural History Museum, London, UK.
Characterization of the first tapeworm genome, Echinococcus multilocularis, is now nearly complete, and genome assemblies of E. granulosus, Taenia solium and Hymenolepis microstoma are in advanced draft versions. These initiatives herald the beginning of a genomic era in cestodology and underpin a diverse set of research agendas targeting both basic and applied aspects of tapeworm biology. We discuss the progress in the genomics of these species, provide insights into the presence and composition of immunologically relevant gene families, including the antigen B- and EG95/45W families, and discuss chemogenomic approaches toward the development of novel chemotherapeutics against cestode diseases. In addition, we discuss the evolution of tapeworm parasites and introduce the research programmes linked to genome initiatives that are aimed at understanding signalling systems involved in basic host-parasite interactions and morphogenesis.
Funded by: Biotechnology and Biological Sciences Research Council: BBG0038151
Parasite immunology 2012;34;2-3;130-50
PUBMED: 21793855; DOI: 10.1111/j.1365-3024.2011.01319.x
-
Animals learn new tricks from microorganisms.
Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK. microbes@sanger.ac.uk
Nature reviews. Microbiology 2011;9;12;836
PUBMED: 22085859; DOI: 10.1038/nrmicro2694
-
Towards a new role for vector systematics in parasite control.
Dept. of Zoology, Natural History Museum, London SW75BD, UK. mz3@sanger.ac.uk
Vector systematics research is being transformed by the recent development of theoretical, experimental and analytical methods, as well as conceptual insights into speciation and reconstruction of evolutionary history. We review this progress using examples from the mosquito genus Anopheles. The conclusion is that recent progress, particularly in the development of better tools for understanding evolutionary history, makes systematics much more informative for vector control purposes, and has increasing potential to inform and improve targeted vector control programmes.
Parasitology 2011;138;13;1723-9
PUBMED: 21679487; DOI: 10.1017/S003118201100062X
-
Rapid evolution of yeast centromeres in the absence of drive.
Division of Biology, Imperial College London, Ascot SL5 7PY, United Kingdom.
To find the most rapidly evolving regions in the yeast genome we compared most of chromosome III from three closely related lineages of the wild yeast Saccharomyces paradoxus. Unexpectedly, the centromere appears to be the fastest-evolving part of the chromosome, evolving even faster than DNA sequences unlikely to be under selective constraint (i.e., synonymous sites after correcting for codon usage bias and remnant transposable elements). Centromeres on other chromosomes also show an elevated rate of nucleotide substitution. Rapid centromere evolution has also been reported for some plants and animals and has been attributed to selection for inclusion in the egg or the ovule at female meiosis. But Saccharomyces yeasts have symmetrical meioses with all four products surviving, thus providing no opportunity for meiotic drive. In addition, yeast centromeres show the high levels of polymorphism expected under a neutral model of molecular evolution. We suggest that yeast centromeres suffer an elevated rate of mutation relative to other chromosomal regions and they change through a process of "centromere drift," not drive.
Funded by: Biotechnology and Biological Sciences Research Council; Wellcome Trust
Genetics 2008;178;4;2161-7
PUBMED: 18430941; PMC: 2323805; DOI: 10.1534/genetics.107.083980
-
Making the most of mitochondrial genomes--markers for phylogeny, molecular ecology and barcodes in Schistosoma (Platyhelminthes: Digenea).
Wolfson Wellcome Biomedical Laboratories, Department of Zoology, Natural History Museum, Cromwell Road, London SW7 5BD, UK.
An increasing number of complete sequences of mitochondrial (mt) genomes provides the opportunity to optimise the choice of molecular markers for phylogenetic and ecological studies. This is particularly the case where mt genomes from closely related taxa have been sequenced; e.g., within Schistosoma. These blood flukes include species that are the causative agents of schistosomiasis, where there has been a need to optimise markers for species and strain recognition. For many phylogenetic and population genetic studies, the choice of nucleotide sequences depends primarily on suitable PCR primers. Complete mt genomes allow individual gene or other mt markers to be assessed relative to one another for potential information content, prior to broad-scale sampling. We assess the phylogenetic utility of individual genes and identify regions that contain the greatest interspecific variation for molecular ecological and diagnostic markers. We show that variable characters are not randomly distributed along the genome and there is a positive correlation between polymorphism and divergence. The mt genomes of African and Asian schistosomes were compared with the available intraspecific dataset of Schistosoma mansoni through sliding window analyses, in order to assess whether the observed polymorphism was at a level predicted from interspecific comparisons. We found a positive correlation except for the two genes (cox1 and nad1) adjoining the putative control region in S. mansoni. The genes nad1, nad4, nad5, cox1 and cox3 resolved phylogenies that were consistent with a benchmark phylogeny and in general, longer genes performed better in phylogenetic reconstruction. Considering the information content of entire mt genome sequences, partial cox1 would not be the ideal marker for either species identification (barcoding) or population studies with Schistosoma species. Instead, we suggest the use of cox3 and nad5 for both phylogenetic and population studies. Five primer pairs designed against Schistosoma mekongi and Schistosoma malayensis were tested successfully against Schistosoma japonicum. In combination, these fragments encompass 20-27% of the variation amongst the genomes (average total length approximately 14,000bp), thus providing an efficient means of encapsulating the greatest amount of variation within the shortest sequence. Comparative mitogenomics provides the basis of a rational approach to molecular marker selection and optimisation.
International journal for parasitology 2007;37;12;1401-18
PUBMED: 17570370; DOI: 10.1016/j.ijpara.2007.04.014

Dr Matt Berriman