Pathogen genomics

The Pathogen genomics team, headed by Julian Parkhill, carries out sequencing of DNA of small genomes and analysis of the resulting data for information on genome structure and function.

The team works on a variety of microorganisms, most of which are pathogens, but it is also sequencing some model organisms – species which provide information about other species, including man, that are more difficult to study directly.

The team are investigating a wide range of human and animal pathogens, ranging from human and bacterial viruses, through bacteria and protist parasites to multicellular worms. They currently have over 100 ongoing projects, and collaborate widely within the UK and world scientific community to generate the best possible biological interpretation of the data. The team has sequenced the genomes of organisms that are of fundamental importance for human health, including the causative agents of tuberculosis, malaria, plague, typhoid fever, sleeping sickness, whooping cough, dengue fever, leprosy, diphtheria and meningitis.

[Muntasir Alam, University of Dhaka]

Background

In 1995 the Wellcome Trust took the decision to set up a Pathogen sequencing unit (PSU) at what was then the Sanger Centre, to sequence the genomes of organisms relevant to human and animal health. It was initially funded through individual grants and later through the Wellcome Trust Beowulf Genomics Panel. The Beowulf panel now no longer operates and the PSU has been renamed to Pathogen genomics. This is now funded through the Sanger Institute Wellcome Trust envelope funding.

Research

Genome sequencing and analysis

The Pathogen genomics team’s approach to pathogen genome analysis is "broad and deep". "Broad" means that we are interested in a wide variety of human and animal pathogens, in order to study the wide diversity of mechanisms that are used to infect a host and cause disease. These broad analyses include, for example, related members of a group of organisms that can cause disease in humans, animals and even plants, and those that live in a host without causing disease. These comparisons allow us to identify genes that are of key importance for specifying common functions, and those that are accessory for example responsible for interaction with specific hosts, or for causing specific pathologies. We have carried out analyses on a range of species including the enteric bacteria Salmonella, Escherichia, Yersinia and Erwinia, and the parasites Trypanosoma and Leishmania. Broad investigations also allow us to find the novel and unexpected in less well-studied pathogens, and to lay the foundations for investigating neglected diseases, such as those caused by helminths.

"Deep" refers to multiple comparisons between very closely related strains within a species, or group of species. Such comparisons allow us to look at the fine detail of how or why organisms specialise on particular hosts for example the host-restricted pathogen Salmonella Typhi, how they have evolved for example Bordetella pertussis or Yersinia pestis and how variation in DNA sequence corresponds to the degree to which the organism can cause disease (or virulence) for example Streptococcus pneumoniae or Neisseria meningitidis. Fine detail comparisons also give us DNA markers that allow studies following transmission, virulence or drug resistance in related families of organisms such as Mycobacterium tuberculosis or Plasmodium falciparum.

Laboratory studies

For some of the organisms we are studying in depth, we also move beyond sequencing into lab-based studies. These include identifying and isolating specific areas of interest from related genomes, and studies on the protein-coding areas of DNA using the latest technologies such as microarrays and ultra-high-throughput sequencing, analysis of all the proteins in an organism, and mutagenesis studies.

Another area of growing interest is the contribution to the health and development of the host by bacteria. We are studying bacterial populations, primarily in the gut, in both humans and mice. Looking at how these populations vary between individuals, and between diseased and healthy organs, should shed light on the role of microorganisms in these processes.

Informatics

To support assigning function to pieces of the DNA sequence (annotation) in pathogens, and to present our data to the scientific community, we have a group of software programmes available. These include our analysis tool Artemis, which is designed to be an intuitive and portable sequence viewer, and an extension of this, a powerful analysis tool called ACT, which allows an interactive view of full genome comparisons. We also provide a set of web pages, GeneDB, which serve as a repository and source for our annotation and analysis.

Collaborations

Internal collaborations

To pursue these studies effectively we have built up very strong collaborations with the other groups within the Institute, particularly Gordon Dougan, Paul Kellam and Dominic Kwiatkowski, and we intend to expand these collaborations to include new pathogen faculty members. We also rely heavily on the support of the core sequencing and informatics teams.

External collaborations

Along with providing our data to the scientific community, we believe it is important to enable scientists to utilise the information provided to its fullest extent, especially in the developing world, where these diseases are most prevalent. In collaboration with the Wellcome Trust Advanced Courses group, we have established a series of bioinformatics training workshops in developing countries, most recently in Vietnam, Malawi, Uraguay and Kenya.

We are also a participating organisation in the MetaHIT Project, a European Community collaboration that aims to sequence and study a reference set of genes and genomes of a selection of intestinal microbes. We are primarily responsible for the workpackage WPS.2: Full genome sequencing of the selected microorganisms.

Selected Publications

  • Comparative genome analysis of Salmonella Enteritidis PT4 and Salmonella Gallinarum 287/91 provides insights into evolutionary and host adaptation pathways.

    Thomson NR, Clayton DJ, Windhorst D, Vernikos G, Davidson S, Churcher C, Quail MA, Stevens M, Jones MA, Watson M, Barron A, Layton A, Pickard D, Kingsley RA, Bignell A, Clark L, Harris B, Ormond D, Abdellah Z, Brooks K, Cherevach I, Chillingworth T, Woodward J, Norberczak H, Lord A, Arrowsmith C, Jagels K, Moule S, Mungall K, Sanders M, Whitehead S, Chabalgoity JA, Maskell D, Humphrey T, Roberts M, Barrow PA, Dougan G and Parkhill J

    Genome research 2008;18;10;1624-37

  • Replacement of adenylate cyclase toxin in a lineage of Bordetella bronchiseptica.

    Buboltz AM, Nicholson TL, Parette MR, Hester SE, Parkhill J and Harvill ET

    Journal of bacteriology 2008;190;15;5502-11

  • Genome evolution of Wolbachia strain wPip from the Culex pipiens group.

    Klasson L, Walker T, Sebaihia M, Sanders MJ, Quail MA, Lord A, Sanders S, Earl J, O'Neill SL, Thomson N, Sinkins SP and Parkhill J

    Molecular biology and evolution 2008;25;9;1877-87

  • The complete genome, comparative and functional analysis of Stenotrophomonas maltophilia reveals an organism heavily shielded by drug resistance determinants.

    Crossman LC, Gould VC, Dow JM, Vernikos GS, Okazaki A, Sebaihia M, Saunders D, Arrowsmith C, Carver T, Peters N, Adlem E, Kerhornou A, Lord A, Murphy L, Seeger K, Squares R, Rutter S, Quail MA, Rajandream MA, Harris D, Churcher C, Bentley SD, Parkhill J, Thomson NR and Avison MB

    Genome biology 2008;9;4;R74

  • Insights from the complete genome sequence of Mycobacterium marinum on the evolution of Mycobacterium tuberculosis.

    Stinear TP, Seemann T, Harrison PF, Jenkin GA, Davies JK, Johnson PD, Abdellah Z, Arrowsmith C, Chillingworth T, Churcher C, Clarke K, Cronin A, Davis P, Goodhead I, Holroyd N, Jagels K, Lord A, Moule S, Mungall K, Norbertczak H, Quail MA, Rabbinowitsch E, Walker D, White B, Whitehead S, Small PL, Brosch R, Ramakrishnan L, Fischbach MA, Parkhill J and Cole ST

    Genome research 2008;18;5;729-41

  • Complete genome sequence of uropathogenic Proteus mirabilis, a master of both adherence and motility.

    Pearson MM, Sebaihia M, Churcher C, Quail MA, Seshasayee AS, Luscombe NM, Abdellah Z, Arrosmith C, Atkin B, Chillingworth T, Hauser H, Jagels K, Moule S, Mungall K, Norbertczak H, Rabbinowitsch E, Walker D, Whithead S, Thomson NR, Rather PN, Parkhill J and Mobley HL

    Journal of bacteriology 2008;190;11;4027-37

  • Genome of the actinomycete plant pathogen Clavibacter michiganensis subsp. sepedonicus suggests recent niche adaptation.

    Bentley SD, Corton C, Brown SE, Barron A, Clark L, Doggett J, Harris B, Ormond D, Quail MA, May G, Francis D, Knudson D, Parkhill J and Ishimaru CA

    Journal of bacteriology 2008;190;6;2150-60

  • Resolving the structural features of genomic islands: a machine learning approach.

    Vernikos GS and Parkhill J

    Genome research 2008;18;2;331-42

  • Sequence-based analysis of pQBR103; a representative of a unique, transfer-proficient mega plasmid resident in the microbial community of sugar beet.

    Tett A, Spiers AJ, Crossman LC, Ager D, Ciric L, Dow JM, Fry JC, Harris D, Lilley A, Oliver A, Parkhill J, Quail MA, Rainey PB, Saunders NJ, Seeger K, Snyder LA, Squares R, Thomas CM, Turner SL, Zhang XX, Field D and Bailey MJ

    The ISME journal 2007;1;4;331-40

  • Chlamydia trachomatis: genome sequence analysis of lymphogranuloma venereum isolates.

    Thomson NR, Holden MT, Carder C, Lennard N, Lockey SJ, Marsh P, Skipp P, O'Connor CD, Goodhead I, Norbertzcak H, Harris B, Ormond D, Rance R, Quail MA, Parkhill J, Stephens RS and Clarke IN

    Genome research 2008;18;1;161-71