Pathogen genomics

The Pathogen genomics team, headed by Julian Parkhill, uses high-throughput sequencing and phenotype analysis of bacteria to understand their virulence, evolution, transmission and host-interactions.

The team works on a variety of microorganisms, most of which are human pathogens, but it is also analysing some veterinary pathogens and model organisms - species which provide information about other species that are more difficult to study directly.

The team are investigating a wide range of human and animal pathogens, with currently over 100 ongoing projects, and collaborate widely within the UK and global scientific community. The team has previously generated reference genomes of organisms that are of fundamental importance for human health, including the causative agents of tuberculosis, plague, typhoid fever, whooping cough, leprosy, diphtheria and meningitis. We are currently using population genomics to investigate transmission and evolution in many of these pathogens.

[David Goulding, Genome Research Limited]

Research

Genome sequencing and analysis

The Pathogen genomics team's approach to pathogen genome analysis is "broad and deep". "Broad" means that we are interested in a wide variety of human and animal pathogens, in order to study the wide diversity of mechanisms that are used to infect a host and cause disease. These broad analyses include, for example, related members of a group of organisms that can cause disease in humans, animals and even plants, and those that live in a host without causing disease. These comparisons allow us to identify genes that are of key importance for specifying common functions, and those that are accessory for example responsible for interaction with specific hosts, or for causing specific pathologies. We have carried out analyses on a range of species including the enteric bacteria Salmonella, Escherichia, Yersinia and Erwinia, and others such as Streptococcus and Staphylococcus. Broad investigations also allow us to find the novel and unexpected in less well-studied pathogens, and to lay the foundations for investigating neglected diseases.

"Deep" refers to multiple comparisons between very closely related strains within a species, or group of species. Such comparisons allow us to look at the fine detail of how or why organisms specialise on particular hosts for example the host-restricted pathogen Salmonella Typhi, how they have evolved for example Bordetella pertussis or Yersinia pestis and how variation in DNA sequence corresponds to the degree to which the organism can cause disease (or virulence) for example Streptococcus pneumoniae or Neisseria meningitidis. Fine detail comparisons also give us DNA markers that allow studies following transmission, virulence or drug resistance in related families of organisms such as Mycobacterium tuberculosis or Staphylococcus aureus.

Laboratory studies

Julian's team also apply multiple lab-based approaches for organisms that they are studying in-depth. These include transcriptome studies using high-throughput sequencing, proteomic analysis and saturation mutagenesis studies.

Another area of growing interest is the contribution to the health and development of the host by bacteria. We are studying bacterial populations, primarily in the gut, in both humans and mice. Looking at how these populations vary between individuals, and between diseased and healthy organs, should shed light on the role of microorganisms in these processes.

Informatics

To support assigning function to pieces of the DNA sequence (annotation) in pathogens, and to present our data to the scientific community, we have a group of software programmes available. These include our analysis tool Artemis, which is designed to be an intuitive and portable sequence viewer, and an extension of this, a powerful analysis tool called ACT, which allows an interactive view of full genome comparisons. We also provide a set of web pages, GeneDB, which serve as a repository and source for our annotation and analysis.

Collaborations

Internal collaborations

To pursue these studies effectively we have built up very strong collaborations with the other groups within the Institute, particularly Gordon Dougan, Paul Kellam, Trevor Lawley, Matthew Berriman and Dominic Kwiatkowski. We also rely heavily on the support of the core sequencing and informatics teams.

External collaborations

Along with providing our data to the scientific community, we believe it is important to enable scientists to utilise the information provided to its fullest extent, especially in the developing world, where these diseases are most prevalent. In collaboration with the Wellcome Trust Advanced Courses group, we have established a series of bioinformatics training workshops in developing countries, most recently in Vietnam, Malawi, Uraguay and Kenya.

We collaborate on specific projects with a broad section of the scientific community, in the UK, Europe, the US and the wider world, and we always welcome new collaborations.

Selected Publications

  • Distinguishable Epidemics of Multidrug-Resistant Salmonella Typhimurium DT104 in Different Hosts.

    Mather AE, Reid SW, Maskell DJ, Parkhill J, Fookes MC, Harris SR, Brown DJ, Coia JE, Mulvey MR, Gilmour MW, Petrovska L, de Pinna E, Kuroda M, Akiba M, Izumiya H, Connor TR, Suchard MA, Lemey P, Mellor DJ, Haydon DT and Thomson NR

    Science (New York, N.Y.) 2013

  • Out-of-Africa migration and Neolithic coexpansion of Mycobacterium tuberculosis with modern humans.

    Comas I, Coscolla M, Luo T, Borrell S, Holt KE, Kato-Maeda M, Parkhill J, Malla B, Berg S, Thwaites G, Yeboah-Manu D, Bothamley G, Mei J, Wei L, Bentley S, Harris SR, Niemann S, Diel R, Aseffa A, Gao Q, Young D and Gagneux S

    Nature genetics 2013

  • Whole-genome sequencing for rapid susceptibility testing of M. tuberculosis.

    Köser CU, Bryant JM, Becq J, Török ME, Ellington MJ, Marti-Renom MA, Carmichael AJ, Parkhill J, Smith GP and Peacock SJ

    The New England journal of medicine 2013;369;3;290-2

  • Genome-wide association study identifies vitamin B5 biosynthesis as a host specificity factor in Campylobacter.

    Sheppard SK, Didelot X, Meric G, Torralbo A, Jolley KA, Kelly DJ, Bentley SD, Maiden MC, Parkhill J and Falush D

    Proceedings of the National Academy of Sciences of the United States of America 2013;110;29;11923-7

  • Rapid Bacterial Whole-Genome Sequencing to Enhance Diagnostic and Public Health Microbiology.

    Reuter S, Ellington MJ, Cartwright EJ, Köser CU, Török ME, Gouliouris T, Harris SR, Brown NM, Holden MT, Quail M, Parkhill J, Smith GP, Bentley SD and Peacock SJ

    JAMA internal medicine 2013

  • Population genomics of post-vaccine changes in pneumococcal epidemiology.

    Croucher NJ, Finkelstein JA, Pelton SI, Mitchell PK, Lee GM, Parkhill J, Bentley SD, Hanage WP and Lipsitch M

    Nature genetics 2013;45;6;656-63

  • Whole-genome sequencing to identify transmission of Mycobacterium abscessus between patients with cystic fibrosis: a retrospective cohort study.

    Bryant JM, Grogono DM, Greaves D, Foweraker J, Roddick I, Inns T, Reacher M, Haworth CS, Curran MD, Harris SR, Peacock SJ, Parkhill J and Floto RA

    Lancet 2013;381;9877;1551-60

  • Whole-genome sequences of Chlamydia trachomatis directly from clinical samples without culture.

    Seth-Smith HM, Harris SR, Skilton RJ, Radebe FM, Golparian D, Shipitsyna E, Duy PT, Scott P, Cutcliffe LT, O'Neill C, Parmar S, Pitt R, Baker S, Ison CA, Marsh P, Jalal H, Lewis DA, Unemo M, Clarke IN, Parkhill J and Thomson NR

    Genome research 2013;23;5;855-66

  • A comparison of dense transposon insertion libraries in the Salmonella serovars Typhi and Typhimurium.

    Barquist L, Langridge GC, Turner DJ, Phan MD, Turner AK, Bateman A, Parkhill J, Wain J and Gardner PP

    Nucleic acids research 2013;41;8;4549-64

  • A genomic portrait of the emergence, evolution, and global spread of a methicillin-resistant Staphylococcus aureus pandemic.

    Holden MT, Hsu LY, Kurt K, Weinert LA, Mather AE, Harris SR, Strommenger B, Layer F, Witte W, de Lencastre H, Skov R, Westh H, Zemlicková H, Coombs G, Kearns AM, Hill RL, Edgeworth J, Gould I, Gant V, Cooke J, Edwards GF, McAdam PR, Templeton KE, McCann A, Zhou Z, Castillo-Ramírez S, Feil EJ, Hudson LO, Enright MC, Balloux F, Aanensen DM, Spratt BG, Fitzgerald JR, Parkhill J, Achtman M, Bentley SD and Nübel U

    Genome research 2013;23;4;653-64

  • Sequencing ancient calcified dental plaque shows changes in oral microbiota with dietary shifts of the Neolithic and Industrial revolutions.

    Adler CJ, Dobney K, Weyrich LS, Kaidonis J, Walker AW, Haak W, Bradshaw CJ, Townsend G, Sołtysiak A, Alt KW, Parkhill J and Cooper A

    Nature genetics 2013;45;4;450-5, 455e1

  • Inferring patient to patient transmission of Mycobacterium tuberculosis from whole genome sequencing data.

    Bryant JM, Schürch AC, van Deutekom H, Harris SR, de Beer JL, de Jager V, Kremer K, van Hijum SA, Siezen RJ, Borgdorff M, Bentley SD, Parkhill J and van Soolingen D

    BMC infectious diseases 2013;13;1;110

  • Whole-genome sequencing for analysis of an outbreak of meticillin-resistant Staphylococcus aureus: a descriptive study.

    Harris SR, Cartwright EJ, Török ME, Holden MT, Brown NM, Ogilvy-Stuart AL, Ellington MJ, Quail MA, Bentley SD, Parkhill J and Peacock SJ

    The Lancet infectious diseases 2013;13;2;130-6

  • Emergence and global spread of epidemic healthcare-associated Clostridium difficile.

    He M, Miyajima F, Roberts P, Ellison L, Pickard DJ, Martin MJ, Connor TR, Harris SR, Fairley D, Bamford KB, D'Arc S, Brazier J, Brown D, Coia JE, Douce G, Gerding D, Kim HJ, Koh TH, Kato H, Senoh M, Louie T, Michell S, Butt E, Peacock SJ, Brown NM, Riley T, Songer G, Wilcox M, Pirmohamed M, Kuijper E, Hawkey P, Wren BW, Dougan G, Parkhill J and Lawley TD

    Nature genetics 2013;45;1;109-13

  • Intracontinental spread of human invasive Salmonella Typhimurium pathovariants in sub-Saharan Africa.

    Okoro CK, Kingsley RA, Connor TR, Harris SR, Parry CM, Al-Mashhadani MN, Kariuki S, Msefula CL, Gordon MA, de Pinna E, Wain J, Heyderman RS, Obaro S, Alonso PL, Mandomando I, MacLennan CA, Tapia MD, Levine MM, Tennant SM, Parkhill J and Dougan G

    Nature genetics 2012;44;11;1215-21

  • Artemis: an integrated platform for visualization and analysis of high-throughput sequence-based experimental data.

    Carver T, Harris SR, Berriman M, Parkhill J and McQuillan JA

    Bioinformatics (Oxford, England) 2012;28;4;464-9

  • Targeted restoration of the intestinal microbiota with a simple, defined bacteriotherapy resolves relapsing Clostridium difficile disease in mice.

    Lawley TD, Clare S, Walker AW, Stares MD, Connor TR, Raisen C, Goulding D, Rad R, Schreiber F, Brandt C, Deakin LJ, Pickard DJ, Duncan SH, Flint HJ, Clark TG, Parkhill J and Dougan G

    PLoS pathogens 2012;8;10;e1002995

  • Evidence for several waves of global transmission in the seventh cholera pandemic.

    Mutreja A, Kim DW, Thomson NR, Connor TR, Lee JH, Kariuki S, Croucher NJ, Choi SY, Harris SR, Lebens M, Niyogi SK, Kim EJ, Ramamurthy T, Chun J, Wood JL, Clemens JD, Czerkinsky C, Nair GB, Holmgren J, Parkhill J and Dougan G

    Nature 2011;477;7365;462-5

  • Meticillin-resistant Staphylococcus aureus with a novel mecA homologue in human and bovine populations in the UK and Denmark: a descriptive study.

    García-Álvarez L, Holden MT, Lindsay H, Webb CR, Brown DF, Curran MD, Walpole E, Brooks K, Pickard DJ, Teale C, Parkhill J, Bentley SD, Edwards GF, Girvan EK, Kearns AM, Pichon B, Hill RL, Larsen AR, Skov RL, Peacock SJ, Maskell DJ and Holmes MA

    The Lancet infectious diseases 2011;11;8;595-603

  • Salmonella bongori provides insights into the evolution of the Salmonellae.

    Fookes M, Schroeder GN, Langridge GC, Blondel CJ, Mammina C, Connor TR, Seth-Smith H, Vernikos GS, Robinson KS, Sanders M, Petty NK, Kingsley RA, Bäumler AJ, Nuccio SP, Contreras I, Santiviago CA, Maskell D, Barrow P, Humphrey T, Nastasi A, Roberts M, Frankel G, Parkhill J, Dougan G and Thomson NR

    PLoS pathogens 2011;7;8;e1002191

  • The impact of recombination on dN/dS within recently emerged bacterial clones.

    Castillo-Ramírez S, Harris SR, Holden MT, He M, Parkhill J, Bentley SD and Feil EJ

    PLoS pathogens 2011;7;7;e1002129

  • Partitioning core and satellite taxa from within cystic fibrosis lung bacterial communities.

    van der Gast CJ, Walker AW, Stressmann FA, Rogers GB, Scott P, Daniels TW, Carroll MP, Parkhill J and Bruce KD

    The ISME journal 2011;5;5;780-91

  • Rapid pneumococcal evolution in response to clinical interventions.

    Croucher NJ, Harris SR, Fraser C, Quail MA, Burton J, van der Linden M, McGee L, von Gottberg A, Song JH, Ko KS, Pichon B, Baker S, Parry CM, Lambertsen LM, Shahinas D, Pillai DR, Mitchell TJ, Dougan G, Tomasz A, Klugman KP, Parkhill J, Hanage WP and Bentley SD

    Science (New York, N.Y.) 2011;331;6016;430-4

  • Evolutionary dynamics of Clostridium difficile over short and long time scales.

    He M, Sebaihia M, Lawley TD, Stabler RA, Dawson LF, Martin MJ, Holt KE, Seth-Smith HM, Quail MA, Rance R, Brooks K, Churcher C, Harris D, Bentley SD, Burrows C, Clark L, Corton C, Murray V, Rose G, Thurston S, van Tonder A, Walker D, Wren BW, Dougan G and Parkhill J

    Proceedings of the National Academy of Sciences of the United States of America 2010;107;16;7527-32

  • Evolution of MRSA during hospital transmission and intercontinental spread.

    Harris SR, Feil EJ, Holden MT, Quail MA, Nickerson EK, Chantratita N, Gardete S, Tavares A, Day N, Lindsay JA, Edgeworth JD, de Lencastre H, Parkhill J, Peacock SJ and Bentley SD

    Science (New York, N.Y.) 2010;327;5964;469-74

Team

Team members

Simon Harris
Staff Scientist

Simon Harris

- Staff Scientist

I completed both my undergraduate joint-honours degree in Biology and Geology and my PhD at the University of Bristol. My PhD focused on attempting to unravel the origins and evolutionary history of turtles, tortoises and terrapins using morphological and fossil evidence. This involved using and developing phylogenetic and compatibility approaches for investigating structure in complex morphological data. My first post-doc was at Newcastle University under Martin Embley, where I was part of a group who employed complex phylogenetic models to try to answer deep evolutionary questions, such as the origins of eukaryotes and the mitochondrial endosymbiont.

Research

My current research involves analysis of data from second and third generation sequencing technologies in order to study the evolution and population dynamics of bacterial pathogens. Over the last five years we have shown that genomic data have the potential to not only allow us to reconstruct the historical spread of some of the most important causes of transmissible disease, but also to play an important role in informing clinical practice in real-time.

References

  • Read and assembly metrics inconsequential for clinical utility of whole-genome sequencing in mapping outbreaks.

    Harris SR, Török ME, Cartwright EJ, Quail MA, Peacock SJ and Parkhill J

    Nature biotechnology 2013;31;7;592-4

  • Whole-genome sequencing to identify transmission of Mycobacterium abscessus between patients with cystic fibrosis: a retrospective cohort study.

    Bryant JM, Grogono DM, Greaves D, Foweraker J, Roddick I, Inns T, Reacher M, Haworth CS, Curran MD, Harris SR, Peacock SJ, Parkhill J and Floto RA

    Wellcome Trust Sanger Institute, Hinxton, UK.

    Background: Increasing numbers of individuals with cystic fibrosis are becoming infected with the multidrug-resistant non-tuberculous mycobacterium (NTM) Mycobacterium abscessus, which causes progressive lung damage and is extremely challenging to treat. How this organism is acquired is not currently known, but there is growing concern that person-to-person transmission could occur. We aimed to define the mechanisms of acquisition of M abscessus in individuals with cystic fibrosis.

    Method: Whole genome sequencing and antimicrobial susceptibility testing were done on 168 consecutive isolates of M abscessus from 31 patients attending an adult cystic fibrosis centre in the UK between 2007 and 2011. In parallel, we undertook detailed environmental testing for NTM and defined potential opportunities for transmission between patients both in and out of hospital using epidemiological data and social network analysis.

    Findings: Phylogenetic analysis revealed two clustered outbreaks of near-identical isolates of the M abscessus subspecies massiliense (from 11 patients), differing by less than ten base pairs. This variation represents less diversity than that seen within isolates from a single individual, strongly indicating between-patient transmission. All patients within these clusters had numerous opportunities for within-hospital transmission from other individuals, while comprehensive environmental sampling, initiated during the outbreak, failed to detect any potential point source of NTM infection. The clusters of M abscessus subspecies massiliense showed evidence of transmission of mutations acquired during infection of an individual to other patients. Thus, isolates with constitutive resistance to amikacin and clarithromycin were isolated from several individuals never previously exposed to long-term macrolides or aminoglycosides, further indicating cross-infection.

    Interpretation: Whole genome sequencing has revealed frequent transmission of multidrug resistant NTM between patients with cystic fibrosis despite conventional cross-infection measures. Although the exact transmission route is yet to be established, our epidemiological analysis suggests that it could be indirect.

    Funding: The Wellcome Trust, Papworth Hospital, NIHR Cambridge Biomedical Research Centre, UK Health Protection Agency, Medical Research Council, and the UKCRC Translational Infection Research Initiative.

    Funded by: Medical Research Council; Wellcome Trust: 084953, 098051

    Lancet 2013;381;9877;1551-60

  • Whole-genome sequences of Chlamydia trachomatis directly from clinical samples without culture.

    Seth-Smith HM, Harris SR, Skilton RJ, Radebe FM, Golparian D, Shipitsyna E, Duy PT, Scott P, Cutcliffe LT, O'Neill C, Parmar S, Pitt R, Baker S, Ison CA, Marsh P, Jalal H, Lewis DA, Unemo M, Clarke IN, Parkhill J and Thomson NR

    Pathogen Genomics, The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SA, United Kingdom. hss@sanger.ac.uk

    The use of whole-genome sequencing as a tool for the study of infectious bacteria is of growing clinical interest. Chlamydia trachomatis is responsible for sexually transmitted infections and the blinding disease trachoma, which affect hundreds of millions of people worldwide. Recombination is widespread within the genome of C. trachomatis, thus whole-genome sequencing is necessary to understand the evolution, diversity, and epidemiology of this pathogen. Culture of C. trachomatis has, until now, been a prerequisite to obtain DNA for whole-genome sequencing; however, as C. trachomatis is an obligate intracellular pathogen, this procedure is technically demanding and time consuming. Discarded clinical samples represent a large resource for sequencing the genomes of pathogens, yet clinical swabs frequently contain very low levels of C. trachomatis DNA and large amounts of contaminating microbial and human DNA. To determine whether it is possible to obtain whole-genome sequences from bacteria without the need for culture, we have devised an approach that combines immunomagnetic separation (IMS) for targeted bacterial enrichment with multiple displacement amplification (MDA) for whole-genome amplification. Using IMS-MDA in conjunction with high-throughput multiplexed Illumina sequencing, we have produced the first whole bacterial genome sequences direct from clinical samples. We also show that this method can be used to generate genome data from nonviable archived samples. This method will prove a useful tool in answering questions relating to the biology of many difficult-to-culture or fastidious bacteria of clinical concern.

    Funded by: Wellcome Trust: 098051

    Genome research 2013;23;5;855-66

  • Whole-genome sequencing for analysis of an outbreak of meticillin-resistant Staphylococcus aureus: a descriptive study.

    Harris SR, Cartwright EJ, Török ME, Holden MT, Brown NM, Ogilvy-Stuart AL, Ellington MJ, Quail MA, Bentley SD, Parkhill J and Peacock SJ

    Wellcome Trust Sanger Institute, Cambridge, UK.

    Background: The emergence of meticillin-resistant Staphylococcus aureus (MRSA) that can persist in the community and replace existing hospital-adapted lineages of MRSA means that it is necessary to understand transmission dynamics in terms of hospitals and the community as one entity. We assessed the use of whole-genome sequencing to enhance detection of MRSA transmission between these settings.

    Methods: We studied a putative MRSA outbreak on a special care baby unit (SCBU) at a National Health Service Foundation Trust in Cambridge, UK. We used whole-genome sequencing to validate and expand findings from an infection-control team who assessed the outbreak through conventional analysis of epidemiological data and antibiogram profiles. We sequenced isolates from all colonised patients in the SCBU, and sequenced MRSA isolates from patients in the hospital or community with the same antibiotic susceptibility profile as the outbreak strain.

    Findings: The hospital infection-control team identified 12 infants colonised with MRSA in a 6 month period in 2011, who were suspected of being linked, but a persistent outbreak could not be confirmed with conventional methods. With whole-genome sequencing, we identified 26 related cases of MRSA carriage, and showed transmission occurred within the SCBU, between mothers on a postnatal ward, and in the community. The outbreak MRSA type was a new sequence type (ST) 2371, which is closely related to ST22, but contains genes encoding Panton-Valentine leucocidin. Whole-genome sequencing data were used to propose and confirm that MRSA carriage by a staff member had allowed the outbreak to persist during periods without known infection on the SCBU and after a deep clean.

    Interpretation: Whole-genome sequencing holds great promise for rapid, accurate, and comprehensive identification of bacterial transmission pathways in hospital and community settings, with concomitant reductions in infections, morbidity, and costs.

    Funding: UK Clinical Research Collaboration Translational Infection Research Initiative, Wellcome Trust, Health Protection Agency, and the National Institute for Health Research Cambridge Biomedical Research Centre.

    Funded by: Biotechnology and Biological Sciences Research Council; Chief Scientist Office; Department of Health; Medical Research Council: G1000803; Wellcome Trust: 098051

    The Lancet infectious diseases 2013;13;2;130-6

  • Rapid whole-genome sequencing for investigation of a neonatal MRSA outbreak.

    Köser CU, Holden MT, Ellington MJ, Cartwright EJ, Brown NM, Ogilvy-Stuart AL, Hsu LY, Chewapreecha C, Croucher NJ, Harris SR, Sanders M, Enright MC, Dougan G, Bentley SD, Parkhill J, Fraser LJ, Betley JR, Schulz-Trieglaff OB, Smith GP and Peacock SJ

    University of Cambridge, Cambridge, United Kingdom.

    Background: Isolates of methicillin-resistant Staphylococcus aureus (MRSA) belonging to a single lineage are often indistinguishable by means of current typing techniques. Whole-genome sequencing may provide improved resolution to define transmission pathways and characterize outbreaks.

    Methods: We investigated a putative MRSA outbreak in a neonatal intensive care unit. By using rapid high-throughput sequencing technology with a clinically relevant turnaround time, we retrospectively sequenced the DNA from seven isolates associated with the outbreak and another seven MRSA isolates associated with carriage of MRSA or bacteremia in the same hospital.

    Results: We constructed a phylogenetic tree by comparing single-nucleotide polymorphisms (SNPs) in the core genome to a reference genome (an epidemic MRSA clone, EMRSA-15 [sequence type 22]). This revealed a distinct cluster of outbreak isolates and clear separation between these and the nonoutbreak isolates. A previously missed transmission event was detected between two patients with bacteremia who were not part of the outbreak. We created an artificial "resistome" of antibiotic-resistance genes and demonstrated concordance between it and the results of phenotypic susceptibility testing; we also created a "toxome" consisting of toxin genes. One outbreak isolate had a hypermutator phenotype with a higher number of SNPs than the other outbreak isolates, highlighting the difficulty of imposing a simple threshold for the number of SNPs between isolates to decide whether they are part of a recent transmission chain.

    Conclusions: Whole-genome sequencing can provide clinically relevant data within a time frame that can influence patient care. The need for automated data interpretation and the provision of clinically meaningful reports represent hurdles to clinical implementation. (Funded by the U.K. Clinical Research Collaboration Translational Infection Research Initiative and others.).

    Funded by: Biotechnology and Biological Sciences Research Council; Chief Scientist Office; Department of Health; Medical Research Council: G1000803; Wellcome Trust

    The New England journal of medicine 2012;366;24;2267-75

  • Whole-genome analysis of diverse Chlamydia trachomatis strains identifies phylogenetic relationships masked by current clinical typing.

    Harris SR, Clarke IN, Seth-Smith HM, Solomon AW, Cutcliffe LT, Marsh P, Skilton RJ, Holland MJ, Mabey D, Peeling RW, Lewis DA, Spratt BG, Unemo M, Persson K, Bjartling C, Brunham R, de Vries HJ, Morré SA, Speksnijder A, Bébéar CM, Clerc M, de Barbeyrac B, Parkhill J and Thomson NR

    Pathogen Genomics, The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK. sh16@sanger.ac.uk

    Chlamydia trachomatis is responsible for both trachoma and sexually transmitted infections, causing substantial morbidity and economic cost globally. Despite this, our knowledge of its population and evolutionary genetics is limited. Here we present a detailed phylogeny based on whole-genome sequencing of representative strains of C. trachomatis from both trachoma and lymphogranuloma venereum (LGV) biovars from temporally and geographically diverse sources. Our analysis shows that predicting phylogenetic structure using ompA, which is traditionally used to classify Chlamydia, is misleading because extensive recombination in this region masks any true relationships present. We show that in many instances, ompA is a chimera that can be exchanged in part or as a whole both within and between biovars. We also provide evidence for exchange of, and recombination within, the cryptic plasmid, which is another key diagnostic target. We used our phylogenetic framework to show how genetic exchange has manifested itself in ocular, urogenital and LGV C. trachomatis strains, including the epidemic LGV serotype L2b.

    Funded by: Wellcome Trust: 080348, 098051

    Nature genetics 2012;44;4;413-9, S1

  • Optimal enzymes for amplifying sequencing libraries.

    Quail MA, Otto TD, Gu Y, Harris SR, Skelly TF, McQuillan JA, Swerdlow HP and Oyola SO

    Nature methods 2012;9;1;10-1

  • Evidence for several waves of global transmission in the seventh cholera pandemic.

    Mutreja A, Kim DW, Thomson NR, Connor TR, Lee JH, Kariuki S, Croucher NJ, Choi SY, Harris SR, Lebens M, Niyogi SK, Kim EJ, Ramamurthy T, Chun J, Wood JL, Clemens JD, Czerkinsky C, Nair GB, Holmgren J, Parkhill J and Dougan G

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.

    Vibrio cholerae is a globally important pathogen that is endemic in many areas of the world and causes 3-5 million reported cases of cholera every year. Historically, there have been seven acknowledged cholera pandemics; recent outbreaks in Zimbabwe and Haiti are included in the seventh and ongoing pandemic. Only isolates in serogroup O1 (consisting of two biotypes known as 'classical' and 'El Tor') and the derivative O139 can cause epidemic cholera. It is believed that the first six cholera pandemics were caused by the classical biotype, but El Tor has subsequently spread globally and replaced the classical biotype in the current pandemic. Detailed molecular epidemiological mapping of cholera has been compromised by a reliance on sub-genomic regions such as mobile elements to infer relationships, making El Tor isolates associated with the seventh pandemic seem superficially diverse. To understand the underlying phylogeny of the lineage responsible for the current pandemic, we identified high-resolution markers (single nucleotide polymorphisms; SNPs) in 154 whole-genome sequences of globally and temporally representative V. cholerae isolates. Using this phylogeny, we show here that the seventh pandemic has spread from the Bay of Bengal in at least three independent but overlapping waves with a common ancestor in the 1950s, and identify several transcontinental transmission events. Additionally, we show how the acquisition of the SXT family of antibiotic resistance elements has shaped pandemic spread, and show that this family was first acquired at least ten years before its discovery in V. cholerae.

    Funded by: Wellcome Trust: 076962, 076964

    Nature 2011;477;7365;462-5

  • Rapid pneumococcal evolution in response to clinical interventions.

    Croucher NJ, Harris SR, Fraser C, Quail MA, Burton J, van der Linden M, McGee L, von Gottberg A, Song JH, Ko KS, Pichon B, Baker S, Parry CM, Lambertsen LM, Shahinas D, Pillai DR, Mitchell TJ, Dougan G, Tomasz A, Klugman KP, Parkhill J, Hanage WP and Bentley SD

    The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.

    Epidemiological studies of the naturally transformable bacterial pathogen Streptococcus pneumoniae have previously been confounded by high rates of recombination. Sequencing 240 isolates of the PMEN1 (Spain(23F)-1) multidrug-resistant lineage enabled base substitutions to be distinguished from polymorphisms arising through horizontal sequence transfer. More than 700 recombinations were detected, with genes encoding major antigens frequently affected. Among these were 10 capsule-switching events, one of which accompanied a population shift as vaccine-escape serotype 19A isolates emerged in the USA after the introduction of the conjugate polysaccharide vaccine. The evolution of resistance to fluoroquinolones, rifampicin, and macrolides was observed to occur on multiple occasions. This study details how genomic plasticity within lineages of recombinogenic bacteria can permit adaptation to clinical interventions over remarkably short time scales.

    Funded by: Wellcome Trust: 076962, 076964

    Science (New York, N.Y.) 2011;331;6016;430-4

  • Evolution of MRSA during hospital transmission and intercontinental spread.

    Harris SR, Feil EJ, Holden MT, Quail MA, Nickerson EK, Chantratita N, Gardete S, Tavares A, Day N, Lindsay JA, Edgeworth JD, de Lencastre H, Parkhill J, Peacock SJ and Bentley SD

    The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 15A, UK.

    Current methods for differentiating isolates of predominant lineages of pathogenic bacteria often do not provide sufficient resolution to define precise relationships. Here, we describe a high-throughput genomics approach that provides a high-resolution view of the epidemiology and microevolution of a dominant strain of methicillin-resistant Staphylococcus aureus (MRSA). This approach reveals the global geographic structure within the lineage, its intercontinental transmission through four decades, and the potential to trace person-to-person transmission within a hospital environment. The ability to interrogate and resolve bacterial populations is applicable to a range of infectious diseases, as well as microbial ecology.

    Funded by: Department of Health; Wellcome Trust: 076964

    Science (New York, N.Y.) 2010;327;5964;469-74

Background

In 1995 the Wellcome Trust took the decision to set up a Pathogen sequencing unit (PSU) at what was then the Sanger Centre, to sequence the genomes of organisms relevant to human and animal health. It was initially funded through individual grants and later through the Wellcome Trust Beowulf Genomics Panel. The Beowulf panel now no longer operates and the PSU has been renamed to Pathogen genomics. This is now funded through the Sanger Institute Wellcome Trust envelope funding.

* quick link - http://q.sanger.ac.uk/pathgen