Sanger Institute - Publications 2016

Number of papers published in 2016: 657

  • Whole-Genome Sequencing for Routine Pathogen Surveillance in Public Health: a Population Snapshot of Invasive Staphylococcus aureus in Europe.

    Aanensen DM, Feil EJ, Holden MT, Dordel J, Yeats CA, Fedosejev A, Goater R, Castillo-Ramírez S, Corander J, Colijn C, Chlebowicz MA, Schouls L, Heck M, Pluister G, Ruimy R, Kahlmeter G, Åhman J, Matuschek E, Friedrich AW, Parkhill J, Bentley SD, Spratt BG, Grundmann H and European SRL Working Group

    Department of Infectious Disease Epidemiology, School of Public Health, Imperial College London, London, United Kingdom The Centre for Genomic Pathogen Surveillance, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom.

    Unlabelled: The implementation of routine whole-genome sequencing (WGS) promises to transform our ability to monitor the emergence and spread of bacterial pathogens. Here we combined WGS data from 308 invasive Staphylococcus aureus isolates corresponding to a pan-European population snapshot, with epidemiological and resistance data. Geospatial visualization of the data is made possible by a generic software tool designed for public health purposes that is available at the project URL ( Our analysis demonstrates that high-risk clones can be identified on the basis of population level properties such as clonal relatedness, abundance, and spatial structuring and by inferring virulence and resistance properties on the basis of gene content. We also show that in silico predictions of antibiotic resistance profiles are at least as reliable as phenotypic testing. We argue that this work provides a comprehensive road map illustrating the three vital components for future molecular epidemiological surveillance: (i) large-scale structured surveys, (ii) WGS, and (iii) community-oriented database infrastructure and analysis tools.

    Importance: The spread of antibiotic-resistant bacteria is a public health emergency of global concern, threatening medical intervention at every level of health care delivery. Several recent studies have demonstrated the promise of routine whole-genome sequencing (WGS) of bacterial pathogens for epidemiological surveillance, outbreak detection, and infection control. However, as this technology becomes more widely adopted, the key challenges of generating representative national and international data sets and the development of bioinformatic tools to manage and interpret the data become increasingly pertinent. This study provides a road map for the integration of WGS data into routine pathogen surveillance. We emphasize the importance of large-scale routine surveys to provide the population context for more targeted or localized investigation and the development of open-access bioinformatic tools to provide the means to combine and compare independently generated data with publicly available data sets.

    Funded by: Medical Research Council: G1000803; Wellcome Trust: 089472, 098051, 099202

    mBio 2016;7;3

  • Genomic prediction of coronary heart disease.

    Abraham G, Havulinna AS, Bhalala OG, Byars SG, De Livera AM, Yetukuri L, Tikkanen E, Perola M, Schunkert H, Sijbrands EJ, Palotie A, Samani NJ, Salomaa V, Ripatti S and Inouye M

    Centre for Systems Genomics, School of BioSciences, The University of Melbourne, Parkville, Victoria 3010, Australia.

    Aims: Genetics plays an important role in coronary heart disease (CHD) but the clinical utility of genomic risk scores (GRSs) relative to clinical risk scores, such as the Framingham Risk Score (FRS), is unclear. Our aim was to construct and externally validate a CHD GRS, in terms of lifetime CHD risk and relative to traditional clinical risk scores.

    Methods and results: We generated a GRS of 49 310 SNPs based on a CARDIoGRAMplusC4D Consortium meta-analysis of CHD, then independently tested it using five prospective population cohorts (three FINRISK cohorts, combined n = 12 676, 757 incident CHD events; two Framingham Heart Study cohorts (FHS), combined n = 3406, 587 incident CHD events). The GRS was associated with incident CHD (FINRISK HR = 1.74, 95% confidence interval (CI) 1.61-1.86 per S.D. of GRS; Framingham HR = 1.28, 95% CI 1.18-1.38), and was largely unchanged by adjustment for known risk factors, including family history. Integration of the GRS with the FRS or ACC/AHA13 scores improved the 10 years risk prediction (meta-analysis C-index: +1.5-1.6%, P < 0.001), particularly for individuals ≥60 years old (meta-analysis C-index: +4.6-5.1%, P < 0.001). Importantly, the GRS captured substantially different trajectories of absolute risk, with men in the top 20% of attaining 10% cumulative CHD risk 12-18 y earlier than those in the bottom 20%. High genomic risk was partially compensated for by low systolic blood pressure, low cholesterol level, and non-smoking.

    Conclusions: A GRS based on a large number of SNPs improves CHD risk prediction and encodes different trajectories of lifetime risk not captured by traditional clinical risk scores.

    Funded by: NHLBI NIH HHS: R01 HL087676; Wellcome Trust: 076113 , 085475

    European heart journal 2016;37;43;3267-3278

  • αv Integrins combine with LC3 and atg5 to regulate Toll-like receptor signalling in B cells.

    Acharya M, Sokolovska A, Tam JM, Conway KL, Stefani C, Raso F, Mukhopadhyay S, Feliu M, Paul E, Savill J, Hynes RO, Xavier RJ, Vyas JM, Stuart LM and Lacy-Hulbert A

    Immunology Program, Benaroya Research Institute, 1201 Ninth Avenue, Seattle, Washington 98101, USA.

    Integrin signalling triggers cytoskeletal rearrangements, including endocytosis and exocytosis of integrins and other membrane proteins. In addition to recycling integrins, this trafficking can also regulate intracellular signalling pathways. Here we describe a role for αv integrins in regulating Toll-like receptor (TLR) signalling by modulating intracellular trafficking. We show that deletion of αv or β3 causes increased B-cell responses to TLR stimulation in vitro, and αv-conditional knockout mice have elevated antibody responses to TLR-ligand-associated antigens. αv regulates TLR signalling by promoting recruitment of the autophagy component LC3 (microtubule-associated proteins 1 light chain 3) to TLR-containing endosomes, which is essential for progression from NF-κB to IRF signalling, and ultimately for traffic to lysosomes where signalling is terminated. Disruption of LC3 recruitment leads to prolonged NF-κB signalling and increased B-cell proliferation and antibody production. This work identifies a previously unrecognized role for αv and the autophagy components LC3 and atg5 in regulating TLR signalling and B-cell immunity.

    Funded by: NIDDK NIH HHS: R01 DK093695

    Nature communications 2016;7;10917

  • G9a inhibition potentiates the anti-tumour activity of DNA double-strand break inducing agents by impairing DNA repair independent of p53 status.

    Agarwal P and Jackson SP

    The Wellcome Trust/Cancer Research UK Gurdon Institute and Department of Biochemistry, University of Cambridge, Cambridge CB2 1QN, UK.

    Cancer cells often exhibit altered epigenetic signatures that can misregulate genes involved in processes such as transcription, proliferation, apoptosis and DNA repair. As regulation of chromatin structure is crucial for DNA repair processes, and both DNA repair and epigenetic controls are deregulated in many cancers, we speculated that simultaneously targeting both might provide new opportunities for cancer therapy. Here, we describe a focused screen that profiled small-molecule inhibitors targeting epigenetic regulators in combination with DNA double-strand break (DSB) inducing agents. We identify UNC0638, a catalytic inhibitor of histone lysine N-methyl-transferase G9a, as hypersensitising tumour cells to low doses of DSB-inducing agents without affecting the growth of the non-tumorigenic cells tested. Similar effects are also observed with another, structurally distinct, G9a inhibitor A-366. We also show that small-molecule inhibition of G9a or siRNA-mediated G9a depletion induces tumour cell death under low DNA damage conditions by impairing DSB repair in a p53 independent manner. Furthermore, we establish that G9a promotes DNA non-homologous end-joining in response to DSB-inducing genotoxic stress. This study thus highlights the potential for using G9a inhibitors as anti-cancer therapeutic agents in combination with DSB-inducing chemotherapeutic drugs such as etoposide.

    Cancer letters 2016;380;2;467-475

  • Human Rhinovirus B and C Genomes from Rural Coastal Kenya.

    Agoti CN, Kiyuka PK, Kamau E, Munywoki PK, Bett A, van der Hoek L, Kellam P, Nokes DJ and Cotten M

    Epidemiology and Demography Department, KEMRI-Wellcome Trust Research Programme, Kilifi, Kenya School of Health and Human Sciences, Pwani University, Kilifi, Kenya.

    Primer-independent agnostic deep sequencing was used to generate three human rhinovirus (HRV) B genomes and one HRV C genome from samples collected in a household respiratory survey in rural coastal Kenya. The study provides the first rhinovirus genomes from Kenya and will help improve the sensitivity of local molecular diagnostics.

    Genome announcements 2016;4;4

  • Sleeping Beauty screen reveals Pparg activation in metastatic prostate cancer.

    Ahmad I, Mui E, Galbraith L, Patel R, Tan EH, Salji M, Rust AG, Repiscak P, Hedley A, Markert E, Loveridge C, van der Weyden L, Edwards J, Sansom OJ, Adams DJ and Leung HY

    Cancer Research UK Beatson Institute, Bearsden, Glasgow G61 1BD, United Kingdom; Institute of Cancer Sciences, University of Glasgow, Glasgow G61 1QH, United Kingdom;

    Prostate cancer (CaP) is the most common adult male cancer in the developed world. The paucity of biomarkers to predict prostate tumor biology makes it important to identify key pathways that confer poor prognosis and guide potential targeted therapy. Using a murine forward mutagenesis screen in a Pten-null background, we identified peroxisome proliferator-activated receptor gamma (Pparg), encoding a ligand-activated transcription factor, as a promoter of metastatic CaP through activation of lipid signaling pathways, including up-regulation of lipid synthesis enzymes [fatty acid synthase (FASN), acetyl-CoA carboxylase (ACC), ATP citrate lyase (ACLY)]. Importantly, inhibition of PPARG suppressed tumor growth in vivo, with down-regulation of the lipid synthesis program. We show that elevated levels of PPARG strongly correlate with elevation of FASN in human CaP and that high levels of PPARG/FASN and PI3K/pAKT pathway activation confer a poor prognosis. These data suggest that CaP patients could be stratified in terms of PPARG/FASN and PTEN levels to identify patients with aggressive CaP who may respond favorably to PPARG/FASN inhibition.

    Funded by: Cancer Research UK: 13031; Medical Research Council: MR/L017997/1

    Proceedings of the National Academy of Sciences of the United States of America 2016;113;29;8290-5

  • Established BMI-associated genetic variants and their prospective associations with BMI and other cardiometabolic traits: the GLACIER Study.

    Ahmad S, Poveda A, Shungin D, Barroso I, Hallmans G, Renström F and Franks PW

    Department of Clinical Sciences, Genetic and Molecular Epidemiology Unit, Lund University Diabetes Center, Lund University, Malmö, Sweden.

    Background: Recent cross-sectional genome-wide scans have reported associations of 97 independent loci with body mass index (BMI). In 3541 middle-aged adult participants from the GLACIER Study, we tested whether these loci are associated with 10-year changes in BMI and other cardiometabolic traits (fasting and 2-h glucose, triglycerides, total cholesterol, and systolic and diastolic blood pressures).

    Methods: A BMI-specific genetic risk score (GRS) was calculated by summing the BMI-associated effect alleles at each locus. Trait-specific cardiometabolic GRSs comprised only the loci that show nominal association (P⩽0.10) with the respective trait in the original cross-sectional study. In longitudinal genetic association analyses, the second visit trait measure (assessed ~10 years after baseline) was used as the dependent variable and the models were adjusted for the baseline measure of the outcome trait, age, age(2), fasting time (for glucose and lipid traits), sex, follow-up time and population substructure.

    Results: The BMI-specific GRS was associated with increased BMI at follow-up (β=0.014 kg m(-2) per allele per 10-year follow-up, s.e.=0.006, P=0.019) as were three loci (PARK2 rs13191362, P=0.005; C6orf106 rs205262, P=0.043; and C9orf93 rs4740619, P=0.01). Although not withstanding Bonferroni correction, a handful of single-nucleotide polymorphisms was nominally associated with changes in blood pressure, glucose and lipid levels.

    Conclusions: Collectively, established BMI-associated loci convey modest but statistically significant time-dependent associations with long-term changes in BMI, suggesting a role for effect modification by factors that change with time in this population.

    International journal of obesity (2005) 2016;40;9;1346-52

  • Quantitation of next generation sequencing library preparation protocol efficiencies using droplet digital PCR assays - a systematic comparison of DNA library preparation kits for Illumina sequencing.

    Aigrain L, Gu Y and Quail MA

    Wellcome Trust Sanger Institute, Wellcome Trust Campus, Hinxton, Cambs, CB10 1SA, UK.

    Background: The emergence of next-generation sequencing (NGS) technologies in the past decade has allowed the democratization of DNA sequencing both in terms of price per sequenced bases and ease to produce DNA libraries. When it comes to preparing DNA sequencing libraries for Illumina, the current market leader, a plethora of kits are available and it can be difficult for the users to determine which kit is the most appropriate and efficient for their applications; the main concerns being not only cost but also minimal bias, yield and time efficiency.

    Results: We compared 9 commercially available library preparation kits in a systematic manner using the same DNA sample by probing the amount of DNA remaining after each protocol steps using a new droplet digital PCR (ddPCR) assay. This method allows the precise quantification of fragments bearing either adaptors or P5/P7 sequences on both ends just after ligation or PCR enrichment. We also investigated the potential influence of DNA input and DNA fragment size on the final library preparation efficiency. The overall library preparations efficiencies of the libraries show important variations between the different kits with the ones combining several steps into a single one exhibiting some final yields 4 to 7 times higher than the other kits. Detailed ddPCR data also reveal that the adaptor ligation yield itself varies by more than a factor of 10 between kits, certain ligation efficiencies being so low that it could impair the original library complexity and impoverish the sequencing results. When a PCR enrichment step is necessary, lower adaptor-ligated DNA inputs leads to greater amplification yields, hiding the latent disparity between kits.

    Conclusion: We describe a ddPCR assay that allows us to probe the efficiency of the most critical step in the library preparation, ligation, and to draw conclusion on which kits is more likely to preserve the sample heterogeneity and reduce the need of amplification.

    Funded by: Wellcome Trust: 098051

    BMC genomics 2016;17;458

  • Ensembl 2017.

    Aken BL, Achuthan P, Akanni W, Amode MR, Bernsdorff F, Bhai J, Billis K, Carvalho-Silva D, Cummins C, Clapham P, Gil L, Girón CG, Gordon L, Hourlier T, Hunt SE, Janacek SH, Juettemann T, Keenan S, Laird MR, Lavidas I, Maurel T, McLaren W, Moore B, Murphy DN, Nag R, Newman V, Nuhn M, Ong CK, Parker A, Patricio M, Riat HS, Sheppard D, Sparrow H, Taylor K, Thormann A, Vullo A, Walts B, Wilder SP, Zadissa A, Kostadima M, Martin FJ, Muffato M, Perry E, Ruffier M, Staines DM, Trevanion SJ, Cunningham F, Yates A, Zerbino DR and Flicek P

    European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

    Ensembl ( is a database and genome browser for enabling research on vertebrate genomes. We import, analyse, curate and integrate a diverse collection of large-scale reference data to create a more comprehensive view of genome biology than would be possible from any individual dataset. Our extensive data resources include evidence-based gene and regulatory region annotation, genome variation and gene trees. An accompanying suite of tools, infrastructure and programmatic access methods ensure uniform data analysis and distribution for all supported species. Together, these provide a comprehensive solution for large-scale and targeted genomics applications alike. Among many other developments over the past year, we have improved our resources for gene regulation and comparative genomics, and added CRISPR/Cas9 target sites. We released new browser functionality and tools, including improved filtering and prioritization of genome variation, Manhattan plot visualization for linkage disequilibrium and eQTL data, and an ontology search for phenotypes, traits and disease. We have also enhanced data discovery and access with a track hub registry and a selection of new REST end points. All Ensembl data are freely released to the scientific community and our source code is available via the open source Apache 2.0 license.

    Nucleic acids research 2016

  • The Ensembl gene annotation system.

    Aken BL, Ayling S, Barrell D, Clarke L, Curwen V, Fairley S, Fernandez Banet J, Billis K, García Girón C, Hourlier T, Howe K, Kähäri A, Kokocinski F, Martin FJ, Murphy DN, Nag R, Ruffier M, Schuster M, Tang YA, Vogel JH, White S, Zadissa A, Flicek P and Searle SM

    European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK

    The Ensembl gene annotation system has been used to annotate over 70 different vertebrate species across a wide range of genome projects. Furthermore, it generates the automatic alignment-based annotation for the human and mouse GENCODE gene sets. The system is based on the alignment of biological sequences, including cDNAs, proteins and RNA-seq reads, to the target genome in order to construct candidate transcript models. Careful assessment and filtering of these candidate transcripts ultimately leads to the final gene set, which is made available on the Ensembl website. Here, we describe the annotation process in detail.Database URL:

    Funded by: Biotechnology and Biological Sciences Research Council: BB/E011640/1, BB/I025360/1, BB/I025360/2, BB/I025506/1, BB/K009524/1, BB/M011461/1, BB/M011615/1, BB/M018458/1, BBS/B/13446, BBS/B/13470; NHGRI NIH HHS: U41 HG007234, U54 HG004555; NICHD NIH HHS: R01 HD074078; Wellcome Trust: WT095908, WT098051

    Database : the journal of biological databases and curation 2016;2016

  • FHF1 (FGF12) epileptic encephalopathy.

    Al-Mehmadi S, Splitt M, For DDD Study group*, Ramesh V, DeBrosse S, Dessoffy K, Xia F, Yang Y, Rosenfeld JA, Cossette P, Michaud JL, Hamdan FF, Campeau PM, Minassian BA and For CENet Study group‡

    Program in Genetics and Genome Biology and Division of Neurology (S.A.-M., B.A.M.), Department of Paediatrics, The Hospital for Sick Children, and University of Toronto, Ontario, Canada; Institute of Genetic Medicine (M.S.), International Centre for Life, Pediatric Neurology (V.R.), Newcastle General Hospital, UK; Center for Human Genetics (S.D., K.D.), UH Case Medical Center, Cleveland, OH; Department of Molecular and Human Genetics (F.X., Y.Y., J.A.R.), Baylor College of Medicine, Houston, TX; Baylor Miraca Genetics Laboratories (F.X., Y.Y.), Houston, TX; The Deciphering Developmental Disorders (DDD) Study, Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK; Division of Neurology (P.C.), CHUM Notre-Dame, Hospital University of Montreal, Quebec, Canada; Department of Pediatrics (J.L.M., P.M.C.), Department of Neurosciences (J.L.M., P.M.C.), Université de Montréal, Québec, Canada; and CHU Sainte-Justine Research Center (J.L.M., F.A.H., P.M.C.), Montreal, Quebec, Canada.

    Voltage-gated sodium channels (Na<sub>v</sub>s) are mainstays of neuronal function, and mutations in the genes encoding CNS Na<sub>v</sub>s (Na<sub>v</sub>1.1 [<i>SCN1A</i>], Na<sub>v</sub>1.2 [<i>SCN2A</i>], Na<sub>v</sub>1.3 [<i>SCN3A</i>], and Na<sub>v</sub>1.6 [<i>SCN8A</i>]) are causes of some of the most common and severe genetic epilepsies and epileptic encephalopathies (EE).<sup>1</sup> Fibroblast-growth-factor homologous factors (FHFs) compose a family of 4 proteins that interact with the C-terminal tails of Na<sub>v</sub>s to modulate the channels' fast, and long-term, inactivations.<sup>2</sup><i>FHF2</i> mutation is a rare cause of generalized epilepsy with febrile seizures plus (GEFS+).<sup>3</sup> Recently, a de novo <i>FHF1</i> mutation (p.R52H) was reported in early-onset EE in 2 siblings.<sup>4</sup> We report 3 patients from unrelated families with the same <i>FHF1</i> p.R52H mutation. The 5 cases together frame the FHF1 R52H EE from infancy to adulthood. As discussed below, this gain-of-function disease may be amenable to personalized therapy.

    Funded by: NINDS NIH HHS: U54 NS078059

    Neurology. Genetics 2016;2;6;e115

  • Complete Genome Sequence of Neisseria weaveri Strain NCTC13585.

    Alexander S, Fazal MA, Burnett E, Deheer-Graham A, Oliver K, Holroyd N, Parkhill J and Russell JE

    Culture Collections, Public Health England, London, United Kingdom

    Neisseria weaveri is a commensal organism of the canine oral cavity and an occasional opportunistic human pathogen which is associated with dog bite wounds. Here we report the first complete genomic sequence of the N. weaveri NCTC13585 (CCUG30381) strain, which was originally isolated from a patient with a canine bite wound.

    Genome announcements 2016;4;4

  • Complete Genome Sequence of Plesiomonas shigelloides Type Strain NCTC10360.

    Alexander S, Fazal MA, Burnett E, Deheer-Graham A, Oliver K, Holroyd N, Parkhill J and Russell JE

    Culture Collections, Public Health England, London, United Kingdom

    Plesiomonas shigelloides is a Gram-negative rod within the Enterobacteriaceae family. It is a gastrointestinal pathogen of increasing notoriety, often associated with diarrheal disease. P. shigelloides is waterborne, and infection is often linked to the consumption of seafood. Here, we describe the first complete genome for P. shigelloides type strain NCTC10360.

    Genome announcements 2016;4;5

  • Mutational signatures associated with tobacco smoking in human cancer.

    Alexandrov LB, Ju YS, Haase K, Van Loo P, Martincorena I, Nik-Zainal S, Totoki Y, Fujimoto A, Nakagawa H, Shibata T, Campbell PJ, Vineis P, Phillips DH and Stratton MR

    Theoretical Biology and Biophysics (T-6), Los Alamos National Laboratory, Los Alamos, NM 87545, USA.

    Tobacco smoking increases the risk of at least 17 classes of human cancer. We analyzed somatic mutations and DNA methylation in 5243 cancers of types for which tobacco smoking confers an elevated risk. Smoking is associated with increased mutation burdens of multiple distinct mutational signatures, which contribute to different extents in different cancers. One of these signatures, mainly found in cancers derived from tissues directly exposed to tobacco smoke, is attributable to misreplication of DNA damage caused by tobacco carcinogens. Others likely reflect indirect activation of DNA editing by APOBEC cytidine deaminases and of an endogenous clocklike mutational process. Smoking is associated with limited differences in methylation. The results are consistent with the proposition that smoking increases cancer risk by increasing the somatic mutation load, although direct evidence for this mechanism is lacking in some smoking-related cancer types.

    Funded by: Cancer Research UK; Department of Health; Wellcome Trust

    Science (New York, N.Y.) 2016;354;6312;618-622

  • Do Genetic Factors Modify the Relationship Between Obesity and Hypertriglyceridemia? Findings From the GLACIER and the MDC Studies.

    Ali A, Varga TV, Stojkovic IA, Schulz CA, Hallmans G, Barroso I, Poveda A, Renström F, Orho-Melander M and Franks PW

    From the Department of Clinical Sciences, Genetic & Molecular Epidemiology Unit (A.A., T.V.V., A.P., F.R., P.W.F.) and Department of Clinical Sciences, Diabetes & Cardiovascular Disease-Genetic Epidemiology (I.A.S., C.-A.S., M.O.-M.), Lund University, Malmö, Sweden; Department of Systems Medicine, Steno Diabetes Center, Gentofte, Denmark (A.A.); Department of Biobank Research (G.H., F.R.) and Department of Public Health & Clinical Medicine (P.W.F.), Umeå University, Umeå, Sweden; Human Genetics Programme, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton (I.B.); NIHR Cambridge Biomedical Research Centre, Institute of Metabolic Science (I.B.) and University of Cambridge, Metabolic Research Laboratories Institute of Metabolic Science (I.B.), Addenbrooke's Hospital, Cambridge, United Kingdom; Department of Genetics, Physical Anthropology & Animal Physiology, University of the Basque Country (UPV/EHU), Bilbao, Spain (A.P.); and Department of Nutrition, Harvard T.H. Chan School of Public Health, Boston, MA (P.W.F.).

    Background: Obesity is a major risk factor for dyslipidemia, but this relationship is highly variable. Recently published data from 2 Danish cohorts suggest that genetic factors may underlie some of this variability.

    Methods and results: We tested whether established triglyceride-associated loci modify the relationship of body mass index (BMI) and triglyceride concentrations in 2 Swedish cohorts (the Gene-Lifestyle Interactions and Complex Traits Involved in Elevated Disease Risk [GLACIER Study; N=4312] and the Malmö Diet and Cancer Study [N=5352]). The genetic loci were amalgamated into a weighted genetic risk score (WGRSTG) by summing the triglyceride-elevating alleles (weighted by their established marginal effects) for all loci. Both BMI and the WGRSTG were strongly associated with triglyceride concentrations in GLACIER, with each additional BMI unit (kg/m(2)) associated with 2.8% (P=8.4×10(-84)) higher triglyceride concentration and each additional WGRSTG unit with 2% (P=7.6×10(-48)) higher triglyceride concentration. Each unit of the WGRSTG was associated with 1.5% higher triglyceride concentrations in normal weight and 2.4% higher concentrations in overweight/obese participants (Pinteraction=0.056). Meta-analyses of results from the Swedish cohorts yielded a statistically significant WGRSTG×BMI interaction effect (Pinteraction=6.0×10(-4)), which was strengthened by including data from the Danish cohorts (Pinteraction=6.5×10(-7)). In the meta-analysis of the Swedish cohorts, nominal evidence of a 3-way interaction (WGRSTG×BMI×sex) was observed (Pinteraction=0.03), where the WGRSTG×BMI interaction was only statistically significant in females. Using protein-protein interaction network analyses, we identified molecular interactions and pathways elucidating the metabolic relationships between BMI and triglyceride-associated loci.

    Conclusions: Our findings provide evidence that body fatness accentuates the effects of genetic susceptibility variants in hypertriglyceridemia, effects that are most evident in females.

    Funded by: Medical Research Council; Wellcome Trust: 098051

    Circulation. Cardiovascular genetics 2016;9;2;162-71

  • Decreased Rate of Plasma Arginine Appearance in Murine Malaria May Explain Hypoargininemia in Children With Cerebral Malaria.

    Alkaitis MS, Wang H, Ikeda AK, Rowley CA, MacCormick IJ, Chertow JH, Billker O, Suffredini AF, Roberts DJ, Taylor TE, Seydel KB and Ackerman HC

    Laboratory of Malaria and Vector Research, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Rockville.

    Background:  Plasmodium infection depletes arginine, the substrate for nitric oxide synthesis, and impairs endothelium-dependent vasodilation. Increased conversion of arginine to ornithine by parasites or host arginase is a proposed mechanism of arginine depletion.

    Methods:  We used high-performance liquid chromatography to measure plasma arginine, ornithine, and citrulline levels in Malawian children with cerebral malaria and in mice infected with Plasmodium berghei ANKA with or without the arginase gene. Heavy isotope-labeled tracers measured by quadrupole time-of-flight liquid chromatography-mass spectrometry were used to quantify the in vivo rate of appearance and interconversion of plasma arginine, ornithine, and citrulline in infected mice.

    Results:  Children with cerebral malaria and P. berghei-infected mice demonstrated depletion of plasma arginine, ornithine, and citrulline. Knock out of Plasmodium arginase did not alter arginine depletion in infected mice. Metabolic tracer analysis demonstrated that plasma arginase flux was unchanged by P. berghei infection. Instead, infected mice exhibited decreased rates of plasma arginine, ornithine, and citrulline appearance and decreased conversion of plasma citrulline to arginine. Notably, plasma arginine use by nitric oxide synthase was decreased in infected mice.

    Conclusions:  Simultaneous arginine and ornithine depletion in malaria parasite-infected children cannot be fully explained by plasma arginase activity. Our mouse model studies suggest that plasma arginine depletion is driven primarily by a decreased rate of appearance.

    Funded by: Department of Health: RP-PG-0310-1004; NIGMS NIH HHS: T32 GM007753; Wellcome Trust

    The Journal of infectious diseases 2016;214;12;1840-1849

  • Ebola virus disease cluster — Northern Sierra Leone, January 2016

    Alpren,C., Sloan,M., Boegler,K.A., Martin,D.W., Ervin,E., Washburn,F., Rickert,R., Singh,T., Redd,J.T., Bangalie,A., Bass,M., Bennett,S.D., Boateng,I.A., Campbell,D., Cassell,C., COTTON,M., Duffy,N., Goodfellow,I., Hersey,S., Jackson,E.L., Jah,U., Jimissa,A.S., Kamara,A.S., Kamara,F., KELLAM,P., Levine,R., Meredith,L., Miller,L.A., Moody-Geissler,S., Musoke,R., Naidoo,D., Ndyahikayo,J., Njie,G., Phan,M., Rambaut,A. and Sesay,F.

    Morbidity and Mortality Weekly Report 2016;65;26;681-3

  • Dihydroartemisinin-piperaquine resistance in Plasmodium falciparum malaria in Cambodia: a multisite prospective cohort study.

    Amaratunga C, Lim P, Suon S, Sreng S, Mao S, Sopha C, Sam B, Dek D, Try V, Amato R, Blessborn D, Song L, Tullo GS, Fay MP, Anderson JM, Tarning J and Fairhurst RM

    Laboratory of Malaria and Vector Research, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Rockville, MD, USA.

    Background: Artemisinin resistance in Plasmodium falciparum threatens to reduce the efficacy of artemisinin combination therapies (ACTs), thus compromising global efforts to eliminate malaria. Recent treatment failures with dihydroartemisinin-piperaquine, the current first-line ACT in Cambodia, suggest that piperaquine resistance may be emerging in this country. We explored the relation between artemisinin resistance and dihydroartemisinin-piperaquine failures, and sought to confirm the presence of piperaquine-resistant P falciparum infections in Cambodia.

    Methods: In this prospective cohort study, we enrolled patients aged 2-65 years with uncomplicated P falciparum malaria in three Cambodian provinces: Pursat, Preah Vihear, and Ratanakiri. Participants were given standard 3-day courses of dihydroartemisinin-piperaquine. Peripheral blood parasite densities were measured until parasites cleared and then weekly to 63 days. The primary outcome was recrudescent P falciparum parasitaemia within 63 days. We measured piperaquine plasma concentrations at baseline, 7 days, and day of recrudescence. We assessed phenotypic and genotypic markers of drug resistance in parasite isolates. The study is registered with, number NCT01736319.

    Findings: Between Sept 4, 2012, and Dec 31, 2013, we enrolled 241 participants. In Pursat, where artemisinin resistance is entrenched, 37 (46%) of 81 patients had parasite recrudescence. In Preah Vihear, where artemisinin resistance is emerging, ten (16%) of 63 patients had recrudescence and in Ratanakiri, where artemisinin resistance is rare, one (2%) of 60 patients did. Patients with recrudescent P falciparum infections were more likely to have detectable piperaquine plasma concentrations at baseline compared with non-recrudescent patients, but did not differ significantly in age, initial parasite density, or piperaquine plasma concentrations at 7 days. Recrudescent parasites had a higher prevalence of kelch13 mutations, higher piperaquine 50% inhibitory concentration (IC50) values, and lower mefloquine IC50 values; none had multiple pfmdr1 copies, a genetic marker of mefloquine resistance.

    Interpretation: Dihydroartemisinin-piperaquine failures are caused by both artemisinin and piperaquine resistance, and commonly occur in places where dihydroartemisinin-piperaquine has been used in the private sector. In Cambodia, artesunate plus mefloquine may be a viable option to treat dihydroartemisinin-piperaquine failures, and a more effective first-line ACT in areas where dihydroartemisinin-piperaquine failures are common. The use of single low-dose primaquine to eliminate circulating gametocytes is needed in areas where artemisinin and ACT resistance is prevalent.

    Funding: National Institute of Allergy and Infectious Diseases.

    Funded by: Intramural NIH HHS: Z01 AI001000-01, Z01 AI001000-02; Wellcome Trust: 089275/Z/09/2

    The Lancet. Infectious diseases 2016;16;3;357-65

  • Genetic markers associated with dihydroartemisinin-piperaquine failure in Plasmodium falciparum malaria in Cambodia: a genotype-phenotype association study.

    Amato R, Lim P, Miotto O, Amaratunga C, Dek D, Pearson RD, Almagro-Garcia J, Neal AT, Sreng S, Suon S, Drury E, Jyothi D, Stalker J, Kwiatkowski DP and Fairhurst RM

    Wellcome Trust Sanger Institute, Hinxton, UK; Centre for Genomics and Global Health, Wellcome Trust Centre for Human Genetics, Oxford, UK. Electronic address:

    Background: As the prevalence of artemisinin-resistant Plasmodium falciparum malaria increases in the Greater Mekong subregion, emerging resistance to partner drugs in artemisinin combination therapies seriously threatens global efforts to treat and eliminate this disease. Molecular markers that predict failure of artemisinin combination therapy are urgently needed to monitor the spread of partner drug resistance, and to recommend alternative treatments in southeast Asia and beyond.

    Methods: We did a genome-wide association study of 297 P falciparum isolates from Cambodia to investigate the relationship of 11 630 exonic single-nucleotide polymorphisms (SNPs) and 43 copy number variations (CNVs) with in-vitro piperaquine 50% inhibitory concentrations (IC<sub>50</sub>s), and tested whether these genetic variants are markers of treatment failure with dihydroartemisinin-piperaquine. We then did a survival analysis of 133 patients to determine whether candidate molecular markers predicted parasite recrudescence following dihydroartemisinin-piperaquine treatment.

    Findings: Piperaquine IC<sub>50</sub>s increased significantly from 2011 to 2013 in three Cambodian provinces (2011 vs 2013 median IC<sub>50</sub>s: 20·0 nmol/L [IQR 13·7-29·0] vs 39·2 nmol/L [32·8-48·1] for Ratanakiri, 19·3 nmol/L [15·1-26·2] vs 66·2 nmol/L [49·9-83·0] for Preah Vihear, and 19·6 nmol/L [11·9-33·9] vs 81·1 nmol/L [61·3-113·1] for Pursat; all p≤10<sup>-3</sup>; Kruskal-Wallis test). Genome-wide analysis of SNPs identified a chromosome 13 region that associates with raised piperaquine IC<sub>50</sub>s. A non-synonymous SNP (encoding a Glu415Gly substitution) in this region, within a gene encoding an exonuclease, associates with parasite recrudescence following dihydroartemisinin-piperaquine treatment. Genome-wide analysis of CNVs revealed that a single copy of the mdr1 gene on chromosome 5 and a novel amplification of the plasmepsin 2 and plasmepsin 3 genes on chromosome 14 also associate with raised piperaquine IC<sub>50</sub>s. After adjusting for covariates, both exo-E415G and plasmepsin 2-3 markers significantly associate (p=3·0 × 10<sup>-8</sup> and p=1·7 × 10<sup>-7</sup>, respectively) with decreased treatment efficacy (survival rates 0·38 [95% CI 0·25-0·51] and 0·41 [0·28-0·53], respectively).

    Interpretation: The exo-E415G SNP and plasmepsin 2-3 amplification are markers of piperaquine resistance and dihydroartemisinin-piperaquine failures in Cambodia, and can help monitor the spread of these phenotypes into other countries of the Greater Mekong subregion, and elucidate the mechanism of piperaquine resistance. Since plasmepsins are involved in the parasite's haemoglobin-to-haemozoin conversion pathway, targeted by related antimalarials, plasmepsin 2-3 amplification probably mediates piperaquine resistance.

    Funding: Intramural Research Program of the US National Institute of Allergy and Infectious Diseases, National Institutes of Health, Wellcome Trust, Bill & Melinda Gates Foundation, Medical Research Council, and UK Department for International Development.

    Funded by: Medical Research Council: G0600718, MR/M006212/1; Wellcome Trust

    The Lancet. Infectious diseases 2016;17;2;164-173

  • Voices of biotech.

    Amit I, Baker D, Barker R, Berger B, Bertozzi C, Bhatia S, Biffi A, Demichelis F, Doudna J, Dowdy SF, Endy D, Helmstaedter M, Junca H, June C, Kamb S, Khvorova A, Kim DH, Kim JS, Krishnan Y, Lakadamyali M, Lappalainen T, Lewin S, Liao J, Loman N, Lundberg E, Lynd L, Martin C, Mellman I, Miyawaki A, Mummery C, Nelson K, Paz J, Peralta-Yahya P, Picotti P, Polyak K, Prather K, Qin J, Quake S, Regev A, Rogers JA, Shetty R, Sommer M, Stevens M, Stolovitzky G, Takahashi M, Tang F, Teichmann S, Torres-Padilla ME, Tripathi L, Vemula P, Verdine G, Vollmer F, Wang J, Ying JY, Zhang F and Zhang T

    Weizmann Institute of Science, Rehovot, Israel.

    Nature biotechnology 2016;34;3;270-5

  • Chlamydia trachomatis from Australian Aboriginal people with trachoma are polyphyletic composed of multiple distinctive lineages.

    Andersson P, Harris SR, Smith HMBS, Hadfield J, O'Neill C, Cutcliffe LT, Douglas FP, Asche LV, Mathews JD, Hutton SI, Sarovich DS, Tong SYC, Clarke IN, Thomson NR and Giffard PM

    Global and Tropical Health Division, Menzies School of Health Research, Charles Darwin University, Darwin, Casuarina, Northern Territory 0811, Australia.

    Chlamydia trachomatis causes sexually transmitted infections and the blinding disease trachoma. Current data on C. trachomatis phylogeny show that there is only a single trachoma-causing clade, which is distinct from the lineages causing urogenital tract (UGT) and lymphogranuloma venerum diseases. Here we report the whole-genome sequences of ocular C. trachomatis isolates obtained from young children with clinical signs of trachoma in a trachoma endemic region of northern Australia. The isolates form two lineages that fall outside the classical trachoma lineage, instead being placed within UGT clades of the C. trachomatis phylogenetic tree. The Australian trachoma isolates appear to be recombinants with UGT C. trachomatis genome backbones, in which loci that encode immunodominant surface proteins (ompA and pmpEFGH) have been replaced by those characteristic of classical ocular isolates. This suggests that ocular tropism and association with trachoma are functionally associated with some sequence variants of ompA and pmpEFGH.

    Funded by: Wellcome Trust: 098051

    Nature communications 2016;7;10688

  • Notes on the implementation of FAM


    CEUR Workshop Proceedings 2016;1661;46-58

  • Parallel single-cell sequencing links transcriptional and epigenetic heterogeneity.

    Angermueller C, Clark SJ, Lee HJ, Macaulay IC, Teng MJ, Hu TX, Krueger F, Smallwood S, Ponting CP, Voet T, Kelsey G, Stegle O and Reik W

    European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge, UK.

    We report scM&T-seq, a method for parallel single-cell genome-wide methylome and transcriptome sequencing that allows for the discovery of associations between transcriptional and epigenetic variation. Profiling of 61 mouse embryonic stem cells confirmed known links between DNA methylation and transcription. Notably, the method revealed previously unrecognized associations between heterogeneously methylated distal regulatory elements and transcription of key pluripotency genes.

    Funded by: Biotechnology and Biological Sciences Research Council; Medical Research Council: MC_PC_15075, MC_U137761446, MC_UU_12021/1, MR/K011332/1; Wellcome Trust: 095645, 105031, 105031REIK, 105045

    Nature methods 2016;13;3;229-232

  • Deep learning for computational biology.

    Angermueller C, Pärnamaa T, Parts L and Stegle O

    European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton Cambridge, UK.

    Technological advances in genomics and imaging have led to an explosion of molecular and cellular profiling data from large numbers of samples. This rapid increase in biological data dimension and acquisition rate is challenging conventional analysis strategies. Modern machine learning methods, such as deep learning, promise to leverage very large data sets for finding hidden structure within them, and for making accurate predictions. In this review, we discuss applications of this new breed of analysis approaches in regulatory genomics and cellular imaging. We provide background of what deep learning is, and the settings in which it can be successfully applied to derive biological insights. In addition to presenting specific applications and providing tips for practical use, we also highlight possible pitfalls and limitations to guide computational biologists when and how to make the most use of this new technology.

    Funded by: Wellcome Trust

    Molecular systems biology 2016;12;7;878

  • Phase variation of a Type IIG restriction-modification enzyme alters site-specific methylation patterns and gene expression in Campylobacter jejuni strain NCTC11168.

    Anjum A, Brathwaite KJ, Aidley J, Connerton PL, Cummings NJ, Parkhill J, Connerton I and Bayliss CD

    Department of Genetics, University of Leicester, Leicester LE1 7RH, UK.

    Phase-variable restriction-modification systems are a feature of a diverse range of bacterial species. Stochastic, reversible switches in expression of the methyltransferase produces variation in methylation of specific sequences. Phase-variable methylation by both Type I and Type III methyltransferases is associated with altered gene expression and phenotypic variation. One phase-variable gene of Campylobacter jejuni encodes a homologue of an unusual Type IIG restriction-modification system in which the endonuclease and methyltransferase are encoded by a single gene. Using both inhibition of restriction and PacBio-derived methylome analyses of mutants and phase-variants, the cj0031c allele in C. jejuni strain NCTC11168 was demonstrated to specifically methylate adenine in 5'CCCGA and 5'CCTGA sequences. Alterations in the levels of specific transcripts were detected using RNA-Seq in phase-variants and mutants of cj0031c but these changes did not correlate with observed differences in phenotypic behaviour. Alterations in restriction of phage growth were also associated with phase variation (PV) of cj0031c and correlated with presence of sites in the genomes of these phages. We conclude that PV of a Type IIG restriction-modification system causes changes in site-specific methylation patterns and gene expression patterns that may indirectly change adaptive traits.

    Nucleic acids research 2016;44;10;4581-94

  • Species Mash-up.

    Argimón S and Aanensen DM

    Centre for Genomic Pathogen Surveillance, The Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.

    Nature reviews. Microbiology 2016;14;12;730

  • Microreact: visualizing and sharing data for genomic epidemiology and phylogeography.

    Argimón S, Abudahab K, Goater RJE, Fedosejev A, Bhai J, Glasner C, Feil EJ, Holden MTG, Yeats CA, Grundmann H, Spratt BG and Aanensen DM

    1​The Centre for Genomic Pathogen Surveillance, Wellcome Genome Campus, Hinxton CB10 1SA, UK.

    Visualization is frequently used to aid our interpretation of complex datasets. Within microbial genomics, visualizing the relationships between multiple genomes as a tree provides a framework onto which associated data (geographical, temporal, phenotypic and epidemiological) are added to generate hypotheses and to explore the dynamics of the system under investigation. Selected static images are then used within publications to highlight the key findings to a wider audience. However, these images are a very inadequate way of exploring and interpreting the richness of the data. There is, therefore, a need for flexible, interactive software that presents the population genomic outputs and associated data in a user-friendly manner for a wide range of end users, from trained bioinformaticians to front-line epidemiologists and health workers. Here, we present Microreact, a web application for the easy visualization of datasets consisting of any combination of trees, geographical, temporal and associated metadata. Data files can be uploaded to Microreact directly via the web browser or by linking to their location (e.g. from Google Drive/Dropbox or via API), and an integrated visualization via trees, maps, timelines and tables provides interactive querying of the data. The visualization can be shared as a permanent web link among collaborators, or embedded within publications to enable readers to explore and download the data. Microreact can act as an end point for any tool or bioinformatic pipeline that ultimately generates a tree, and provides a simple, yet powerful, visualization method that will aid research and discovery and the open sharing of datasets.

    Microbial genomics 2016;2;11;e000093

  • Rapid outbreak sequencing of Ebola virus in Sierra Leone identifies transmission chains linked to sporadic cases.

    Arias A, Watson SJ, Asogun D, Tobin EA, Lu J, Phan MVT, Jah U, Wadoum REG, Meredith L, Thorne L, Caddy S, Tarawalie A, Langat P, Dudas G, Faria NR, Dellicour S, Kamara A, Kargbo B, Kamara BO, Gevao S, Cooper D, Newport M, Horby P, Dunning J, Sahr F, Brooks T, Simpson AJH, Groppelli E, Liu G, Mulakken N, Rhodes K, Akpablie J, Yoti Z, Lamunu M, Vitto E, Otim P, Owilli C, Boateng I, Okoror L, Omomoh E, Oyakhilome J, Omiunu R, Yemisis I, Adomeh D, Ehikhiametalor S, Akhilomen P, Aire C, Kurth A, Cook N, Baumann J, Gabriel M, Wölfel R, Di Caro A, Carroll MW, Günther S, Redd J, Naidoo D, Pybus OG, Rambaut A, Kellam P, Goodfellow I and Cotten M

    Division of Virology, Department of Pathology, University of Cambridge, Cambridge, United Kingdom.

    To end the largest known outbreak of Ebola virus disease (EVD) in West Africa and to prevent new transmissions, rapid epidemiological tracing of cases and contacts was required. The ability to quickly identify unknown sources and chains of transmission is key to ending the EVD epidemic and of even greater importance in the context of recent reports of Ebola virus (EBOV) persistence in survivors. Phylogenetic analysis of complete EBOV genomes can provide important information on the source of any new infection. A local deep sequencing facility was established at the Mateneh Ebola Treatment Centre in central Sierra Leone. The facility included all wetlab and computational resources to rapidly process EBOV diagnostic samples into full genome sequences. We produced 554 EBOV genomes from EVD cases across Sierra Leone. These genomes provided a detailed description of EBOV evolution and facilitated phylogenetic tracking of new EVD cases. Importantly, we show that linked genomic and epidemiological data can not only support contact tracing but also identify unconventional transmission chains involving body fluids, including semen. Rapid EBOV genome sequencing, when linked to epidemiological information and a comprehensive database of virus sequences across the outbreak, provided a powerful tool for public health epidemic control efforts.

    Funded by: Wellcome Trust; World Health Organization: 001

    Virus evolution 2016;2;1;vew016

  • Origin of modern syphilis and emergence of a pandemic Treponema pallidum cluster.

    Arora N, Schuenemann VJ, Jäger G, Peltzer A, Seitz A, Herbig A, Strouhal M, Grillová L, Sánchez-Busó L, Kühnert D, Bos KI, Davis LR, Mikalová L, Bruisten S, Komericki P, French P, Grant PR, Pando MA, Vaulet LG, Fermepin MR, Martinez A, Centurion Lara A, Giacani L, Norris SJ, Šmajs D, Bosshard PP, González-Candelas F, Nieselt K, Krause J and Bagheri HC

    Institute for Evolutionary Biology and Environmental Studies, University of Zurich, 8057 Zurich, Switzerland.

    The abrupt onslaught of the syphilis pandemic that started in the late fifteenth century established this devastating infectious disease as one of the most feared in human history<sup>1</sup>. Surprisingly, despite the availability of effective antibiotic treatment since the mid-twentieth century, this bacterial infection, which is caused by Treponema pallidum subsp. pallidum (TPA), has been re-emerging globally in the last few decades with an estimated 10.6 million cases in 2008 (ref. 2). Although resistance to penicillin has not yet been identified, an increasing number of strains fail to respond to the second-line antibiotic azithromycin<sup>3</sup>. Little is known about the genetic patterns in current infections or the evolutionary origins of the disease due to the low quantities of treponemal DNA in clinical samples and difficulties in cultivating the pathogen<sup>4</sup>. Here, we used DNA capture and whole-genome sequencing to successfully interrogate genome-wide variation from syphilis patient specimens, combined with laboratory samples of TPA and two other subspecies. Phylogenetic comparisons based on the sequenced genomes indicate that the TPA strains examined share a common ancestor after the fifteenth century, within the early modern era. Moreover, most contemporary strains are azithromycin-resistant and are members of a globally dominant cluster, named here as SS14-Ω. The cluster diversified from a common ancestor in the mid-twentieth century subsequent to the discovery of antibiotics. Its recent phylogenetic divergence and global presence point to the emergence of a pandemic strain cluster.

    Nature microbiology 2016;2;16245

  • Trans-ethnic study design approaches for fine-mapping.

    Asimit JL, Hatzikotoulas K, McCarthy M, Morris AP and Zeggini E

    Wellcome Trust Sanger Institute, Cambridge, UK.

    Studies that traverse ancestrally diverse populations may increase power to detect novel loci and improve fine-mapping resolution of causal variants by leveraging linkage disequilibrium differences between ethnic groups. The inclusion of African ancestry samples may yield further improvements because of low linkage disequilibrium and high genetic heterogeneity. We investigate the fine-mapping resolution of trans-ethnic fixed-effects meta-analysis for five type II diabetes loci, under various settings of ancestral composition (European, East Asian, African), allelic heterogeneity, and causal variant minor allele frequency. In particular, three settings of ancestral composition were compared: (1) single ancestry (European), (2) moderate ancestral diversity (European and East Asian), and (3) high ancestral diversity (European, East Asian, and African). Our simulations suggest that the European/Asian and European ancestry-only meta-analyses consistently attain similar fine-mapping resolution. The inclusion of African ancestry samples in the meta-analysis leads to a marked improvement in fine-mapping resolution.

    Funded by: Medical Research Council: MR/K021486/1; NIDDK NIH HHS: U01 DK085545; Wellcome Trust: 098017, 098051

    European journal of human genetics : EJHG 2016;24;9;1330-6

  • The Allelic Landscape of Human Blood Cell Trait Variation and Links to Common Complex Disease.

    Astle WJ, Elding H, Jiang T, Allen D, Ruklisa D, Mann AL, Mead D, Bouman H, Riveros-Mckay F, Kostadima MA, Lambourne JJ, Sivapalaratnam S, Downes K, Kundu K, Bomba L, Berentsen K, Bradley JR, Daugherty LC, Delaneau O, Freson K, Garner SF, Grassi L, Guerrero J, Haimel M, Janssen-Megens EM, Kaan A, Kamat M, Kim B, Mandoli A, Marchini J, Martens JHA, Meacham S, Megy K, O'Connell J, Petersen R, Sharifi N, Sheard SM, Staley JR, Tuna S, van der Ent M, Walter K, Wang SY, Wheeler E, Wilder SP, Iotchkova V, Moore C, Sambrook J, Stunnenberg HG, Di Angelantonio E, Kaptoge S, Kuijpers TW, Carrillo-de-Santa-Pau E, Juan D, Rico D, Valencia A, Chen L, Ge B, Vasquez L, Kwan T, Garrido-Martín D, Watt S, Yang Y, Guigo R, Beck S, Paul DS, Pastinen T, Bujold D, Bourque G, Frontini M, Danesh J, Roberts DJ, Ouwehand WH, Butterworth AS and Soranzo N

    Department of Haematology, University of Cambridge, Cambridge Biomedical Campus, Long Road, Cambridge CB2 0PT, UK; National Health Service (NHS) Blood and Transplant, Cambridge Biomedical Campus, Long Road, Cambridge CB2 0PT, UK; Medical Research Council Biostatistics Unit, Cambridge Institute of Public Health, Cambridge Biomedical Campus, Forvie Site, Robinson Way, Cambridge CB2 0SR, UK; MRC/BHF Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Strangeways Research Laboratory, Wort's Causeway, Cambridge CB1 8RN, UK.

    Many common variants have been associated with hematological traits, but identification of causal genes and pathways has proven challenging. We performed a genome-wide association analysis in the UK Biobank and INTERVAL studies, testing 29.5 million genetic variants for association with 36 red cell, white cell, and platelet properties in 173,480 European-ancestry participants. This effort yielded hundreds of low frequency (<5%) and rare (<1%) variants with a strong impact on blood cell phenotypes. Our data highlight general properties of the allelic architecture of complex traits, including the proportion of the heritable component of each blood trait explained by the polygenic signal across different genome regulatory domains. Finally, through Mendelian randomization, we provide evidence of shared genetic pathways linking blood cell indices with complex pathologies, including autoimmune diseases, schizophrenia, and coronary heart disease and evidence suggesting previously reported population associations between blood cell indices and cardiovascular disease may be non-causal.

    Funded by: British Heart Foundation: RG/09/012/28096; Department of Health: RP-PG-0310-1002, RP-PG-0310-1004; European Research Council: 268834; Medical Research Council: MC_QA137853, MR/L003120/1

    Cell 2016;167;5;1415-1429.e19

  • A new Plasmodium vivax reference sequence with improved assembly of the subtelomeres reveals an abundance of pir genes.

    Auburn S, Böhme U, Steinbiss S, Trimarsanto H, Hostetler J, Sanders M, Gao Q, Nosten F, Newbold CI, Berriman M, Price RN and Otto TD

    Global and Tropical Health Division, Menzies School of Health Research and Charles Darwin University, Darwin, Australia.

    <i>Plasmodium vivax</i> is now the predominant cause of malaria in the Asia-Pacific, South America and Horn of Africa. Laboratory studies of this species are constrained by the inability to maintain the parasite in continuous <i>ex vivo</i> culture, but genomic approaches provide an alternative and complementary avenue to investigate the parasite's biology and epidemiology. To date, molecular studies of <i>P. vivax</i> have relied on the Salvador-I reference genome sequence, derived from a monkey-adapted strain from South America. However, the Salvador-I reference remains highly fragmented with over 2500 unassembled scaffolds.  Using high-depth Illumina sequence data, we assembled and annotated a new reference sequence, PvP01, sourced directly from a patient from Papua Indonesia. Draft assemblies of isolates from China (PvC01) and Thailand (PvT01) were also prepared for comparative purposes. The quality of the PvP01 assembly is improved greatly over Salvador-I, with fragmentation reduced to 226 scaffolds. Detailed manual curation has ensured highly comprehensive annotation, with functions attributed to 58% core genes in PvP01 versus 38% in Salvador-I. The assemblies of PvP01, PvC01 and PvT01 are larger than that of Salvador-I (28-30 versus 27 Mb), owing to improved assembly of the subtelomeres.  An extensive repertoire of over 1200 <i>Plasmodium</i> interspersed repeat (<i>pir</i>) genes were identified in PvP01 compared to 346 in Salvador-I, suggesting a vital role in parasite survival or development. The manually curated PvP01 reference and PvC01 and PvT01 draft assemblies are important new resources to study vivax malaria. PvP01 is maintained at GeneDB and ongoing curation will ensure continual improvements in assembly and annotation quality.

    Funded by: Wellcome Trust: 091625, 098051, 099198

    Wellcome open research 2016;1;4

  • Genomic Analysis Reveals a Common Breakpoint in Amplifications of the Plasmodium vivax Multidrug Resistance 1 Locus in Thailand.

    Auburn S, Serre D, Pearson RD, Amato R, Sriprawat K, To S, Handayuni I, Suwanarusk R, Russell B, Drury E, Stalker J, Miotto O, Kwiatkowski DP, Nosten F and Price RN

    Global and Tropical Health Division, Menzies School of Health Research, Charles Darwin University, Australia.

    In regions of coendemicity for Plasmodium falciparum and Plasmodium vivax where mefloquine is used to treat P. falciparum infection, drug pressure mediated by increased copy numbers of the multidrug resistance 1 gene (pvmdr1) may select for mefloquine-resistant P. vivax Surveillance is not undertaken routinely owing in part to methodological challenges in detection of gene amplification. Using genomic data on 88 P. vivax samples from western Thailand, we identified pvmdr1 amplification in 17 isolates, all exhibiting tandem copies of a 37.6-kilobase pair region with identical breakpoints. A novel breakpoint-specific polymerase chain reaction assay was designed to detect the amplification. The assay demonstrated high sensitivity, identifying amplifications in 13 additional, polyclonal infections. Application to 132 further samples identified the common breakpoint in all years tested (2003-2015), with a decline in prevalence after 2012 corresponding to local discontinuation of mefloquine regimens. Assessment of the structure of pvmdr1 amplification in other geographic regions will yield information about the population-specificity of the breakpoints and underlying amplification mechanisms.

    Funded by: NIAID NIH HHS: R01 AI103228; Wellcome Trust: 091625

    The Journal of infectious diseases 2016;214;8;1235-42

  • Making sense of big data in health research: Towards an EU action plan.

    Auffray C, Balling R, Barroso I, Bencze L, Benson M, Bergeron J, Bernal-Delgado E, Blomberg N, Bock C, Conesa A, Del Signore S, Delogne C, Devilee P, Di Meglio A, Eijkemans M, Flicek P, Graf N, Grimm V, Guchelaar HJ, Guo YK, Gut IG, Hanbury A, Hanif S, Hilgers RD, Honrado Á, Hose DR, Houwing-Duistermaat J, Hubbard T, Janacek SH, Karanikas H, Kievits T, Kohler M, Kremer A, Lanfear J, Lengauer T, Maes E, Meert T, Müller W, Nickel D, Oledzki P, Pedersen B, Petkovic M, Pliakos K, Rattray M, I Màs JR, Schneider R, Sengstag T, Serra-Picamal X, Spek W, Vaas LA, van Batenburg O, Vandelaer M, Varnai P, Villoslada P, Vizcaíno JA, Wubbe JP and Zanetti G

    European Institute for Systems Biology and Medicine, 1 avenue Claude Vellefaux, 75010, Paris, France.

    Medicine and healthcare are undergoing profound changes. Whole-genome sequencing and high-resolution imaging technologies are key drivers of this rapid and crucial transformation. Technological innovation combined with automation and miniaturization has triggered an explosion in data production that will soon reach exabyte proportions. How are we going to deal with this exponential increase in data production? The potential of "big data" for improving health is enormous but, at the same time, we face a wide range of challenges to overcome urgently. Europe is very proud of its cultural diversity; however, exploitation of the data made available through advances in genomic medicine, imaging, and a wide range of mobile health applications or connected devices is hampered by numerous historical, technical, legal, and political barriers. European health systems and databases are diverse and fragmented. There is a lack of harmonization of data formats, processing, analysis, and data transfer, which leads to incompatibilities and lost opportunities. Legal frameworks for data sharing are evolving. Clinicians, researchers, and citizens need improved methods, tools, and training to generate, analyze, and query data effectively. Addressing these barriers will contribute to creating the European Single Market for health, which will improve health and healthcare for all Europeans.

    Genome medicine 2016;8;1;71

  • Whole-genome sequencing of multidrug-resistant Mycobacterium tuberculosis isolates from Myanmar.

    Aung HL, Tun T, Moradigaravand D, Köser CU, Nyunt WW, Aung ST, Lwin T, Thinn KK, Crump JA, Parkhill J, Peacock SJ, Cook GM and Hill PC

    Department of Microbiology and Immunology, Otago School of Medical Sciences, University of Otago, Dunedin, New Zealand; Maurice Wilkins Centre for Molecular Biodiscovery, University of Auckland, Auckland, New Zealand. Electronic address:

    Drug-resistant tuberculosis (TB) is a major health threat in Myanmar. An initial study was conducted to explore the potential utility of whole-genome sequencing (WGS) for the diagnosis and management of drug-resistant TB in Myanmar. Fourteen multidrug-resistant Mycobacterium tuberculosis isolates were sequenced. Known resistance genes for a total of nine antibiotics commonly used in the treatment of drug-susceptible and multidrug-resistant TB (MDR-TB) in Myanmar were interrogated through WGS. All 14 isolates were MDR-TB, consistent with the results of phenotypic drug susceptibility testing (DST), and the Beijing lineage predominated. Based on the results of WGS, 9 of the 14 isolates were potentially resistant to at least one of the drugs used in the standard MDR-TB regimen but for which phenotypic DST is not conducted in Myanmar. This study highlights a need for the introduction of second-line DST as part of routine TB diagnosis in Myanmar as well as new classes of TB drugs to construct effective regimens.

    Funded by: Wellcome Trust: 098600

    Journal of global antimicrobial resistance 2016;6;113-117

  • An Integrated Genome-Wide Systems Genetics Screen for Breast Cancer Metastasis Susceptibility Genes.

    Bai L, Yang HH, Hu Y, Shukla A, Ha NH, Doran A, Faraji F, Goldberger N, Lee MP, Keane T and Hunter KW

    Laboratory of Cancer Biology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, United States of America.

    Metastasis remains the primary cause of patient morbidity and mortality in solid tumors and is due to the action of a large number of tumor-autonomous and non-autonomous factors. Here we report the results of a genome-wide integrated strategy to identify novel metastasis susceptibility candidate genes and molecular pathways in breast cancer metastasis. This analysis implicates a number of transcriptional regulators and suggests cell-mediated immunity is an important determinant. Moreover, the analysis identified novel or FDA-approved drugs as potentially useful for anti-metastatic therapy. Further explorations implementing this strategy may therefore provide a variety of information for clinical applications in the control and treatment of advanced neoplastic disease.

    Funded by: Intramural NIH HHS

    PLoS genetics 2016;12;4;e1005989

  • Travel- and Community-Based Transmission of Multidrug-Resistant Shigella sonnei Lineage among International Orthodox Jewish Communities.

    Baker KS, Dallman TJ, Behar A, Weill FX, Gouali M, Sobel J, Fookes M, Valinsky L, Gal-Mor O, Connor TR, Nissan I, Bertrand S, Parkhill J, Jenkins C, Cohen D and Thomson NR

    Shigellae are sensitive indicator species for studying trends in the international transmission of antimicrobial-resistant Enterobacteriaceae. Orthodox Jewish communities (OJCs) are a known risk group for shigellosis; Shigella sonnei is cyclically epidemic in OJCs in Israel, and sporadic outbreaks occur in OJCs elsewhere. We generated whole-genome sequences for 437 isolates of S. sonnei from OJCs and non-OJCs collected over 22 years in Europe (the United Kingdom, France, and Belgium), the United States, Canada, and Israel and analyzed these within a known global genomic context. Through phylogenetic and genomic analysis, we showed that strains from outbreaks in OJCs outside of Israel are distinct from strains in the general population and relate to a single multidrug-resistant sublineage of S. sonnei that prevails in Israel. Further Bayesian phylogenetic analysis showed that this strain emerged approximately 30 years ago, demonstrating the speed at which antimicrobial drug-resistant pathogens can spread widely through geographically dispersed, but internationally connected, communities.

    Funded by: Wellcome Trust: 098051, 106690/A/14/Z

    Emerging infectious diseases 2016;22;9;1545-53

  • Single-cell sequencing reveals karyotype heterogeneity in murine and human malignancies.

    Bakker B, Taudt A, Belderbos ME, Porubsky D, Spierings DC, de Jong TV, Halsema N, Kazemier HG, Hoekstra-Wakker K, Bradley A, de Bont ES, van den Berg A, Guryev V, Lansdorp PM, Colomé-Tatché M and Foijer F

    European Research Institute for the Biology of Ageing, University of Groningen, University Medical Center Groningen, A. Deusinglaan 1, Groningen, 9713 AV, The Netherlands.

    Background: Chromosome instability leads to aneuploidy, a state in which cells have abnormal numbers of chromosomes, and is found in two out of three cancers. In a chromosomal instable p53 deficient mouse model with accelerated lymphomagenesis, we previously observed whole chromosome copy number changes affecting all lymphoma cells. This suggests that chromosome instability is somehow suppressed in the aneuploid lymphomas or that selection for frequently lost/gained chromosomes out-competes the CIN-imposed mis-segregation.

    Results: To distinguish between these explanations and to examine karyotype dynamics in chromosome instable lymphoma, we use a newly developed single-cell whole genome sequencing (scWGS) platform that provides a complete and unbiased overview of copy number variations (CNV) in individual cells. To analyse these scWGS data, we develop AneuFinder, which allows annotation of copy number changes in a fully automated fashion and quantification of CNV heterogeneity between cells. Single-cell sequencing and AneuFinder analysis reveals high levels of copy number heterogeneity in chromosome instability-driven murine T-cell lymphoma samples, indicating ongoing chromosome instability. Application of this technology to human B cell leukaemias reveals different levels of karyotype heterogeneity in these cancers.

    Conclusion: Our data show that even though aneuploid tumours select for particular and recurring chromosome combinations, single-cell analysis using AneuFinder reveals copy number heterogeneity. This suggests ongoing chromosome instability that other platforms fail to detect. As chromosome instability might drive tumour evolution, karyotype analysis using single-cell sequencing technology could become an essential tool for cancer treatment stratification.

    Genome biology 2016;17;1;115

  • Synthetic lethality between PAXX and XLF in mammalian development.

    Balmus G, Barros AC, Wijnhoven PW, Lescale C, Hasse HL, Boroviak K, le Sage C, Doe B, Speak AO, Galli A, Jacobsen M, Deriano L, Adams DJ, Blackford AN and Jackson SP

    Wellcome Trust/Cancer Research UK Gurdon Institute, University of Cambridge, Cambridge CB2 1QN, United Kingdom.

    PAXX was identified recently as a novel nonhomologous end-joining DNA repair factor in human cells. To characterize its physiological roles, we generated Paxx-deficient mice. Like Xlf<sup>-/-</sup> mice, Paxx<sup>-/-</sup> mice are viable, grow normally, and are fertile but show mild radiosensitivity. Strikingly, while Paxx loss is epistatic with Ku80, Lig4, and Atm deficiency, Paxx/Xlf double-knockout mice display embryonic lethality associated with genomic instability, cell death in the central nervous system, and an almost complete block in lymphogenesis, phenotypes that closely resemble those of Xrcc4<sup>-/-</sup> and Lig4<sup>-/-</sup> mice. Thus, combined loss of Paxx and Xlf is synthetic-lethal in mammals.

    Funded by: Cancer Research UK: 13031; European Research Council: 310917

    Genes & development 2016;30;19;2152-2157

  • Dawning of the age of genomics for platelet granule disorders: improving insight, diagnosis and management.

    Bariana TK, Ouwehand WH, Guerrero JA, Gomez K and BRIDGE Bleeding, Thrombotic and Platelet Disorders and ThromboGenomics Consortia

    Katharine Dormandy Haemophilia Centre and Thrombosis Unit, Royal Free London NHS Foundation Trust, London, UK.

    Inherited disorders of platelet granules are clinically heterogeneous and their prevalence is underestimated because most patients do not undergo a complete diagnostic work-up. The lack of a genetic diagnosis limits the ability to tailor management, screen family members, aid with family planning, predict clinical progression and detect serious consequences, such as myelofibrosis, lung fibrosis and malignancy, in a timely manner. This is set to change with the introduction of high throughput sequencing (HTS) as a routine clinical diagnostic test. HTS diagnostic tests are now available, affordable and allow parallel screening of DNA samples for variants in all of the 80 known bleeding, thrombotic and platelet genes. Increased genetic diagnosis and curation of variants is, in turn, improving our understanding of the pathobiology and clinical course of inherited platelet disorders. Our understanding of the genetic causes of platelet granule disorders and the regulation of granule biogenesis is a work in progress and has been significantly enhanced by recent genomic discoveries from high-powered genome-wide association studies and genome sequencing projects. In the era of whole genome and epigenome sequencing, new strategies are required to integrate multiple sources of big data in the search for elusive, novel genes underlying granule disorders.

    British journal of haematology 2016

  • Studying RNA Homology and Conservation with Infernal: From Single Sequences to RNA Families.

    Barquist L, Burge SW and Gardner PP

    Institute for Molecular Infection Biology, University of Würzburg, Würzburg, Germany.

    Emerging high-throughput technologies have led to a deluge of putative non-coding RNA (ncRNA) sequences identified in a wide variety of organisms. Systematic characterization of these transcripts will be a tremendous challenge. Homology detection is critical to making maximal use of functional information gathered about ncRNAs: identifying homologous sequence allows us to transfer information gathered in one organism to another quickly and with a high degree of confidence. ncRNA presents a challenge for homology detection, as the primary sequence is often poorly conserved and de novo secondary structure prediction and search remain difficult. This unit introduces methods developed by the Rfam database for identifying "families" of homologous ncRNAs starting from single "seed" sequences, using manually curated sequence alignments to build powerful statistical models of sequence and structure conservation known as covariance models (CMs), implemented in the Infernal software package. We provide a step-by-step iterative protocol for identifying ncRNA homologs and then constructing an alignment and corresponding CM. We also work through an example for the bacterial small RNA MicA, discovering a previously unreported family of divergent MicA homologs in genus Xenorhabdus in the process. © 2016 by John Wiley & Sons, Inc.

    Funded by: Wellcome Trust: 098051

    Current protocols in bioinformatics 2016;54;12.13.1-12.13.25

  • The TraDIS toolkit: sequencing and analysis for dense transposon mutant libraries.

    Barquist L, Mayho M, Cummins C, Cain AK, Boinett CJ, Page AJ, Langridge GC, Quail MA, Keane JA and Parkhill J

    Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, UK and Institute for Molecular Infection Biology, University of Würzburg, Würzburg D-97080, Germany.

    Unlabelled: Transposon insertion sequencing is a high-throughput technique for assaying large libraries of otherwise isogenic transposon mutants providing insight into gene essentiality, gene function and genetic interactions. We previously developed the Transposon Directed Insertion Sequencing (TraDIS) protocol for this purpose, which utilizes shearing of genomic DNA followed by specific PCR amplification of transposon-containing fragments and Illumina sequencing. Here we describe an optimized high-yield library preparation and sequencing protocol for TraDIS experiments and a novel software pipeline for analysis of the resulting data. The Bio-Tradis analysis pipeline is implemented as an extensible Perl library which can either be used as is, or as a basis for the development of more advanced analysis tools. This article can serve as a general reference for the application of the TraDIS methodology.

    Availability and implementation: The optimized sequencing protocol is included as supplementary information. The Bio-Tradis analysis pipeline is available under a GPL license at


    Supplementary information: Supplementary data are available at Bioinformatics online.

    Funded by: Medical Research Council: G1100100; Wellcome Trust: WT098051

    Bioinformatics (Oxford, England) 2016;32;7;1109-11

  • The need for an integrated approach for chronic disease research and care in Africa

    BARR, A.L., YOUNG, E.H., Smeeth, L., Newton, R., Seeley, J., RIPULLONE, K., HIRD, T.R., THORNTON, J.R.M., Nyirenda, M.J., Kapiga, S., Adebamowo,C.A, Amoah, A.G., Wareham,N., Rotimi, C.N., Levitt,N.S., Ramaiya, K., Hennig,B.J., Mbanya, J.C., Tollman, S., Motala,A.A., Kaleebu, P. and SANDHU, M.S.

    Global Health, Epidemiology and Genomics 2016;1;e19

  • Genome-wide association studies of quantitative glycaemic traits

    Barroso,I. and Scott,R.

    The Genetics of Type 2 Diabetes and Related Traits: Biology, Physiology and Translation 2016;63-89

  • SeqTools: visual tools for manual analysis of sequence alignments.

    Barson G and Griffiths E

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK.

    Background: Manual annotation is essential to create high-quality reference alignments and annotation. Annotators need to be able to view sequence alignments in detail. The SeqTools package provides three tools for viewing different types of sequence alignment: Blixem is a many-to-one browser of pairwise alignments, displaying multiple match sequences aligned against a single reference sequence; Dotter provides a graphical dot-plot view of a single pairwise alignment; and Belvu is a multiple sequence alignment viewer, editor, and phylogenetic tool. These tools were originally part of the AceDB genome database system but have been completely rewritten to make them generally available as a standalone package of greatly improved function.

    Findings: Blixem is used by annotators to give a detailed view of the evidence for particular gene models. Blixem displays the gene model positions and the match sequences aligned against the genomic reference sequence. Annotators use this for many reasons, including to check the quality of an alignment, to find missing/misaligned sequence and to identify splice sites and polyA sites and signals. Dotter is used to give a dot-plot representation of a particular pairwise alignment. This is used to identify sequence that is not represented (or is misrepresented) and to quickly compare annotated gene models with transcriptional and protein evidence that putatively supports them. Belvu is used to analyse conservation patterns in multiple sequence alignments and to perform a combination of manual and automatic processing of the alignment. High-quality reference alignments are essential if they are to be used as a starting point for further automatic alignment generation.

    Conclusions: While there are many different alignment tools available, the SeqTools package provides unique functionality that annotators have found to be essential for analysing sequence alignments as part of the manual annotation process.

    Funded by: NHGRI NIH HHS: 5U54HG00455-04; Wellcome Trust: 098051

    BMC research notes 2016;9;39

  • The Accessory Genome of Shiga Toxin-Producing Escherichia coli Defines a Persistent Colonization Type in Cattle.

    Barth SA, Menge C, Eichhorn I, Semmler T, Wieler LH, Pickard D, Belka A, Berens C and Geue L

    Friedrich-Loeffler-Institut/Federal Research Institute for Animal Health, Institute of Molecular Pathogenesis, Jena, Germany.

    Unlabelled: Shiga toxin-producing Escherichia coli (STEC) strains can colonize cattle for several months and may, thus, serve as gene reservoirs for the genesis of highly virulent zoonotic enterohemorrhagic E. coli (EHEC). Attempts to reduce the human risk for acquiring EHEC infections should include strategies to control such STEC strains persisting in cattle. We therefore aimed to identify genetic patterns associated with the STEC colonization type in the bovine host. We included 88 persistent colonizing STEC (STEC(per)) (shedding for ≥4 months) and 74 sporadically colonizing STEC (STEC(spo)) (shedding for ≤2 months) isolates from cattle and 16 bovine STEC isolates with unknown colonization types. Genoserotypes and multilocus sequence types (MLSTs) were determined, and the isolates were probed with a DNA microarray for virulence-associated genes (VAGs). All STEC(per) isolates belonged to only four genoserotypes (O26:H11, O156:H25, O165:H25, O182:H25), which formed three genetic clusters (ST21/396/1705, ST300/688, ST119). In contrast, STEC(spo) isolates were scattered among 28 genoserotypes and 30 MLSTs, with O157:H7 (ST11) and O6:H49 (ST1079) being the most prevalent. The microarray analysis identified 139 unique gene patterns that clustered with the genoserotypes and MLSTs of the strains. While the STEC(per) isolates possessed heterogeneous phylogenetic backgrounds, the accessory genome clustered these isolates together, separating them from the STEC(spo) isolates. Given the vast genetic heterogeneity of bovine STEC strains, defining the genetic patterns distinguishing STEC(per) from STEC(spo) isolates will facilitate the targeted design of new intervention strategies to counteract these zoonotic pathogens at the farm level.

    Importance: Ruminants, especially cattle, are sources of food-borne infections by Shiga toxin-producing Escherichia coli (STEC) in humans. Some STEC strains persist in cattle for longer periods of time, while others are detected only sporadically. Persisting strains can serve as gene reservoirs that supply E. coli with virulence factors, thereby generating new outbreak strains. Attempts to reduce the human risk for acquiring STEC infections should therefore include strategies to control such persisting STEC strains. By analyzing representative genes of their core and accessory genomes, we show that bovine STEC with a persistent colonization type emerged independently from sporadically colonizing isolates and evolved in parallel evolutionary branches. However, persistent colonizing strains share similar sets of accessory genes. Defining the genetic patterns that distinguish persistent from sporadically colonizing STEC isolates will facilitate the targeted design of new intervention strategies to counteract these zoonotic pathogens at the farm level.

    Applied and environmental microbiology 2016;82;17;5455-64

  • Eye on the B-ALL: B-cell receptor repertoires reveal persistence of numerous B-lymphoblastic leukemia subclones from diagnosis to relapse.

    Bashford-Rogers RJ, Nicolaou KA, Bartram J, Goulden NJ, Loizou L, Koumas L, Chi J, Hubank M, Kellam P, Costeas PA and Vassiliou GS

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK.

    The strongest predictor of relapse in B-cell acute lymphoblastic leukemia (B-ALL) is the level of persistence of tumor cells after initial therapy. The high mutation rate of the B-cell receptor (BCR) locus allows high-resolution tracking of the architecture, evolution and clonal dynamics of B-ALL. Using longitudinal BCR repertoire sequencing, we find that the BCR undergoes an unexpectedly high level of clonal diversification in B-ALL cells through both somatic hypermutation and secondary rearrangements, which can be used for tracking the subclonal composition of the disease and detect minimal residual disease with unprecedented sensitivity. We go on to investigate clonal dynamics of B-ALL using BCR phylogenetic analyses of paired diagnosis-relapse samples and find that large numbers of small leukemic subclones present at diagnosis re-emerge at relapse alongside a dominant clone. Our findings suggest that in all informative relapsed patients, the survival of large numbers of clonogenic cells beyond initial chemotherapy is a surrogate for inherent partial chemoresistance or inadequate therapy, providing an increased opportunity for subsequent emergence of fully resistant clones. These results frame early cytoreduction as an important determinant of long-term outcome.

    Funded by: Medical Research Council: MC_PC_12009; Wellcome Trust: WT095663MA, WT098051, WT106068AIA

    Leukemia 2016;30;12;2312-2321

  • Six-Year Incidence of Blindness and Visual Impairment in Kenya: The Nakuru Eye Disease Cohort Study.

    Bastawrous A, Mathenge W, Wing K, Rono H, Gichangi M, Weiss HA, Macleod D, Foster A, Burton MJ and Kuper H

    International Centre for Eye Health, Clinical Research Department, London School of Hygiene and Tropical Medicine, London, United Kingdom.

    Purpose: To describe the cumulative 6-year incidence of visual impairment (VI) and blindness in an adult Kenyan population. The Nakuru Posterior Segment Eye Disease Study is a population-based sample of 4414 participants aged ≥50 years, enrolled in 2007-2008. Of these, 2170 (50%) were reexamined in 2013-2014.

    Methods: The World Health Organization (WHO) and US definitions were used to calculate presenting visual acuity classifications based on logMAR visual acuity tests at baseline and follow-up. Detailed ophthalmic and anthropometric examinations as well as a questionnaire, which included past medical and ophthalmic history, were used to assess risk factors for study participation and vision loss. Cumulative incidence of VI and blindness, and factors associated with these outcomes, were estimated. Inverse probability weighting was used to adjust for nonparticipation.

    Results: Visual acuity measurements were available for 2164 (99.7%) participants. Using WHO definitions, the 6-year cumulative incidence of VI was 11.9% (95%CI [confidence interval]: 10.3-13.8%) and blindness was 1.51% (95%CI: 1.0-2.2%); using the US classification, the cumulative incidence of blindness was 2.70% (95%CI: 1.8-3.2%). Incidence of VI increased strongly with older age, and independently with being diabetic. There are an estimated 21 new cases of VI per year in people aged ≥50 years per 1000 people, of whom 3 are blind. Therefore in Kenya we estimate that there are 92,000 new cases of VI in people aged ≥50 years per year, of whom 11,600 are blind, out of a total population of approximately 4.3 million people aged 50 and above.

    Conclusions: The incidence of VI and blindness in this older Kenyan population was considerably higher than in comparable studies worldwide. A continued effort to strengthen the eye health system is necessary to support the growing unmet need in an aging and growing population.

    Funded by: Medical Research Council: G1001934

    Investigative ophthalmology & visual science 2016;57;14;5974-5983

  • Circulation of multiple genotypes of H1N2 viruses in a swine farm in Italy over a two-month period.

    Beato MS, Tassoni L, Milani A, Salviato A, Di Martino G, Mion M, Bonfanti L, Monne I, Watson SJ and Fusaro A

    Istituto Zooprofilattico Sperimentale delle Venezie, Legnaro, PD, Italy. Electronic address:

    In August 2012 repeated respiratory outbreaks caused by swine influenza A virus (swIAV) were registered for a whole year in a breeding farm in northeast Italy that supplied piglets for fattening. The virus, initially characterized in the farm, was a reassortant Eurasian avian-like H1N1 (H1avN1) genotype, containing a haemagglutinin segment derived from the pandemic H1N1 (A(H1N1)pdm09) lineage. To control infection, a vaccination program using vaccines against the A(H1N1)pdm09, human-like H1N2 (H1huN2), human-like H3N2 (H3N2), and H1avN1 viruses was implemented in sows in November 2013. Vaccine efficacy was assessed by sampling nasal swabs for two months in 35-75 day-old piglets born from vaccinated sows. Complete genome sequencing of eight swIAV-positive nasal swabs collected longitudinally from piglets after the implementation of the vaccination program was conducted to investigate the virus characteristics. Over the two-month period, two different genotypes involving multiple reassortment events were detected. The unexpected circulation of multiple reassortant genotypes in such a short time highlights the complexity of the genetic diversity of swIAV and the need for a better surveillance plan, based on the combination of clinical signs, epidemiological data and whole genome characterization.

    Veterinary microbiology 2016;195;25-29

  • Retracing embryological fate.

    Behjati S

    Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton CB10 1SA, UK.

    Science (New York, N.Y.) 2016;354;6316;1109

  • Mutational signatures of ionizing radiation in second malignancies.

    Behjati S, Gundem G, Wedge DC, Roberts ND, Tarpey PS, Cooke SL, Van Loo P, Alexandrov LB, Ramakrishna M, Davies H, Nik-Zainal S, Hardy C, Latimer C, Raine KM, Stebbings L, Menzies A, Jones D, Shepherd R, Butler AP, Teague JW, Jorgensen M, Khatri B, Pillay N, Shlien A, Futreal PA, Badie C, ICGC Prostate Group, McDermott U, Bova GS, Richardson AL, Flanagan AM, Stratton MR and Campbell PJ

    Cancer Genome Project, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SA UK.

    Ionizing radiation is a potent carcinogen, inducing cancer through DNA damage. The signatures of mutations arising in human tissues following in vivo exposure to ionizing radiation have not been documented. Here, we searched for signatures of ionizing radiation in 12 radiation-associated second malignancies of different tumour types. Two signatures of somatic mutation characterize ionizing radiation exposure irrespective of tumour type. Compared with 319 radiation-naive tumours, radiation-associated tumours carry a median extra 201 deletions genome-wide, sized 1-100 base pairs often with microhomology at the junction. Unlike deletions of radiation-naive tumours, these show no variation in density across the genome or correlation with sequence context, replication timing or chromatin structure. Furthermore, we observe a significant increase in balanced inversions in radiation-associated tumours. Both small deletions and inversions generate driver mutations. Thus, ionizing radiation generates distinctive mutational signatures that explain its carcinogenic potential.

    Funded by: Cancer Research UK: 14835; Wellcome Trust

    Nature communications 2016;7;12605

  • FINEMAP: efficient variable selection using summary data from genome-wide association studies.

    Benner C, Spencer CC, Havulinna AS, Salomaa V, Ripatti S and Pirinen M

    Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, Finland, Department of Public Health, University of Helsinki, Helsinki, Finland.

    Motivation: The goal of fine-mapping in genomic regions associated with complex diseases and traits is to identify causal variants that point to molecular mechanisms behind the associations. Recent fine-mapping methods using summary data from genome-wide association studies rely on exhaustive search through all possible causal configurations, which is computationally expensive.

    Results: We introduce FINEMAP, a software package to efficiently explore a set of the most important causal configurations of the region via a shotgun stochastic search algorithm. We show that FINEMAP produces accurate results in a fraction of processing time of existing approaches and is therefore a promising tool for analyzing growing amounts of data produced in genome-wide association studies and emerging sequencing projects.

    Availability and implementation: FINEMAP v1.0 is freely available for Mac OS X and Linux at

    Contact: : or

    Bioinformatics (Oxford, England) 2016;32;10;1493-501

  • Stage-Specific Transcriptome and Proteome Analyses of the Filarial Parasite Onchocerca volvulus and Its Wolbachia Endosymbiont.

    Bennuru S, Cotton JA, Ribeiro JM, Grote A, Harsha B, Holroyd N, Mhashilkar A, Molina DM, Randall AZ, Shandling AD, Unnasch TR, Ghedin E, Berriman M, Lustigman S and Nutman TB

    Laboratory of Parasitic Diseases, NIAID, NIH, Bethesda, Maryland, USA.

    Onchocerciasis (river blindness) is a neglected tropical disease that has been successfully targeted by mass drug treatment programs in the Americas and small parts of Africa. Achieving the long-term goal of elimination of onchocerciasis, however, requires additional tools, including drugs, vaccines, and biomarkers of infection. Here, we describe the transcriptome and proteome profiles of the major vector and the human host stages (L1, L2, L3, molting L3, L4, adult male, and adult female) of Onchocerca volvulus along with the proteome of each parasitic stage and of its Wolbachia endosymbiont (wOv). In so doing, we have identified stage-specific pathways important to the parasite's adaptation to its human host during its early development. Further, we generated a protein array that, when screened with well-characterized human samples, identified novel diagnostic biomarkers of O. volvulus infection and new potential vaccine candidates. This immunomic approach not only demonstrates the power of this postgenomic discovery platform but also provides additional tools for onchocerciasis control programs.

    Importance: The global onchocerciasis (river blindness) elimination program will have to rely on the development of new tools (drugs, vaccines, biomarkers) to achieve its goals by 2025. As an adjunct to the completed genomic sequencing of O. volvulus, we used a comprehensive proteomic and transcriptomic profiling strategy to gain a comprehensive understanding of both the vector-derived and human host-derived parasite stages. In so doing, we have identified proteins and pathways that enable novel drug targeting studies and the discovery of novel vaccine candidates, as well as useful biomarkers of active infection.

    Funded by: NIA NIH HHS: R24 AG042328; NIAID NIH HHS: R21 AI126466, T32 AI007180, ZIA AI000512; Wellcome Trust: 098051

    mBio 2016;7;6

  • Deep Roots for Aboriginal Australian Y Chromosomes.

    Bergström A, Nagle N, Chen Y, McCarthy S, Pollard MO, Ayub Q, Wilcox S, Wilcox L, van Oorschot RA, McAllister P, Williams L, Xue Y, Mitchell RJ and Tyler-Smith C

    The Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK.

    Australia was one of the earliest regions outside Africa to be colonized by fully modern humans, with archaeological evidence for human presence by 47,000 years ago (47 kya) widely accepted [1, 2]. However, the extent of subsequent human entry before the European colonial age is less clear. The dingo reached Australia about 4 kya, indirectly implying human contact, which some have linked to changes in language and stone tool technology to suggest substantial cultural changes at the same time [3]. Genetic data of two kinds have been proposed to support gene flow from the Indian subcontinent to Australia at this time, as well: first, signs of South Asian admixture in Aboriginal Australian genomes have been reported on the basis of genome-wide SNP data [4]; and second, a Y chromosome lineage designated haplogroup C(∗), present in both India and Australia, was estimated to have a most recent common ancestor around 5 kya and to have entered Australia from India [5]. Here, we sequence 13 Aboriginal Australian Y chromosomes to re-investigate their divergence times from Y chromosomes in other continents, including a comparison of Aboriginal Australian and South Asian haplogroup C chromosomes. We find divergence times dating back to ∼50 kya, thus excluding the Y chromosome as providing evidence for recent gene flow from India into Australia.

    Funded by: Wellcome Trust: 098051

    Current biology : CB 2016;26;6;809-13

  • Chemokine (C-C Motif) Receptor 2 Mediates Dendritic Cell Recruitment to the Human Colon but Is Not Responsible for Differences Observed in Dendritic Cell Subsets, Phenotype, and Function Between the Proximal and Distal Colon.

    Bernardo D, Durant L, Mann ER, Bassity E, Montalvillo E, Man R, Vora R, Reddi D, Bayiroglu F, Fernández-Salazar L, English NR, Peake ST, Landy J, Lee GH, Malietzis G, Siaw YH, Murugananthan AU, Hendy P, Sánchez-Recio E, Phillips RK, Garrote JA, Scott P, Parkhill J, Paulsen M, Hart AL, Al-Hassi HO, Arranz E, Walker AW, Carding SR and Knight SC

    Antigen Presentation Research Group, Imperial College London, Harrow, United Kingdom.

    Background &amp; aims: Most knowledge about gastrointestinal (GI)-tract dendritic cells (DC) relies on murine studies where CD103<sup>+</sup> DC specialize in generating immune tolerance with the functionality of CD11b<sup>+/-</sup> subsets being unclear. Information about human GI-DC is scarce, especially regarding regional specifications. Here, we characterized human DC properties throughout the human colon.

    Methods: Paired proximal (right/ascending) and distal (left/descending) human colonic biopsies from 95 healthy subjects were taken; DC were assessed by flow cytometry and microbiota composition assessed by 16S rRNA gene sequencing.

    Results: Colonic DC identified were myeloid (mDC, CD11c<sup>+</sup>CD123<sup>-</sup>) and further divided based on CD103 and SIRPα (human analog of murine CD11b) expression. CD103<sup>-</sup>SIRPα<sup>+</sup> DC were the major population and with CD103<sup>+</sup>SIRPα<sup>+</sup> DC were CD1c<sup>+</sup>ILT3<sup>+</sup>CCR2<sup>+</sup> (although CCR2 was not expressed on all CD103<sup>+</sup>SIRPα<sup>+</sup> DC). CD103<sup>+</sup>SIRPα<sup>-</sup> DC constituted a minor subset that were CD141<sup>+</sup>ILT3<sup>-</sup>CCR2<sup>-</sup>. Proximal colon samples had higher total DC counts and fewer CD103<sup>+</sup>SIRPα<sup>+</sup> cells. Proximal colon DC were more mature than distal DC with higher stimulatory capacity for CD4<sup>+</sup>CD45RA<sup>+</sup> T-cells. However, DC and DC-invoked T-cell expression of mucosal homing markers (β7, CCR9) was lower for proximal DC. CCR2 was expressed on circulating CD1c<sup>+</sup>, but not CD141<sup>+</sup> mDC, and mediated DC recruitment by colonic culture supernatants in transwell assays. Proximal colon DC produced higher levels of cytokines. Mucosal microbiota profiling showed a lower microbiota load in the proximal colon, but with no differences in microbiota composition between compartments.

    Conclusions: Proximal colonic DC subsets differ from those in distal colon and are more mature. Targeted immunotherapy using DC in T-cell mediated GI tract inflammation may therefore need to reflect this immune compartmentalization.

    Funded by: NIDDK NIH HHS: T32 DK007632; Worldwide Cancer Research: 12-0234

    Cellular and molecular gastroenterology and hepatology 2016;2;1;22-39.e5

  • Optimized inducible shRNA and CRISPR/Cas9 platforms for in vitro studies of human development using hPSCs.

    Bertero A, Pawlowski M, Ortmann D, Snijders K, Yiangou L, Cardoso de Brito M, Brown S, Bernard WG, Cooper JD, Giacomelli E, Gambardella L, Hannan NR, Iyer D, Sampaziotis F, Serrano F, Zonneveld MC, Sinha S, Kotter M and Vallier L

    Wellcome Trust-MRC Stem Cell Institute, Anne McLaren Laboratory, University of Cambridge, Cambridge, CB2 0SZ, UK

    Inducible loss of gene function experiments are necessary to uncover mechanisms underlying development, physiology and disease. However, current methods are complex, lack robustness and do not work in multiple cell types. Here we address these limitations by developing single-step optimized inducible gene knockdown or knockout (sOPTiKD or sOPTiKO) platforms. These are based on genetic engineering of human genomic safe harbors combined with an improved tetracycline-inducible system and CRISPR/Cas9 technology. We exemplify the efficacy of these methods in human pluripotent stem cells (hPSCs), and show that generation of sOPTiKD/KO hPSCs is simple, rapid and allows tightly controlled individual or multiplexed gene knockdown or knockout in hPSCs and in a wide variety of differentiated cells. Finally, we illustrate the general applicability of this approach by investigating the function of transcription factors (OCT4 and T), cell cycle regulators (cyclin D family members) and epigenetic modifiers (DPY30). Overall, sOPTiKD and sOPTiKO provide a unique opportunity for functional analyses in multiple cell types relevant for the study of human development.

    Funded by: British Heart Foundation: FS/11/77/39327 , FS/13/29/30024; European Research Council: 281335; Medical Research Council: MC_PC_12009, MR/L016761/1

    Development (Cambridge, England) 2016;143;23;4405-4418

  • A detailed clinical analysis of 13 patients with AUTS2 syndrome further delineates the phenotypic spectrum and underscores the behavioural phenotype.

    Beunders G, van de Kamp J, Vasudevan P, Morton J, Smets K, Kleefstra T, de Munnik SA, Schuurs-Hoeijmakers J, Ceulemans B, Zollino M, Hoffjan S, Wieczorek S, So J, Mercer L, Walker T, Velsher L, DDD study, Parker MJ, Magee AC, Elffers B, Kooy RF, Yntema HG, Meijers-Heijboer EJ and Sistermans EA

    Department of Clinical Genetics, VU University Medical Center Amsterdam, The Netherlands.

    Background: AUTS2 syndrome is an 'intellectual disability (ID) syndrome' caused by genomic rearrangements, deletions, intragenic duplications or mutations disrupting AUTS2. So far, 50 patients with AUTS2 syndrome have been described, but clinical data are limited and almost all cases involved young children.

    Methods: We present a detailed clinical description of 13 patients (including six adults) with AUTS2 syndrome who have a pathogenic mutation or deletion in AUTS2. All patients were systematically evaluated by the same clinical geneticist.

    Results: All patients have borderline to severe ID/developmental delay, 83-100% have microcephaly and feeding difficulties. Congenital malformations are rare, but mild heart defects, contractures and genital malformations do occur. There are no major health issues in the adults; the oldest of whom is now 59 years of age. Behaviour is marked by it is a friendly outgoing social interaction. Specific features of autism (like obsessive behaviour) are seen frequently (83%), but classical autism was not diagnosed in any. A mild clinical phenotype is associated with a small in-frame 5' deletions, which are often inherited. Deletions and other mutations causing haploinsufficiency of the full-length AUTS2 transcript give a more severe phenotype and occur de novo.

    Conclusions: The 13 patients with AUTS2 syndrome with unique pathogenic deletions scattered around the AUTS2 locus confirm a phenotype-genotype correlation. Despite individual variations, AUTS2 syndrome emerges as a specific ID syndrome with microcephaly, feeding difficulties, dysmorphic features and a specific behavioural phenotype.

    Journal of medical genetics 2016;53;8;523-32

  • Sperm Meets Egg: The Genetics of Mammalian Fertilization.

    Bianchi E and Wright GJ

    Cell Surface Signalling Laboratory, Wellcome Trust Sanger Institute, Cambridge, CB10 1SA, United Kingdom; email:

    Fertilization is the culminating event of sexual reproduction, which involves the union of the sperm and egg to form a single, genetically distinct organism. Despite the fundamental role of fertilization, the basic mechanisms involved have remained poorly understood. However, these mechanisms must involve an ordered schedule of cellular recognition events between the sperm and egg to ensure successful fusion. In this article, we review recent progress in our molecular understanding of mammalian fertilization, highlighting the areas in which genetic approaches have been particularly informative and focusing especially on the roles of secreted and cell surface proteins, expressed in a sex-specific manner, that mediate sperm-egg interactions. We discuss how the sperm interacts with the female reproductive tract, zona pellucida, and the oolemma. Finally, we review recent progress made in elucidating the mechanisms that reduce polyspermy and ensure that eggs normally fuse with only a single sperm. Expected final online publication date for the Annual Review of Genetics Volume 50 is November 23, 2016. Please see for revised estimates.

    Annual review of genetics 2016

  • Interferon-driven alterations of the host's amino acid metabolism in the pathogenesis of typhoid fever.

    Blohmke CJ, Darton TC, Jones C, Suarez NM, Waddington CS, Angus B, Zhou L, Hill J, Clare S, Kane L, Mukhopadhyay S, Schreiber F, Duque-Correa MA, Wright JC, Roumeliotis TI, Yu L, Choudhary JS, Mejias A, Ramilo O, Shanyinde M, Sztein MB, Kingsley RA, Lockhart S, Levine MM, Lynn DJ, Dougan G and Pollard AJ

    Oxford Vaccine Group, Department of Paediatrics, University of Oxford and the NIHR Oxford Biomedical Research Centre, Oxford OX3 7LE, England, UK

    Enteric fever, caused by Salmonella enterica serovar Typhi, is an important public health problem in resource-limited settings and, despite decades of research, human responses to the infection are poorly understood. In 41 healthy adults experimentally infected with wild-type S. Typhi, we detected significant cytokine responses within 12 h of bacterial ingestion. These early responses did not correlate with subsequent clinical disease outcomes and likely indicate initial host-pathogen interactions in the gut mucosa. In participants developing enteric fever after oral infection, marked transcriptional and cytokine responses during acute disease reflected dominant type I/II interferon signatures, which were significantly associated with bacteremia. Using a murine and macrophage infection model, we validated the pivotal role of this response in the expression of proteins of the host tryptophan metabolism during Salmonella infection. Corresponding alterations in tryptophan catabolites with immunomodulatory properties in serum of participants with typhoid fever confirmed the activity of this pathway, and implicate a central role of host tryptophan metabolism in the pathogenesis of typhoid fever.

    Funded by: Medical Research Council: MR/M02637X/1; NIAID NIH HHS: R01 AI036525, U01 AI082210, U19 AI057234, U19 AI082655, U19 AI089987, U19 AI109776; Wellcome Trust: 092661

    The Journal of experimental medicine 2016;213;6;1061-77

  • Tissue-specific mutation accumulation in human adult stem cells during life.

    Blokzijl F, de Ligt J, Jager M, Sasselli V, Roerink S, Sasaki N, Huch M, Boymans S, Kuijk E, Prins P, Nijman IJ, Martincorena I, Mokry M, Wiegerinck CL, Middendorp S, Sato T, Schwank G, Nieuwenhuis EE, Verstegen MM, van der Laan LJ, de Jonge J, IJzermans JN, Vries RG, van de Wetering M, Stratton MR, Clevers H, Cuppen E and van Boxtel R

    Center for Molecular Medicine, Cancer Genomics Netherlands, Department of Genetics, University Medical Center Utrecht, Heidelberglaan 100, 3584CX Utrecht, The Netherlands.

    The gradual accumulation of genetic mutations in human adult stem cells (ASCs) during life is associated with various age-related diseases, including cancer. Extreme variation in cancer risk across tissues was recently proposed to depend on the lifetime number of ASC divisions, owing to unavoidable random mutations that arise during DNA replication. However, the rates and patterns of mutations in normal ASCs remain unknown. Here we determine genome-wide mutation patterns in ASCs of the small intestine, colon and liver of human donors with ages ranging from 3 to 87 years by sequencing clonal organoid cultures derived from primary multipotent cells. Our results show that mutations accumulate steadily over time in all of the assessed tissue types, at a rate of approximately 40 novel mutations per year, despite the large variation in cancer incidence among these tissues. Liver ASCs, however, have different mutation spectra compared to those of the colon and small intestine. Mutational signature analysis reveals that this difference can be attributed to spontaneous deamination of methylated cytosine residues in the colon and small intestine, probably reflecting their high ASC division rate. In liver, a signature with an as-yet-unknown underlying mechanism is predominant. Mutation spectra of driver genes in cancer show high similarity to the tissue-specific ASC mutation spectra, suggesting that intrinsic mutational processes in ASCs can initiate tumorigenesis. Notably, the inter-individual variation in mutation rate and spectra are low, suggesting tissue-specific activity of common mutational processes throughout life.

    Funded by: Worldwide Cancer Research: 16-0193

    Nature 2016;538;7624;260-264

  • Mutation Rates and Discriminating Power for 13 Rapidly-Mutating Y-STRs between Related and Unrelated Individuals.

    Boattini A, Sarno S, Bini C, Pesci V, Barbieri C, De Fanti S, Quagliariello A, Pagani L, Ayub Q, Ferri G, Pettener D, Luiselli D and Pelotti S

    Dipartimento di Scienze Biologiche, Geologiche e Ambientali (BiGeA), Università di Bologna, Bologna, Italy.

    Rapidly Mutating Y-STRs (RM Y-STRs) were recently introduced in forensics in order to increase the differentiation of Y-chromosomal profiles even in case of close relatives. We estimate RM Y-STRs mutation rates and their power to discriminate between related individuals by using samples extracted from a wide set of paternal pedigrees and by comparing RM Y-STRs results with those obtained from the Y-filer set. In addition, we tested the ability of RM Y-STRs to discriminate between unrelated individuals carrying the same Y-filer haplotype, using the haplogroup R-M269 (reportedly characterised by a strong resemblance in Y-STR profiles) as a case study. Our results, despite confirming the high mutability of RM Y-STRs, show significantly lower mutation rates than reference germline ones. Consequently, their power to discriminate between related individuals, despite being higher than the one of Y-filer, does not seem to improve significantly the performance of the latter. On the contrary, when considering R-M269 unrelated individuals, RM Y-STRs reveal significant discriminatory power and retain some phylogenetic signal, allowing the correct classification of individuals for some R-M269-derived sub-lineages. These results have important implications not only for forensics, but also for molecular anthropology, suggesting that RM Y-STRs are useful tools for exploring subtle genetic variability within Y-chromosomal haplogroups.

    Funded by: European Research Council: 295733

    PloS one 2016;11;11;e0165678

  • An organelle-specific protein landscape identifies novel diseases and molecular mechanisms.

    Boldt K, van Reeuwijk J, Lu Q, Koutroumpas K, Nguyen TM, Texier Y, van Beersum SE, Horn N, Willer JR, Mans DA, Dougherty G, Lamers IJ, Coene KL, Arts HH, Betts MJ, Beyer T, Bolat E, Gloeckner CJ, Haidari K, Hetterschijt L, Iaconis D, Jenkins D, Klose F, Knapp B, Latour B, Letteboer SJ, Marcelis CL, Mitic D, Morleo M, Oud MM, Riemersma M, Rix S, Terhal PA, Toedt G, van Dam TJ, de Vrieze E, Wissinger Y, Wu KM, Apic G, Beales PL, Blacque OE, Gibson TJ, Huynen MA, Katsanis N, Kremer H, Omran H, van Wijk E, Wolfrum U, Kepes F, Davis EE, Franco B, Giles RH, Ueffing M, Russell RB, Roepman R and UK10K Rare Diseases Group

    Medical Proteome Center, Institute for Ophthalmic Research, University of Tuebingen, 72074 Tuebingen, Germany.

    Cellular organelles provide opportunities to relate biological mechanisms to disease. Here we use affinity proteomics, genetics and cell biology to interrogate cilia: poorly understood organelles, where defects cause genetic diseases. Two hundred and seventeen tagged human ciliary proteins create a final landscape of 1,319 proteins, 4,905 interactions and 52 complexes. Reverse tagging, repetition of purifications and statistical analyses, produce a high-resolution network that reveals organelle-specific interactions and complexes not apparent in larger studies, and links vesicle transport, the cytoskeleton, signalling and ubiquitination to ciliary signalling and proteostasis. We observe sub-complexes in exocyst and intraflagellar transport complexes, which we validate biochemically, and by probing structurally predicted, disruptive, genetic variants from ciliary disease patients. The landscape suggests other genetic diseases could be ciliary including 3M syndrome. We show that 3M genes are involved in ciliogenesis, and that patient fibroblasts lack cilia. Overall, this organelle-specific targeting strategy shows considerable promise for Systems Medicine.

    Funded by: NEI NIH HHS: R01 EY021872; NICHD NIH HHS: R01 HD042601; NIDDK NIH HHS: R01 DK072301, R01 DK075972; NIGMS NIH HHS: R01 GM121317

    Nature communications 2016;7;11491

  • A DNA target-enrichment approach to detect mutations, copy number changes and immunoglobulin translocations in multiple myeloma.

    Bolli N, Li Y, Sathiaseelan V, Raine K, Jones D, Ganly P, Cocito F, Bignell G, Chapman MA, Sperling AS, Anderson KC, Avet-Loiseau H, Minvielle S, Campbell PJ and Munshi NC

    Cancer Genome Project, Wellcome Trust Sanger Institute, Cambridge, UK.

    Genomic lesions are not investigated during routine diagnostic workup for multiple myeloma (MM). Cytogenetic studies are performed to assess prognosis but with limited impact on therapeutic decisions. Recently, several recurrently mutated genes have been described, but their clinical value remains to be defined. Therefore, clinical-grade strategies to investigate the genomic landscape of myeloma samples are needed to integrate new and old prognostic markers. We developed a target-enrichment strategy followed by next-generation sequencing (NGS) to streamline simultaneous analysis of gene mutations, copy number changes and immunoglobulin heavy chain (IGH) translocations in MM in a high-throughput manner, and validated it in a panel of cell lines. We identified 548 likely oncogenic mutations in 182 genes. By integrating published data sets of NGS in MM, we retrieved a list of genes with significant relevance to myeloma and found that the mutational spectrum of primary samples and MM cell lines is partially overlapping. Gains and losses of chromosomes, chromosomal segments and gene loci were identified with accuracy comparable to conventional arrays, allowing identification of lesions with known prognostic significance. Furthermore, we identified IGH translocations with high positive and negative predictive value. Our approach could allow the identification of novel biomarkers with clinical relevance in myeloma.

    Funded by: NCI NIH HHS: P01 CA155258, P50 CA100707; Wellcome Trust: 077012/Z/05/Z

    Blood cancer journal 2016;6;9;e467

  • Mouse model of chromosome mosaicism reveals lineage-specific depletion of aneuploid cells and normal developmental potential.

    Bolton H, Graham SJL, Van der Aa N, Kumar P, Theunis K, Fernandez Gallardo E, Voet T and Zernicka-Goetz M

    Department of Physiology, Development and Neuroscience and Gurdon Institute, University of Cambridge, Downing Street, Cambridge CB2 3EG, UK.

    Most human pre-implantation embryos are mosaics of euploid and aneuploid cells. To determine the fate of aneuploid cells and the developmental potential of mosaic embryos, here we generate a mouse model of chromosome mosaicism. By treating embryos with a spindle assembly checkpoint inhibitor during the four- to eight-cell division, we efficiently generate aneuploid cells, resulting in embryo death during peri-implantation development. Live-embryo imaging and single-cell tracking in chimeric embryos, containing aneuploid and euploid cells, reveal that the fate of aneuploid cells depends on lineage: aneuploid cells in the fetal lineage are eliminated by apoptosis, whereas those in the placental lineage show severe proliferative defects. Overall, the proportion of aneuploid cells is progressively depleted from the blastocyst stage onwards. Finally, we show that mosaic embryos have full developmental potential, provided they contain sufficient euploid cells, a finding of significance for the assessment of embryo vitality in the clinic.

    Funded by: Wellcome Trust

    Nature communications 2016;7;11165

  • The influence of a short-term gluten-free diet on the human gut microbiome.

    Bonder MJ, Tigchelaar EF, Cai X, Trynka G, Cenit MC, Hrdlickova B, Zhong H, Vatanen T, Gevers D, Wijmenga C, Wang Y and Zhernakova A

    Department of Genetics, University of Groningen, University Medical Centre Groningen, Groningen, The Netherlands.

    Background: A gluten-free diet (GFD) is the most commonly adopted special diet worldwide. It is an effective treatment for coeliac disease and is also often followed by individuals to alleviate gastrointestinal complaints. It is known there is an important link between diet and the gut microbiome, but it is largely unknown how a switch to a GFD affects the human gut microbiome.

    Methods: We studied changes in the gut microbiomes of 21 healthy volunteers who followed a GFD for four weeks. We collected nine stool samples from each participant: one at baseline, four during the GFD period, and four when they returned to their habitual diet (HD), making a total of 189 samples. We determined microbiome profiles using 16S rRNA sequencing and then processed the samples for taxonomic and imputed functional composition. Additionally, in all 189 samples, six gut health-related biomarkers were measured.

    Results: Inter-individual variation in the gut microbiota remained stable during this short-term GFD intervention. A number of taxon-specific differences were seen during the GFD: the most striking shift was seen for the family Veillonellaceae (class Clostridia), which was significantly reduced during the intervention (p = 2.81 × 10(-05)). Seven other taxa also showed significant changes; the majority of them are known to play a role in starch metabolism. We saw stronger differences in pathway activities: 21 predicted pathway activity scores showed significant association to the change in diet. We observed strong relations between the predicted activity of pathways and biomarker measurements.

    Conclusions: A GFD changes the gut microbiome composition and alters the activity of microbial pathways.

    Funded by: European Research Council: 322698; Wellcome Trust: WT098051

    Genome medicine 2016;8;1;45

  • Chromosome engineering in zygotes with CRISPR/Cas9.

    Boroviak K, Doe B, Banerjee R, Yang F and Bradley A

    Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, Cambridge, United Kingdom.

    Deletions, duplications, and inversions of large genomic regions covering several genes are an important class of disease causing variants in humans. Modeling these structural variants in mice requires multistep processes in ES cells, which has limited their availability. Mutant mice containing small insertions, deletions, and single nucleotide polymorphisms can be reliably generated using CRISPR/Cas9 directly in mouse zygotes. Large structural variants can be generated using CRISPR/Cas9 in ES cells, but it has not been possible to generate these directly in zygotes. We now demonstrate the direct generation of deletions, duplications and inversions of up to one million base pairs by zygote injection.

    Funded by: NIH HHS: U42OD011174; Wellcome Trust: WT098051

    Genesis (New York, N.Y. : 2000) 2016;54;2;78-85

  • Complete Genome Sequence of MIDG2331, a Genetically Tractable Serovar 8 Clinical Isolate of Actinobacillus pleuropneumoniae.

    Bossé JT, Chaudhuri RR, Li Y, Leanse LG, Fernandez Crespo R, Coupland P, Holden MT, Bazzolli DM, Maskell DJ, Tucker AW, Wren BW, Rycroft AN and Langford PR

    Department of Medicine, Section of Paediatrics, Imperial College London, London, United Kingdom.

    We report here the complete annotated genome sequence of a clinical serovar 8 isolate Actinobacillus pleuropneumoniae MIDG2331. Unlike the serovar 8 reference strain 405, MIDG2331 is amenable to genetic manipulation via natural transformation as well as conjugation, making it ideal for studies of gene function.

    Funded by: Biotechnology and Biological Sciences Research Council: BB/G018553/1, BB/G019177/1, BB/G019274/1, BB/G020744/1, EGA16167

    Genome announcements 2016;4;1

  • ICEApl1, an Integrative Conjugative Element Related to ICEHin1056, Identified in the Pig Pathogen Actinobacillus pleuropneumoniae.

    Bossé JT, Li Y, Fernandez Crespo R, Chaudhuri RR, Rogers J, Holden MT, Maskell DJ, Tucker AW, Wren BW, Rycroft AN, Langford PR and the BRaDP1T Consortium

    Section of Paediatrics, Department of Medicine, Imperial College London London, UK.

    ICEApl1 was identified in the whole genome sequence of MIDG2331, a tetracycline-resistant (MIC = 8 mg/L) serovar 8 clinical isolate of Actinobacillus pleuropneumoniae, the causative agent of porcine pleuropneumonia. PCR amplification of virB4, one of the core genes involved in conjugation, was used to identify other A. pleuropneumoniae isolates potentially carrying ICEApl1. MICs for tetracycline were determined for virB4 positive isolates, and shotgun whole genome sequence analysis was used to confirm presence of the complete ICEApl1. The sequence of ICEApl1 is 56083 bp long and contains 67 genes including a Tn10 element encoding tetracycline resistance. Comparative sequence analysis was performed with similar integrative conjugative elements (ICEs) found in other members of the Pasteurellaceae. ICEApl1 is most similar to the 59393 bp ICEHin1056, from Haemophilus influenzae strain 1056. Although initially identified only in serovar 8 isolates of A. pleuropneumoniae (31 from the UK and 1 from Cyprus), conjugal transfer of ICEApl1 to representative isolates of other serovars was confirmed. All isolates carrying ICEApl1 had a MIC for tetracycline of 8 mg/L. This is, to our knowledge, the first description of an ICE in A. pleuropneumoniae, and the first report of a member of the ICEHin1056 subfamily in a non-human pathogen. ICEApl1 confers resistance to tetracycline, currently one of the more commonly used antibiotics for treatment and control of porcine pleuropneumonia.

    Funded by: Biotechnology and Biological Sciences Research Council: BB/G018553/1, BB/G019177/1, BB/G019274/1, BB/G020744/1

    Frontiers in microbiology 2016;7;810

  • GFI1(36N) as a therapeutic and prognostic marker for myelodysplastic syndrome.

    Botezatu L, Michel LC, Makishima H, Schroeder T, Germing U, Haas R, van der Reijden B, Marneth AE, Bergevoet SM, Jansen JH, Przychodzen B, Wlodarski M, Niemeyer C, Platzbecker U, Ehninger G, Unnikrishnan A, Beck D, Pimanda J, Hellström-Lindberg E, Malcovati L, Boultwood J, Pellagatti A, Papaemmanuil E, Le Coutre P, Kaeda J, Opalka B, Möröy T, Dührsen U, Maciejewski J and Khandanpour C

    Department of Hematology, West German Cancer Center, University Hospital Essen, University Duisburg-Essen, Essen, Germany.

    Inherited gene variants play an important role in malignant diseases. The transcriptional repressor growth factor independence 1 (GFI1) regulates hematopoietic stem cell (HSC) self-renewal and differentiation. A single-nucleotide polymorphism of GFI1 (rs34631763) generates a protein with an asparagine (N) instead of a serine (S) at position 36 (GFI1(36N)) and has a prevalence of 3%-5% among Caucasians. Because GFI1 regulates myeloid development, we examined the role of GFI1(36N) on the course of MDS disease. To this end, we determined allele frequencies of GFI1(36N) in four independent MDS cohorts from the Netherlands and Belgium, Germany, the ICGC consortium, and the United States. The GFI1(36N) allele frequency in the 723 MDS patients genotyped ranged between 9% and 12%. GFI1(36N) was an independent adverse prognostic factor for overall survival, acute myeloid leukemia-free survival, and event-free survival in a univariate analysis. After adjustment for age, bone marrow blast percentage, IPSS score, mutational status, and cytogenetic findings, GFI1(36N) remained an independent adverse prognostic marker. GFI1(36S) homozygous patients exhibited a sustained response to treatment with hypomethylating agents, whereas GFI1(36N) patients had a poor sustained response to this therapy. Because allele status of GFI1(36N) is readily determined using basic molecular techniques, we propose inclusion of GFI1(36N) status in future prospective studies for MDS patients to better predict prognosis and guide therapeutic decisions.

    Funded by: CIHR: MOP-111011, MOP-84238

    Experimental hematology 2016;44;7;590-595.e1

  • A multi-factorial analysis of response to warfarin in a UK prospective cohort.

    Bourgeois S, Jorgensen A, Zhang EJ, Hanson A, Gillman MS, Bumpstead S, Toh CH, Williamson P, Daly AK, Kamali F, Deloukas P and Pirmohamed M

    Wellcome Trust Sanger Institute, Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK.

    Background: Warfarin is the most widely used oral anticoagulant worldwide, but it has a narrow therapeutic index which necessitates constant monitoring of anticoagulation response. Previous genome-wide studies have focused on identifying factors explaining variance in stable dose, but have not explored the initial patient response to warfarin, and a wider range of clinical and biochemical factors affecting both initial and stable dosing with warfarin.

    Methods: A prospective cohort of 711 patients starting warfarin was followed up for 6 months with analyses focusing on both non-genetic and genetic factors. The outcome measures used were mean weekly warfarin dose (MWD), stable mean weekly dose (SMWD) and international normalised ratio (INR) > 4 during the first week. Samples were genotyped on the Illumina Human610-Quad chip. Statistical analyses were performed using Plink and R.

    Results: VKORC1 and CYP2C9 were the major genetic determinants of warfarin MWD and SMWD, with CYP4F2 having a smaller effect. Age, height, weight, cigarette smoking and interacting medications accounted for less than 20 % of the variance. Our multifactorial analysis explained 57.89 % and 56.97 % of the variation for MWD and SMWD, respectively. Genotypes for VKORC1 and CYP2C9*3, age, height and weight, as well as other clinical factors such as alcohol consumption, loading dose and concomitant drugs were important for the initial INR response to warfarin. In a small subset of patients for whom data were available, levels of the coagulation factors VII and IX (highly correlated) also played a role.

    Conclusion: Our multifactorial analysis in a prospectively recruited cohort has shown that multiple factors, genetic and clinical, are important in determining the response to warfarin. VKORC1 and CYP2C9 genetic polymorphisms are the most important determinants of warfarin dosing, and it is highly unlikely that other common variants of clinical importance influencing warfarin dosage will be found. Both VKORC1 and CYP2C9*3 are important determinants of the initial INR response to warfarin. Other novel variants, which did not reach genome-wide significance, were identified for the different outcome measures, but need replication.

    Funded by: British Heart Foundation: RG/14/5/30893; Department of Health; Medical Research Council: G0700654; Wellcome Trust

    Genome medicine 2016;8;1;2

  • The spotted gar genome illuminates vertebrate evolution and facilitates human-teleost comparisons.

    Braasch I, Gehrke AR, Smith JJ, Kawasaki K, Manousaki T, Pasquier J, Amores A, Desvignes T, Batzel P, Catchen J, Berlin AM, Campbell MS, Barrell D, Martin KJ, Mulley JF, Ravi V, Lee AP, Nakamura T, Chalopin D, Fan S, Wcisel D, Cañestro C, Sydes J, Beaudry FE, Sun Y, Hertel J, Beam MJ, Fasold M, Ishiyama M, Johnson J, Kehr S, Lara M, Letaw JH, Litman GW, Litman RT, Mikami M, Ota T, Saha NR, Williams L, Stadler PF, Wang H, Taylor JS, Fontenot Q, Ferrara A, Searle SM, Aken B, Yandell M, Schneider I, Yoder JA, Volff JN, Meyer A, Amemiya CT, Venkatesh B, Holland PW, Guiguen Y, Bobe J, Shubin NH, Di Palma F, Alföldi J, Lindblad-Toh K and Postlethwait JH

    Institute of Neuroscience, University of Oregon, Eugene, Oregon, USA.

    To connect human biology to fish biomedical models, we sequenced the genome of spotted gar (Lepisosteus oculatus), whose lineage diverged from teleosts before teleost genome duplication (TGD). The slowly evolving gar genome has conserved in content and size many entire chromosomes from bony vertebrate ancestors. Gar bridges teleosts to tetrapods by illuminating the evolution of immunity, mineralization and development (mediated, for example, by Hox, ParaHox and microRNA genes). Numerous conserved noncoding elements (CNEs; often cis regulatory) undetectable in direct human-teleost comparisons become apparent using gar: functional studies uncovered conserved roles for such cryptic CNEs, facilitating annotation of sequences identified in human genome-wide association studies. Transcriptomic analyses showed that the sums of expression domains and expression levels for duplicated teleost genes often approximate the patterns and levels of expression for gar genes, consistent with subfunctionalization. The gar genome provides a resource for understanding evolution after genome duplication, the origin of vertebrate genomes and the function of human regulatory sequences.

    Nature genetics 2016

  • Maternal DNA Methylation Regulates Early Trophoblast Development.

    Branco MR, King M, Perez-Garcia V, Bogutz AB, Caley M, Fineberg E, Lefebvre L, Cook SJ, Dean W, Hemberger M and Reik W

    Blizard Institute, Barts and The London School of Medicine and Dentistry, QMUL, London E1 2AT, UK. Electronic address:

    Critical roles for DNA methylation in embryonic development are well established, but less is known about its roles during trophoblast development, the extraembryonic lineage that gives rise to the placenta. We dissected the role of DNA methylation in trophoblast development by performing mRNA and DNA methylation profiling of Dnmt3a/3b mutants. We find that oocyte-derived methylation plays a major role in regulating trophoblast development but that imprinting of the key placental regulator Ascl2 is only partially responsible for these effects. We have identified several methylation-regulated genes associated with trophoblast differentiation that are involved in cell adhesion and migration, potentially affecting trophoblast invasion. Specifically, trophoblast-specific DNA methylation is linked to the silencing of Scml2, a Polycomb Repressive Complex 1 protein that drives loss of cell adhesion in methylation-deficient trophoblast. Our results reveal that maternal DNA methylation controls multiple differentiation-related and physiological processes in trophoblast via both imprinting-dependent and -independent mechanisms.

    Funded by: Biotechnology and Biological Sciences Research Council: BBS/E/B/000C0417; Canadian Institutes of Health Research: MOP-119357; Medical Research Council: MR/L00027X/1; Wellcome Trust: 095645, 101225, 101225/Z/13/Z

    Developmental cell 2016;36;2;152-63

  • eFORGE: A Tool for Identifying Cell Type-Specific Signal in Epigenomic Data.

    Breeze CE, Paul DS, van Dongen J, Butcher LM, Ambrose JC, Barrett JE, Lowe R, Rakyan VK, Iotchkova V, Frontini M, Downes K, Ouwehand WH, Laperle J, Jacques PÉ, Bourque G, Bergmann AK, Siebert R, Vellenga E, Saeed S, Matarese F, Martens JHA, Stunnenberg HG, Teschendorff AE, Herrero J, Birney E, Dunham I and Beck S

    UCL Cancer Institute, University College London, London WC1E 6BT, UK. Electronic address:

    Epigenome-wide association studies (EWAS) provide an alternative approach for studying human disease through consideration of non-genetic variants such as altered DNA methylation. To advance the complex interpretation of EWAS, we developed eFORGE (, a new standalone and web-based tool for the analysis and interpretation of EWAS data. eFORGE determines the cell type-specific regulatory component of a set of EWAS-identified differentially methylated positions. This is achieved by detecting enrichment of overlap with DNase I hypersensitive sites across 454 samples (tissues, primary cell types, and cell lines) from the ENCODE, Roadmap Epigenomics, and BLUEPRINT projects. Application of eFORGE to 20 publicly available EWAS datasets identified disease-relevant cell types for several common diseases, a stem cell-like signature in cancer, and demonstrated the ability to detect cell-composition effects for EWAS performed on heterogeneous tissues. Our approach bridges the gap between large-scale epigenomics data and EWAS-derived target selection to yield insight into disease etiology.

    Funded by: British Heart Foundation: RG/09/012/28096; Department of Health: RP-PG-0310-1002; Medical Research Council: G0800270; Wellcome Trust: 99148

    Cell reports 2016;17;8;2137-2150

  • Efficient identification of CRISPR/Cas9-induced insertions/deletions by direct germline screening in zebrafish.

    Brocal I, White RJ, Dooley CM, Carruthers SN, Clark R, Hall A, Busch-Nentwich EM, Stemple DL and Kettleborough RN

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.

    Background: The CRISPR/Cas9 system is a prokaryotic immune system that infers resistance to foreign genetic material and is a sort of 'adaptive immunity'. It has been adapted to enable high throughput genome editing and has revolutionised the generation of targeted mutations.

    Results: We have developed a scalable analysis pipeline to identify CRISPR/Cas9 induced mutations in hundreds of samples using next generation sequencing (NGS) of amplicons. We have used this system to investigate the best way to screen mosaic Zebrafish founder individuals for germline transmission of induced mutations. Screening sperm samples from potential founders provides much better information on germline transmission rates and crucially the sequence of the particular insertions/deletions (indels) that will be transmitted. This enables us to combine screening with archiving to create a library of cryopreserved samples carrying known mutations. It also allows us to design efficient genotyping assays, making identifying F1 carriers straightforward.

    Conclusions: The methods described will streamline the production of large numbers of knockout alleles in selected genes for phenotypic analysis, complementing existing efforts using random mutagenesis.

    Funded by: Wellcome Trust: WT098051

    BMC genomics 2016;17;259

  • Calcium signalling in malaria parasites.

    Brochet M and Billker O

    Faculty of Medicine, University of Geneva, 1 Rue Michel-Servet, CH-1211 Geneva 4, Switzerland.

    Ca(2+) is a ubiquitous intracellular messenger in malaria parasites with important functions in asexual blood stages responsible for malaria symptoms, the preceding liver-stage infection and transmission through the mosquito. Intracellular messengers amplify signals by binding to effector molecules that trigger physiological changes. The characterisation of some Ca(2+) effector proteins has begun to provide insights into the vast range of biological processes controlled by Ca(2+) signalling in malaria parasites, including host cell egress and invasion, protein secretion, motility, and cell cycle regulation. Despite the importance of Ca(2+) signalling during the life cycle of malaria parasites, little is known about Ca(2+) homeostasis. Recent findings highlighted that upstream of stage-specific Ca(2+) effectors is a conserved interplay between second messengers to control critical intracellular Ca(2+) signals throughout the life cycle. The identification of the molecular mechanisms integrating stagetranscending mechanisms of Ca(2+) homeostasis in a network of stage-specific regulator and effector pathways now represents a major challenge for a meaningful understanding of Ca(2+) signalling in malaria parasites. This article is protected by copyright. All rights reserved.

    Molecular microbiology 2016

  • Whole-genome sequencing reveals transmission of vancomycin-resistant Enterococcus faecium in a healthcare network.

    Brodrick HJ, Raven KE, Harrison EM, Blane B, Reuter S, Török ME, Parkhill J and Peacock SJ

    Department of Medicine, University of Cambridge, Addenbrooke's Hospital, Box 157, Hills Road, Cambridge, CB2 0QQ, UK.

    Background: Bacterial whole-genome sequencing (WGS) has the potential to identify reservoirs of multidrug-resistant organisms and transmission of these pathogens across healthcare networks. We used WGS to define transmission of vancomycin-resistant enterococci (VRE) within a long-term care facility (LTCF), and between this and an acute hospital in the United Kingdom (UK).

    Methods: A longitudinal prospective observational study of faecal VRE carriage was conducted in a LTCF in Cambridge, UK. Stool samples were collected at recruitment, and then repeatedly until the end of the study period, discharge or death. Selective culture media were used to isolate VRE, which were subsequently sequenced and analysed. We also analysed the genomes of 45 Enterococcus faecium bloodstream isolates collected at Cambridge University Hospitals NHS Foundation Trust (CUH).

    Results: Forty-five residents were recruited during a 6-month period in 2014, and 693 stools were collected at a frequency of at least 1 week apart. Fifty-one stool samples from 3/45 participants (7 %) were positive for vancomycin-resistant E. faecium. Two residents carried multiple VRE lineages, and one carried a single VRE lineage. Genome analyses based on single nucleotide polymorphisms (SNPs) in the core genome indicated that VRE carried by each of the three residents were unrelated. Participants had extensive contact with the local healthcare network. We found that VRE genomes from LTCF residents and hospital-associated bloodstream infection were interspersed throughout the phylogenetic tree, with several instances of closely related VRE strains from the two settings.

    Conclusions: A proportion of LTCF residents are long-term carriers of VRE. Evidence for genetic relatedness between these and VRE associated with bloodstream infection in a nearby acute NHS Trust indicate a shared bacterial population.

    Funded by: Department of Health: HICF-T5-342; Wellcome Trust: 098600

    Genome medicine 2016;8;1;4

  • Quantitative insertion-site sequencing (QIseq) for high throughput phenotyping of transposon mutants.

    Bronner IF, Otto TD, Zhang M, Udenze K, Wang C, Quail MA, Jiang RH, Adams JH and Rayner JC

    Malaria Programme, Wellcome Trust Sanger Institute, Genome Campus, Hinxton, Cambridgeshire CB10 1SA, United Kingdom;

    Genetic screening using random transposon insertions has been a powerful tool for uncovering biology in prokaryotes, where whole-genome saturating screens have been performed in multiple organisms. In eukaryotes, such screens have proven more problematic, in part because of the lack of a sensitive and robust system for identifying transposon insertion sites. We here describe quantitative insertion-site sequencing, or QIseq, which uses custom library preparation and Illumina sequencing technology and is able to identify insertion sites from both the 5' and 3' ends of the transposon, providing an inbuilt level of validation. The approach was developed using piggyBac mutants in the human malaria parasite Plasmodium falciparum but should be applicable to many other eukaryotic genomes. QIseq proved accurate, confirming known sites in >100 mutants, and sensitive, identifying and monitoring sites over a >10,000-fold dynamic range of sequence counts. Applying QIseq to uncloned parasites shortly after transfections revealed multiple insertions in mixed populations and suggests that >4000 independent mutants could be generated from relatively modest scales of transfection, providing a clear pathway to genome-scale screens in P. falciparum QIseq was also used to monitor the growth of pools of previously cloned mutants and reproducibly differentiated between deleterious and neutral mutations in competitive growth. Among the mutants with fitness defects was a mutant with a piggyBac insertion immediately upstream of the kelch protein K13 gene associated with artemisinin resistance, implying mutants in this gene may have competitive fitness costs. QIseq has the potential to enable the scale-up of piggyBac-mediated genetics across multiple eukaryotic systems.

    Genome research 2016;26;7;980-9

  • Antibiotics, gut bugs and the young.

    Browne H

    Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.

    Two recent studies have investigated the effects of antibiotic use on the intestinal microbiota of preterm infants and young children.

    Nature reviews. Microbiology 2016;14;6;336

  • Culturing of 'unculturable' human microbiota reveals novel taxa and extensive sporulation.

    Browne HP, Forster SC, Anonye BO, Kumar N, Neville BA, Stares MD, Goulding D and Lawley TD

    Host-Microbiota Interactions Laboratory, Wellcome Trust Sanger Institute, Hinxton, UK.

    Our intestinal microbiota harbours a diverse bacterial community required for our health, sustenance and wellbeing. Intestinal colonization begins at birth and climaxes with the acquisition of two dominant groups of strict anaerobic bacteria belonging to the Firmicutes and Bacteroidetes phyla. Culture-independent, genomic approaches have transformed our understanding of the role of the human microbiome in health and many diseases. However, owing to the prevailing perception that our indigenous bacteria are largely recalcitrant to culture, many of their functions and phenotypes remain unknown. Here we describe a novel workflow based on targeted phenotypic culturing linked to large-scale whole-genome sequencing, phylogenetic analysis and computational modelling that demonstrates that a substantial proportion of the intestinal bacteria are culturable. Applying this approach to healthy individuals, we isolated 137 bacterial species from characterized and candidate novel families, genera and species that were archived as pure cultures. Whole-genome and metagenomic sequencing, combined with computational and phenotypic analysis, suggests that at least 50-60% of the bacterial genera from the intestinal microbiota of a healthy individual produce resilient spores, specialized for host-to-host transmission. Our approach unlocks the human intestinal microbiota for phenotypic analysis and reveals how a marked proportion of oxygen-sensitive intestinal bacteria can be transmitted between individuals, affecting microbiota heritability.

    Funded by: Medical Research Council: G1000214, MR/K000551/1, PF451; Wellcome Trust: 098051

    Nature 2016;533;7604;543-546

  • A Biobank of Breast Cancer Explants with Preserved Intra-tumor Heterogeneity to Screen Anticancer Compounds.

    Bruna A, Rueda OM, Greenwood W, Batra AS, Callari M, Batra RN, Pogrebniak K, Sandoval J, Cassidy JW, Tufegdzic-Vidakovic A, Sammut SJ, Jones L, Provenzano E, Baird R, Eirew P, Hadfield J, Eldridge M, McLaren-Douglas A, Barthorpe A, Lightfoot H, O'Connor MJ, Gray J, Cortes J, Baselga J, Marangoni E, Welm AL, Aparicio S, Serra V, Garnett MJ and Caldas C

    Department of Oncology and Cancer Research UK Cambridge Institute, Li Ka Shing Centre, University of Cambridge, Cambridge CB2 0RE, UK.

    The inter- and intra-tumor heterogeneity of breast cancer needs to be adequately captured in pre-clinical models. We have created a large collection of breast cancer patient-derived tumor xenografts (PDTXs), in which the morphological and molecular characteristics of the originating tumor are preserved through passaging in the mouse. An integrated platform combining in vivo maintenance of these PDTXs along with short-term cultures of PDTX-derived tumor cells (PDTCs) was optimized. Remarkably, the intra-tumor genomic clonal architecture present in the originating breast cancers was mostly preserved upon serial passaging in xenografts and in short-term cultured PDTCs. We assessed drug responses in PDTCs on a high-throughput platform and validated several ex vivo responses in vivo. The biobank represents a powerful resource for pre-clinical breast cancer pharmacogenomic studies (, including identification of biomarkers of response or resistance.

    Funded by: Cancer Research UK: 9675; Marie Skłodowska-Curie Individual Fellowships: 660060; Medical Research Council: MR/M008975/1; NCI NIH HHS: P30 CA008748, R01 CA166422; Wellcome Trust

    Cell 2016;167;1;260-274.e22

  • Sugar-sweetened beverage consumption and genetic predisposition to obesity in 2 Swedish cohorts.

    Brunkwall L, Chen Y, Hindy G, Rukh G, Ericson U, Barroso I, Johansson I, Franks PW, Orho-Melander M and Renström F

    Diabetes and Cardiovascular Disease-Genetic Epidemiology and.

    Background: The consumption of sugar-sweetened beverages (SSBs), which has increased substantially during the last decades, has been associated with obesity and weight gain.

    Objective: Common genetic susceptibility to obesity has been shown to modify the association between SSB intake and obesity risk in 3 prospective cohorts from the United States. We aimed to replicate these findings in 2 large Swedish cohorts.

    Design: Data were available for 21,824 healthy participants from the Malmö Diet and Cancer study and 4902 healthy participants from the Gene-Lifestyle Interactions and Complex Traits Involved in Elevated Disease Risk Study. Self-reported SSB intake was categorized into 4 levels (seldom, low, medium, and high). Unweighted and weighted genetic risk scores (GRSs) were constructed based on 30 body mass index [(BMI) in kg/m(2)]-associated loci, and effect modification was assessed in linear regression equations by modeling the product and marginal effects of the GRS and SSB intake adjusted for age-, sex-, and cohort-specific covariates, with BMI as the outcome. In a secondary analysis, models were additionally adjusted for putative confounders (total energy intake, alcohol consumption, smoking status, and physical activity).

    Results: In an inverse variance-weighted fixed-effects meta-analysis, each SSB intake category increment was associated with a 0.18 higher BMI (SE = 0.02; P = 1.7 × 10(-20); n = 26,726). In the fully adjusted model, a nominal significant interaction between SSB intake category and the unweighted GRS was observed (P-interaction = 0.03). Comparing the participants within the top and bottom quartiles of the GRS to each increment in SSB intake was associated with 0.24 (SE = 0.04; P = 2.9 × 10(-8); n = 6766) and 0.15 (SE = 0.04; P = 1.3 × 10(-4); n = 6835) higher BMIs, respectively.

    Conclusions: The interaction observed in the Swedish cohorts is similar in magnitude to the previous analysis in US cohorts and indicates that the relation of SSB intake and BMI is stronger in people genetically predisposed to obesity.

    Funded by: Wellcome Trust

    The American journal of clinical nutrition 2016;104;3;809-15

  • Emergence and spread of a human-transmissible multidrug-resistant nontuberculous mycobacterium.

    Bryant JM, Grogono DM, Rodriguez-Rincon D, Everall I, Brown KP, Moreno P, Verma D, Hill E, Drijkoningen J, Gilligan P, Esther CR, Noone PG, Giddings O, Bell SC, Thomson R, Wainwright CE, Coulter C, Pandey S, Wood ME, Stockwell RE, Ramsay KA, Sherrard LJ, Kidd TJ, Jabbour N, Johnson GR, Knibbs LD, Morawska L, Sly PD, Jones A, Bilton D, Laurenson I, Ruddy M, Bourke S, Bowler IC, Chapman SJ, Clayton A, Cullen M, Daniels T, Dempsey O, Denton M, Desai M, Drew RJ, Edenborough F, Evans J, Folb J, Humphrey H, Isalska B, Jensen-Fangel S, Jönsson B, Jones AM, Katzenstein TL, Lillebaek T, MacGregor G, Mayell S, Millar M, Modha D, Nash EF, O'Brien C, O'Brien D, Ohri C, Pao CS, Peckham D, Perrin F, Perry A, Pressler T, Prtak L, Qvist T, Robb A, Rodgers H, Schaffer K, Shafi N, van Ingen J, Walshaw M, Watson D, West N, Whitehouse J, Haworth CS, Harris SR, Ordway D, Parkhill J and Floto RA

    Wellcome Trust Sanger Institute, Hinxton, UK.

    Lung infections with Mycobacterium abscessus, a species of multidrug-resistant nontuberculous mycobacteria, are emerging as an important global threat to individuals with cystic fibrosis (CF), in whom M. abscessus accelerates inflammatory lung damage, leading to increased morbidity and mortality. Previously, M. abscessus was thought to be independently acquired by susceptible individuals from the environment. However, using whole-genome analysis of a global collection of clinical isolates, we show that the majority of M. abscessus infections are acquired through transmission, potentially via fomites and aerosols, of recently emerged dominant circulating clones that have spread globally. We demonstrate that these clones are associated with worse clinical outcomes, show increased virulence in cell-based and mouse infection models, and thus represent an urgent international infection challenge.

    Funded by: Medical Research Council: G1001712; Wellcome Trust

    Science (New York, N.Y.) 2016;354;6313;751-757

  • Phylogenomic exploration of the relationships between strains of Mycobacterium avium subspecies paratuberculosis.

    Bryant JM, Thibault VC, Smith DG, McLuckie J, Heron I, Sevilla IA, Biet F, Harris SR, Maskell DJ, Bentley SD, Parkhill J and Stevenson K

    Wellcome Trust Sanger Institute, Genome Campus, Cambridge, UK.

    Background: Mycobacterium avium subspecies paratuberculosis (Map) is an infectious enteric pathogen that causes Johne's disease in livestock. Determining genetic diversity is prerequisite to understanding the epidemiology and biology of Map. We performed the first whole genome sequencing (WGS) of 141 global Map isolates that encompass the main molecular strain types currently reported. We investigated the phylogeny of the Map strains, the diversity of the genome and the limitations of commonly used genotyping methods.

    Results: Single nucleotide polymorphism (SNP) and phylogenetic analyses confirmed two major lineages concordant with the former Type S and Type C designations. The Type I and Type III strain groups are subtypes of Type S, and Type B strains are a subtype of Type C and not restricted to Bison species. We found that the genome-wide SNPs detected provided greater resolution between isolates than currently employed genotyping methods. Furthermore, the SNP used for IS1311 typing is not informative, as it is likely to have occurred after Type S and C strains diverged and does not assign all strains to the correct lineage. Mycobacterial Interspersed Repetitive Unit-Variable Number Tandem Repeat (MIRU-VNTR) differentiates Type S from Type C but provides limited resolution between isolates within these lineages and the polymorphisms detected do not necessarily accurately reflect the phylogenetic relationships between strains. WGS of passaged strains and coalescent analysis of the collection revealed a very high level of genetic stability, with the substitution rate estimated to be less than 0.5 SNPs per genome per year.

    Conclusions: This study clarifies the phylogenetic relationships between the previously described Map strain groups, and highlights the limitations of current genotyping techniques. Map isolates exhibit restricted genetic diversity and a substitution rate consistent with a monomorphic pathogen. WGS provides the ultimate level of resolution for differentiation between strains. However, WGS alone will not be sufficient for tracing and tracking Map infections, yet importantly it can provide a phylogenetic context for affirming epidemiological connections.

    Funded by: Wellcome Trust: 098051

    BMC genomics 2016;17;79

  • Wbp2 is required for normal glutamatergic synapses in the cochlea and is crucial for hearing.

    Buniello A, Ingham NJ, Lewis MA, Huma AC, Martinez-Vega R, Varela-Nieto I, Vizcay-Barrena G, Fleck RA, Houston O, Bardhan T, Johnson SL, White JK, Yuan H, Marcotti W and Steel KP

    Wolfson Centre For Age-Related Diseases, King's College London, London, UK Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, UK

    WBP2 encodes the WW domain-binding protein 2 that acts as a transcriptional coactivator for estrogen receptor α (ESR1) and progesterone receptor (PGR). We reported that the loss of Wbp2 expression leads to progressive high-frequency hearing loss in mouse, as well as in two deaf children, each carrying two different variants in the WBP2 gene. The earliest abnormality we detect in Wbp2-deficient mice is a primary defect at inner hair cell afferent synapses. This study defines a new gene involved in the molecular pathway linking hearing impairment to hormonal signalling and provides new therapeutic targets.

    Funded by: Medical Research Council: G0300212, MC_QA137918; Wellcome Trust: 098051, 100669, 102892

    EMBO molecular medicine 2016;8;3;191-207

  • Mitochondrial Protein Lipoylation and the 2-Oxoglutarate Dehydrogenase Complex Controls HIF1α Stability in Aerobic Conditions.

    Burr SP, Costa AS, Grice GL, Timms RT, Lobb IT, Freisinger P, Dodd RB, Dougan G, Lehner PJ, Frezza C and Nathan JA

    Department of Medicine, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, CB2 0XY, UK.

    Hypoxia-inducible transcription factors (HIFs) control adaptation to low oxygen environments by activating genes involved in metabolism, angiogenesis, and redox homeostasis. The finding that HIFs are also regulated by small molecule metabolites highlights the need to understand the complexity of their cellular regulation. Here we use a forward genetic screen in near-haploid human cells to identify genes that stabilize HIFs under aerobic conditions. We identify two mitochondrial genes, oxoglutarate dehydrogenase (OGDH) and lipoic acid synthase (LIAS), which when mutated stabilize HIF1α in a non-hydroxylated form. Disruption of OGDH complex activity in OGDH or LIAS mutants promotes L-2-hydroxyglutarate formation, which inhibits the activity of the HIFα prolyl hydroxylases (PHDs) and TET 2-oxoglutarate dependent dioxygenases. We also find that PHD activity is decreased in patients with homozygous germline mutations in lipoic acid synthesis, leading to HIF1 activation. Thus, mutations affecting OGDHC activity may have broad implications for epigenetic regulation and tumorigenesis.

    Funded by: Medical Research Council; Wellcome Trust: 102770/Z/13/Z; Wellcome Trust : 084957/Z/08/Z

    Cell metabolism 2016;24;5;740-752

  • Admixture into and within sub-Saharan Africa.

    Busby GB, Band G, Si Le Q, Jallow M, Bougama E, Mangano VD, Amenga-Etego LN, Enimil A, Apinjoh T, Ndila CM, Manjurano A, Nyirongo V, Doumba O, Rockett KA, Kwiatkowski DP, Spencer CC and Malaria Genomic Epidemiology Network

    Wellcome Trust Centre for Human Genetics, Oxford, United Kingdom.

    Similarity between two individuals in the combination of genetic markers along their chromosomes indicates shared ancestry and can be used to identify historical connections between different population groups due to admixture. We use a genome-wide, haplotype-based, analysis to characterise the structure of genetic diversity and gene-flow in a collection of 48 sub-Saharan African groups. We show that coastal populations experienced an influx of Eurasian haplotypes over the last 7000 years, and that Eastern and Southern Niger-Congo speaking groups share ancestry with Central West Africans as a result of recent population expansions. In fact, most sub-Saharan populations share ancestry with groups from outside of their current geographic region as a result of gene-flow within the last 4000 years. Our in-depth analysis provides insight into haplotype sharing across different ethno-linguistic groups and the recent movement of alleles into new environments, both of which are relevant to studies of genetic epidemiology.

    Funded by: Medical Research Council: G0600718, MR/M006212/1; Wellcome Trust

    eLife 2016;5

  • Non-CG DNA methylation is a biomarker for assessing endodermal differentiation capacity in pluripotent stem cells.

    Butcher LM, Ito M, Brimpari M, Morris TJ, Soares FAC, Ährlund-Richter L, Carey N, Vallier L, Ferguson-Smith AC and Beck S

    UCL Cancer Institute, University College London, 72 Huntley Street, London WC1E 6BT, UK.

    Non-CG methylation is an unexplored epigenetic hallmark of pluripotent stem cells. Here we report that a reduction in non-CG methylation is associated with impaired differentiation capacity into endodermal lineages. Genome-wide analysis of 2,670 non-CG sites in a discovery cohort of 25 phenotyped human induced pluripotent stem cell (hiPSC) lines revealed unidirectional loss (Δβ=13%, P<7.4 × 10(-4)) of non-CG methylation that correctly identifies endodermal differentiation capacity in 23 out of 25 (92%) hiPSC lines. Translation into a simplified assay of only nine non-CG sites maintains predictive power in the discovery cohort (Δβ=23%, P<9.1 × 10(-6)) and correctly identifies endodermal differentiation capacity in nine out of ten pluripotent stem cell lines in an independent replication cohort consisting of hiPSCs reprogrammed from different cell types and different delivery systems, as well as human embryonic stem cell (hESC) lines. This finding infers non-CG methylation at these sites as a biomarker when assessing endodermal differentiation capacity as a readout.

    Funded by: Medical Research Council: G0800784, G1000847, MC_PC_12009, MR/J001597/1; Wellcome Trust: 084071, 095606, WT098503

    Nature communications 2016;7;10458

  • C13orf31 (FAMIN) is a central regulator of immunometabolic function.

    Cader MZ, Boroviak K, Zhang Q, Assadi G, Kempster SL, Sewell GW, Saveljeva S, Ashcroft JW, Clare S, Mukhopadhyay S, Brown KP, Tschurtschenthaler M, Raine T, Doe B, Chilvers ER, Griffin JL, Kaneider NC, Floto RA, D'Amato M, Bradley A, Wakelam MJ, Dougan G and Kaser A

    Division of Gastroenterology and Hepatology, Department of Medicine, Addenbrooke's Hospital, University of Cambridge, Cambridge, UK.

    Single-nucleotide variations in C13orf31 (LACC1) that encode p.C284R and p.I254V in a protein of unknown function (called 'FAMIN' here) are associated with increased risk for systemic juvenile idiopathic arthritis, leprosy and Crohn's disease. Here we set out to identify the biological mechanism affected by these coding variations. FAMIN formed a complex with fatty acid synthase (FASN) on peroxisomes and promoted flux through de novo lipogenesis to concomitantly drive high levels of fatty-acid oxidation (FAO) and glycolysis and, consequently, ATP regeneration. FAMIN-dependent FAO controlled inflammasome activation, mitochondrial and NADPH-oxidase-dependent production of reactive oxygen species (ROS), and the bactericidal activity of macrophages. As p.I254V and p.C284R resulted in diminished function and loss of function, respectively, FAMIN determined resilience to endotoxin shock. Thus, we have identified a central regulator of the metabolic function and bioenergetic state of macrophages that is under evolutionary selection and determines the risk of inflammatory and infectious disease.

    Funded by: Biotechnology and Biological Sciences Research Council: BBS/E/B/000C0411, BBS/E/B/000C0413, BBS/E/B/000C0415, BBS/E/B/000C0417; European Research Council: 260961; Medical Research Council: MC_UP_A090_1006; NHGRI NIH HHS: U54 HG006348; NIH HHS: U42 OD011174; Wellcome Trust: 079643, 100675, 100891, 103077, 106260, 106260/Z/14/Z

    Nature immunology 2016;17;9;1046-56

  • The AMP-activated protein kinase beta 1 subunit modulates erythrocyte integrity.

    Cambridge EL, McIntyre Z, Clare S, Arends MJ, Goulding D, Isherwood C, Caetano SS, Reviriego CB, Swiatkowska A, Kane L, Harcourt K, Sanger Mouse Genetics Project, Adams DJ, White JK and Speak AO

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, UK.

    Failure to maintain a normal in vivo erythrocyte half-life results in the development of hemolytic anemia. Half-life is affected by numerous factors, including energy balance, electrolyte gradients, reactive oxygen species, and membrane plasticity. The heterotrimeric AMP-activated protein kinase (AMPK) is an evolutionarily conserved serine/threonine kinase that acts as a critical regulator of cellular energy balance. Previous roles for the alpha 1 and gamma 1 subunits in the control of erythrocyte survival have been reported. In the work described here, we studied the role of the beta 1 subunit in erythrocytes and observed microcytic anemia with compensatory extramedullary hematopoiesis together with splenomegaly and increased osmotic resistance.

    Funded by: Cancer Research UK: 13031; Wellcome Trust

    Experimental hematology 2016;45;64-68.e5

  • Genomic variation in two gametocyte non-producing Plasmodium falciparum clonal lines.

    Campino S, Benavente ED, Assefa S, Thompson E, Drought LG, Taylor CJ, Gorvett Z, Carret CK, Flueck C, Ivens AC, Kwiatkowski DP, Alano P, Baker DA and Clark TG

    Faculty of Infectious and Tropical Diseases, London School of Hygiene & Tropical Medicine, London, UK.

    Background: Transmission of the malaria parasite Plasmodium falciparum from humans to the mosquito vector requires differentiation of a sub-population of asexual forms replicating within red blood cells into non-dividing male and female gametocytes. The nature of the molecular mechanism underlying this key differentiation event required for malaria transmission is not fully understood.

    Methods: Whole genome sequencing was used to examine the genomic diversity of the gametocyte non-producing 3D7-derived lines F12 and A4. These lines were used in the recent detection of the PF3D7_1222600 locus (encoding PfAP2-G), which acts as a genetic master switch that triggers gametocyte development.

    Results: The evolutionary changes from the 3D7 parental strain through its derivatives F12 (culture-passage derived cloned line) and A4 (transgenic cloned line) were identified. The genetic differences including the formation of chimeric var genes are presented.

    Conclusion: A genomics resource is provided for the further study of gametocytogenesis or other phenotypes using these parasite lines.

    Funded by: Medical Research Council: MR/K000551/1, MR/M006212/1, MR/M01360X/1, MR/N010469/1; Wellcome Trust: 090770, 098051, 106240, WT094752

    Malaria journal 2016;15;229

  • A CRISPR outlook for apicomplexans.

    Carrasquilla M and Owusu CK

    Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.

    Nature reviews. Microbiology 2016;14;11;668

  • Whole Genome Sequence Analysis of a Large Isoniazid-Resistant Tuberculosis Outbreak in London: A Retrospective Observational Study.

    Casali N, Broda A, Harris SR, Parkhill J, Brown T and Drobniewski F

    Department of Infectious Diseases and Immunity, Imperial College London, London, United Kingdom.

    Background: A large isoniazid-resistant tuberculosis outbreak centred on London, United Kingdom, has been ongoing since 1995. The aim of this study was to investigate the power and value of whole genome sequencing (WGS) to resolve the transmission network compared to current molecular strain typing approaches, including analysis of intra-host diversity within a specimen, across body sites, and over time, with identification of genetic factors underlying the epidemiological success of this cluster.

    Methods and findings: We sequenced 344 outbreak isolates from individual patients collected over 14 y (2 February 1998-22 June 2012). This demonstrated that 96 (27.9%) were indistinguishable, and only one differed from this major clone by more than five single nucleotide polymorphisms (SNPs). The maximum number of SNPs between any pair of isolates was nine SNPs, and the modal distance between isolates was two SNPs. WGS was able to reveal the direction of transmission of tuberculosis in 16 cases within the outbreak (4.7%), including within a multidrug-resistant cluster that carried a rare rpoB mutation associated with rifampicin resistance. Eleven longitudinal pairs of patient pulmonary isolates collected up to 48 mo apart differed from each other by between zero and four SNPs. Extrapulmonary dissemination resulted in acquisition of a SNP in two of five cases. WGS analysis of 27 individual colonies cultured from a single patient specimen revealed ten loci differed amongst them, with a maximum distance between any pair of six SNPs. A limitation of this study, as in previous studies, is that indels and SNPs in repetitive regions were not assessed due to the difficulty in reliably determining this variation.

    Conclusions: Our study suggests that (1) certain paradigms need to be revised, such as the 12 SNP distance as the gold standard upper threshold to identify plausible transmissions; (2) WGS technology is helpful to rule out the possibility of direct transmission when isolates are separated by a substantial number of SNPs; (3) the concept of a transmission chain or network may not be useful in institutional or household settings; (4) the practice of isolating single colonies prior to sequencing is likely to lead to an overestimation of the number of SNPs between cases resulting from direct transmission; and (5) despite appreciable genomic diversity within a host, transmission of tuberculosis rarely results in minority variants becoming dominant. Thus, whilst WGS provided some increased resolution over variable number tandem repeat (VNTR)-based clustering, it was insufficient for inferring transmission in the majority of cases.

    PLoS medicine 2016;13;10;e1002137

  • Novel Genetic Variants for Cartilage Thickness and Hip Osteoarthritis.

    Castaño-Betancourt MC, Evans DS, Ramos YF, Boer CG, Metrustry S, Liu Y, den Hollander W, van Rooij J, Kraus VB, Yau MS, Mitchell BD, Muir K, Hofman A, Doherty M, Doherty S, Zhang W, Kraaij R, Rivadeneira F, Barrett-Connor E, Maciewicz RA, Arden N, Nelissen RG, Kloppenburg M, Jordan JM, Nevitt MC, Slagboom EP, Hart DJ, Lafeber F, Styrkarsdottir U, Zeggini E, Evangelou E, Spector TD, Uitterlinden AG, Lane NE, Meulenbelt I, Valdes AM and van Meurs JB

    Department of Internal Medicine, Erasmus Medical Center, Rotterdam, The Netherlands.

    Osteoarthritis is one of the most frequent and disabling diseases of the elderly. Only few genetic variants have been identified for osteoarthritis, which is partly due to large phenotype heterogeneity. To reduce heterogeneity, we here examined cartilage thickness, one of the structural components of joint health. We conducted a genome-wide association study of minimal joint space width (mJSW), a proxy for cartilage thickness, in a discovery set of 13,013 participants from five different cohorts and replication in 8,227 individuals from seven independent cohorts. We identified five genome-wide significant (GWS, P≤5·0×10-8) SNPs annotated to four distinct loci. In addition, we found two additional loci that were significantly replicated, but results of combined meta-analysis fell just below the genome wide significance threshold. The four novel associated genetic loci were located in/near TGFA (rs2862851), PIK3R1 (rs10471753), SLBP/FGFR3 (rs2236995), and TREH/DDX6 (rs496547), while the other two (DOT1L and SUPT3H/RUNX2) were previously identified. A systematic prioritization for underlying causal genes was performed using diverse lines of evidence. Exome sequencing data (n = 2,050 individuals) indicated that there were no rare exonic variants that could explain the identified associations. In addition, TGFA, FGFR3 and PIK3R1 were differentially expressed in OA cartilage lesions versus non-lesioned cartilage in the same individuals. In conclusion, we identified four novel loci (TGFA, PIK3R1, FGFR3 and TREH) and confirmed two loci known to be associated with cartilage thickness.The identified associations were not caused by rare exonic variants. This is the first report linking TGFA to human OA, which may serve as a new target for future therapies.

    Funded by: NIA NIH HHS: R01 AG027574, T32 AG023480

    PLoS genetics 2016;12;10;e1006260

  • EphrinB1/EphB3b Coordinate Bidirectional Epithelial-Mesenchymal Interactions Controlling Liver Morphogenesis and Laterality.

    Cayuso J, Dzementsei A, Fischer JC, Karemore G, Caviglia S, Bartholdson J, Wright GJ and Ober EA

    Division of Developmental Biology, Mill Hill Laboratories, The Francis Crick Institute, London NW7 1AA, UK.

    Positioning organs in the body often requires the movement of multiple tissues, yet the molecular and cellular mechanisms coordinating such movements are largely unknown. Here, we show that bidirectional signaling between EphrinB1 and EphB3b coordinates the movements of the hepatic endoderm and adjacent lateral plate mesoderm (LPM), resulting in asymmetric positioning of the zebrafish liver. EphrinB1 in hepatoblasts regulates directional migration and mediates interactions with the LPM, where EphB3b controls polarity and movement of the LPM. EphB3b in the LPM concomitantly repels hepatoblasts to move leftward into the liver bud. Cellular protrusions controlled by Eph/Ephrin signaling mediate hepatoblast motility and long-distance cell-cell contacts with the LPM beyond immediate tissue interfaces. Mechanistically, intracellular EphrinB1 domains mediate EphB3b-independent hepatoblast extension formation, while EpB3b interactions cause their destabilization. We propose that bidirectional short- and long-distance cell interactions between epithelial and mesenchyme-like tissues coordinate liver bud formation and laterality via cell repulsion.

    Developmental cell 2016;39;3;316-328

  • Recombination in Streptococcus pneumoniae Lineages Increase with Carriage Duration and Size of the Polysaccharide Capsule.

    Chaguza C, Andam CP, Harris SR, Cornick JE, Yang M, Bricio-Moreno L, Kamng'ona AW, Parkhill J, French N, Heyderman RS, Kadioglu A, Everett DB, Bentley SD and Hanage WP

    Department of Clinical Infection, Microbiology and Immunology, Institute of Infection and Global Health, University of Liverpool, Liverpool, United Kingdom Microbial Ecology, Malawi-Liverpool-Wellcome Trust Clinical Research Programme, University of Malawi, College of Medicine, Blantyre, Malawi

    Streptococcus pneumoniae causes a high burden of invasive pneumococcal disease (IPD) globally, especially in children from resource-poor settings. Like many bacteria, the pneumococcus can import DNA from other strains or even species by transformation and homologous recombination, which has allowed the pneumococcus to evade clinical interventions such as antibiotics and pneumococcal conjugate vaccines (PCVs). Pneumococci are enclosed in a complex polysaccharide capsule that determines the serotype; the capsule varies in size and is associated with properties including carriage prevalence and virulence. We determined and quantified the association between capsule and recombination events using genomic data from a diverse collection of serotypes sampled in Malawi. We determined both the amount of variation introduced by recombination relative to mutation (the relative rate) and how many individual recombination events occur per isolate (the frequency). Using univariate analyses, we found an association between both recombination measures and multiple factors associated with the capsule, including duration and prevalence of carriage. Because many capsular factors are correlated, we used multivariate analysis to correct for collinearity. Capsule size and carriage duration remained positively associated with recombination, although with a reduced P value, and this effect may be mediated through some unassayed additional property associated with larger capsules. This work describes an important impact of serotype on recombination that has been previously overlooked. While the details of how this effect is achieved remain to be determined, it may have important consequences for the serotype-specific response to vaccines and other interventions.

    Importance: The capsule determines >90 different pneumococcal serotypes, which vary in capsule size, virulence, duration, and prevalence of carriage. Current serotype-specific vaccines elicit anticapsule antibodies. Pneumococcus can take up exogenous DNA by transformation and insert it into its chromosome by homologous recombination. This mechanism has disseminated drug resistance and generated vaccine escape variants. It is hence crucial to pneumococcal evolutionary response to interventions, but there has been no systematic study quantifying whether serotypes vary in recombination and whether this is associated with serotype-specific properties such as capsule size or carriage duration. Larger capsules could physically inhibit DNA uptake, or given the longer carriage duration for larger capsules, this may promote recombination. We find that recombination varies among capsules and is associated with capsule size, carriage duration, and carriage prevalence and negatively associated with invasiveness. The consequence of this work is that serotypes with different capsules may respond differently to selective pressures like vaccines.

    Funded by: NIAID NIH HHS: R01 AI106786; Wellcome Trust

    mBio 2016;7;5

  • Understanding pneumococcal serotype 1 biology through population genomic analysis.

    Chaguza C, Cornick JE, Harris SR, Andam CP, Bricio-Moreno L, Yang M, Yalcin F, Ousmane S, Govindpersad S, Senghore M, Ebruke C, Du Plessis M, Kiran AM, Pluschke G, Sigauque B, McGee L, Klugman KP, Turner P, Corander J, Parkhill J, Collard JM, Antonio M, von Gottberg A, Heyderman RS, French N, Kadioglu A, Hanage WP, Everett DB, Bentley SD and PAGe Consortium

    Department of Clinical Infection, Microbiology and Immunology, Institute of Infection and Global Health, University of Liverpool, Liverpool, L69 7BE, UK.

    Background: Pneumococcus kills over one million children annually and over 90 % of these deaths occur in low-income countries especially in Sub-Saharan Africa (SSA) where HIV exacerbates the disease burden. In SSA, serotype 1 pneumococci particularly the endemic ST217 clone, causes majority of the pneumococcal disease burden. To understand the evolution of the virulent ST217 clone, we analysed ST217 whole genomes from isolates sampled from African and Asian countries.

    Methods: We analysed 226 whole genome sequences from the ST217 lineage sampled from 9 African and 4 Asian countries. We constructed a whole genome alignment and used it for phylogenetic and coalescent analyses. We also screened the genomes to determine presence of antibiotic resistance conferring genes.

    Results: Population structure analysis grouped the ST217 isolates into five sequence clusters (SCs), which were highly associated with different geographical regions and showed limited intracontinental and intercontinental spread. The SCs showed lower than expected genomic sequence, which suggested strong purifying selection and small population sizes caused by bottlenecks. Recombination rates varied between the SCs but were lower than in other successful clones such as PMEN1. African isolates showed higher prevalence of antibiotic resistance genes than Asian isolates. Interestingly, certain West African isolates harbored a defective chloramphenicol and tetracycline resistance-conferring element (Tn5253) with a deletion in the loci encoding the chloramphenicol resistance gene (cat <sub>pC194</sub>), which caused lower chloramphenicol than tetracycline resistance. Furthermore, certain genes that promote colonisation were absent in the isolates, which may contribute to serotype 1's rarity in carriage and consequently its lower recombination rates.

    Conclusions: The high phylogeographic diversity of the ST217 clone shows that this clone has been in circulation globally for a long time, which allowed its diversification and adaptation in different geographical regions. Such geographic adaptation reflects local variations in selection pressures in different locales. Further studies will be required to fully understand the biological mechanisms which makes the ST217 clone highly invasive but unable to successfully colonise the human nasopharynx for long durations which results in lower recombination rates.

    Funded by: Medical Research Council: MC_U190074190, MC_U190081991, MC_UP_A900_1122; NIAID NIH HHS: R01 AI106786; Wellcome Trust: 084679/Z/08/Z, 100891

    BMC infectious diseases 2016;16;1;649

  • Dataset for a Dugesia japonica de novo transcriptome assembly, utilized for defining the voltage-gated like ion channel superfamily.

    Chan JD, Zhang D, Liu X, Zarowiecki MZ, Berriman M and Marchant JS

    Department of Pharmacology, University of Minnesota Medical School, MN 55455, USA.

    This data article provides a transcriptomic resource for the free living planarian flatworm <i>Dugesia japonica</i> related to the research article entitled 'Utilizing the planarian voltage-gated ion channel transcriptome to resolve a role for a Ca<sup>2+</sup> channel in neuromuscular function and regeneration (J.D. Chan, D. Zhang, X. Liu, M. Zarowiecki, M. Berriman, J.S. Marchant, 2016) [1]. Data provided in this submission comprise sequence information for the unfiltered de novo assembly, the filtered assembly and a curated analysis of voltage-gated like (VGL) ion channel sequences mined from this resource. Availability of this data should facilitate further adoption of this model by laboratories interested in studying the role of individual genes of interest in planarian physiology and regenerative biology.

    Funded by: NIGMS NIH HHS: T32 GM113846

    Data in brief 2016;9;1044-1047

  • Utilizing the planarian voltage-gated ion channel transcriptome to resolve a role for a Ca(2+) channel in neuromuscular function and regeneration.

    Chan JD, Zhang D, Liu X, Zarowiecki MZ, Berriman M and Marchant JS

    Department of Pharmacology, United Kingdom.

    The robust regenerative capacity of planarian flatworms depends on the orchestration of signaling events from early wounding responses through the stem cell enacted differentiative outcomes that restore appropriate tissue types. Acute signaling events in excitable cells play an important role in determining regenerative polarity, rationalized by the discovery that sub-epidermal muscle cells express critical patterning genes known to control regenerative outcomes. These data imply a dual conductive (neuromuscular signaling) and instructive (anterior-posterior patterning) role for Ca(2+) signaling in planarian regeneration. Here, to facilitate study of acute signaling events in the excitable cell niche, we provide a de novo transcriptome assembly from the planarian Dugesia japonica allowing characterization of the diverse ionotropic portfolio of this model organism. We demonstrate the utility of this resource by proceeding to characterize the individual role of each of the planarian voltage-operated Ca(2+) channels during regeneration, and demonstrate that knockdown of a specific voltage operated Ca(2+) channel (Cav1B) that impairs muscle function uniquely creates an environment permissive for anteriorization. Provision of the full transcriptomic dataset should facilitate further investigations of molecules within the planarian voltage-gated channel portfolio to explore the role of excitable cell physiology on regenerative outcomes. This article is part of a Special Issue entitled: ECS Meeting edited by Claus Heizmann, Joachim Krebs and Jacques Haiech.

    Biochimica et biophysica acta 2016

  • Chromosome organisation during ageing and senescence.

    Chandra T and Kirschner K

    Epigenetics Programme, The Babraham Institute, Cambridge CB22 3AT, UK; The Wellcome Trust Sanger Institute, Cambridge CB10 1SA, UK. Electronic address:

    Acute cellular stress caused by oncogene activation or high levels of DNA damage can engage a tumour suppressive response, which can lead to cellular senescence. Chronic cellular stress evoked by low levels of DNA damage or telomere erosion is involved in the ageing process. In oncogene induced senescence in fibroblasts, a dramatic rearrangement of heterochromatin into foci and accumulation of constitutive heterochromatin is well documented. In contrast, a loss of heterochromatin has been described in replicative senescence and premature ageing syndromes. The distinct nuclear phenotypes that accompany the stress response highlight the differences between acute and chronic stress models, and this review will address the differences and similarities between these models with a focus on chromosome organisation and heterochromatin.

    Current opinion in cell biology 2016;40;161-167

  • Phenotypic insights into ADCY5-associated disease.

    Chang FC, Westenberger A, Dale RC, Smith M, Pall HS, Perez-Dueñas B, Grattan-Smith P, Ouvrier RA, Mahant N, Hanna BC, Hunter M, Lawson JA, Max C, Sachdev R, Meyer E, Crimmins D, Pryor D, Morris JG, Münchau A, Grozeva D, Carss KJ, Raymond L, Kurian MA, Klein C and Fung VS

    Movement Disorders Unit, Department of Neurology, Westmead Hospital, Sydney, Australia.

    Background: Adenylyl cyclase 5 (ADCY5) mutations is associated with heterogenous syndromes: familial dyskinesia and facial myokymia; paroxysmal chorea and dystonia; autosomal-dominant chorea and dystonia; and benign hereditary chorea. We provide detailed clinical data on 7 patients from six new kindreds with mutations in the ADCY5 gene, in order to expand and define the phenotypic spectrum of ADCY5 mutations.

    Methods: In 5 of the 7 patients, followed over a period of 9 to 32 years, ADCY5 was sequenced by Sanger sequencing. The other 2 unrelated patients participated in studies for undiagnosed pediatric hyperkinetic movement disorders and underwent whole-exome sequencing.

    Results: Five patients had the previously reported p.R418W ADCY5 mutation; we also identified two novel mutations at p.R418G and p.R418Q. All patients presented with motor milestone delay, infantile-onset action-induced generalized choreoathetosis, dystonia, or myoclonus, with episodic exacerbations during drowsiness being a characteristic feature. Axial hypotonia, impaired upward saccades, and intellectual disability were variable features. The p.R418G and p.R418Q mutation patients had a milder phenotype. Six of seven patients had mild functional gain with clonazepam or clobazam. One patient had bilateral globus pallidal DBS at the age of 33 with marked reduction in dyskinesia, which resulted in mild functional improvement.

    Conclusion: We further delineate the clinical features of ADCY5 gene mutations and illustrate its wide phenotypic expression. We describe mild improvement after treatment with clonazepam, clobazam, and bilateral pallidal DBS. ADCY5-associated dyskinesia may be under-recognized, and its diagnosis has important prognostic, genetic, and therapeutic implications. © 2016 The Authors. Movement Disorders published by Wiley Periodicals, Inc. on behalf of International Parkinson and Movement Disorder Society.

    Movement disorders : official journal of the Movement Disorder Society 2016

  • Identifying the effect of patient sharing on between-hospital genetic differentiation of methicillin-resistant Staphylococcus aureus.

    Chang HH, Dordel J, Donker T, Worby CJ, Feil EJ, Hanage WP, Bentley SD, Huang SS and Lipsitch M

    Department of Epidemiology, Center for Communicable Disease Dynamics, Harvard T.H. Chan School of Public Health, Boston, MA, USA.

    Background: Methicillin-resistant Staphylococcus aureus (MRSA) is one of the most common healthcare-associated pathogens. To examine the role of inter-hospital patient sharing on MRSA transmission, a previous study collected 2,214 samples from 30 hospitals in Orange County, California and showed by spa typing that genetic differentiation decreased significantly with increased patient sharing. In the current study, we focused on the 986 samples with spa type t008 from the same population.

    Methods: We used genome sequencing to determine the effect of patient sharing on genetic differentiation between hospitals. Genetic differentiation was measured by between-hospital genetic diversity, F ST , and the proportion of nearly identical isolates between hospitals.

    Results: Surprisingly, we found very similar genetic diversity within and between hospitals, and no significant association between patient sharing and genetic differentiation measured by F ST . However, in contrast to F ST , there was a significant association between patient sharing and the proportion of nearly identical isolates between hospitals. We propose that the proportion of nearly identical isolates is more powerful at determining transmission dynamics than traditional estimators of genetic differentiation (F ST ) when gene flow between populations is high, since it is more responsive to recent transmission events. Our hypothesis was supported by the results from coalescent simulations.

    Conclusions: Our results suggested that there was a high level of gene flow between hospitals facilitated by patient sharing, and that the proportion of nearly identical isolates is more sensitive to population structure than F ST when gene flow is high.

    Funded by: NIGMS NIH HHS: U54 GM088558

    Genome medicine 2016;8;1;18

  • Coordinated nuclease activities counteract Ku at single-ended DNA double-strand breaks.

    Chanut P, Britton S, Coates J, Jackson SP and Calsou P

    Institut de Pharmacologie et de Biologie Structurale, Université de Toulouse, CNRS, UPS, 31077 Toulouse, France.

    Repair of single-ended DNA double-strand breaks (seDSBs) by homologous recombination (HR) requires the generation of a 3' single-strand DNA overhang by exonuclease activities in a process called DNA resection. However, it is anticipated that the highly abundant DNA end-binding protein Ku sequesters seDSBs and shields them from exonuclease activities. Despite pioneering works in yeast, it is unclear how mammalian cells counteract Ku at seDSBs to allow HR to proceed. Here we show that in human cells, ATM-dependent phosphorylation of CtIP and the epistatic and coordinated actions of MRE11 and CtIP nuclease activities are required to limit the stable loading of Ku on seDSBs. We also provide evidence for a hitherto unsuspected additional mechanism that contributes to prevent Ku accumulation at seDSBs, acting downstream of MRE11 endonuclease activity and in parallel with MRE11 exonuclease activity. Finally, we show that Ku persistence at seDSBs compromises Rad51 focus assembly but not DNA resection.

    Funded by: Cancer Research UK: 11224; Wellcome Trust

    Nature communications 2016;7;12889

  • Extensive Proliferation of a Subset of Differentiated, yet Plastic, Medial Vascular Smooth Muscle Cells Contributes to Neointimal Formation in Mouse Injury and Atherosclerosis Models.

    Chappell J, Harman JL, Narasimhan VM, Yu H, Foote K, Simons BD, Bennett MR and Jørgensen HF

    From the Cardiovascular Medicine Division, Department of Medicine (J.C., J.L.H., H.Y., K.F., M.R.B., H.F.J.), Cavendish Laboratory, Department of Physics (B.D.S.), The Wellcome Trust/Cancer Research UK Gurdon Institute (B.D.S.), and Wellcome Trust-Medical Research Council Stem Cell Institute (B.D.S.), University of Cambridge, United Kingdom; and The Wellcome Trust Sanger Institute, Hinxton, Cambridge, United Kingdom (V.M.N.).

    Rationale: Vascular smooth muscle cell (VSMC) accumulation is a hallmark of atherosclerosis and vascular injury. However, fundamental aspects of proliferation and the phenotypic changes within individual VSMCs, which underlie vascular disease, remain unresolved. In particular, it is not known whether all VSMCs proliferate and display plasticity or whether individual cells can switch to multiple phenotypes.

    Objective: To assess whether proliferation and plasticity in disease is a general characteristic of VSMCs or a feature of a subset of cells.

    Methods and results: Using multicolor lineage labeling, we demonstrate that VSMCs in injury-induced neointimal lesions and in atherosclerotic plaques are oligoclonal, derived from few expanding cells. Lineage tracing also revealed that the progeny of individual VSMCs contributes to both alpha smooth muscle actin (aSma)-positive fibrous cap and Mac3-expressing macrophage-like plaque core cells. Costaining for phenotypic markers further identified a double-positive aSma+ Mac3+ cell population, which is specific to VSMC-derived plaque cells. In contrast, VSMC-derived cells generating the neointima after vascular injury generally retained the expression of VSMC markers and the upregulation of Mac3 was less pronounced. Monochromatic regions in atherosclerotic plaques and injury-induced neointima did not contain VSMC-derived cells expressing a different fluorescent reporter protein, suggesting that proliferation-independent VSMC migration does not make a major contribution to VSMC accumulation in vascular disease.

    Conclusions: We demonstrate that extensive proliferation of a low proportion of highly plastic VSMCs results in the observed VSMC accumulation after injury and in atherosclerotic plaques. Therapeutic targeting of these hyperproliferating VSMCs might effectively reduce vascular disease without affecting vascular integrity.

    Funded by: British Heart Foundation: FS/15/38/31516, PG/12/86/29930, PG/13/25/30014, RG/13/14/30314

    Circulation research 2016;119;12;1313-1323

  • Whole-genome sequencing of a quarter-century melioidosis outbreak in temperate Australia uncovers a region of low-prevalence endemicity.

    Chapple SNJ, Sarovich DS, Holden MTG, Peacock SJ, Buller N, Golledge C, Mayo M, Currie BJ and Price EP

    1​Melbourne Medical School, University of Melbourne, Melbourne, Victoria, Australia.

    Melioidosis, caused by the highly recombinogenic bacterium <i>Burkholderia pseudomallei</i>, is a disease with high mortality. Tracing the origin of melioidosis outbreaks and understanding how the bacterium spreads and persists in the environment are essential to protecting public and veterinary health and reducing mortality associated with outbreaks. We used whole-genome sequencing to compare isolates from a historical quarter-century outbreak that occurred between 1966 and 1991 in the Avon Valley, Western Australia, a region far outside the known range of <i>B. pseudomallei</i> endemicity. All Avon Valley outbreak isolates shared the same multilocus sequence type (ST-284), which has not been identified outside this region. We found substantial genetic diversity among isolates based on a comparison of genome-wide variants, with no clear correlation between genotypes and temporal, geographical or source data. We observed little evidence of recombination in the outbreak strains, indicating that genetic diversity among these isolates has primarily accrued by mutation. Phylogenomic analysis demonstrated that the isolates confidently grouped within the Australian <i>B. pseudomallei</i> clade, thereby ruling out introduction from a melioidosis-endemic region outside Australia. Collectively, our results point to <i>B. pseudomallei</i> ST-284 being present in the Avon Valley for longer than previously recognized, with its persistence and genomic diversity suggesting long-term, low-prevalence endemicity in this temperate region. Our findings provide a concerning demonstration of the potential for environmental persistence of <i>B. pseudomallei</i> far outside the conventional endemic regions. An expected increase in extreme weather events may reactivate latent <i>B. pseudomallei</i> populations in this region.

    Funded by: Wellcome Trust: 098051

    Microbial genomics 2016;2;7;e000067

  • Canalization of genetic and pharmacological perturbations in developing primary neuronal activity patterns.

    Charlesworth P, Morton A, Eglen SJ, Komiyama NH and Grant SG

    Genes to Cognition Programme, Wellcome Trust Sanger Institute, Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK. Electronic address:

    The function of the nervous system depends on the integrity of synapses and the patterning of electrical activity in brain circuits. The rapid advances in genome sequencing reveal a large number of mutations disrupting synaptic proteins, which potentially result in diseases known as synaptopathies. However, it is also evident that every normal individual carries hundreds of potentially damaging mutations. Although genetic studies in several organisms show that mutations can be masked during development by a process known as canalization, it is unknown if this occurs in the development of the electrical activity in the brain. Using longitudinal recordings of primary cultured neurons on multi-electrode arrays from mice carrying knockout mutations we report evidence of canalization in development of spontaneous activity patterns. Phenotypes in the activity patterns in young cultures from mice lacking the Gria1 subunit of the AMPA receptor were ameliorated as cultures matured. Similarly, the effects of chronic pharmacological NMDA receptor blockade diminished as cultures matured. Moreover, disturbances in activity patterns by simultaneous disruption of Gria1 and NMDA receptors were also canalized by three weeks in culture. Additional mutations and genetic variations also appeared to be canalized to varying degrees. These findings indicate that neuronal network canalization is a form of nervous system plasticity that provides resilience to developmental disruption. This article is part of the Special Issue entitled 'Synaptopathy--from Biology to Therapy'.

    Neuropharmacology 2016;100;47-55

  • Genetic Drivers of Epigenetic and Transcriptional Variation in Human Immune Cells.

    Chen L, Ge B, Casale FP, Vasquez L, Kwan T, Garrido-Martín D, Watt S, Yan Y, Kundu K, Ecker S, Datta A, Richardson D, Burden F, Mead D, Mann AL, Fernandez JM, Rowlston S, Wilder SP, Farrow S, Shao X, Lambourne JJ, Redensek A, Albers CA, Amstislavskiy V, Ashford S, Berentsen K, Bomba L, Bourque G, Bujold D, Busche S, Caron M, Chen SH, Cheung W, Delaneau O, Dermitzakis ET, Elding H, Colgiu I, Bagger FO, Flicek P, Habibi E, Iotchkova V, Janssen-Megens E, Kim B, Lehrach H, Lowy E, Mandoli A, Matarese F, Maurano MT, Morris JA, Pancaldi V, Pourfarzad F, Rehnstrom K, Rendon A, Risch T, Sharifi N, Simon MM, Sultan M, Valencia A, Walter K, Wang SY, Frontini M, Antonarakis SE, Clarke L, Yaspo ML, Beck S, Guigo R, Rico D, Martens JHA, Ouwehand WH, Kuijpers TW, Paul DS, Stunnenberg HG, Stegle O, Downes K, Pastinen T and Soranzo N

    Department of Human Genetics, The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1HH, UK; Department of Haematology, University of Cambridge, Cambridge Biomedical Campus, Long Road, Cambridge CB2 0PT, UK.

    Characterizing the multifaceted contribution of genetic and epigenetic factors to disease phenotypes is a major challenge in human genetics and medicine. We carried out high-resolution genetic, epigenetic, and transcriptomic profiling in three major human immune cell types (CD14<sup>+</sup> monocytes, CD16<sup>+</sup> neutrophils, and naive CD4<sup>+</sup> T cells) from up to 197 individuals. We assess, quantitatively, the relative contribution of cis-genetic and epigenetic factors to transcription and evaluate their impact as potential sources of confounding in epigenome-wide association studies. Further, we characterize highly coordinated genetic effects on gene expression, methylation, and histone variation through quantitative trait locus (QTL) mapping and allele-specific (AS) analyses. Finally, we demonstrate colocalization of molecular trait QTLs at 345 unique immune disease loci. This expansive, high-resolution atlas of multi-omics changes yields insights into cell-type-specific correlation between diverse genomic inputs, more generalizable correlations between these inputs, and defines molecular events that may underpin complex disease risk.

    Funded by: British Heart Foundation: RG/09/012/28096; Department of Health: RP-PG-0310-1002; Medical Research Council: G0800270; Wellcome Trust

    Cell 2016;167;5;1398-1414.e24

  • Single-cell analysis at the threshold.

    Chen X, Love JC, Navin NE, Pachter L, Stubbington MJ, Svensson V, Sweedler JV and Teichmann SA

    Wellcome Trust Sanger Institute, Cambridge, UK.

    Nature biotechnology 2016;34;11;1111-1118

  • Genome-Wide Association Analysis of Young-Onset Stroke Identifies a Locus on Chromosome 10q25 Near HABP2.

    Cheng YC, Stanne TM, Giese AK, Ho WK, Traylor M, Amouyel P, Holliday EG, Malik R, Xu H, Kittner SJ, Cole JW, O'Connell JR, Danesh J, Rasheed A, Zhao W, Engelter S, Grond-Ginsbach C, Kamatani Y, Lathrop M, Leys D, Thijs V, Metso TM, Tatlisumak T, Pezzini A, Parati EA, Norrving B, Bevan S, Rothwell PM, Sudlow C, Slowik A, Lindgren A, Walters MR, WTCCC-2 Consortium, Jannes J, Shen J, Crosslin D, Doheny K, Laurie CC, Kanse SM, Bis JC, Fornage M, Mosley TH, Hopewell JC, Strauch K, Müller-Nurasyid M, Gieger C, Waldenberger M, Peters A, Meisinger C, Ikram MA, Longstreth WT, Meschia JF, Seshadri S, Sharma P, Worrall B, Jern C, Levi C, Dichgans M, Boncoraglio GB, Markus HS, Debette S, Rolfs A, Saleheen D and Mitchell BD

    From the Veterans Affairs Maryland Health Care System, Baltimore, MD (Y.-C.C., S.J.K., J.W.C., B.D.M.); University of Maryland School of Medicine, Baltimore (Y.-C.C., H.X., S.J.K., J.W.C., J.R.O., B.D.M.); The University of Gothenburg, Gothenburg, Sweden (T.M.S., C.J.); University of Rostock, Rostock, Germany (A.-K.G., A. Rolfs); University of Nottingham Malaysia Campus, Selangor Darul Ehsa, Malaysia (W.K.H.); University of Cambridge, Cambridge, UK (M.T., J.D., S.B., H.S.M., S.D., D.S.); Institut Pasteur de Lille, F-59000 Lille, France (P.A.); University of Newcastle, Australia (E.G.H.); Ludwig-Maximilians-Universität, Munich, Germany (R.M., K.S., M.D.); Wellcome Trust Sanger Institute, Cambridge, UK (J.D.); Center for Non-Communicable Diseases, Karachi, Pakistan (A. Rasheed, D.S.); University of Pennsylvania (W.Z., D.S.); Basel University Hospital, Switzerland (S.E.); Heidelberg University Hospital, Germany (C.G.-G.); Centre d'Étude du Polymorphisme Humain, Paris, France (Y.K.); RIKEN Center for Integrative Medical Sciences, Yokohama, Japan (Y.K.); National Genotyping Center, Evry, France (M.L.); Genome Quebec, McGill University, Montreal, Canada (M.L.); Lille University Hospital, France (D.L., S.D.); KU Leuven - University of Leuven, Leuven, Belgium (V.T.); Vesalius Research Center, VIB, Leuven, Belgium (V.T.); University Hospitals Leuven, Leuven, Belgium (V.T.); Helsinki University Central Hospital, Helsinki, Finland (T.M.M., T.T.); Università degli Studi di Brescia, Brescia, Italy (A. Pezzini); Fondazione IRCCS Istituto Neurologico Carlo Besta, Milan, Italy (E.A.P., G.B.B.); University of Lund, Sweden (B.N.); University of Oxford, John Radcliffe Hospital (P.M.R.); University of Edinburgh, Edinburgh, UK (C.S.); Jagiellonian University Medical College, Krakow, Poland (A.S.); Lund University, Lund, Sweden (A.L.); Skåne University Hospital, Lund, Sweden (A.L.); University of Glasgow, Glasgow, UK (M.R.W.); University of Adelaide, Australia (J.J.); Mount Sinai Hos

    Background and purpose: Although a genetic contribution to ischemic stroke is well recognized, only a handful of stroke loci have been identified by large-scale genetic association studies to date. Hypothesizing that genetic effects might be stronger for early- versus late-onset stroke, we conducted a 2-stage meta-analysis of genome-wide association studies, focusing on stroke cases with an age of onset <60 years.

    Methods: The discovery stage of our genome-wide association studies included 4505 cases and 21 968 controls of European, South-Asian, and African ancestry, drawn from 6 studies. In Stage 2, we selected the lead genetic variants at loci with association P<5×10(-6) and performed in silico association analyses in an independent sample of ≤1003 cases and 7745 controls.

    Results: One stroke susceptibility locus at 10q25 reached genome-wide significance in the combined analysis of all samples from the discovery and follow-up stages (rs11196288; odds ratio =1.41; P=9.5×10(-9)). The associated locus is in an intergenic region between TCF7L2 and HABP2. In a further analysis in an independent sample, we found that 2 single nucleotide polymorphisms in high linkage disequilibrium with rs11196288 were significantly associated with total plasma factor VII-activating protease levels, a product of HABP2.

    Conclusions: HABP2, which encodes an extracellular serine protease involved in coagulation, fibrinolysis, and inflammatory pathways, may be a genetic susceptibility locus for early-onset stroke.

    Stroke; a journal of cerebral circulation 2016;47;2;307-16

  • Pathogen hide-and-'seq'.

    Chewapreecha C, Bénard A and Reuter S

    Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.

    Nature reviews. Microbiology 2016;14;5;271

  • CRISPR-Cas9(D10A) nickase-based genotypic and phenotypic screening to enhance genome editing.

    Chiang TW, le Sage C, Larrieu D, Demir M and Jackson SP

    Wellcome Trust/Cancer Research UK Gurdon Institute, University of Cambridge, Cambridge CB2 1QN, UK.

    The RNA-guided Cas9 nuclease is being widely employed to engineer the genomes of various cells and organisms. Despite the efficient mutagenesis induced by Cas9, off-target effects have raised concerns over the system's specificity. Recently a "double-nicking" strategy using catalytic mutant Cas9(D10A) nickase has been developed to minimise off-target effects. Here, we describe a Cas9(D10A)-based screening approach that combines an All-in-One Cas9(D10A) nickase vector with fluorescence-activated cell sorting enrichment followed by high-throughput genotypic and phenotypic clonal screening strategies to generate isogenic knockouts and knock-ins highly efficiently, with minimal off-target effects. We validated this approach by targeting genes for the DNA-damage response (DDR) proteins MDC1, 53BP1, RIF1 and P53, plus the nuclear architecture proteins Lamin A/C, in three different human cell lines. We also efficiently obtained biallelic knock-in clones, using single-stranded oligodeoxynucleotides as homologous templates, for insertion of an EcoRI recognition site at the RIF1 locus and introduction of a point mutation at the histone H2AFX locus to abolish assembly of DDR factors at sites of DNA double-strand breaks. This versatile screening approach should facilitate research aimed at defining gene functions, modelling of cancers and other diseases underpinned by genetic factors, and exploring new therapeutic opportunities.

    Scientific reports 2016;6;24356

  • gEVAL - a web-based browser for evaluating genome assemblies.

    Chow W, Brugger K, Caccamo M, Sealy I, Torrance J and Howe K

    Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, UK.

    Motivation: For most research approaches, genome analyses are dependent on the existence of a high quality genome reference assembly. However, the local accuracy of an assembly remains difficult to assess and improve. The gEVAL browser allows the user to interrogate an assembly in any region of the genome by comparing it to different datasets and evaluating the concordance. These analyses include: a wide variety of sequence alignments, comparative analyses of multiple genome assemblies, and consistency with optical and other physical maps. gEVAL highlights allelic variations, regions of low complexity, abnormal coverage, and potential sequence and assembly errors, and offers strategies for improvement. Although gEVAL focuses primarily on sequence integrity, it can also display arbitrary annotation including from Ensembl or TrackHub sources. We provide gEVAL web sites for many human, mouse, zebrafish and chicken assemblies to support the Genome Reference Consortium, and gEVAL is also downloadable to enable its use for any organism and assembly.

    Availability and implementation: Web Browser:, Plugin:


    Supplementary information: Supplementary data are available at Bioinformatics online.

    Funded by: Wellcome Trust: 098051

    Bioinformatics (Oxford, England) 2016;32;16;2508-10

  • South Asia as a Reservoir for the Global Spread of Ciprofloxacin-Resistant Shigella sonnei: A Cross-Sectional Study.

    Chung The H, Rabaa MA, Pham Thanh D, De Lappe N, Cormican M, Valcanis M, Howden BP, Wangchuk S, Bodhidatta L, Mason CJ, Nguyen Thi Nguyen T, Vu Thuy D, Thompson CN, Phu Huong Lan N, Voong Vinh P, Ha Thanh T, Turner P, Sar P, Thwaites G, Thomson NR, Holt KE and Baker S

    The Hospital for Tropical Diseases, Wellcome Trust Major Overseas Programme, Oxford University Clinical Research Unit, Ho Chi Minh City, Vietnam.

    Background: Antimicrobial resistance is a major issue in the Shigellae, particularly as a specific multidrug-resistant (MDR) lineage of Shigella sonnei (lineage III) is becoming globally dominant. Ciprofloxacin is a recommended treatment for Shigella infections. However, ciprofloxacin-resistant S. sonnei are being increasingly isolated in Asia and sporadically reported on other continents. We hypothesized that Asia is a primary hub for the recent international spread of ciprofloxacin-resistant S. sonnei.

    Methods and findings: We performed whole-genome sequencing on a collection of 60 contemporaneous ciprofloxacin-resistant S. sonnei isolated in four countries within Asia (Vietnam, n = 11; Bhutan, n = 12; Thailand, n = 1; Cambodia, n = 1) and two outside of Asia (Australia, n = 19; Ireland, n = 16). We reconstructed the recent evolutionary history of these organisms and combined these data with their geographical location of isolation. Placing these sequences into a global phylogeny, we found that all ciprofloxacin-resistant S. sonnei formed a single clade within a Central Asian expansion of lineage III. Furthermore, our data show that resistance to ciprofloxacin within S. sonnei may be globally attributed to a single clonal emergence event, encompassing sequential gyrA-S83L, parC-S80I, and gyrA-D87G mutations. Geographical data predict that South Asia is the likely primary source of these organisms, which are being regularly exported across Asia and intercontinentally into Australia, the United States and Europe. Our analysis was limited by the number of S. sonnei sequences available from diverse geographical areas and time periods, and we cannot discount the potential existence of other unsampled reservoir populations of antimicrobial-resistant S. sonnei.

    Conclusions: This study suggests that a single clone, which is widespread in South Asia, is likely driving the current intercontinental surge of ciprofloxacin-resistant S. sonnei and is capable of establishing endemic transmission in new locations. Despite being limited in geographical scope, our work has major implications for understanding the international transfer of antimicrobial-resistant pathogens, with S. sonnei acting as a tractable model for studying how antimicrobial-resistant Gram-negative bacteria spread globally.

    Funded by: Wellcome Trust: 098051, 100087/Z/12/Z

    PLoS medicine 2016;13;8;e1002055

  • Modulation of the human gut microbiota by dietary fibres occurs at the species level.

    Chung WS, Walker AW, Louis P, Parkhill J, Vermeiren J, Bosscher D, Duncan SH and Flint HJ

    Microbiology Group, Rowett Institute of Nutrition and Health, University of Aberdeen, Greenburn Road, Bucksburn, Aberdeen, Scotland, AB21 9SB, UK.

    Background: Dietary intake of specific non-digestible carbohydrates (including prebiotics) is increasingly seen as a highly effective approach for manipulating the composition and activities of the human gut microbiota to benefit health. Nevertheless, surprisingly little is known about the global response of the microbial community to particular carbohydrates. Recent in vivo dietary studies have demonstrated that the species composition of the human faecal microbiota is influenced by dietary intake. There is now potential to gain insights into the mechanisms involved by using in vitro systems that produce highly controlled conditions of pH and substrate supply.

    Results: We supplied two alternative non-digestible polysaccharides as energy sources to three different human gut microbial communities in anaerobic, pH-controlled continuous-flow fermentors. Community analysis showed that supply of apple pectin or inulin resulted in the highly specific enrichment of particular bacterial operational taxonomic units (OTUs; based on 16S rRNA gene sequences). Of the eight most abundant Bacteroides OTUs detected, two were promoted specifically by inulin and six by pectin. Among the Firmicutes, Eubacterium eligens in particular was strongly promoted by pectin, while several species were stimulated by inulin. Responses were influenced by pH, which was stepped up, and down, between 5.5, 6.0, 6.4 and 6.9 in parallel vessels within each experiment. In particular, several experiments involving downshifts to pH 5.5 resulted in Faecalibacterium prausnitzii replacing Bacteroides spp. as the dominant sequences observed. Community diversity was greater in the pectin-fed than in the inulin-fed fermentors, presumably reflecting the differing complexity of the two substrates.

    Conclusions: We have shown that particular non-digestible dietary carbohydrates have enormous potential for modifying the gut microbiota, but these modifications occur at the level of individual strains and species and are not easily predicted a priori. Furthermore, the gut environment, especially pH, plays a key role in determining the outcome of interspecies competition. This makes it crucial to put greater effort into identifying the range of bacteria that may be stimulated by a given prebiotic approach. Both for reasons of efficacy and of safety, the development of prebiotics intended to benefit human health has to take account of the highly individual species profiles that may result.

    Funded by: Biotechnology and Biological Sciences Research Council; Wellcome Trust: 098051

    BMC biology 2016;14;3

  • metaCCA: summary statistics-based multivariate meta-analysis of genome-wide association studies using canonical correlation analysis.

    Cichonska A, Rousu J, Marttinen P, Kangas AJ, Soininen P, Lehtimäki T, Raitakari OT, Järvelin MR, Salomaa V, Ala-Korpela M, Ripatti S and Pirinen M

    Institute for Molecular Medicine Finland FIMM, University of Helsinki, Helsinki, Finland, Helsinki Institute for Information Technology HIIT, Department of Computer Science, Aalto University, Espoo, Finland.

    Motivation: A dominant approach to genetic association studies is to perform univariate tests between genotype-phenotype pairs. However, analyzing related traits together increases statistical power, and certain complex associations become detectable only when several variants are tested jointly. Currently, modest sample sizes of individual cohorts, and restricted availability of individual-level genotype-phenotype data across the cohorts limit conducting multivariate tests.

    Results: We introduce metaCCA, a computational framework for summary statistics-based analysis of a single or multiple studies that allows multivariate representation of both genotype and phenotype. It extends the statistical technique of canonical correlation analysis to the setting where original individual-level records are not available, and employs a covariance shrinkage algorithm to achieve robustness.Multivariate meta-analysis of two Finnish studies of nuclear magnetic resonance metabolomics by metaCCA, using standard univariate output from the program SNPTEST, shows an excellent agreement with the pooled individual-level analysis of original data. Motivated by strong multivariate signals in the lipid genes tested, we envision that multivariate association testing using metaCCA has a great potential to provide novel insights from already published summary statistics from high-throughput phenotyping technologies.

    Availability and implementation: Code is available at CONTACTS: or matti.pirinen@helsinki.fiSupplementary information: Supplementary data are available at Bioinformatics online.

    Bioinformatics (Oxford, England) 2016

  • Single-cell epigenomics: powerful new methods for understanding gene regulation and cell identity.

    Clark SJ, Lee HJ, Smallwood SA, Kelsey G and Reik W

    Epigenetics Programme, Babraham Institute, Cambridge, CB22 3AT, UK.

    Emerging single-cell epigenomic methods are being developed with the exciting potential to transform our knowledge of gene regulation. Here we review available techniques and future possibilities, arguing that the full potential of single-cell epigenetic studies will be realized through parallel profiling of genomic, transcriptional, and epigenetic information.

    Genome biology 2016;17;1;72

  • Comparative genomics of carriage and disease isolates of Streptococcus pneumoniae serotype 22F reveals lineage specific divergence and niche adaptation.

    Cleary DW, Devine VT, Jefferies J, Webb JS, Bentley SD, Gladstone RA, Faust SN and Clarke SC

    1. Academic Unit of Clinical and Experimental Sciences, Faculty of Medicine, University of Southampton, Southampton, UK 2. Institute for Life Sciences, University of Southampton, Southampton, UK.

    Streptococcus pneumoniaeis a major cause of meningitis, sepsis and pneumonia worldwide. Pneumococcal conjugate vaccines (PCV) have been part of the UK's childhood immunisation programme since 2006 and have significantly reduced the incidence of disease due to vaccine efficacy in reducing carriage in the population. Here we isolated two clones of 22F (an emerging serotype of clinical concern, multilocus sequence types (MLST) 433 and 698) and conducted comparative genomic analysis on four isolates, paired by ST with one of each pair being derived from carriage and the other disease (sepsis). The most compelling observation was of non-synonymous mutations inpgdA, encoding peptidoglycanN-acetylglucosamine deacetylase A, which were found in the carriage isolates of both ST433 and 698. Deacetylation of pneumococcal peptidoglycan is known to enable resistance to lysozyme upon invasion. Whilst no other clear genotypic signatures related to disease or carriage could be determined, additional intriguing comparisons between the two STs were possible. These include the presence of an intact prophage, in addition to numerous additional phage insertions, within the carriage isolate of ST433. Contrasting gene repertoires related to virulence and colonisation, including: bacteriocins, lantibiotics, and toxin-antitoxin systems, were also observed.

    Genome biology and evolution 2016

  • Cytomegalovirus-Specific IL-10-Producing CD4+ T Cells Are Governed by Type-I IFN-Induced IL-27 and Promote Virus Persistence.

    Clement M, Marsden M, Stacey MA, Abdul-Karim J, Gimeno Brias S, Costa Bento D, Scurr MJ, Ghazal P, Weaver CT, Carlesso G, Clare S, Jones SA, Godkin A, Jones GW and Humphreys IR

    Division of Infection & Immunity, Cardiff University, Cardiff, United Kingdom.

    CD4+ T cells support host defence against herpesviruses and other viral pathogens. We identified that CD4+ T cells from systemic and mucosal tissues of hosts infected with the β-herpesviridae human cytomegalovirus (HCMV) or murine cytomegalovirus (MCMV) express the regulatory cytokine interleukin (IL)-10. IL-10+CD4+ T cells co-expressed TH1-associated transcription factors and chemokine receptors. Mice lacking T cell-derived IL-10 elicited enhanced antiviral T cell responses and restricted MCMV persistence in salivary glands and secretion in saliva. Thus, IL-10+CD4+ T cells suppress antiviral immune responses against CMV. Expansion of this T-cell population in the periphery was promoted by IL-27 whereas mucosal IL-10+ T cell responses were ICOS-dependent. Infected Il27rα-deficient mice with reduced peripheral IL-10+CD4+ T cell accumulation displayed robust T cell responses and restricted MCMV persistence and shedding. Temporal inhibition experiments revealed that IL-27R signaling during initial infection was required for the suppression of T cell immunity and control of virus shedding during MCMV persistence. IL-27 production was promoted by type-I IFN, suggesting that β-herpesviridae exploit the immune-regulatory properties of this antiviral pathway to establish chronicity. Further, our data reveal that cytokine signaling events during initial infection profoundly influence virus chronicity.

    PLoS pathogens 2016;12;12;e1006050

  • Common polygenic variation in coeliac disease and confirmation of ZNF335 and NIFA as disease susceptibility loci.

    Coleman C, Quinn EM, Ryan AW, Conroy J, Trimble V, Mahmud N, Kennedy N, Corvin AP, Morris DW, Donohoe G, O'Morain C, MacMathuna P, Byrnes V, Kiat C, Trynka G, Wijmenga C, Kelleher D, Ennis S, Anney RJ and McManus R

    Department of Medicine, Institute of Molecular Medicine, Trinity College Dublin, St. James's Hospital, Dublin, Ireland.

    Coeliac disease (CD) is a chronic immune-mediated disease triggered by the ingestion of gluten. It has an estimated prevalence of approximately 1% in European populations. Specific HLA-DQA1 and HLA-DQB1 alleles are established coeliac susceptibility genes and are required for the presentation of gliadin to the immune system resulting in damage to the intestinal mucosa. In the largest association analysis of CD to date, 39 non-HLA risk loci were identified, 13 of which were new, in a sample of 12 014 individuals with CD and 12 228 controls using the Immunochip genotyping platform. Including the HLA, this brings the total number of known CD loci to 40. We have replicated this study in an independent Irish CD case-control population of 425 CD and 453 controls using the Immunochip platform. Using a binomial sign test, we show that the direction of the effects of previously described risk alleles were highly correlated with those reported in the Irish population, (P=2.2 × 10(-16)). Using the Polygene Risk Score (PRS) approach, we estimated that up to 35% of the genetic variance could be explained by loci present on the Immunochip (P=9 × 10(-75)). When this is limited to non-HLA loci, we explain a maximum of 4.5% of the genetic variance (P=3.6 × 10(-18)). Finally, we performed a meta-analysis of our data with the previous reports, identifying two further loci harbouring the ZNF335 and NIFA genes which now exceed genome-wide significance, taking the total number of CD susceptibility loci to 42.

    European journal of human genetics : EJHG 2016;24;2;291-7

  • Clonal analysis of stem cells in differentiation and disease.

    Colom B and Jones PH

    Wellcome Trust Sanger Institute, Hinxton CB10 1SA, UK.

    Tracking the fate of individual cells and their progeny by clonal analysis has redefined the concept of stem cells and their role in health and disease. The maintenance of cell turnover in adult tissues is achieved by the collective action of populations of stem cells with an equal likelihood of self-renewal or differentiation. Following injury stem cells exhibit striking plasticity, switching from homeostatic behavior in order to repair damaged tissues. The effects of disease states on stem cells are also being uncovered, with new insights into how somatic mutations trigger clonal expansion in early neoplasia.

    Funded by: Cancer Research UK: C609/A17257; Wellcome Trust

    Current opinion in cell biology 2016;43;14-21

  • A survey of best practices for RNA-seq data analysis.

    Conesa A, Madrigal P, Tarazona S, Gomez-Cabrero D, Cervera A, McPherson A, Szcześniak MW, Gaffney DJ, Elo LL, Zhang X and Mortazavi A

    Institute for Food and Agricultural Sciences, Department of Microbiology and Cell Science, University of Florida, Gainesville, FL, 32603, USA.

    RNA-sequencing (RNA-seq) has a wide variety of applications, but no single analysis pipeline can be used in all cases. We review all of the major steps in RNA-seq data analysis, including experimental design, quality control, read alignment, quantification of gene and transcript levels, visualization, differential gene expression, alternative splicing, functional analysis, gene fusion detection and eQTL mapping. We highlight the challenges associated with each step. We discuss the analysis of small RNAs and the integration of RNA-seq with other functional genomics techniques. Finally, we discuss the outlook for novel technologies that are changing the state of the art in transcriptomics.

    Funded by: Medical Research Council: MC_PC_12009; NIGMS NIH HHS: DP2 GM111100; Wellcome Trust

    Genome biology 2016;17;13

  • What's in a Name? Species-Wide Whole-Genome Sequencing Resolves Invasive and Noninvasive Lineages of Salmonella enterica Serotype Paratyphi B.

    Connor TR, Owen SV, Langridge G, Connell S, Nair S, Reuter S, Dallman TJ, Corander J, Tabing KC, Le Hello S, Fookes M, Doublet B, Zhou Z, Feltwell T, Ellington MJ, Herrera S, Gilmour M, Cloeckaert A, Achtman M, Parkhill J, Wain J, De Pinna E, Weill FX, Peters T and Thomson N

    Cardiff University School of Biosciences, Cardiff University, Cardiff, United Kingdom Wellcome Trust Sanger Institute, Hinxton, United Kingdom

    Unlabelled: For 100 years, it has been obvious that Salmonella enterica strains sharing the serotype with the formula 1,4,[5],12:b:1,2-now known as Paratyphi B-can cause diseases ranging from serious systemic infections to self-limiting gastroenteritis. Despite considerable predicted diversity between strains carrying the common Paratyphi B serotype, there remain few methods that subdivide the group into groups that are congruent with their disease phenotypes. Paratyphi B therefore represents one of the canonical examples in Salmonella where serotyping combined with classical microbiological tests fails to provide clinically informative information. Here, we use genomics to provide the first high-resolution view of this serotype, placing it into a wider genomic context of the Salmonella enterica species. These analyses reveal why it has been impossible to subdivide this serotype based upon phenotypic and limited molecular approaches. By examining the genomic data in detail, we are able to identify common features that correlate with strains of clinical importance. The results presented here provide new diagnostic targets, as well as posing important new questions about the basis for the invasive disease phenotype observed in a subset of strains.

    Importance: Salmonella enterica strains carrying the serotype Paratyphi B have long been known to possess Jekyll and Hyde characteristics; some cause gastroenteritis, while others cause serious invasive disease. Understanding what makes up the population of strains carrying this serotype, as well as the source of their invasive disease, is a 100-year-old puzzle that we address here using genomics. Our analysis provides the first high-resolution view of this serotype, placing strains carrying serotype Paratyphi B into the wider genomic context of the Salmonella enterica species. This work reveals a history of disease dating back to the middle ages, caused by a group of distinct lineages with various abilities to cause invasive disease. By quantifying the key genomic differences between the invasive and noninvasive populations, we are able to identify key virulence-related targets that can form the basis of simple, rapid, point-of-care tests.

    Funded by: Medical Research Council: MR/L015080/1; Wellcome Trust: 098051

    mBio 2016;7;4

  • From clinical sample to complete genome: Comparing methods for the extraction of HIV-1 RNA for high-throughput deep sequencing.

    Cornelissen M, Gall A, Vink M, Zorgdrager F, Binter Š, Edwards S, Jurriaans S, Bakker M, Ong SH, Gras L, van Sighem A, Bezemer D, de Wolf F, Reiss P, Kellam P, Berkhout B, Fraser C, van der Kuyl AC and BEEHIVE Consortium

    Laboratory of Experimental Virology, Department of Medical Microbiology, Center for Infection and Immunity Amsterdam (CINIMA), Academic Medical Center of the University of Amsterdam, Meibergdreef 15, 1105 AZ Amsterdam, The Netherlands.

    The BEEHIVE (Bridging the Evolution and Epidemiology of HIV in Europe) project aims to analyse nearly-complete viral genomes from >3000 HIV-1 infected Europeans using high-throughput deep sequencing techniques to investigate the virus genetic contribution to virulence. Following the development of a computational pipeline, including a new de novo assembler for RNA virus genomes, to generate larger contiguous sequences (contigs) from the abundance of short sequence reads that characterise the data, another area that determines genome sequencing success is the quality and quantity of the input RNA. A pilot experiment with 125 patient plasma samples was performed to investigate the optimal method for isolation of HIV-1 viral RNA for long amplicon genome sequencing. Manual isolation with the QIAamp Viral RNA Mini Kit (Qiagen) was superior over robotically extracted RNA using either the QIAcube robotic system, the mSample Preparation Systems RNA kit with automated extraction by the m2000sp system (Abbott Molecular), or the MagNA Pure 96 System in combination with the MagNA Pure 96 Instrument (Roche Diagnostics). We scored amplification of a set of four HIV-1 amplicons of ∼1.9, 3.6, 3.0 and 3.5kb, and subsequent recovery of near-complete viral genomes. Subsequently, 616 BEEHIVE patient samples were analysed to determine factors that influence successful amplification of the genome in four overlapping amplicons using the QIAamp Viral RNA Kit for viral RNA isolation. Both low plasma viral load and high sample age (stored before 1999) negatively influenced the amplification of viral amplicons >3kb. A plasma viral load of >100,000 copies/ml resulted in successful amplification of all four amplicons for 86% of the samples, this value dropped to only 46% for samples with viral loads of <20,000 copies/ml.

    Virus research 2016;239;10-16

  • The genome of Onchocerca volvulus, agent of river blindness.

    Cotton JA, Bennuru S, Grote A, Harsha B, Tracey A, Beech R, Doyle SR, Dunn M, Hotopp JC, Holroyd N, Kikuchi T, Lambert O, Mhashilkar A, Mutowo P, Nursimulu N, Ribeiro JM, Rogers MB, Stanley E, Swapna LS, Tsai IJ, Unnasch TR, Voronin D, Parkinson J, Nutman TB, Ghedin E, Berriman M and Lustigman S

    Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK.

    Human onchocerciasis is a serious neglected tropical disease caused by the filarial nematode Onchocerca volvulus that can lead to blindness and chronic disability. Control of the disease relies largely on mass administration of a single drug, and the development of new drugs and vaccines depends on a better knowledge of parasite biology. Here, we describe the chromosomes of O. volvulus and its Wolbachia endosymbiont. We provide the highest-quality sequence assembly for any parasitic nematode to date, giving a glimpse into the evolution of filarial parasite chromosomes and proteomes. This resource was used to investigate gene families with key functions that could be potentially exploited as targets for future drugs. Using metabolic reconstruction of the nematode and its endosymbiont, we identified enzymes that are likely to be essential for O. volvulus viability. In addition, we have generated a list of proteins that could be targeted by Federal-Drug-Agency-approved but repurposed drugs, providing starting points for anti-onchocerciasis drug development.

    Funded by: NIAID NIH HHS: R01 AI078314, T32 AI007180, U19 AI110820; NIH HHS: DP2 OD007372

    Nature microbiology 2016;2;16216

  • An expressed, endogenous Nodavirus-like element captured by a retrotransposon in the genome of the plant parasitic nematode Bursaphelenchus xylophilus.

    Cotton JA, Steinbiss S, Yokoi T, Tsai IJ and Kikuchi T

    Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.

    Recently, nematode viruses infecting Caenorhabditis elegans have been reported from the family Nodaviridae, the first nematode viruses described. Here, we report the observation of a novel endogenous viral element (EVE) in the genome of Bursaphelenchus xylophilus, a plant parasitic nematode unrelated to other nematodes from which viruses have been characterised. This element derives from a different clade of nodaviruses to the previously reported nematode viruses. This represents the first endogenous nodavirus sequence, the first nematode endogenous viral element, and significantly extends our knowledge of the potential diversity of the Nodaviridae. A search for endogenous elements related to the Nodaviridae did not reveal any elements in other available nematode genomes. Further surveillance for endogenous viral elements is warranted as our knowledge of nematode genome diversity, and in particular of free-living nematodes, expands.

    Funded by: Wellcome Trust: WT098051, WT099198MA

    Scientific reports 2016;6;39749

  • RLZAP: Relative lempel-Ziv with adaptive pointers

    Cox,A.J., Farruggia,A., Gagie,T., Puglisi,S.J. and Siren,J.

    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2016;9954 LNCS;1-14

  • Whole genome resequencing of the human parasite Schistosoma mansoni reveals population history and effects of selection.

    Crellen T, Allan F, David S, Durrant C, Huckvale T, Holroyd N, Emery AM, Rollinson D, Aanensen DM, Berriman M, Webster JP and Cotton JA

    Department of Infectious Disease Epidemiology, Imperial College London, St Mary's Campus, Norfolk Place, London W2 1PG, United Kingdom.

    Schistosoma mansoni is a parasitic fluke that infects millions of people in the developing world. This study presents the first application of population genomics to S. mansoni based on high-coverage resequencing data from 10 global isolates and an isolate of the closely-related Schistosoma rodhaini, which infects rodents. Using population genetic tests, we document genes under directional and balancing selection in S. mansoni that may facilitate adaptation to the human host. Coalescence modeling reveals the speciation of S. mansoni and S. rodhaini as 107.5-147.6KYA, a period which overlaps with the earliest archaeological evidence for fishing in Africa. Our results indicate that S. mansoni originated in East Africa and experienced a decline in effective population size 20-90KYA, before dispersing across the continent during the Holocene. In addition, we find strong evidence that S. mansoni migrated to the New World with the 16-19th Century Atlantic Slave Trade.

    Funded by: Medical Research Council; Wellcome Trust: 098051

    Scientific reports 2016;6;20954

  • Reduced Efficacy of Praziquantel Against Schistosoma mansoni Is Associated With Multiple Rounds of Mass Drug Administration.

    Crellen T, Walker M, Lamberton PH, Kabatereine NB, Tukahebwa EM, Cotton JA and Webster JP

    Department of Infectious Disease Epidemiology and the London Centre for Neglected Tropical Disease Research, Imperial College London, St Mary's Campus Wellcome Trust Sanger Institute, Hinxton Department of Pathology and Pathogen Biology, Royal Veterinary College, University of London, Hertfordshire.

    Background: Mass drug administration (MDA) with praziquantel is the cornerstone of schistosomiasis control in sub-Saharan Africa. The effectiveness of this strategy is dependent on the continued high efficacy of praziquantel; however, drug efficacy is rarely monitored using appropriate statistical approaches that can detect early signs of wane.

    Methods: We conducted a repeated cross-sectional study, examining children infected with Schistosoma mansoni from 6 schools in Uganda that had previously received between 1 and 9 rounds of MDA with praziquantel. We collected up to 12 S. mansoni egg counts from 414 children aged 6-12 years before and 25-27 days after treatment with praziquantel. We estimated individual patient egg reduction rates (ERRs) using a statistical model to explore the influence of covariates, including the number of prior MDA rounds.

    Results: The average ERR among children within schools that had received 8 or 9 previous rounds of MDA (95% Bayesian credible interval [BCI], 88.23%-93.64%) was statistically significantly lower than the average in schools that had received 5 rounds (95% BCI, 96.13%-99.08%) or 1 round (95% BCI, 95.51%-98.96%) of MDA. We estimate that 5.11%, 4.55%, and 16.42% of children from schools that had received 1, 5, and 8-9 rounds of MDA, respectively, had ERRs below the 90% threshold of optimal praziquantel efficacy set by the World Health Organization.

    Conclusions: The reduced efficacy of praziquantel in schools with a higher exposure to MDA may pose a threat to the effectiveness of schistosomiasis control programs. We call for the efficacy of anthelmintic drugs used in MDA to be closely monitored.

    Funded by: Medical Research Council; Wellcome Trust: 098051

    Clinical infectious diseases : an official publication of the Infectious Diseases Society of America 2016;63;9;1151-1159

  • Binding of Plasmodium falciparum Merozoite Surface Proteins DBLMSP and DBLMSP2 to Human Immunoglobulin M Is Conserved among Broadly Diverged Sequence Variants.

    Crosnier C, Iqbal Z, Knuepfer E, Maciuca S, Perrin AJ, Kamuyu G, Goulding D, Bustamante LY, Miles A, Moore SC, Dougan G, Holder AA, Kwiatkowski DP, Rayner JC, Pleass RJ and Wright GJ

    From the Cell Surface Signalling Laboratory, the Malaria Programme, and.

    Diversity at pathogen genetic loci can be driven by host adaptive immune selection pressure and may reveal proteins important for parasite biology. Population-based genome sequencing of Plasmodium falciparum, the parasite responsible for the most severe form of malaria, has highlighted two related polymorphic genes called dblmsp and dblmsp2, which encode Duffy binding-like (DBL) domain-containing proteins located on the merozoite surface but whose function remains unknown. Using recombinant proteins and transgenic parasites, we show that DBLMSP and DBLMSP2 directly and avidly bind human IgM via their DBL domains. We used whole genome sequence data from over 400 African and Asian P. falciparum isolates to show that dblmsp and dblmsp2 exhibit extreme protein polymorphism in their DBL domain, with multiple variants of two major allelic classes present in every population tested. Despite this variability, the IgM binding function was retained across diverse sequence representatives. Although this interaction did not seem to have an effect on the ability of the parasite to invade red blood cells, binding of DBLMSP and DBLMSP2 to IgM inhibited the overall immunoreactivity of these proteins to IgG from patients who had been exposed to the parasite. This suggests that IgM binding might mask these proteins from the host humoral immune system.

    Funded by: Medical Research Council: MC_PC_12017, MR/M006212/1

    The Journal of biological chemistry 2016;291;27;14285-99

  • Horizontal DNA Transfer Mechanisms of Bacteria as Weapons of Intragenomic Conflict.

    Croucher NJ, Mostowy R, Wymant C, Turner P, Bentley SD and Fraser C

    Department of Infectious Disease Epidemiology, Imperial College London, London, United Kingdom.

    Horizontal DNA transfer (HDT) is a pervasive mechanism of diversification in many microbial species, but its primary evolutionary role remains controversial. Much recent research has emphasised the adaptive benefit of acquiring novel DNA, but here we argue instead that intragenomic conflict provides a coherent framework for understanding the evolutionary origins of HDT. To test this hypothesis, we developed a mathematical model of a clonally descended bacterial population undergoing HDT through transmission of mobile genetic elements (MGEs) and genetic transformation. Including the known bias of transformation toward the acquisition of shorter alleles into the model suggested it could be an effective means of counteracting the spread of MGEs. Both constitutive and transient competence for transformation were found to provide an effective defence against parasitic MGEs; transient competence could also be effective at permitting the selective spread of MGEs conferring a benefit on their host bacterium. The coordination of transient competence with cell-cell killing, observed in multiple species, was found to result in synergistic blocking of MGE transmission through releasing genomic DNA for homologous recombination while simultaneously reducing horizontal MGE spread by lowering the local cell density. To evaluate the feasibility of the functions suggested by the modelling analysis, we analysed genomic data from longitudinal sampling of individuals carrying Streptococcus pneumoniae. This revealed the frequent within-host coexistence of clonally descended cells that differed in their MGE infection status, a necessary condition for the proposed mechanism to operate. Additionally, we found multiple examples of MGEs inhibiting transformation through integrative disruption of genes encoding the competence machinery across many species, providing evidence of an ongoing "arms race." Reduced rates of transformation have also been observed in cells infected by MGEs that reduce the concentration of extracellular DNA through secretion of DNases. Simulations predicted that either mechanism of limiting transformation would benefit individual MGEs, but also that this tactic's effectiveness was limited by competition with other MGEs coinfecting the same cell. A further observed behaviour we hypothesised to reduce elimination by transformation was MGE activation when cells become competent. Our model predicted that this response was effective at counteracting transformation independently of competing MGEs. Therefore, this framework is able to explain both common properties of MGEs, and the seemingly paradoxical bacterial behaviours of transformation and cell-cell killing within clonally related populations, as the consequences of intragenomic conflict between self-replicating chromosomes and parasitic MGEs. The antagonistic nature of the different mechanisms of HDT over short timescales means their contribution to bacterial evolution is likely to be substantially greater than previously appreciated.

    PLoS biology 2016;14;3;e1002394

  • Practical Experience of the Application of a Weighted Burden Test to Whole Exome Sequence Data for Obesity and Schizophrenia.

    Curtis D and UK10K Consortium

    UCL Genetics Institute, UCL, Darwin Building, Gower Street, London, WC1E 6BT, UK.

    For biological and statistical reasons it makes sense to combine information from variants at the level of the gene. One may wish to give more weight to variants which are rare and those that are more likely to affect function. A combined weighting scheme, implemented in the SCOREASSOC program, was applied to whole exome sequence data for 1392 subjects with schizophrenia and 982 with obesity from the UK10K project. Results conformed fairly well with null hypothesis expectations and no individual gene was strongly implicated. However, a number of the higher ranked genes appear plausible candidates as being involved in one or other phenotype and may warrant further investigation. These include MC4R, NLGN2, CRP, DONSON, GTF3A, IL36B, ADCYAP1R1, ARSA, DLG1, SIK2, SLAIN1, UBE2Q2, ZNF507, CRHR1, MUSK, NSF, SNORD115, GDF3 and HIBADH. Some individual variants in these genes have different frequencies between cohorts and could be genotyped in additional subjects. For other genes, there is a general excess of variants at many different sites so attempts at replication would be more difficult. Overall, the weighted burden test provides a convenient method for using sequence data to highlight genes of interest.

    Funded by: Wellcome Trust: WT091310

    Annals of human genetics 2016;80;1;38-49

  • Respiratory microbiota resistance and resilience to pulmonary exacerbation and subsequent antimicrobial intervention.

    Cuthbertson L, Rogers GB, Walker AW, Oliver A, Green LE, Daniels TW, Carroll MP, Parkhill J, Bruce KD and van der Gast CJ

    NERC Centre for Ecology & Hydrology, Wallingford, UK.

    Pulmonary symptoms in cystic fibrosis (CF) begin in early life with chronic lung infections and concomitant airway inflammation leading to progressive loss of lung function. Gradual pulmonary function decline is interspersed with periods of acute worsening of respiratory symptoms known as CF pulmonary exacerbations (CFPEs). Cumulatively, CFPEs are associated with more rapid disease progression. In this study multiple sputum samples were collected from adult CF patients over the course of CFPEs to better understand how changes in microbiota are associated with CFPE onset and management. Data were divided into five clinical periods: pre-CFPE baseline, CFPE, antibiotic treatment, recovery, and post-CFPE baseline. Samples were treated with propidium monoazide prior to DNA extraction, to remove the impact of bacterial cell death artefacts following antibiotic treatment, and then characterised by 16S rRNA gene-targeted high-throughput sequencing. Partitioning CF microbiota into core and rare groups revealed compositional resistance to CFPE and resilience to antibiotics interventions. Mixed effects modelling of core microbiota members revealed no significant negative impact on the relative abundance of Pseudomonas aeruginosa across the exacerbation cycle. Our findings have implications for current CFPE management strategies, supporting reassessment of existing antimicrobial treatment regimens, as antimicrobial resistance by pathogens and other members of the microbiota may be significant contributing factors.

    The ISME journal 2016;10;5;1081-91

  • Mechanisms of fate decision and lineage commitment during haematopoiesis.

    Cvejic A

    Department of Haematology, University of Cambridge, Cambridge, UK.

    Blood stem cells need to both perpetuate themselves (self-renew) and differentiate into all mature blood cells to maintain blood formation throughout life. However, it is unclear how the underlying gene regulatory network maintains this population of self-renewing and differentiating stem cells and how it accommodates the transition from a stem cell to a mature blood cell. Our current knowledge of transcriptomes of various blood cell types has mainly been advanced by population-level analysis. However, a population of seemingly homogenous blood cells may include many distinct cell types with substantially different transcriptomes and abilities to make diverse fate decisions. Therefore, understanding the cell-intrinsic differences between individual cells is necessary for a deeper understanding of the molecular basis of their behaviour. Here we review recent single-cell studies in the haematopoietic system and their contribution to our understanding of the mechanisms governing cell fate choices and lineage commitment.

    Immunology and cell biology 2016;94;3;230-5

  • Exome sequencing identifies rare variants in multiple genes in atrioventricular septal defect.

    D'Alessandro LC, Al Turki S, Manickaraj AK, Manase D, Mulder BJ, Bergin L, Rosenberg HC, Mondal T, Gordon E, Lougheed J, Smythe J, Devriendt K, Bhattacharya S, Watkins H, Bentham J, Bowdin S, Hurles ME and Mital S

    Division of Cardiology, Department of Pediatrics, Hospital for Sick Children, University of Toronto, Toronto, Ontario, Canada.

    Purpose: The genetic etiology of atrioventricular septal defect (AVSD) is unknown in 40% cases. Conventional sequencing and arrays have identified the etiology in only a minority of nonsyndromic individuals with AVSD.

    Methods: Whole-exome sequencing was performed in 81 unrelated probands with AVSD to identify potentially causal variants in a comprehensive set of 112 genes with strong biological relevance to AVSD.

    Results: A significant enrichment of rare and rare damaging variants was identified in the gene set, compared with controls (odds ratio (OR): 1.52; 95% confidence interval (CI): 1.35-1.71; P = 4.8 × 10(-11)). The enrichment was specific to AVSD probands, compared with a cohort without AVSD with tetralogy of Fallot (OR: 2.25; 95% CI: 1.84-2.76; P = 2.2 × 10(-16)). Six genes (NIPBL, CHD7, CEP152, BMPR1a, ZFPM2, and MDM4) were enriched for rare variants in AVSD compared with controls, including three syndrome-associated genes (NIPBL, CHD7, and CEP152). The findings were confirmed in a replication cohort of 81 AVSD probands.

    Conclusion: Mutations in genes with strong biological relevance to AVSD, including syndrome-associated genes, can contribute to AVSD, even in those with isolated heart disease. The identification of a gene set associated with AVSD will facilitate targeted genetic screening in this cohort.

    Funded by: British Heart Foundation: CH/09/003/26631, PG/07/045/22690, RG/10/17/28553; Wellcome Trust: 090532, 098051, WT098051

    Genetics in medicine : official journal of the American College of Medical Genetics 2016;18;2;189-98

  • A multiple-phenotype imputation method for genetic studies.

    Dahl A, Iotchkova V, Baud A, Johansson Å, Gyllensten U, Soranzo N, Mott R, Kranis A and Marchini J

    Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, UK.

    Genetic association studies have yielded a wealth of biological discoveries. However, these studies have mostly analyzed one trait and one SNP at a time, thus failing to capture the underlying complexity of the data sets. Joint genotype-phenotype analyses of complex, high-dimensional data sets represent an important way to move beyond simple genome-wide association studies (GWAS) with great potential. The move to high-dimensional phenotypes will raise many new statistical problems. Here we address the central issue of missing phenotypes in studies with any level of relatedness between samples. We propose a multiple-phenotype mixed model and use a computationally efficient variational Bayesian algorithm to fit the model. On a variety of simulated and real data sets from a range of organisms and trait types, we show that our method outperforms existing state-of-the-art methods from the statistics and machine learning literature and can boost signals of association.

    Nature genetics 2016;48;4;466-72

  • A Method for Checking Genomic Integrity in Cultured Cell Lines from SNP Genotyping Data.

    Danecek P, McCarthy SA, HipSci Consortium and Durbin R

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, CB10 1SA, United Kingdom.

    Genomic screening for chromosomal abnormalities is an important part of quality control when establishing and maintaining stem cell lines. We present a new method for sensitive detection of copy number alterations, aneuploidy, and contamination in cell lines using genome-wide SNP genotyping data. In contrast to other methods designed for identifying copy number variations in a single sample or in a sample composed of a mixture of normal and tumor cells, this new method is tailored for determining differences between cell lines and the starting material from which they were derived, which allows us to distinguish between normal and novel copy number variation. We implemented the method in the freely available BCFtools package and present results based on induced pluripotent stem cell lines obtained in the HipSci project.

    Funded by: Wellcome Trust: WT098051, WT098503

    PloS one 2016;11;5;e0155014

  • Using a Human Challenge Model of Infection to Measure Vaccine Efficacy: A Randomised, Controlled Trial Comparing the Typhoid Vaccines M01ZH09 with Placebo and Ty21a.

    Darton TC, Jones C, Blohmke CJ, Waddington CS, Zhou L, Peters A, Haworth K, Sie R, Green CA, Jeppesen CA, Moore M, Thompson BA, John T, Kingsley RA, Yu LM, Voysey M, Hindle Z, Lockhart S, Sztein MB, Dougan G, Angus B, Levine MM and Pollard AJ

    Oxford Vaccine Group, Department of Paediatrics, and the NIHR Oxford Biomedical Research Centre, University of Oxford, Oxford, United Kingdom.

    Background: Typhoid persists as a major cause of global morbidity. While several licensed vaccines to prevent typhoid are available, they are of only moderate efficacy and unsuitable for use in children less than two years of age. Development of new efficacious vaccines is complicated by the human host-restriction of Salmonella enterica serovar Typhi (S. Typhi) and lack of clear correlates of protection. In this study, we aimed to evaluate the protective efficacy of a single dose of the oral vaccine candidate, M01ZH09, in susceptible volunteers by direct typhoid challenge.

    Methods and findings: We performed a randomised, double-blind, placebo-controlled trial in healthy adult participants at a single centre in Oxford (UK). Participants were allocated to receive one dose of double-blinded M01ZH09 or placebo or 3-doses of open-label Ty21a. Twenty-eight days after vaccination, participants were challenged with 104CFU S. Typhi Quailes strain. The efficacy of M01ZH09 compared with placebo (primary outcome) was assessed as the percentage of participants reaching pre-defined endpoints constituting typhoid diagnosis (fever and/or bacteraemia) during the 14 days after challenge. Ninety-nine participants were randomised to receive M01ZH09 (n = 33), placebo (n = 33) or 3-doses of Ty21a (n = 33). After challenge, typhoid was diagnosed in 18/31 (58.1% [95% CI 39.1 to 75.5]) M01ZH09, 20/30 (66.7% [47.2 to 87.2]) placebo, and 13/30 (43.3% [25.5 to 62.6]) Ty21a vaccine recipients. Vaccine efficacy (VE) for one dose of M01ZH09 was 13% [95% CI -29 to 41] and 35% [-5 to 60] for 3-doses of Ty21a. Retrospective multivariable analyses demonstrated that pre-existing anti-Vi antibody significantly reduced susceptibility to infection after challenge; a 1 log increase in anti-Vi IgG resulting in a 71% decrease in the hazard ratio of typhoid diagnosis ([95% CI 30 to 88%], p = 0.006) during the 14 day challenge period. Limitations to the study included the requirement to limit the challenge period prior to treatment to 2 weeks, the intensity of the study procedures and the high challenge dose used resulting in a stringent model.

    Conclusions: Despite successfully demonstrating the use of a human challenge study to directly evaluate vaccine efficacy, a single-dose M01ZH09 failed to demonstrate significant protection after challenge with virulent Salmonella Typhi in this model. Anti-Vi antibody detected prior to vaccination played a major role in outcome after challenge.

    Trial registration: (NCT01405521) and EudraCT (number 2011-000381-35).

    PLoS neglected tropical diseases 2016;10;8;e0004926

  • Evaluation of an Optimal Epidemiological Typing Scheme for Legionella pneumophila with Whole-Genome Sequence Data Using Validation Guidelines.

    David S, Mentasti M, Tewolde R, Aslett M, Harris SR, Afshar B, Underwood A, Fry NK, Parkhill J and Harrison TG

    Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom Public Health England, London, United Kingdom

    Sequence-based typing (SBT), analogous to multilocus sequence typing (MLST), is the current "gold standard" typing method for investigation of legionellosis outbreaks caused by Legionella pneumophila However, as common sequence types (STs) cause many infections, some investigations remain unresolved. In this study, various whole-genome sequencing (WGS)-based methods were evaluated according to published guidelines, including (i) a single nucleotide polymorphism (SNP)-based method, (ii) extended MLST using different numbers of genes, (iii) determination of gene presence or absence, and (iv) a kmer-based method. L. pneumophila serogroup 1 isolates (n = 106) from the standard "typing panel," previously used by the European Society for Clinical Microbiology Study Group on Legionella Infections (ESGLI), were tested together with another 229 isolates. Over 98% of isolates were considered typeable using the SNP- and kmer-based methods. Percentages of isolates with complete extended MLST profiles ranged from 99.1% (50 genes) to 86.8% (1,455 genes), while only 41.5% produced a full profile with the gene presence/absence scheme. Replicates demonstrated that all methods offer 100% reproducibility. Indices of discrimination range from 0.972 (ribosomal MLST) to 0.999 (SNP based), and all values were higher than that achieved with SBT (0.940). Epidemiological concordance is generally inversely related to discriminatory power. We propose that an extended MLST scheme with ∼50 genes provides optimal epidemiological concordance while substantially improving the discrimination offered by SBT and can be used as part of a hierarchical typing scheme that should maintain backwards compatibility and increase discrimination where necessary. This analysis will be useful for the ESGLI to design a scheme that has the potential to become the new gold standard typing method for L. pneumophila.

    Funded by: Wellcome Trust

    Journal of clinical microbiology 2016;54;8;2135-48

  • Multiple major disease-associated clones of Legionella pneumophila have emerged recently and independently.

    David S, Rusniok C, Mentasti M, Gomez-Valero L, Harris SR, Lechat P, Lees J, Ginevra C, Glaser P, Ma L, Bouchier C, Underwood A, Jarraud S, Harrison TG, Parkhill J and Buchrieser C

    Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA Cambridge, United Kingdom.

    Legionella pneumophila is an environmental bacterium and the leading cause of Legionnaires' disease. Just five sequence types (ST), from more than 2000 currently described, cause nearly half of disease cases in northwest Europe. Here, we report the sequence and analyses of 364 L. pneumophila genomes, including 337 from the five disease-associated STs and 27 representative of the species diversity. Phylogenetic analyses revealed that the five STs have independent origins within a highly diverse species. The number of de novo mutations is extremely low with maximum pairwise single-nucleotide polymorphisms (SNPs) ranging from 19 (ST47) to 127 (ST1), which suggests emergences within the last century. Isolates sampled geographically far apart differ by only a few SNPs, demonstrating rapid dissemination. These five STs have been recombining recently, leading to a shared pool of allelic variants potentially contributing to their increased disease propensity. The oldest clone, ST1, has spread globally; between 1940 and 2000, four new clones have emerged in Europe, which show long-distance, rapid dispersal. That a large proportion of clinical cases is caused by recently emerged and internationally dispersed clones, linked by convergent evolution, is surprising for an environmental bacterium traditionally considered to be an opportunistic pathogen. To simultaneously explain recent emergence, rapid spread and increased disease association, we hypothesize that these STs have adapted to new man-made environmental niches, which may be linked by human infection and transmission.

    Funded by: Wellcome Trust: 098051

    Genome research 2016;26;11;1555-1564

  • Formin Is Associated with Left-Right Asymmetry in the Pond Snail and the Frog.

    Davison A, McDowell GS, Holden JM, Johnson HF, Koutsovoulos GD, Liu MM, Hulpiau P, Van Roy F, Wade CM, Banerjee R, Yang F, Chiba S, Davey JW, Jackson DJ, Levin M and Blaxter ML

    School of Life Sciences, University of Nottingham, Nottingham NG7 2RD, UK. Electronic address:

    While components of the pathway that establishes left-right asymmetry have been identified in diverse animals, from vertebrates to flies, it is striking that the genes involved in the first symmetry-breaking step remain wholly unknown in the most obviously chiral animals, the gastropod snails. Previously, research on snails was used to show that left-right signaling of Nodal, downstream of symmetry breaking, may be an ancestral feature of the Bilateria [1 and 2]. Here, we report that a disabling mutation in one copy of a tandemly duplicated, diaphanous-related formin is perfectly associated with symmetry breaking in the pond snail. This is supported by the observation that an anti-formin drug treatment converts dextral snail embryos to a sinistral phenocopy, and in frogs, drug inhibition or overexpression by microinjection of formin has a chirality-randomizing effect in early (pre-cilia) embryos. Contrary to expectations based on existing models [3, 4 and 5], we discovered asymmetric gene expression in 2- and 4-cell snail embryos, preceding morphological asymmetry. As the formin-actin filament has been shown to be part of an asymmetry-breaking switch in vitro [6 and 7], together these results are consistent with the view that animals with diverse body plans may derive their asymmetries from the same intracellular chiral elements [8].

    Funded by: Biotechnology and Biological Sciences Research Council: BB/F018940/1, BB/F021135/1, BB/G00661X/1, F021135, G00661X; Medical Research Council: G0900740, MR/K001744/1; NCI NIH HHS: U54 CA143876, U54CA143876; Wellcome Trust: WT098051

    Current biology : CB 2016;26;5;654-60

  • Chimpanzee genomic diversity reveals ancient admixture with bonobos.

    de Manuel M, Kuhlwilm M, Frandsen P, Sousa VC, Desai T, Prado-Martinez J, Hernandez-Rodriguez J, Dupanloup I, Lao O, Hallast P, Schmidt JM, Heredia-Genestar JM, Benazzo A, Barbujani G, Peter BM, Kuderna LF, Casals F, Angedakin S, Arandjelovic M, Boesch C, Kühl H, Vigilant L, Langergraber K, Novembre J, Gut M, Gut I, Navarro A, Carlsen F, Andrés AM, Siegismund HR, Scally A, Excoffier L, Tyler-Smith C, Castellano S, Xue Y, Hvilsom C and Marques-Bonet T

    Institut de Biologia Evolutiva (Consejo Superior de Investigaciones Científicas-Universitat Pompeu Fabra), Barcelona Biomedical Research Park, Doctor Aiguader 88, Barcelona, Catalonia 08003, Spain.

    Our closest living relatives, chimpanzees and bonobos, have a complex demographic history. We analyzed the high-coverage whole genomes of 75 wild-born chimpanzees and bonobos from 10 countries in Africa. We found that chimpanzee population substructure makes genetic information a good predictor of geographic origin at country and regional scales. Multiple lines of evidence suggest that gene flow occurred from bonobos into the ancestors of central and eastern chimpanzees between 200,000 and 550,000 years ago, probably with subsequent spread into Nigeria-Cameroon chimpanzees. Together with another, possibly more recent contact (after 200,000 years ago), bonobos contributed less than 1% to the central chimpanzee genomes. Admixture thus appears to have been widespread during hominid evolution.

    Funded by: NCI NIH HHS: U01 CA198933; NIMH NIH HHS: U01 MH106874; Wellcome Trust

    Science (New York, N.Y.) 2016;354;6311;477-481

  • A meta-analysis of 120 246 individuals identifies 18 new loci for fibrinogen concentration.

    de Vries PS, Chasman DI, Sabater-Lleal M, Chen MH, Huffman JE, Steri M, Tang W, Teumer A, Marioni RE, Grossmann V, Hottenga JJ, Trompet S, Müller-Nurasyid M, Zhao JH, Brody JA, Kleber ME, Guo X, Wang JJ, Auer PL, Attia JR, Yanek LR, Ahluwalia TS, Lahti J, Venturini C, Tanaka T, Bielak LF, Joshi PK, Rocanin-Arjo A, Kolcic I, Navarro P, Rose LM, Oldmeadow C, Riess H, Mazur J, Basu S, Goel A, Yang Q, Ghanbari M, Willemsen G, Rumley A, Fiorillo E, de Craen AJ, Grotevendt A, Scott R, Taylor KD, Delgado GE, Yao J, Kifley A, Kooperberg C, Qayyum R, Lopez LM, Berentzen TL, Räikkönen K, Mangino M, Bandinelli S, Peyser PA, Wild S, Trégouët DA, Wright AF, Marten J, Zemunik T, Morrison AC, Sennblad B, Tofler G, de Maat MP, de Geus EJ, Lowe GD, Zoledziewska M, Sattar N, Binder H, Völker U, Waldenberger M, Khaw KT, Mcknight B, Huang J, Jenny NS, Holliday EG, Qi L, Mcevoy MG, Becker DM, Starr JM, Sarin AP, Hysi PG, Hernandez DG, Jhun MA, Campbell H, Hamsten A, Rivadeneira F, Mcardle WL, Slagboom PE, Zeller T, Koenig W, Psaty BM, Haritunians T, Liu J, Palotie A, Uitterlinden AG, Stott DJ, Hofman A, Franco OH, Polasek O, Rudan I, Morange PE, Wilson JF, Kardia SL, Ferrucci L, Spector TD, Eriksson JG, Hansen T, Deary IJ, Becker LC, Scott RJ, Mitchell P, März W, Wareham NJ, Peters A, Greinacher A, Wild PS, Jukema JW, Boomsma DI, Hayward C, Cucca F, Tracy R, Watkins H, Reiner AP, Folsom AR, Ridker PM, O'Donnell CJ, Smith NL, Strachan DP and Dehghan A

    Department of Epidemiology.

    Genome-wide association studies have previously identified 23 genetic loci associated with circulating fibrinogen concentration. These studies used HapMap imputation and did not examine the X-chromosome. 1000 Genomes imputation provides better coverage of uncommon variants, and includes indels. We conducted a genome-wide association analysis of 34 studies imputed to the 1000 Genomes Project reference panel and including ∼120 000 participants of European ancestry (95 806 participants with data on the X-chromosome). Approximately 10.7 million single-nucleotide polymorphisms and 1.2 million indels were examined. We identified 41 genome-wide significant fibrinogen loci; of which, 18 were newly identified. There were no genome-wide significant signals on the X-chromosome. The lead variants of five significant loci were indels. We further identified six additional independent signals, including three rare variants, at two previously characterized loci: FGB and IRF1. Together the 41 loci explain 3% of the variance in plasma fibrinogen concentration.

    Funded by: Biotechnology and Biological Sciences Research Council: BB/F019394/1; Chief Scientist Office: CZB/4/505, ETM/55; Medical Research Council: G1000143, G1001799, MC_PC_U127561128, MR/K026992/1; NCATS NIH HHS: UL1 TR000124; NHLBI NIH HHS: R01 HL059367; NIDDK NIH HHS: P30 DK063491

    Human molecular genetics 2016;25;2;358-70

  • CD4-Transgenic Zebrafish Reveal Tissue-Resident Th2- and Regulatory T Cell-like Populations and Diverse Mononuclear Phagocytes.

    Dee CT, Nagaraju RT, Athanasiadis EI, Gray C, Fernandez Del Ama L, Johnston SA, Secombes CJ, Cvejic A and Hurlstone AF

    Faculty of Life Sciences, The University of Manchester, Manchester M13 9PT, United Kingdom.

    CD4<sup>+</sup> T cells are at the nexus of the innate and adaptive arms of the immune system. However, little is known about the evolutionary history of CD4<sup>+</sup> T cells, and it is unclear whether their differentiation into specialized subsets is conserved in early vertebrates. In this study, we have created transgenic zebrafish with vibrantly labeled CD4<sup>+</sup> cells allowing us to scrutinize the development and specialization of teleost CD4<sup>+</sup> leukocytes in vivo. We provide further evidence that CD4<sup>+</sup> macrophages have an ancient origin and had already emerged in bony fish. We demonstrate the utility of this zebrafish resource for interrogating the complex behavior of immune cells at cellular resolution by the imaging of intimate contacts between teleost CD4<sup>+</sup> T cells and mononuclear phagocytes. Most importantly, we reveal the conserved subspecialization of teleost CD4<sup>+</sup> T cells in vivo. We demonstrate that the ancient and specialized tissues of the gills contain a resident population of il-4/13b-expressing Th2-like cells, which do not coexpress il-4/13a Additionally, we identify a contrasting population of regulatory T cell-like cells resident in the zebrafish gut mucosa, in marked similarity to that found in the intestine of mammals. Finally, we show that, as in mammals, zebrafish CD4<sup>+</sup> T cells will infiltrate melanoma tumors and obtain a phenotype consistent with a type 2 immune microenvironment. We anticipate that this unique resource will prove invaluable for future investigation of T cell function in biomedical research, the development of vaccination and health management in aquaculture, and for further research into the evolution of adaptive immunity.

    Funded by: Biotechnology and Biological Sciences Research Council: BB/L007401/1; Cancer Research UK: A14953; European Research Council: 282059; Medical Research Council: MC_PC_12009, MR/J009156/1, MR/L011840/1; Wellcome Trust

    Journal of immunology (Baltimore, Md. : 1950) 2016;197;9;3520-3530

  • Discrete distributional differential expression (D3E)--a tool for gene expression analysis of single-cell RNA-seq data.

    Delmans M and Hemberg M

    Department of Plant Sciences, University of Cambridge, Downing Street, Cambridge, CB2 3EA, UK.

    Background: The advent of high throughput RNA-seq at the single-cell level has opened up new opportunities to elucidate the heterogeneity of gene expression. One of the most widespread applications of RNA-seq is to identify genes which are differentially expressed between two experimental conditions.

    Results: We present a discrete, distributional method for differential gene expression (D(3)E), a novel algorithm specifically designed for single-cell RNA-seq data. We use synthetic data to evaluate D(3)E, demonstrating that it can detect changes in expression, even when the mean level remains unchanged. Since D(3)E is based on an analytically tractable stochastic model, it provides additional biological insights by quantifying biologically meaningful properties, such as the average burst size and frequency. We use D(3)E to investigate experimental data, and with the help of the underlying model, we directly test hypotheses about the driving mechanism behind changes in gene expression.

    Conclusion: Evaluation using synthetic data shows that D(3)E performs better than other methods for identifying differentially expressed genes since it is designed to take full advantage of the information available from single-cell RNA-seq experiments. Moreover, the analytical model underlying D(3)E makes it possible to gain additional biological insights.

    Funded by: Biotechnology and Biological Sciences Research Council; Wellcome Trust

    BMC bioinformatics 2016;17;110

  • Tracing the origin of disseminated tumor cells in breast cancer using single-cell sequencing.

    Demeulemeester J, Kumar P, Møller EK, Nord S, Wedge DC, Peterson A, Mathiesen RR, Fjelldal R, Zamani Esteki M, Theunis K, Fernandez Gallardo E, Grundstad AJ, Borgen E, Baumbusch LO, Børresen-Dale AL, White KP, Kristensen VN, Van Loo P, Voet T and Naume B

    The Francis Crick Institute, London, UK.

    Background: Single-cell micro-metastases of solid tumors often occur in the bone marrow. These disseminated tumor cells (DTCs) may resist therapy and lay dormant or progress to cause overt bone and visceral metastases. The molecular nature of DTCs remains elusive, as well as when and from where in the tumor they originate. Here, we apply single-cell sequencing to identify and trace the origin of DTCs in breast cancer.

    Results: We sequence the genomes of 63 single cells isolated from six non-metastatic breast cancer patients. By comparing the cells' DNA copy number aberration (CNA) landscapes with those of the primary tumors and lymph node metastasis, we establish that 53% of the single cells morphologically classified as tumor cells are DTCs disseminating from the observed tumor. The remaining cells represent either non-aberrant "normal" cells or "aberrant cells of unknown origin" that have CNA landscapes discordant from the tumor. Further analyses suggest that the prevalence of aberrant cells of unknown origin is age-dependent and that at least a subset is hematopoietic in origin. Evolutionary reconstruction analysis of bulk tumor and DTC genomes enables ordering of CNA events in molecular pseudo-time and traced the origin of the DTCs to either the main tumor clone, primary tumor subclones, or subclones in an axillary lymph node metastasis.

    Conclusions: Single-cell sequencing of bone marrow epithelial-like cells, in parallel with intra-tumor genetic heterogeneity profiling from bulk DNA, is a powerful approach to identify and study DTCs, yielding insight into metastatic processes. A heterogeneous population of CNA-positive cells is present in the bone marrow of non-metastatic breast cancer patients, only part of which are derived from the observed tumor lineages.

    Funded by: Cancer Research UK: FC001202; Medical Research Council: FC001202; Wellcome Trust: FC001202

    Genome biology 2016;17;1;250

  • Somatic, positive and negative domains of the Center for Epidemiological Studies Depression (CES-D) scale: a meta-analysis of genome-wide association studies.

    Demirkan A, Lahti J, Direk N, Viktorin A, Lunetta KL, Terracciano A, Nalls MA, Tanaka T, Hek K, Fornage M, Wellmann J, Cornelis MC, Ollila HM, Yu L, Smith JA, Pilling LC, Isaacs A, Palotie A, Zhuang WV, Zonderman A, Faul JD, Sutin A, Meirelles O, Mulas A, Hofman A, Uitterlinden A, Rivadeneira F, Perola M, Zhao W, Salomaa V, Yaffe K, Luik AI, NABEC, Liu Y, Ding J, Lichtenstein P, Landén M, Widen E, Weir DR, Llewellyn DJ, Murray A, Kardia SL, Eriksson JG, Koenen K, Magnusson PK, Ferrucci L, Mosley TH, Cucca F, Oostra BA, Bennett DA, Paunio T, Berger K, Harris TB, Pedersen NL, Murabito JM, Tiemeier H, van Duijn CM and Räikkönen K

    Genetic Epidemiology Unit,Departments of Epidemiology and Clinical Genetics,Erasmus MC,Rotterdam,The Netherlands.

    Background: Major depressive disorder (MDD) is moderately heritable, however genome-wide association studies (GWAS) for MDD, as well as for related continuous outcomes, have not shown consistent results. Attempts to elucidate the genetic basis of MDD may be hindered by heterogeneity in diagnosis. The Center for Epidemiological Studies Depression (CES-D) scale provides a widely used tool for measuring depressive symptoms clustered in four different domains which can be combined together into a total score but also can be analysed as separate symptom domains.

    Method: We performed a meta-analysis of GWAS of the CES-D symptom clusters. We recruited 12 cohorts with the 20- or 10-item CES-D scale (32 528 persons).

    Results: One single nucleotide polymorphism (SNP), rs713224, located near the brain-expressed melatonin receptor (MTNR1A) gene, was associated with the somatic complaints domain of depression symptoms, with borderline genome-wide significance (p discovery = 3.82 × 10-8). The SNP was analysed in an additional five cohorts comprising the replication sample (6813 persons). However, the association was not consistent among the replication sample (p discovery+replication = 1.10 × 10-6) with evidence of heterogeneity.

    Conclusions: Despite the effort to harmonize the phenotypes across cohorts and participants, our study is still underpowered to detect consistent association for depression, even by means of symptom classification. On the contrary, the SNP-based heritability and co-heritability estimation results suggest that a very minor part of the variation could be captured by GWAS, explaining the reason of sparse findings.

    Funded by: Department of Health; Intramural NIH HHS: Z99 AG999999; Medical Research Council: G0802462, G0901254; NCRR NIH HHS: UL1 RR025005; NHGRI NIH HHS: HHSN268200782096C, U01 HG004402; NHLBI NIH HHS: N01HC25195, N01HC55015, N01HC55016, N01HC55018, N01HC55019, N01HC55020, N01HC55021, N01HC55022, N02HL64278, R01 HL070825, R01 HL087641, R01 HL093029; NIA NIH HHS: N01 AG062101, N01 AG062103, N01 AG062106, P30 AG010161, R01 AG015819, R01 AG017917, R01 AG029451, R01 AG032098, RC2 AG036495, U01 AG009740, ZIA AG000932; Wellcome Trust: WT089062

    Psychological medicine 2016;46;8;1613-23

  • Catalog of genetic progression of human cancers: breast cancer.

    Desmedt C, Yates L and Kulka J

    J.-C. Heuson Breast Cancer Translational Research Laboratory, Institut Jules Bordet, Université Libre de Bruxelles, Boulevard de Waterloo 121, 1000, Brussels, Belgium.

    With the rapid development of next-generation sequencing, deeper insights are being gained into the molecular evolution that underlies the development and clinical progression of breast cancer. It is apparent that during evolution, breast cancers acquire thousands of mutations including single base pair substitutions, insertions, deletions, copy number aberrations, and structural rearrangements. As a consequence, at the whole genome level, no two cancers are identical and few cancers even share the same complement of "driver" mutations. Indeed, two samples from the same cancer may also exhibit extensive differences due to constant remodeling of the genome over time. In this review, we summarize recent studies that extend our understanding of the genomic basis of cancer progression. Key biological insights include the following: subclonal diversification begins early in cancer evolution, being detectable even in in situ lesions; geographical stratification of subclonal structure is frequent in primary tumors and can include therapeutically targetable alterations; multiple distant metastases typically arise from a common metastatic ancestor following a "metastatic cascade" model; systemic therapy can unmask preexisting resistant subclones or influence further treatment sensitivity and disease progression. We conclude the review by describing novel approaches such as the analysis of circulating DNA and patient-derived xenografts that promise to further our understanding of the genomic changes occurring during cancer evolution and guide treatment decision making.

    Cancer metastasis reviews 2016;35;1;49-62

  • Genomic Characterization of Primary Invasive Lobular Breast Cancer.

    Desmedt C, Zoppoli G, Gundem G, Pruneri G, Larsimont D, Fornili M, Fumagalli D, Brown D, Rothé F, Vincent D, Kheddoumi N, Rouas G, Majjaj S, Brohée S, Van Loo P, Maisonneuve P, Salgado R, Van Brussel T, Lambrechts D, Bose R, Metzger O, Galant C, Bertucci F, Piccart-Gebhart M, Viale G, Biganzoli E, Campbell PJ and Sotiriou C

    Christine Desmedt, Gabriele Zoppoli, Denis Larsimont, Debora Fumagalli, David Brown, Françoise Rothé, Delphine Vincent, Naima Kheddoumi, Ghizlane Rouas, Samira Majjaj, Sylvain Brohée, Roberto Salgado, Martine Piccart-Gebhart, and Christos Sotiriou, Institut Jules Bordet; Christine Galant, Cliniques Universitaires Saint Luc, Brussels; Peter Van Loo, University of Leuven; Thomas Van Brussel and Diether Lambrechts, VIB Vesalius Research Center, Leuven, Belgium; Gabriele Zoppoli, University of Genoa and Istituto di Ricerca a Carattere Clinico-Scientifico San Martino-National Cancer Institute, Genoa; Giancarlo Pruneri, Patrick Maisonneuve, and Giuseppe Viale, European Institute of Oncology; Marco Fornili and Elia Biganzoli, University of Milan, Fondazione Istituto di Ricovero e Cura a Carattere Scientifico Istituto Nazionale Tumori, Milan, Italy; Gunes Gundem and Peter J. Campbell, Wellcome Trust Sanger Institute, Cambridgeshire; Peter Van Loo, The Francis Crick Institute, London, United Kingdom; Ron Bose, Washington University School of Medicine, St Louis, MO; Otto Metzger, Dana-Farber Cancer Institute, Boston, MA; and François Bertucci, Institut Paoli-Calmettes, Marseille, France.

    Purpose: Invasive lobular breast cancer (ILBC) is the second most common histologic subtype after invasive ductal breast cancer (IDBC). Despite clinical and pathologic differences, ILBC is still treated as IDBC. We aimed to identify genomic alterations in ILBC with potential clinical implications.

    Methods: From an initial 630 ILBC primary tumors, we interrogated oncogenic substitutions and insertions and deletions of 360 cancer genes and genome-wide copy number aberrations in 413 and 170 ILBC samples, respectively, and correlated those findings with clinicopathologic and outcome features.

    Results: Besides the high mutation frequency of CDH1 in 65% of tumors, alterations in one of the three key genes of the phosphatidylinositol 3-kinase pathway, PIK3CA, PTEN, and AKT1, were present in more than one-half of the cases. HER2 and HER3 were mutated in 5.1% and 3.6% of the tumors, with most of these mutations having a proven role in activating the human epidermal growth factor receptor/ERBB pathway. Mutations in FOXA1 and ESR1 copy number gains were detected in 9% and 25% of the samples. All these alterations were more frequent in ILBC than in IDBC. The histologic diversity of ILBC was associated with specific alterations, such as enrichment for HER2 mutations in the mixed, nonclassic, and ESR1 gains in the solid subtype. Survival analyses revealed that chromosome 1q and 11p gains showed independent prognostic value in ILBC and that HER2 and AKT1 mutations were associated with increased risk of early relapse.

    Conclusion: This study demonstrates that we can now begin to individualize the treatment of ILBC, with HER2, HER3, and AKT1 mutations representing high-prevalence therapeutic targets and FOXA1 mutations and ESR1 gains deserving urgent dedicated clinical investigation, especially in the context of endocrine treatment.

    Funded by: Wellcome Trust

    Journal of clinical oncology : official journal of the American Society of Clinical Oncology 2016;34;16;1872-81

  • Zygotes segregate entire parental genomes in distinct blastomere lineages causing cleavage-stage chimerism and mixoploidy.

    Destouni A, Zamani Esteki M, Catteeuw M, Tšuiko O, Dimitriadou E, Smits K, Kurg A, Salumets A, Van Soom A, Voet T and Vermeesch JR

    Laboratory of Cytogenetics and Genome Research, Center of Human Genetics, KU Leuven, Leuven, 3000, Belgium;

    Dramatic genome dynamics, such as chromosome instability, contribute to the remarkable genomic heterogeneity among the blastomeres comprising a single embryo during human preimplantation development. This heterogeneity, when compatible with life, manifests as constitutional mosaicism, chimerism, and mixoploidy in live-born individuals. Chimerism and mixoploidy are defined by the presence of cell lineages with different parental genomes or different ploidy states in a single individual, respectively. Our knowledge of their mechanistic origin results from indirect observations, often when the cell lineages have been subject to rigorous selective pressure during development. Here, we applied haplarithmisis to infer the haplotypes and the copy number of parental genomes in 116 single blastomeres comprising entire preimplantation bovine embryos (n = 23) following in vitro fertilization. We not only demonstrate that chromosome instability is conserved between bovine and human cleavage embryos, but we also discovered that zygotes can spontaneously segregate entire parental genomes into different cell lineages during the first post-zygotic cleavage division. Parental genome segregation was not exclusively triggered by abnormal fertilizations leading to triploid zygotes, but also normally fertilized zygotes can spontaneously segregate entire parental genomes into different cell lineages during cleavage of the zygote. We coin the term "heterogoneic division" to indicate the events leading to noncanonical zygotic cytokinesis, segregating the parental genomes into distinct cell lineages. Persistence of those cell lines during development is a likely cause of chimerism and mixoploidy in mammals.

    Genome research 2016;26;5;567-78

  • The Role of Folate Transport in Antifolate Drug Action in Trypanosoma brucei.

    Dewar S, Sienkiewicz N, Ong HB, Wall RJ, Horn D and Fairlamb AH

    From the Division of Biological Chemistry and Drug Discovery, Wellcome Trust Building, College of Life Sciences, University of Dundee, Dundee DD1 5EH, Scotland, United Kingdom.

    The aim of this study was to identify and characterize mechanisms of resistance to antifolate drugs in African trypanosomes. Genome-wide RNAi library screens were undertaken in bloodstream form Trypanosoma brucei exposed to the antifolates methotrexate and raltitrexed. In conjunction with drug susceptibility and folate transport studies, RNAi knockdown was used to validate the functions of the putative folate transporters. The transport kinetics of folate and methotrexate were further characterized in whole cells. RNA interference target sequencing experiments identified a tandem array of genes encoding a folate transporter family, TbFT1-3, as major contributors to antifolate drug uptake. RNAi knockdown of TbFT1-3 substantially reduced folate transport into trypanosomes and reduced the parasite's susceptibly to the classical antifolates methotrexate and raltitrexed. In contrast, knockdown of TbFT1-3 increased susceptibly to the non-classical antifolates pyrimethamine and nolatrexed. Both folate and methotrexate transport were inhibited by classical antifolates but not by non-classical antifolates or biopterin. Thus, TbFT1-3 mediates the uptake of folate and classical antifolates in trypanosomes, and TbFT1-3 loss-of-function is a mechanism of antifolate drug resistance.

    The Journal of biological chemistry 2016;291;47;24768-24778

  • Complete Genome Sequence of the First KPC-Type Carbapenemase-Positive Proteus mirabilis Strain from a Bloodstream Infection.

    Di Pilato V, Chiarelli A, Boinett CJ, Riccobono E, Harris SR, D'Andrea MM, Thomson NR, Rossolini GM and Giani T

    Department of Surgery and Translational Medicine, University of Florence, Florence, Italy.

    Sequencing of the blaKPC-positive strain Proteus mirabilis AOUC-001 was performed using both the MiSeq and PacBio RS II platforms and yielded a single molecule of 4,272,433 bp, representing the complete chromosome. Genome analysis showed the presence of several acquired resistance determinants, including two copies of blaKPC-2 carried on a fragment of a KPC-producing plasmid previously described in Klebsiella pneumoniae.

    Funded by: Medical Research Council: G1100100

    Genome announcements 2016;4;3

  • Coalescent Inference Using Serially Sampled, High-Throughput Sequencing Data from Intra-Host HIV Infection.

    Dialdestoro K, Sibbesen JA, Maretty L, Raghwani J, Gall A, Kellam P, Pybus OG, Hein J and Jenkins PA

    University of Oxford;

    Human immunodeficiency virus (HIV) is a rapidly evolving pathogen that causes chronic infections, so genetic diversity within a single infection can be very high. High-throughput "deep" sequencing can now measure this diversity in unprecedented detail, particularly since it can be performed at different timepoints during an infection, and this offers a potentially powerful way to infer the evolutionary dynamics of the intra-host viral population. However, population genomic inference from HIV sequence data is challenging because of high rates of mutation and recombination, rapid demographic changes, and ongoing selective pressures. In this paper we develop a new method for inference using HIV deep sequencing data using an approach based on importance sampling of ancestral recombination graphs under a multi-locus coalescent model. The approach further extends recent progress in the approximation of so-called conditional sampling distributions, a quantity of key interest when approximating coalescent likelihoods. The chief novelties of our method are that it is able to infer rates of recombination and mutation, as well as the effective population size, while handling sampling over different timepoints and missing data without extra computational difficulty. We apply our method to a dataset of HIV-1, in which several hundred sequences were obtained from an infected individual at seven timepoints over two years. We find mutation rate and effective population size estimates to be comparable to those produced by the software BEAST. Additionally, our method is able to produce local recombination rate estimates. The software underlying our method, Coalescenator, is freely available.

    Genetics 2016

  • BCL11A Haploinsufficiency Causes an Intellectual Disability Syndrome and Dysregulates Transcription.

    Dias C, Estruch SB, Graham SA, McRae J, Sawiak SJ, Hurst JA, Joss SK, Holder SE, Morton JE, Turner C, Thevenon J, Mellul K, Sánchez-Andrade G, Ibarra-Soria X, Deriziotis P, Santos RF, Lee SC, Faivre L, Kleefstra T, Liu P, Hurles ME, DDD Study, Fisher SE and Logan DW

    Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1SA, UK.

    Intellectual disability (ID) is a common condition with considerable genetic heterogeneity. Next-generation sequencing of large cohorts has identified an increasing number of genes implicated in ID, but their roles in neurodevelopment remain largely unexplored. Here we report an ID syndrome caused by de novo heterozygous missense, nonsense, and frameshift mutations in BCL11A, encoding a transcription factor that is a putative member of the BAF swi/snf chromatin-remodeling complex. Using a comprehensive integrated approach to ID disease modeling, involving human cellular analyses coupled to mouse behavioral, neuroanatomical, and molecular phenotyping, we provide multiple lines of functional evidence for phenotypic effects. The etiological missense variants cluster in the amino-terminal region of human BCL11A, and we demonstrate that they all disrupt its localization, dimerization, and transcriptional regulatory activity, consistent with a loss of function. We show that Bcl11a haploinsufficiency in mice causes impaired cognition, abnormal social behavior, and microcephaly in accordance with the human phenotype. Furthermore, we identify shared aberrant transcriptional profiles in the cortex and hippocampus of these mouse models. Thus, our work implicates BCL11A haploinsufficiency in neurodevelopmental disorders and defines additional targets regulated by this gene, with broad relevance for our understanding of ID and related syndromes.

    Funded by: Department of Health; Wellcome Trust: WT098051

    American journal of human genetics 2016;99;2;253-74

  • High-throughput discovery of novel developmental phenotypes.

    Dickinson ME, Flenniken AM, Ji X, Teboul L, Wong MD, White JK, Meehan TF, Weninger WJ, Westerberg H, Adissu H, Baker CN, Bower L, Brown JM, Caddle LB, Chiani F, Clary D, Cleak J, Daly MJ, Denegre JM, Doe B, Dolan ME, Edie SM, Fuchs H, Gailus-Durner V, Galli A, Gambadoro A, Gallegos J, Guo S, Horner NR, Hsu CW, Johnson SJ, Kalaga S, Keith LC, Lanoue L, Lawson TN, Lek M, Mark M, Marschall S, Mason J, McElwee ML, Newbigging S, Nutter LM, Peterson KA, Ramirez-Solis R, Rowland DJ, Ryder E, Samocha KE, Seavitt JR, Selloum M, Szoke-Kovacs Z, Tamura M, Trainor AG, Tudose I, Wakana S, Warren J, Wendling O, West DB, Wong L, Yoshiki A, International Mouse Phenotyping Consortium, Jackson Laboratory, Infrastructure Nationale PHENOMIN, Institut Clinique de la Souris (ICS), Charles River Laboratories, MRC Harwell, Toronto Centre for Phenogenomics, Wellcome Trust Sanger Institute, RIKEN BioResource Center, MacArthur DG, Tocchini-Valentini GP, Gao X, Flicek P, Bradley A, Skarnes WC, Justice MJ, Parkinson HE, Moore M, Wells S, Braun RE, Svenson KL, de Angelis MH, Herault Y, Mohun T, Mallon AM, Henkelman RM, Brown SD, Adams DJ, Lloyd KC, McKerlie C, Beaudet AL, Bućan M and Murray SA

    Department of Molecular Physiology and Biophysics, Houston, Texas 77030, USA.

    Approximately one-third of all mammalian genes are essential for life. Phenotypes resulting from knockouts of these genes in mice have provided tremendous insight into gene function and congenital disorders. As part of the International Mouse Phenotyping Consortium effort to generate and phenotypically characterize 5,000 knockout mouse lines, here we identify 410 lethal genes during the production of the first 1,751 unique gene knockouts. Using a standardized phenotyping platform that incorporates high-resolution 3D imaging, we identify phenotypes at multiple time points for previously uncharacterized genes and additional phenotypes for genes with previously reported mutant phenotypes. Unexpectedly, our analysis reveals that incomplete penetrance and variable expressivity are common even on a defined genetic background. In addition, we show that human disease genes are enriched for essential genes, thus providing a dataset that facilitates the prioritization and validation of mutations identified in clinical sequencing efforts.

    Funded by: Cancer Research UK: 13031; Medical Research Council: MC_U142684171, MC_U142684172; NCI NIH HHS: P30 CA034196, P30 CA093373; NEI NIH HHS: P30 EY002520; NHGRI NIH HHS: U54 HG006332, U54 HG006348, U54 HG006364, U54 HG006370, UM1 HG006348, UM1 HG006370; NIDDK NIH HHS: U2C DK092993; NIH HHS: U42 OD011174, U42 OD011175, U42 OD011185, U42 OD012210, UM1 OD023221, UM1 OD023222; Welcome Trust; Wellcome Trust

    Nature 2016;537;7621;508-514

  • Genomic Analysis and Comparison of Two Gonorrhea Outbreaks.

    Didelot X, Dordel J, Whittles LK, Collins C, Bilek N, Bishop CJ, White PJ, Aanensen DM, Parkhill J, Bentley SD, Spratt BG and Harris SR

    Department of Infectious Disease Epidemiology, Imperial College London, London, United Kingdom

    Unlabelled: Gonorrhea is a sexually transmitted disease causing growing concern, with a substantial increase in reported incidence over the past few years in the United Kingdom and rising levels of resistance to a wide range of antibiotics. Understanding its epidemiology is therefore of major biomedical importance, not only on a population scale but also at the level of direct transmission. However, the molecular typing techniques traditionally used for gonorrhea infections do not provide sufficient resolution to investigate such fine-scale patterns. Here we sequenced the genomes of 237 isolates from two local collections of isolates from Sheffield and London, each of which was resolved into a single type using traditional methods. The two data sets were selected to have different epidemiological properties: the Sheffield data were collected over 6 years from a predominantly heterosexual population, whereas the London data were gathered within half a year and strongly associated with men who have sex with men. Based on contact tracing information between individuals in Sheffield, we found that transmission is associated with a median time to most recent common ancestor of 3.4 months, with an upper bound of 8 months, which we used as a criterion to identify likely transmission links in both data sets. In London, we found that transmission happened predominantly between individuals of similar age, sexual orientation, and location and also with the same HIV serostatus, which may reflect serosorting and associated risk behaviors. Comparison of the two data sets suggests that the London epidemic involved about ten times more cases than the Sheffield outbreak.

    Importance: The recent increases in gonorrhea incidence and antibiotic resistance are cause for public health concern. Successful intervention requires a better understanding of transmission patterns, which is not uncovered by traditional molecular epidemiology techniques. Here we studied two outbreaks that took place in Sheffield and London, United Kingdom. We show that whole-genome sequencing provides the resolution to investigate direct gonorrhea transmission between infected individuals. Combining genome sequencing with rich epidemiological information about infected individuals reveals the importance of several transmission routes and risk factors, which can be used to design better control measures.

    Funded by: Medical Research Council: MR/K010174/1, MR/N010760/1

    mBio 2016;7;3

  • Perturbed hematopoietic stem and progenitor cell hierarchy in myelodysplastic syndromes patients with monosomy 7 as the sole cytogenetic abnormality.

    Dimitriou M, Woll PS, Mortera-Blanco T, Karimi M, Wedge DC, Doolittle H, Douagi I, Papaemmanuil E, Jacobsen SE and Hellström-Lindberg E

    Center for Hematology and Regenerative Medicine, Karolinska Institutet, Department of Medicine, Karolinska University Hospital Huddinge, Stockholm, Sweden.

    The stem and progenitor cell compartments in low- and intermediate-risk myelodysplastic syndromes (MDS) have recently been described, and shown to be highly conserved when compared to those in acute myeloid leukemia (AML). Much less is known about the characteristics of the hematopoietic hierarchy of subgroups of MDS with a high risk of transforming to AML. Immunophenotypic analysis of immature stem and progenitor cell compartments from patients with an isolated loss of the entire chromosome 7 (isolated -7), an independent high-risk genetic event in MDS, showed expansion and dominance of the malignant -7 clone in the granulocyte and macrophage progenitors (GMP), and other CD45RA+ progenitor compartments, and a significant reduction of the LIN-CD34+CD38low/-CD90+CD45RA- hematopoietic stem cell (HSC) compartment, highly reminiscent of what is typically seen in AML, and distinct from low-risk MDS. Established functional in vitro and in vivo stem cell assays showed a poor readout for -7 MDS patients irrespective of marrow blast counts. Moreover, while the -7 clone dominated at all stages of GM differentiation, the -7 clone had a competitive disadvantage in erythroid differentiation. In azacitidine-treated -7 MDS patients with a clinical response, the decreased clonal involvement in mononuclear bone marrow cells was not accompanied by a parallel reduced clonal involvement in the dominant CD45RA+ progenitor populations, suggesting a selective azacitidine-resistance of these distinct -7 progenitor compartments. Our data demonstrate, in a subgroup of high risk MDS with monosomy 7, that the perturbed stem and progenitor cell compartments resemble more that of AML than low-risk MDS.

    Funded by: Medical Research Council: MC_PC_12020, MC_UU_12009/5; NCI NIH HHS: P30 CA008748

    Oncotarget 2016;7;45;72685-72698

  • Pitfalls in genetic testing: the story of missed SCN1A mutations.

    Djémié T, Weckhuysen S, von Spiczak S, Carvill GL, Jaehn J, Anttonen AK, Brilstra E, Caglayan HS, de Kovel CG, Depienne C, Gaily E, Gennaro E, Giraldez BG, Gormley P, Guerrero-López R, Guerrini R, Hämäläinen E, Hartmann C, Hernandez-Hernandez L, Hjalgrim H, Koeleman BP, Leguern E, Lehesjoki AE, Lemke JR, Leu C, Marini C, McMahon JM, Mei D, Møller RS, Muhle H, Myers CT, Nava C, Serratosa JM, Sisodiya SM, Stephani U, Striano P, van Kempen MJ, Verbeek NE, Usluer S, Zara F, Palotie A, Mefford HC, Scheffer IE, De Jonghe P, Helbig I, Suls A and EuroEPINOMICS‐RES Dravet working group

    Neurogenetics groupDepartment of Molecular GeneticsVIBAntwerpBelgium; Laboratory of NeurogeneticsInstitute Born-BungeUniversity of AntwerpAntwerpBelgium.

    Background: Sanger sequencing, still the standard technique for genetic testing in most diagnostic laboratories and until recently widely used in research, is gradually being complemented by next-generation sequencing (NGS). No single mutation detection technique is however perfect in identifying all mutations. Therefore, we wondered to what extent inconsistencies between Sanger sequencing and NGS affect the molecular diagnosis of patients. Since mutations in SCN1A, the major gene implicated in epilepsy, are found in the majority of Dravet syndrome (DS) patients, we focused on missed SCN1A mutations.

    Methods: We sent out a survey to 16 genetic centers performing SCN1A testing.

    Results: We collected data on 28 mutations initially missed using Sanger sequencing. All patients were falsely reported as SCN1A mutation-negative, both due to technical limitations and human errors.

    Conclusion: We illustrate the pitfalls of Sanger sequencing and most importantly provide evidence that SCN1A mutations are an even more frequent cause of DS than already anticipated.

    Funded by: NINDS NIH HHS: R01 NS069605

    Molecular genetics & genomic medicine 2016;4;4;457-64

  • Using reference-free compressed data structures to analyze sequencing reads from thousands of human genomes.

    Dolle DD, Liu Z, Cotten M, Simpson JT, Iqbal Z, Durbin R, McCarthy SA and Keane TM

    Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, United Kingdom.

    We are rapidly approaching the point where we have sequenced millions of human genomes. There is a pressing need for new data structures to store raw sequencing data and efficient algorithms for population scale analysis. Current reference-based data formats do not fully exploit the redundancy in population sequencing nor take advantage of shared genetic variation. In recent years, the Burrows-Wheeler transform (BWT) and FM-index have been widely employed as a full-text searchable index for read alignment and de novo assembly. We introduce the concept of a population BWT and use it to store and index the sequencing reads of 2705 samples from the 1000 Genomes Project. A key feature is that, as more genomes are added, identical read sequences are increasingly observed, and compression becomes more efficient. We assess the support in the 1000 Genomes read data for every base position of two human reference assembly versions, identifying that 3.2 Mbp with population support was lost in the transition from GRCh37 with 13.7 Mbp added to GRCh38. We show that the vast majority of variant alleles can be uniquely described by overlapping 31-mers and show how rapid and accurate SNP and indel genotyping can be carried out across the genomes in the population BWT. We use the population BWT to carry out nonreference queries to search for the presence of all known viral genomes and discover human T-lymphotropic virus 1 integrations in six samples in a recognized epidemiological distribution.

    Funded by: Wellcome Trust

    Genome research 2016;27;2;300-309

  • Identification, Validation, and Application of Molecular Diagnostics for Insecticide Resistance in Malaria Vectors.

    Donnelly MJ, Isaacs AT and Weetman D

    Department of Vector Biology, Liverpool School of Tropical Medicine, Pembroke Place, Liverpool L3 5QA, UK; Malaria Programme, Wellcome Trust Sanger Institute, Cambridge, UK. Electronic address:

    Insecticide resistance is a major obstacle to control of Anopheles malaria mosquitoes in sub-Saharan Africa and requires an improved understanding of the underlying mechanisms. Efforts to discover resistance genes and DNA markers have been dominated by candidate gene and quantitative trait locus studies of laboratory strains, but with greater availability of genome sequences a shift toward field-based agnostic discovery is anticipated. Mechanisms evolve continually to produce elevated resistance yielding multiplicative diagnostic markers, co-screening of which can give high predictive value. With a shift toward prospective analyses, identification and screening of resistance marker panels will boost monitoring and programmatic decision making.

    Trends in parasitology 2016;32;3;197-206

  • Deep genome sequencing and variation analysis of 13 inbred mouse strains defines candidate phenotypic alleles, private variation and homozygous truncating mutations.

    Doran AG, Wong K, Flint J, Adams DJ, Hunter KW and Keane TM

    Wellcome Trust Sanger Institute, Hinxton, Cambridge, CB10 1HH, UK.

    Background: The Mouse Genomes Project is an ongoing collaborative effort to sequence the genomes of the common laboratory mouse strains. In 2011, the initial analysis of sequence variation across 17 strains found 56.7 M unique single nucleotide polymorphisms (SNPs) and 8.8 M indels. We carry out deep sequencing of 13 additional inbred strains (BUB/BnJ, C57BL/10J, C57BR/cdJ, C58/J, DBA/1J, I/LnJ, KK/HiJ, MOLF/EiJ, NZB/B1NJ, NZW/LacJ, RF/J, SEA/GnJ and ST/bJ), cataloguing molecular variation within and across the strains. These strains include important models for immune response, leukaemia, age-related hearing loss and rheumatoid arthritis. We now have several examples of fully sequenced closely related strains that are divergent for several disease phenotypes.

    Results: Approximately 27.4 M unique SNPs and 5 M indels are identified across these strains compared to the C57BL/6 J reference genome (GRCm38). The amount of variation found in the inbred laboratory mouse genome has increased to 71 M SNPs and 12 M indels. We investigate the genetic basis of highly penetrant cancer susceptibility in RF/J finding private novel missense mutations in DNA damage repair and highly cancer associated genes. We use two highly related strains (DBA/1J and DBA/2J) to investigate the genetic basis of collagen-induced arthritis susceptibility.

    Conclusions: This paper significantly expands the catalogue of fully sequenced laboratory mouse strains and now contains several examples of highly genetically similar strains with divergent phenotypes. We show how studying private missense mutations can lead to insights into the genetic mechanism for a highly penetrant phenotype.

    Funded by: Biotechnology and Biological Sciences Research Council: BB/M000281/1; Cancer Research UK: 13031; Medical Research Council: MR/L007428/1; Wellcome Trust

    Genome biology 2016;17;1;167

  • DNA supercoiling is a fundamental regulatory principle in the control of bacterial gene expression.

    Dorman CJ and Dorman MJ

    Department of Microbiology, Moyne Institute of Preventive Medicine, Trinity College Dublin, Dublin 2, Ireland.

    Although it has become routine to consider DNA in terms of its role as a carrier of genetic information, it is also an important contributor to the control of gene expression. This regulatory principle arises from its structural properties. DNA is maintained in an underwound state in most bacterial cells and this has important implications both for DNA storage in the nucleoid and for the expression of genetic information. Underwinding of the DNA through reduction in its linking number potentially imparts energy to the duplex that is available to drive DNA transactions, such as transcription, replication and recombination. The topological state of DNA also influences its affinity for some DNA binding proteins, especially in DNA sequences that have a high A + T base content. The underwinding of DNA by the ATP-dependent topoisomerase DNA gyrase creates a continuum between metabolic flux, DNA topology and gene expression that underpins the global response of the genome to changes in the intracellular and external environments. These connections describe a fundamental and generalised mechanism affecting global gene expression that underlies the specific control of transcription operating through conventional transcription factors. This mechanism also provides a basal level of control for genes acquired by horizontal DNA transfer, assisting microbial evolution, including the evolution of pathogenic bacteria.

    Biophysical reviews 2016;8;3;209-220

  • DNAH11 Localization in the Proximal Region of Respiratory Cilia Defines Distinct Outer Dynein Arm Complexes.

    Dougherty GW, Loges NT, Klinkenbusch JA, Olbrich H, Pennekamp P, Menchen T, Raidt J, Wallmeier J, Werner C, Westermann C, Ruckert C, Mirra V, Hjeij R, Memari Y, Durbin R, Kolb-Kokocinski A, Praveen K, Kashef MA, Kashef S, Eghtedari F, Häffner K, Valmari P, Baktai G, Aviram M, Bentur L, Amirav I, Davis EE, Katsanis N, Brueckner M, Shaposhnykov A, Pigino G, Dworniczak B and Omran H

    1 Department of General Pediatrics and.

    Primary ciliary dyskinesia (PCD) is a recessively inherited disease that leads to chronic respiratory disorders owing to impaired mucociliary clearance. Conventional transmission electron microscopy (TEM) is a diagnostic standard to identify ultrastructural defects in respiratory cilia but is not useful in approximately 30% of PCD cases, which have normal ciliary ultrastructure. DNAH11 mutations are a common cause of PCD with normal ciliary ultrastructure and hyperkinetic ciliary beating, but its pathophysiology remains poorly understood. We therefore characterized DNAH11 in human respiratory cilia by immunofluorescence microscopy (IFM) in the context of PCD. We used whole-exome and targeted next-generation sequence analysis as well as Sanger sequencing to identify and confirm eight novel loss-of-function DNAH11 mutations. We designed and validated a monoclonal antibody specific to DNAH11 and performed high-resolution IFM of both control and PCD-affected human respiratory cells, as well as samples from green fluorescent protein (GFP)-left-right dynein mice, to determine the ciliary localization of DNAH11. IFM analysis demonstrated native DNAH11 localization in only the proximal region of wild-type human respiratory cilia and loss of DNAH11 in individuals with PCD with certain loss-of-function DNAH11 mutations. GFP-left-right dynein mice confirmed proximal DNAH11 localization in tracheal cilia. DNAH11 retained proximal localization in respiratory cilia of individuals with PCD with distinct ultrastructural defects, such as the absence of outer dynein arms (ODAs). TEM tomography detected a partial reduction of ODAs in DNAH11-deficient cilia. DNAH11 mutations result in a subtle ODA defect in only the proximal region of respiratory cilia, which is detectable by IFM and TEM tomography.

    Funded by: NCATS NIH HHS: UL1 TR001863; NHLBI NIH HHS: R01 HL093280; NIDDK NIH HHS: R01 DK072301

    American journal of respiratory cell and molecular biology 2016;55;2;213-24

  • RESEARCH ETHICS. Ethics review for international data-intensive research.

    Dove ES, Townend D, Meslin EM, Bobrow M, Littler K, Nicol D, de Vries J, Junker A, Garattini C, Bovenberg J, Shabani M, Lévesque E and Knoppers BM

    J. Kenyon Mason Institute for Medicine, Life Sciences and the Law, School of Law, University of Edinburgh, UK.

    Funded by: Wellcome Trust: 099313, 103360

    Science (New York, N.Y.) 2016;351;6280;1399-400

  • Identification of a germline F692L drug resistance variant in cis with Flt3-internal tandem duplication in knock-in mice.

    Dovey OM, Chen B, Mupo A, Friedrich M, Grove CS, Cooper JL, Lee B, Varela I, Huang Y and Vassiliou GS

    The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK

    Funded by: Medical Research Council: MC_PC_12009; Wellcome Trust: WT095663MA

    Haematologica 2016;101;8;e328-31

  • Phylogenetic Analysis of Invasive Serotype 1 Pneumococcus in South Africa, 1989-2013.

    du Plessis M, Allam M, Tempia S, Wolter N, de Gouveia L, Mollendorf CV, Jolley KA, Mbelle N, Wadula J, Cornick JE, Everett DB, McGee L, Breiman RF, Gladstone RA, Bentley SD, Klugman KP and von Gottberg A

    Centre for Respiratory Diseases and Meningitis, National Institute for Communicable Diseases, National Health Laboratory Service, Johannesburg, South Africa School of Pathology, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa

    Background: Serotype 1 is an important cause of invasive pneumococcal disease in South Africa and has declined following introduction of the 13-valent pneumococcal conjugate vaccine in 2011.

    Methods: We genetically characterized 912 invasive serotype 1 isolates from 1989-2013. Simpson's diversity index and recombination ratios were calculated. Factors associated with sequence types (ST) were assessed.

    Results: Clonal complex 217 represented 96% (872/912) of sampled isolates. Post PCV13, ST diversity increased in children <5 years (0.39 to 0.63, p=0.002) and individuals >14 years (0.35 to 0.54, p<0.001): ST-217 declined proportionately in children <5 years [153/203 (75%) vs. 21/37 (57%), p=0.027], and individuals >14 years [242/305 (79%) vs. 96/148 (65%), p=0.001], whereas ST-9067 increased [4/684 (0.6%) vs. 24/228 (11%), p<0.001]. Three sub-clades were identified within ST-217: ST-217C1 (353/382, 92%), ST-217C2 (15/382, 4%) and ST-217C3 (14/382, 4%). ST-217C2, ST-217C3 and single-locus variant (SLV) ST-8314 (20/912, 2%) were associated with non-susceptibility to chloramphenicol, tetracycline and co-trimoxazole. ST-8314 (20/912, 2%) was also associated with increased non-susceptibility to penicillin (p<0.001). ST-217C3 and newly reported ST-9067 had higher recombination ratios compared to ST-217C1 (4.344 vs. 0.091, p<0.001 and 0.086 vs. 0.013, p<0.001, respectively).

    Conclusions: Increases in genetic diversity were noted post PCV13, and lineages associated with antimicrobial non-susceptibility were identified.

    Journal of clinical microbiology 2016

  • Bacterial pathogenesis: Getting all tangled up.

    Du Toit A

    Nature reviews. Microbiology 2016;14;9;545

  • Wheat bran promotes enrichment within the human colonic microbiota of butyrate-producing bacteria that release ferulic acid.

    Duncan SH, Russell WR, Quartieri A, Rossi M, Parkhill J, Walker AW and Flint HJ

    Rowett Institute of Nutrition and Health, University of Aberdeen, Aberdeen, UK.

    Cereal fibres such as wheat bran are considered to offer human health benefits via their impact on the intestinal microbiota. We show here by 16S rRNA gene-based community analysis that providing amylase-pretreated wheat bran as the sole added energy source to human intestinal microbial communities in anaerobic fermentors leads to the selective and progressive enrichment of a small number of bacterial species. In particular, OTUs corresponding to uncultured Lachnospiraceae (Firmicutes) related to Eubacterium xylanophilum and Butyrivibrio spp. were strongly enriched (by five to 160 fold) over 48 h in four independent experiments performed with different faecal inocula, while nine other Firmicutes OTUs showed > 5-fold enrichment in at least one experiment. Ferulic acid was released from the wheat bran during degradation but was rapidly converted to phenylpropionic acid derivatives via hydrogenation, demethylation and dehydroxylation to give metabolites that are detected in human faecal samples. Pure culture work using bacterial isolates related to the enriched OTUs, including several butyrate-producers, demonstrated that the strains caused substrate weight loss and released ferulic acid, but with limited further conversion. We conclude that breakdown of wheat bran involves specialist primary degraders while the conversion of released ferulic acid is likely to involve a multi-species pathway.

    Environmental microbiology 2016;18;7;2214-25

  • Consent Codes: Upholding Standard Data Use Conditions.

    Dyke SO, Philippakis AA, Rambla De Argila J, Paltoo DN, Luetkemeier ES, Knoppers BM, Brookes AJ, Spalding JD, Thompson M, Roos M, Boycott KM, Brudno M, Hurles M, Rehm HL, Matern A, Fiume M and Sherry ST

    Centre of Genomics and Policy, Faculty of Medicine, McGill University, Montreal, Quebec, Canada.

    A systematic way of recording data use conditions that are based on consent permissions as found in the datasets of the main public genome archives (NCBI dbGaP and EMBL-EBI/CRG EGA).

    Funded by: Canadian Institutes of Health Research: EP1-120608, EP2-120609

    PLoS genetics 2016;12;1;e1005772

  • Alternative Splice Forms Influence Functions of Whirlin in Mechanosensory Hair Cell Stereocilia.

    Ebrahim S, Ingham NJ, Lewis MA, Rogers MJ, Cui R, Kachar B, Pass JC and Steel KP

    Wolfson Centre for Age-Related Diseases, King's College London, Guy's Campus, London SE1 1UL, UK.

    WHRN (DFNB31) mutations cause diverse hearing disorders: profound deafness (DFNB31) or variable hearing loss in Usher syndrome type II. The known role of WHRN in stereocilia elongation does not explain these different pathophysiologies. Using spontaneous and targeted Whrn mutants, we show that the major long (WHRN-L) and short (WHRN-S) isoforms of WHRN have distinct localizations within stereocilia and also across hair cell types. Lack of both isoforms causes abnormally short stereocilia and profound deafness and vestibular dysfunction. WHRN-S expression, however, is sufficient to maintain stereocilia bundle morphology and function in a subset of hair cells, resulting in some auditory response and no overt vestibular dysfunction. WHRN-S interacts with EPS8, and both are required at stereocilia tips for normal length regulation. WHRN-L localizes midway along the shorter stereocilia, at the level of inter-stereociliary links. We propose that differential isoform expression underlies the variable auditory and vestibular phenotypes associated with WHRN mutations.

    Cell reports 2016;15;5;935-43

  • MERVL/Zscan4 Network Activation Results in Transient Genome-wide DNA Demethylation of mESCs.

    Eckersley-Maslin MA, Svensson V, Krueger C, Stubbs TM, Giehr P, Krueger F, Miragaia RJ, Kyriakopoulos C, Berrens RV, Milagre I, Walter J, Teichmann SA and Reik W

    Epigenetics Programme, Babraham Institute, Cambridge CB22 3AT, UK. Electronic address:

    Mouse embryonic stem cells are dynamic and heterogeneous. For example, rare cells cycle through a state characterized by decondensed chromatin and expression of transcripts, including the Zscan4 cluster and MERVL endogenous retrovirus, which are usually restricted to preimplantation embryos. Here, we further characterize the dynamics and consequences of this transient cell state. Single-cell transcriptomics identified the earliest upregulated transcripts as cells enter the MERVL/Zscan4 state. The MERVL/Zscan4 transcriptional network was also upregulated during induced pluripotent stem cell reprogramming. Genome-wide DNA methylation and chromatin analyses revealed global DNA hypomethylation accompanying increased chromatin accessibility. This transient DNA demethylation was driven by a loss of DNA methyltransferase proteins in the cells and occurred genome-wide. While methylation levels were restored once cells exit this state, genomic imprints remained hypomethylated, demonstrating a potential global and enduring influence of endogenous retroviral activation on the epigenome.

    Funded by: Biotechnology and Biological Sciences Research Council: BB/K010867/1; Wellcome Trust: 095645/Z/11/Z

    Cell reports 2016;17;1;179-192

  • The genetics of blood pressure regulation and its target organs from association studies in 342,415 individuals.

    Ehret GB, Ferreira T, Chasman DI, Jackson AU, Schmidt EM, Johnson T, Thorleifsson G, Luan J, Donnelly LA, Kanoni S, Petersen AK, Pihur V, Strawbridge RJ, Shungin D, Hughes MF, Meirelles O, Kaakinen M, Bouatia-Naji N, Kristiansson K, Shah S, Kleber ME, Guo X, Lyytikäinen LP, Fava C, Eriksson N, Nolte IM, Magnusson PK, Salfati EL, Rallidis LS, Theusch E, Smith AJP, Folkersen L, Witkowska K, Pers TH, Joehanes R, Kim SK, Lataniotis L, Jansen R, Johnson AD, Warren H, Kim YJ, Zhao W, Wu Y, Tayo BO, Bochud M, CHARGE-EchoGen consortium, CHARGE-HF consortium, Wellcome Trust Case Control Consortium, Absher D, Adair LS, Amin N, Arking DE, Axelsson T, Baldassarre D, Balkau B, Bandinelli S, Barnes MR, Barroso I, Bevan S, Bis JC, Bjornsdottir G, Boehnke M, Boerwinkle E, Bonnycastle LL, Boomsma DI, Bornstein SR, Brown MJ, Burnier M, Cabrera CP, Chambers JC, Chang IS, Cheng CY, Chines PS, Chung RH, Collins FS, Connell JM, Döring A, Dallongeville J, Danesh J, de Faire U, Delgado G, Dominiczak AF, Doney ASF, Drenos F, Edkins S, Eicher JD, Elosua R, Enroth S, Erdmann J, Eriksson P, Esko T, Evangelou E, Evans A, Fall T, Farrall M, Felix JF, Ferrières J, Ferrucci L, Fornage M, Forrester T, Franceschini N, Duran OHF, Franco-Cereceda A, Fraser RM, Ganesh SK, Gao H, Gertow K, Gianfagna F, Gigante B, Giulianini F, Goel A, Goodall AH, Goodarzi MO, Gorski M, Gräßler J, Groves C, Gudnason V, Gyllensten U, Hallmans G, Hartikainen AL, Hassinen M, Havulinna AS, Hayward C, Hercberg S, Herzig KH, Hicks AA, Hingorani AD, Hirschhorn JN, Hofman A, Holmen J, Holmen OL, Hottenga JJ, Howard P, Hsiung CA, Hunt SC, Ikram MA, Illig T, Iribarren C, Jensen RA, Kähönen M, Kang H, Kathiresan S, Keating BJ, Khaw KT, Kim YK, Kim E, Kivimaki M, Klopp N, Kolovou G, Komulainen P, Kooner JS, Kosova G, Krauss RM, Kuh D, Kutalik Z, Kuusisto J, Kvaløy K, Lakka TA, Lee NR, Lee IT, Lee WJ, Levy D, Li X, Liang KW, Lin H, Lin L, Lindström J, Lobbens S, Männistö S, Müller G, Müller-Nurasyid M, Mach F, Markus HS, Marouli E, McCarthy MI, McKenzie CA, Meneton P, Menni C, Metspalu A, Mijatovic V, Moilanen L, Montasser ME, Morris AD, Morrison AC, Mulas A, Nagaraja R, Narisu N, Nikus K, O'Donnell CJ, O'Reilly PF, Ong KK, Paccaud F, Palmer CD, Parsa A, Pedersen NL, Penninx BW, Perola M, Peters A, Poulter N, Pramstaller PP, Psaty BM, Quertermous T, Rao DC, Rasheed A, Rayner NWNWR, Renström F, Rettig R, Rice KM, Roberts R, Rose LM, Rossouw J, Samani NJ, Sanna S, Saramies J, Schunkert H, Sebert S, Sheu WH, Shin YA, Sim X, Smit JH, Smith AV, Sosa MX, Spector TD, Stančáková A, Stanton A, Stirrups KE, Stringham HM, Sundstrom J, Swift AJ, Syvänen AC, Tai ES, Tanaka T, Tarasov KV, Teumer A, Thorsteinsdottir U, Tobin MD, Tremoli E, Uitterlinden AG, Uusitupa M, Vaez A, Vaidya D, van Duijn CM, van Iperen EPA, Vasan RS, Verwoert GC, Virtamo J, Vitart V, Voight BF, Vollenweider P, Wagner A, Wain LV, Wareham NJ, Watkins H, Weder AB, Westra HJ, Wilks R, Wilsgaard T, Wilson JF, Wong TY, Yang TP, Yao J, Yengo L, Zhang W, Zhao JH, Zhu X, Bovet P, Cooper RS, Mohlke KL, Saleheen D, Lee JY, Elliott P, Gierman HJ, Willer CJ, Franke L, Hovingh GK, Taylor KD, Dedoussis G, Sever P, Wong A, Lind L, Assimes TL, Njølstad I, Schwarz PE, Langenberg C, Snieder H, Caulfield MJ, Melander O, Laakso M, Saltevo J, Rauramaa R, Tuomilehto J, Ingelsson E, Lehtimäki T, Hveem K, Palmas W, März W, Kumari M, Salomaa V, Chen YI, Rotter JI, Froguel P, Jarvelin MR, Lakatta EG, Kuulasmaa K, Franks PW, Hamsten A, Wichmann HE, Palmer CNA, Stefansson K, Ridker PM, Loos RJF, Chakravarti A, Deloukas P, Morris AP, Newton-Cheh C and Munroe PB

    Center for Complex Disease Genomics, McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA.

    To dissect the genetic architecture of blood pressure and assess effects on target organ damage, we analyzed 128,272 SNPs from targeted and genome-wide arrays in 201,529 individuals of European ancestry, and genotypes from an additional 140,886 individuals were used for validation. We identified 66 blood pressure-associated loci, of which 17 were new; 15 harbored multiple distinct association signals. The 66 index SNPs were enriched for cis-regulatory elements, particularly in vascular endothelial cells, consistent with a primary role in blood pressure control through modulation of vascular tone across multiple tissues. The 66 index SNPs combined in a risk score showed comparable effects in 64,421 individuals of non-European descent. The 66-SNP blood pressure risk score was significantly associated with target organ damage in multiple tissues but with minor effects in the kidney. Our findings expand current knowledge of blood pressure-related pathways and highlight tissues beyond the classical renal system in blood pressure regulation.

    Funded by: Arthritis Research UK; British Heart Foundation: FS/13/6/29977, PG/02/128, RG/07/005/23633, RG/10/12/28456, RG/13/2/30098, RG/14/5/30893, RG08/008, RG2008/014, SP/04/002, SP/08/005/25115; Chief Scientist Office; FIC NIH HHS: R01 TW005596, R01 TW008288, RC1 TW008485; Medical Research Council: 85374, G0000934, G0401527, G0500539, G0600237, G0600705, G0601261, G0601966, G0700931, G1000143, G1002319, G9521010D, MC_PC_U127561128, MC_UU_12013/5, MC_UU_12015/1, MC_UU_12019/1, MR/K006584/1, MR/K013351/1, MR/L003120/1, MR/L01341X/1; NCATS NIH HHS: UL1 TR000124; NCI NIH HHS: UM1 CA182913; NCRR NIH HHS: M01 RR000052, M01 RR000425, M01 RR010284, M01 RR016500, P20 RR020649, U54 RR020278, UL1 RR024156, UL1 RR024975, UL1 RR025005, UL1 RR025774, UL1 RR033176; NEI NIH HHS: R01 EY014684, R01 EY018246, ZIA EY000401; NHGRI NIH HHS: HHSN268200782096C, N01HG65403, U01 HG004402, U01 HG007416, Z01 HG000024, Z01 HG200362; NHLBI NIH HHS: HHSN268201100005C, HHSN268201100006C, HHSN268201100007C, HHSN268201100008C, HHSN268201100009C, HHSN268201100010C, HHSN268201100011C, HHSN268201100012C, HHSN268201200036C, K23 HL080025, K24 HL004334, K99 HL094535, N01 HC005187, N01 HC015103, N01 HC035129, N01 HC045133, N01 HC045134, N01 HC045204, N01 HC045205, N01 HC048047, N01 HC048048, N01 HC048049, N01 HC048050, N01 HC055015, N01 HC055018, N01 HC055019, N01 HC085084, N01 HC085085, N01 HC095095, N01 WH42114, N01HC25195, N01HC55016, N01HC55020, N01HC55021, N01HC55222, N01HC65226, N01HC75150, N01HC85079, N01HC85080, N01HC85081, N01HC85082, N01HC85083, N01HC85086, N01HC95159, N01HC95160, N01HC95161, N01HC95162, N01HC95163, N01HC95164, N01HC95165, N01HC95166, N01HC95167, N01HC95168, N01HC95169, N01WH32102, N02HL64278, R01 HL036310, R01 HL043851, R01 HL046380, R01 HL053353, R01 HL055673, R01 HL059367, R01 HL059684, R01 HL071025, R01 HL074166, R01 HL075366, R01 HL077477, R01 HL080295, R01 HL080467, R01 HL085144, R01 HL085251, R01 HL086694, R01 HL087641, R01 HL087647, R01 HL087652, R01 HL087660, R01 HL087679, R01 HL087698, R01 HL088119, R01 HL093029, R01 HL093328, R01 HL098283, R01 HL103612, R01 HL105756, R01 HL109512, R01 HL109946, R01 HL113933, R01 HL120393, R01 HL122684, R21 HL123677, RC1 HL100245, RC2 HL101834, T32 HL007208, T32 HL007902, T32 HL098049, U01 HL054471, U01 HL054472, U01 HL054473, U01 HL054495, U01 HL054496, U01 HL054497, U01 HL054509, U01 HL069757, U01 HL072515, U01 HL072518, U01 HL080295, U01 HL084756, U01 HL096917; NIA NIH HHS: HHSN271201200022C, N01 AG012100, N01 AG062101, N01 AG062103, N01 AG062106, N01AG12109, R01 AG010175, R01 AG013196, R01 AG015928, R01 AG016592, R01 AG018728, R01 AG020098, R01 AG023629, R01 AG025941, R01 AG027058, R01 AG028555, R01 AG032098, T32 AG000219, Z01 AG000513, ZIA AG007380; NICHD NIH HHS: P2C HD050924, R03 HD061437, R03 HD062783; NIDDK NIH HHS: P30 DK020572, P30 DK056350, P30 DK063491, P30 DK072488, P60 DK079637, R01 DK054261, R01 DK062370, R01 DK072193, R01 DK078150, R01 DK079888, R01 DK084350, R01 DK093757, R01 DK101478, U01 DK062370; NIEHS NIH HHS: P30 ES007033, P30 ES010126; NIGMS NIH HHS: S06 GM008016, U01 GM074518; NIH HHS: N01 WH42124; NIMH NIH HHS: R01 MH063706, RC2 MH089951, RL1 MH083268; NINDS NIH HHS: R21 NS064908, U01 NS041588; WHI NIH HHS: N01 WH032105, N01 WH032118, N01 WH032119, N01 WH032122, N01 WH042107, N01 WH042109, N01 WH042110, N01 WH042111, N01 WH042112, N01 WH042115, N01 WH042116, N01 WH042117, N01 WH042118, N01 WH042119, N01 WH042120, N01 WH042121, N01 WH042122, N01 WH042123, N01 WH042125, N01 WH042126, N01 WH042129, N01 WH042130, N01 WH042131, N01 WH042132, N01 WH044221, N01WH22110, N01WH24152, N01WH32100, N01WH32101, N01WH32106, N01WH32108, N01WH32109, N01WH32111, N01WH32112, N01WH32113, N01WH32115, N01WH42108, N01WH42113; Wellcome Trust: 068545/Z/02, 081917/Z/07/Z, 084723/Z/08/Z, 085475/B/08/Z, 090532/Z/09/Z, 098051, WT098017

    Nature genetics 2016;48;10;1171-1184

  • Community dynamics and the lower airway microbiota in stable chronic obstructive pulmonary disease, smokers and healthy non-smokers.

    Einarsson GG, Comer DM, McIlreavey L, Parkhill J, Ennis M, Tunney MM and Elborn JS

    Halo, Queen's University Belfast, Belfast, UK Centre for Infection and Immunity, School of Medicine, Dentistry and Biomedical Sciences, Queen's University Belfast, Belfast, UK.

    Rationale: The role bacteria play in the progression of COPD has increasingly been highlighted in recent years. However, the microbial community complexity in the lower airways of patients with COPD is poorly characterised.

    Objectives: To compare the lower airway microbiota in patients with COPD, smokers and non-smokers.

    Methods: Bronchial wash samples from adults with COPD (n=18), smokers with no airways disease (n=8) and healthy individuals (n=11) were analysed by extended-culture and culture-independent Illumina MiSeq sequencing. We determined aerobic and anaerobic microbiota load and evaluated differences in bacteria associated with the three cohorts. Culture-independent analysis was used to determine differences in microbiota between comparison groups including taxonomic richness, diversity, relative abundance, 'core' microbiota and co-occurrence.

    Measurement and main results: Extended-culture showed no difference in total load of aerobic and anaerobic bacteria between the three cohorts. Culture-independent analysis revealed that the prevalence of members of Pseudomonas spp. was greater in the lower airways of patients with COPD; however, the majority of the sequence reads for this taxa were attributed to three patients. Furthermore, members of Bacteroidetes, such as Prevotella spp., were observed to be greater in the 'healthy' comparison groups. Community diversity (α and β) was significantly less in COPD compared with healthy groups. Co-occurrence of bacterial taxa and the observation of a putative 'core' community within the lower airways were also observed.

    Conclusions: Microbial community composition in the lower airways of patients with COPD is significantly different to that found in smokers and non-smokers, indicating that a component of the disease is associated with changes in microbiological status.

    Funded by: Wellcome Trust

    Thorax 2016;71;9;795-803

  • Involvement of astrocyte and oligodendrocyte gene sets in migraine.

    Eising E, de Leeuw C, Min JL, Anttila V, Verheijen MH, Terwindt GM, Dichgans M, Freilinger T, Kubisch C, International Headache Genetics Consortium, Ferrari MD, Smit AB, de Vries B, Palotie A, van den Maagdenberg AM and Posthuma D

    Department of Human Genetics, Leiden University Medical Centre, The Netherlands.

    Background: Migraine is a common episodic brain disorder characterized by recurrent attacks of severe unilateral headache and additional neurological symptoms. Two main migraine types can be distinguished based on the presence of aura symptoms that can accompany the headache: migraine with aura and migraine without aura. Multiple genetic and environmental factors confer disease susceptibility. Recent genome-wide association studies (GWAS) indicate that migraine susceptibility genes are involved in various pathways, including neurotransmission, which have already been implicated in genetic studies of monogenic familial hemiplegic migraine, a subtype of migraine with aura.

    Methods: To further explore the genetic background of migraine, we performed a gene set analysis of migraine GWAS data of 4954 clinic-based patients with migraine, as well as 13,390 controls. Curated sets of synaptic genes and sets of genes predominantly expressed in three glial cell types (astrocytes, microglia and oligodendrocytes) were investigated.

    Discussion: Our results show that gene sets containing astrocyte- and oligodendrocyte-related genes are associated with migraine, which is especially true for gene sets involved in protein modification and signal transduction. Observed differences between migraine with aura and migraine without aura indicate that both migraine types, at least in part, seem to have a different genetic background.

    Cephalalgia : an international journal of headache 2016;36;7;640-7

  • H3Africa multi-centre study of the prevalence and environmental and genetic determinants of type 2 diabetes in sub-Saharan Africa: study protocol.

    Ekoru K, Young EH, Adebamowo C, Balde N, Hennig BJ, Kaleebu P, Kapiga S, Levitt NS, Mayige M, Mbanya JC, McCarthy MI, Nyan O, Nyirenda M, Oli J, Ramaiya K, Smeeth L, Sobngwi E, Rotimi CN, Sandhu MS and Motala AA

    Department of Medicine, University of Cambridge, Cambridge, UK.

    The burden and aetiology of type 2 diabetes (T2D) and its microvascular complications may be influenced by varying behavioural and lifestyle environments as well as by genetic susceptibility. These aspects of the epidemiology of T2D have not been reliably clarified in sub-Saharan Africa (SSA), highlighting the need for context-specific epidemiological studies with the statistical resolution to inform potential preventative and therapeutic strategies. Therefore, as part of the Human Heredity and Health in Africa (H3Africa) initiative, we designed a multi-site study comprising case collections and population-based surveys at 11 sites in eight countries across SSA. The goal is to recruit up to 6000 T2D participants and 6000 control participants. We will collect questionnaire data, biophysical measurements and biological samples for chronic disease traits, risk factors and genetic data on all study participants. Through integrating epidemiological and genomic techniques, the study provides a framework for assessing the burden, spectrum and environmental and genetic risk factors for T2D and its complications across SSA. With established mechanisms for fieldwork, data and sample collection and management, data-sharing and consent for re-approaching participants, the study will be a resource for future research studies, including longitudinal studies, prospective case ascertainment of incident disease and interventional studies.

    Funded by: Medical Research Council: G0901756, MR/K013491/1; Wellcome Trust

    Global health, epidemiology and genomics 2016;1;e5

  • The role of hepatocyte nuclear factor 1β in disease and development.

    El-Khairi R and Vallier L

    Wellcome Trust-Medical Research Council Stem Cell Institute, Anne McLaren Laboratory, Department of Surgery, University of Cambridge, Cambridge, UK.

    Heterozygous mutations in the gene that encodes the transcription factor hepatocyte nuclear factor 1β (HNF1B) result in a multi-system disorder. HNF1B was initially discovered as a monogenic diabetes gene; however, renal cysts are the most frequently detected feature. Other clinical features include pancreatic hypoplasia and exocrine insufficiency, genital tract malformations, abnormal liver function, cholestasis and early-onset gout. Heterozygous mutations and complete gene deletions in HNF1B each account for approximately 50% of all cases of HNF1B-associated disease and may show autosomal dominant inheritance or arise spontaneously. There is no clear genotype-phenotype correlation indicating that haploinsufficiency is the main disease mechanism. Data from animal models suggest that HNF1B is essential for several stages of pancreas and liver development. However, mice with heterozygous mutations in HNF1B show no phenotype in contrast to the phenotype seen in humans. This suggests that mouse models do not fully replicate the features of human disease and complementary studies in human systems are necessary to determine the molecular mechanisms underlying HNF1B-associated disease. This review discusses the role of HNF1B in human and murine pancreas and liver development, summarizes the disease phenotypes and identifies areas for future investigations in HNF1B-associated diabetes and liver disease.

    Funded by: Medical Research Council: MC_PC_12009; Wellcome Trust

    Diabetes, obesity & metabolism 2016;18 Suppl 1;23-32

  • Analysis of five chronic inflammatory diseases identifies 27 new associations and highlights disease-specific patterns at shared loci.

    Ellinghaus D, Jostins L, Spain SL, Cortes A, Bethune J, Han B, Park YR, Raychaudhuri S, Pouget JG, Hübenthal M, Folseraas T, Wang Y, Esko T, Metspalu A, Westra HJ, Franke L, Pers TH, Weersma RK, Collij V, D'Amato M, Halfvarson J, Jensen AB, Lieb W, Degenhardt F, Forstner AJ, Hofmann A, International IBD Genetics Consortium (IIBDGC), International Genetics of Ankylosing Spondylitis Consortium (IGAS), International PSC Study Group (IPSCSG), Genetic Analysis of Psoriasis Consortium (GAPC), Psoriasis Association Genetics Extension (PAGE), Schreiber S, Mrowietz U, Juran BD, Lazaridis KN, Brunak S, Dale AM, Trembath RC, Weidinger S, Weichenthal M, Ellinghaus E, Elder JT, Barker JN, Andreassen OA, McGovern DP, Karlsen TH, Barrett JC, Parkes M, Brown MA and Franke A

    Institute of Clinical Molecular Biology, Christian Albrechts University of Kiel, Kiel, Germany.

    We simultaneously investigated the genetic landscape of ankylosing spondylitis, Crohn's disease, psoriasis, primary sclerosing cholangitis and ulcerative colitis to investigate pleiotropy and the relationship between these clinically related diseases. Using high-density genotype data from more than 86,000 individuals of European ancestry, we identified 244 independent multidisease signals, including 27 new genome-wide significant susceptibility loci and 3 unreported shared risk loci. Complex pleiotropy was supported when contrasting multidisease signals with expression data sets from human, rat and mouse together with epigenetic and expressed enhancer profiles. The comorbidities among the five immune diseases were best explained by biological pleiotropy rather than heterogeneity (a subgroup of cases genetically identical to those with another disease, possibly owing to diagnostic misclassification, molecular subtypes or excessive comorbidity). In particular, the strong comorbidity between primary sclerosing cholangitis and inflammatory bowel disease is likely the result of a unique disease, which is genetically distinct from classical inflammatory bowel disease phenotypes.

    Nature genetics 2016

  • Beegle: from literature mining to disease-gene discovery.

    ElShal S, Tranchevent LC, Sifrim A, Ardeshirdavani A, Davis J and Moreau Y

    Department of Electrical Engineering (ESAT) STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics Department, KU Leuven, Leuven 3001, Belgium iMinds Future Health Department, KU Leuven, Leuven 3001, Belgium

    Disease-gene identification is a challenging process that has multiple applications within functional genomics and personalized medicine. Typically, this process involves both finding genes known to be associated with the disease (through literature search) and carrying out preliminary experiments or screens (e.g. linkage or association studies, copy number analyses, expression profiling) to determine a set of promising candidates for experimental validation. This requires extensive time and monetary resources. We describe Beegle, an online search and discovery engine that attempts to simplify this process by automating the typical approaches. It starts by mining the literature to quickly extract a set of genes known to be linked with a given query, then it integrates the learning methodology of Endeavour (a gene prioritization tool) to train a genomic model and rank a set of candidate genes to generate novel hypotheses. In a realistic evaluation setup, Beegle has an average recall of 84% in the top 100 returned genes as a search engine, which improves the discovery engine by 12.6% in the top 5% prioritized genes. Beegle is publicly available at

    Nucleic acids research 2016;44;2;e18

  • Phenotypic Characterization of Genetically Lowered Human Lipoprotein(a) Levels.

    Emdin CA, Khera AV, Natarajan P, Klarin D, Won HH, Peloso GM, Stitziel NO, Nomura A, Zekavat SM, Bick AG, Gupta N, Asselta R, Duga S, Merlini PA, Correa A, Kessler T, Wilson JG, Bown MJ, Hall AS, Braund PS, Samani NJ, Schunkert H, Marrugat J, Elosua R, McPherson R, Farrall M, Watkins H, Willer C, Abecasis GR, Felix JF, Vasan RS, Lander E, Rader DJ, Danesh J, Ardissino D, Gabriel S, Saleheen D, Kathiresan S, CHARGE–Heart Failure Consortium and CARDIoGRAM Exome Consortium

    Center for Human Genetic Research, Cardiovascular Research Center and Cardiology Division, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts; Program in Medical and Population Genetics, Broad Institute, Cambridge, Massachusetts.

    Background: Genomic analyses have suggested that the LPA gene and its associated plasma biomarker, lipoprotein(a) (Lp[a]), represent a causal risk factor for coronary heart disease (CHD). As such, lowering Lp(a) levels has emerged as a therapeutic strategy. Beyond target identification, human genetics may contribute to the development of new therapies by defining the full spectrum of beneficial and adverse consequences and by developing a dose-response curve of target perturbation.

    Objectives: The goal of this study was to establish the full phenotypic impact of LPA gene variation and to estimate a dose-response curve between genetically altered plasma Lp(a) and risk for CHD.

    Methods: We leveraged genetic variants at the LPA gene from 3 data sources: individual-level data from 112,338 participants in the U.K. Biobank; summary association results from large-scale genome-wide association studies; and LPA gene sequencing results from case subjects with CHD and control subjects free of CHD.

    Results: One SD genetically lowered Lp(a) level was associated with a 29% lower risk of CHD (odds ratio [OR]: 0.71; 95% confidence interval [CI]: 0.69 to 0.73), a 31% lower risk of peripheral vascular disease (OR: 0.69; 95% CI: 0.59 to 0.80), a 13% lower risk of stroke (OR: 0.87; 95% CI: 0.79 to 0.96), a 17% lower risk of heart failure (OR: 0.83; 95% CI: 0.73 to 0.94), and a 37% lower risk of aortic stenosis (OR: 0.63; 95% CI: 0.47 to 0.83). We observed no association with 31 other disorders, including type 2 diabetes and cancer. Variants that led to gain of LPA gene function increased the risk for CHD, whereas those that led to loss of gene function reduced the CHD risk.

    Conclusions: Beyond CHD, genetically lowered Lp(a) levels are associated with a lower risk of peripheral vascular disease, stroke, heart failure, and aortic stenosis. As such, pharmacological lowering of plasma Lp(a) may influence a range of atherosclerosis-related diseases.

    Funded by: Medical Research Council: MC_QA137853, MR/L003120/1; NCATS NIH HHS: KL2 TR001100; NHGRI NIH HHS: U54 HG003067; NHLBI NIH HHS: K01 HL125751, K08 HL114642, R01 HL127564, R01 HL131961, RC2 HL102923, RC2 HL102924, RC2 HL102925, RC2 HL102926, RC2 HL103010, T32 HL007734

    Journal of the American College of Cardiology 2016;68;25;2761-2772

  • Generation and Characterisation of a Pax8-CreERT2 Transgenic Line and a Slc22a6-CreERT2 Knock-In Line for Inducible and Specific Genetic Manipulation of Renal Tubular Epithelial Cells.

    Espana-Agusti J, Zou X, Wong K, Fu B, Yang F, Tuveson DA, Adams DJ and Matakidou A

    Department of Oncology, University of Cambridge, CRUK Cambridge institute, Cambridge, United Kingdom.

    Genetically relevant mouse models need to recapitulate the hallmarks of human disease by permitting spatiotemporal gene targeting. This is especially important for replicating the biology of complex diseases like cancer, where genetic events occur in a sporadic fashion within developed somatic tissues. Though a number of renal tubule targeting mouse lines have been developed their utility for the study of renal disease is limited by lack of inducibility and specificity. In this study we describe the generation and characterisation of two novel mouse lines directing CreERT2 expression to renal tubular epithelia. The Pax8-CreERT2 transgenic line uses the mouse Pax8 promoter to direct expression of CreERT2 to all renal tubular compartments (proximal and distal tubules as well as collecting ducts) whilst the Slc22a6-CreERT2 knock-in line utilises the endogenous mouse Slc22a6 locus to specifically target the epithelium of proximal renal tubules. Both lines show high organ and tissue specificity with no extrarenal activity detected. To establish the utility of these lines for the study of renal cancer biology, Pax8-CreERT2 and Slc22a6-CreERT2 mice were crossed to conditional Vhl knockout mice to induce long-term renal tubule specific Vhl deletion. These models exhibited renal specific activation of the hypoxia inducible factor pathway (a VHL target). Our results establish Pax8-CreERT2 and Slc22a6-CreERT2 mice as valuable tools for the investigation and modelling of complex renal biology and disease.

    Funded by: Cancer Research UK: 13031, C37839/A12177

    PloS one 2016;11;2;e0148055

  • Genomic variations leading to alterations in cell morphology of Campylobacter spp.

    Esson D, Mather AE, Scanlan E, Gupta S, de Vries SP, Bailey D, Harris SR, McKinley TJ, Méric G, Berry SK, Mastroeni P, Sheppard SK, Christie G, Thomson NR, Parkhill J, Maskell DJ and Grant AJ

    Department of Veterinary Medicine, University of Cambridge, Madingley Road, Cambridge, UK.

    Campylobacter jejuni, the most common cause of bacterial diarrhoeal disease, is normally helical. However, it can also adopt straight rod, elongated helical and coccoid forms. Studying how helical morphology is generated, and how it switches between its different forms, is an important objective for understanding this pathogen. Here, we aimed to determine the genetic factors involved in generating the helical shape of Campylobacter. A C. jejuni transposon (Tn) mutant library was screened for non-helical mutants with inconsistent results. Whole genome sequence variation and morphological trends within this Tn library, and in various C. jejuni wild type strains, were compared and correlated to detect genomic elements associated with helical and rod morphologies. All rod-shaped C. jejuni Tn mutants and all rod-shaped laboratory, clinical and environmental C. jejuni and Campylobacter coli contained genetic changes within the pgp1 or pgp2 genes, which encode peptidoglycan modifying enzymes. We therefore confirm the importance of Pgp1 and Pgp2 in the maintenance of helical shape and extended this to a wide range of C. jejuni and C. coli isolates. Genome sequence analysis revealed variation in the sequence and length of homopolymeric tracts found within these genes, providing a potential mechanism of phase variation of cell shape.

    Scientific reports 2016;6;38303

  • The genome of the yellow potato cyst nematode, Globodera rostochiensis, reveals insights into the basis of parasitism and virulence.

    Eves-van den Akker S, Laetsch DR, Thorpe P, Lilley CJ, Danchin EG, Da Rocha M, Rancurel C, Holroyd NE, Cotton JA, Szitenberg A, Grenier E, Montarry J, Mimee B, Duceppe MO, Boyes I, Marvin JM, Jones LM, Yusup HB, Lafond-Lapalme J, Esquibet M, Sabeh M, Rott M, Overmars H, Finkers-Tomczak A, Smant G, Koutsovoulos G, Blok V, Mantelin S, Cock PJ, Phillips W, Henrissat B, Urwin PE, Blaxter M and Jones JT

    Division of Plant Sciences, College of Life Sciences, University of Dundee, Dundee, DD1 5EH, UK.

    Background: The yellow potato cyst nematode, Globodera rostochiensis, is a devastating plant pathogen of global economic importance. This biotrophic parasite secretes effectors from pharyngeal glands, some of which were acquired by horizontal gene transfer, to manipulate host processes and promote parasitism. G. rostochiensis is classified into pathotypes with different plant resistance-breaking phenotypes.

    Results: We generate a high quality genome assembly for G. rostochiensis pathotype Ro1, identify putative effectors and horizontal gene transfer events, map gene expression through the life cycle focusing on key parasitic transitions and sequence the genomes of eight populations including four additional pathotypes to identify variation. Horizontal gene transfer contributes 3.5 % of the predicted genes, of which approximately 8.5 % are deployed as effectors. Over one-third of all effector genes are clustered in 21 putative 'effector islands' in the genome. We identify a dorsal gland promoter element motif (termed DOG Box) present upstream in representatives from 26 out of 28 dorsal gland effector families, and predict a putative effector superset associated with this motif. We validate gland cell expression in two novel genes by in situ hybridisation and catalogue dorsal gland promoter element-containing effectors from available cyst nematode genomes. Comparison of effector diversity between pathotypes highlights correlation with plant resistance-breaking.

    Conclusions: These G. rostochiensis genome resources will facilitate major advances in understanding nematode plant-parasitism. Dorsal gland promoter element-containing effectors are at the front line of the evolutionary arms race between plant and parasite and the ability to predict gland cell expression a priori promises rapid advances in understanding their roles and mechanisms of action.

    Funded by: Biotechnology and Biological Sciences Research Council: BB/F000642/1, BB/F00334X/1, BB/G007071/1; Wellcome Trust: 098051

    Genome biology 2016;17;1;124

  • DNA Methylation Dynamics of Human Hematopoietic Stem Cell Differentiation.

    Farlik M, Halbritter F, Müller F, Choudry FA, Ebert P, Klughammer J, Farrow S, Santoro A, Ciaurro V, Mathur A, Uppal R, Stunnenberg HG, Ouwehand WH, Laurenti E, Lengauer T, Frontini M and Bock C

    CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, 1090 Vienna, Austria.

    Hematopoietic stem cells give rise to all blood cells in a differentiation process that involves widespread epigenome remodeling. Here we present genome-wide reference maps of the associated DNA methylation dynamics. We used a meta-epigenomic approach that combines DNA methylation profiles across many small pools of cells and performed single-cell methylome sequencing to assess cell-to-cell heterogeneity. The resulting dataset identified characteristic differences between HSCs derived from fetal liver, cord blood, bone marrow, and peripheral blood. We also observed lineage-specific DNA methylation between myeloid and lymphoid progenitors, characterized immature multi-lymphoid progenitors, and detected progressive DNA methylation differences in maturing megakaryocytes. We linked these patterns to gene expression, histone modifications, and chromatin accessibility, and we used machine learning to derive a model of human hematopoietic differentiation directly from DNA methylation data. Our results contribute to a better understanding of human hematopoietic stem cell differentiation and provide a framework for studying blood-linked diseases.

    Funded by: British Heart Foundation: RG/09/012/28096; Department of Health: RP-PG-0310-1002; Medical Research Council: MC_PC_12009, MR/K024043/1; Wellcome Trust: 107630/Z/15/Z

    Cell stem cell 2016;19;6;808-822

  • The mountainous Cretan dietary patterns and their relationship with cardiovascular risk factors: the Hellenic Isolated Cohorts MANOLIS study.

    Farmaki AE, Rayner NW, Matchan A, Spiliopoulou P, Gilly A, Kariakli V, Kiagiadaki C, Tsafantakis E, Zeggini E and Dedoussis G

    1Department of Nutrition and Dietetics,School of Health Science and Education,Harokopio University,70 El Venizelou Avenue,17671 Athens,Greece.

    Objective: We carried out de novo recruitment of a population-based cohort (MANOLIS study) and describe the specific population, which displays interesting characteristics in terms of diet and health in old age, through deep phenotyping.

    Design: Cross-sectional study where anthropometric, biochemical and clinical measurements were taken in addition to interview-based completion of an extensive questionnaire on health and lifestyle parameters. Dietary patterns were derived through principal component analysis based on a validated FFQ.

    Setting: Geographically isolated Mylopotamos villages on Mount Idi, Crete, Greece.

    Subjects: Adults (n 1553).

    Results: Mean age of the participants was 61·6 years and 55·8 % were women. Of the population, 82·7 % were overweight or obese with a significantly different prevalence between overweight men and women (43·4 v. 34·7 %, P=0·002). The majority (70·6 %) of participants were married, while a larger proportion of women were widowed than men (27·8 v. 3·5 %, P<0·001). Smoking was more prevalent in men (38·7 v. 8·2 %, P<0·001), as 88·8% of women had never smoked. Four dietary patterns emerged as characteristic of the population; these were termed 'local', 'high fat and sugar, 'Greek café/tavern' and 'olive oil, fruits and vegetables'. Individuals more adherent to the local dietary pattern presented higher blood glucose (β=4·026, P<0·001). Similarly, individuals with higher compliance with the Greek café/tavern pattern had higher waist-to-hip ratio (β=0·012, P<0·001), blood pressure (β=1·015, P=0·005) and cholesterol (β=5·398, P<0·001).

    Conclusions: Profiling of the MANOLIS elderly population identifies unique unhealthy dietary patterns that are associated with cardiometabolic indices.

    Public health nutrition 2016;20;6;1063-1074

  • Complete Whole-Genome Sequence of Salmonella enterica subsp. enterica Serovar Java NCTC5706.

    Fazal MA, Alexander S, Burnett E, Deheer-Graham A, Oliver K, Holroyd N, Parkhill J and Russell JE

    Culture Collections, Public Health England, London, United Kingdom.

    Salmonellae are a significant cause of morbidity and mortality globally. Here, we report the first complete genome sequence for Salmonella enterica subsp. enterica serovar Java strain NCTC5706. This strain is of historical significance, having been isolated in the pre-antibiotic era and was deposited into the National Collection of Type Cultures in 1939.

    Genome announcements 2016;4;6

  • Distinct Salmonella Enteritidis lineages associated with enterocolitis in high-income settings and invasive disease in low-income settings.

    Feasey NA, Hadfield J, Keddy KH, Dallman TJ, Jacobs J, Deng X, Wigley P, Barquist L, Langridge GC, Feltwell T, Harris SR, Mather AE, Fookes M, Aslett M, Msefula C, Kariuki S, Maclennan CA, Onsare RS, Weill FX, Le Hello S, Smith AM, McClelland M, Desai P, Parry CM, Cheesbrough J, French N, Campos J, Chabalgoity JA, Betancor L, Hopkins KL, Nair S, Humphrey TJ, Lunguya O, Cogan TA, Tapia MD, Sow SO, Tennant SM, Bornstein K, Levine MM, Lacharme-Lora L, Everett DB, Kingsley RA, Parkhill J, Heyderman RS, Dougan G, Gordon MA and Thomson NR

    Liverpool School of Tropical Medicine, Liverpool, UK.

    An epidemiological paradox surrounds Salmonella enterica serovar Enteritidis. In high-income settings, it has been responsible for an epidemic of poultry-associated, self-limiting enterocolitis, whereas in sub-Saharan Africa it is a major cause of invasive nontyphoidal Salmonella disease, associated with high case fatality. By whole-genome sequence analysis of 675 isolates of S. Enteritidis from 45 countries, we show the existence of a global epidemic clade and two new clades of S. Enteritidis that are geographically restricted to distinct regions of Africa. The African isolates display genomic degradation, a novel prophage repertoire, and an expanded multidrug resistance plasmid. S. Enteritidis is a further example of a Salmonella serotype that displays niche plasticity, with distinct clades that enable it to become a prominent cause of gastroenteritis in association with the industrial production of eggs and of multidrug-resistant, bloodstream-invasive infection in Africa.

    Funded by: Biotechnology and Biological Sciences Research Council: BB/M014088/1; NIAID NIH HHS: R01 AI099525; Wellcome Trust: 092152, 098051, 100891, 101113/Z/13/Z

    Nature genetics 2016;48;10;1211-1217

  • Genome-wide association analysis identifies three new susceptibility loci for childhood body mass index.

    Felix JF, Bradfield JP, Monnereau C, van der Valk RJ, Stergiakouli E, Chesi A, Gaillard R, Feenstra B, Thiering E, Kreiner-Møller E, Mahajan A, Pitkänen N, Joro R, Cavadino A, Huikari V, Franks S, Groen-Blokhuis MM, Cousminer DL, Marsh JA, Lehtimäki T, Curtin JA, Vioque J, Ahluwalia TS, Myhre R, Price TS, Vilor-Tejedor N, Yengo L, Grarup N, Ntalla I, Ang W, Atalay M, Bisgaard H, Blakemore AI, Bonnefond A, Carstensen L, Bone Mineral Density in Childhood Study (BMDCS), Early Genetics and Lifecourse Epidemiology (EAGLE) consortium, Eriksson J, Flexeder C, Franke L, Geller F, Geserick M, Hartikainen AL, Haworth CM, Hirschhorn JN, Hofman A, Holm JC, Horikoshi M, Hottenga JJ, Huang J, Kadarmideen HN, Kähönen M, Kiess W, Lakka HM, Lakka TA, Lewin AM, Liang L, Lyytikäinen LP, Ma B, Magnus P, McCormack SE, McMahon G, Mentch FD, Middeldorp CM, Murray CS, Pahkala K, Pers TH, Pfäffle R, Postma DS, Power C, Simpson A, Sengpiel V, Tiesler CM, Torrent M, Uitterlinden AG, van Meurs JB, Vinding R, Waage J, Wardle J, Zeggini E, Zemel BS, Dedoussis GV, Pedersen O, Froguel P, Sunyer J, Plomin R, Jacobsson B, Hansen T, Gonzalez JR, Custovic A, Raitakari OT, Pennell CE, Widén E, Boomsma DI, Koppelman GH, Sebert S, Järvelin MR, Hyppönen E, McCarthy MI, Lindi V, Harri N, Körner A, Bønnelykke K, Heinrich J, Melbye M, Rivadeneira F, Hakonarson H, Ring SM, Smith GD, Sørensen TI, Timpson NJ, Grant SF, Jaddoe VW, Early Growth Genetics (EGG) Consortium and Bone Mineral Density in Childhood Study BMDCS

    The Generation R Study Group, Department of Pediatrics, Department of Epidemiology,

    A large number of genetic loci are associated with adult body mass index. However, the genetics of childhood body mass index are largely unknown. We performed a meta-analysis of genome-wide association studies of childhood body mass index, using sex- and age-adjusted standard deviation scores. We included 35 668 children from 20 studies in the discovery phase and 11 873 children from 13 studies in the replication phase. In total, 15 loci reached genome-wide significance (P-value < 5 × 10(-8)) in the joint discovery and replication analysis, of which 12 are previously identified loci in or close to ADCY3, GNPDA2, TMEM18, SEC16B, FAIM2, FTO, TFAP2B, TNNI3K, MC4R, GPR61, LMX1B and OLFM4 associated with adult body mass index or childhood obesity. We identified three novel loci: rs13253111 near ELP3, rs8092503 near RAB27B and rs13387838 near ADAM23. Per additional risk allele, body mass index increased 0.04 Standard Deviation Score (SDS) [Standard Error (SE) 0.007], 0.05 SDS (SE 0.008) and 0.14 SDS (SE 0.025), for rs13253111, rs8092503 and rs13387838, respectively. A genetic risk score combining all 15 SNPs showed that each additional average risk allele was associated with a 0.073 SDS (SE 0.011, P-value = 3.12 × 10(-10)) increase in childhood body mass index in a population of 1955 children. This risk score explained 2% of the variance in childhood body mass index. This study highlights the shared genetic background between childhood and adult body mass index and adds three novel loci. These loci likely represent age-related differences in strength of the associations with body mass index.

    Funded by: Wellcome Trust: 098381

    Human molecular genetics 2016;25;2;389-403

  • A whole-genome sequence and transcriptome perspective on HER2-positive breast cancers.

    Ferrari A, Vincent-Salomon A, Pivot X, Sertier AS, Thomas E, Tonon L, Boyault S, Mulugeta E, Treilleux I, MacGrogan G, Arnould L, Kielbassa J, Le Texier V, Blanché H, Deleuze JF, Jacquemier J, Mathieu MC, Penault-Llorca F, Bibeau F, Mariani O, Mannina C, Pierga JY, Trédan O, Bachelot T, Bonnefoi H, Romieu G, Fumoleau P, Delaloge S, Rios M, Ferrero JM, Tarpin C, Bouteille C, Calvo F, Gut IG, Gut M, Martin S, Nik-Zainal S, Stratton MR, Pauporté I, Saintigny P, Birnbaum D, Viari A and Thomas G

    Synergie Lyon Cancer, Plateforme de bioinformatique 'Gilles Thomas' Centre Léon Bérard, 28 rue Laënnec, 69008 Lyon, France.

    HER2-positive breast cancer has long proven to be a clinically distinct class of breast cancers for which several targeted therapies are now available. However, resistance to the treatment associated with specific gene expressions or mutations has been observed, revealing the underlying diversity of these cancers. Therefore, understanding the full extent of the HER2-positive disease heterogeneity still remains challenging. Here we carry out an in-depth genomic characterization of 64 HER2-positive breast tumour genomes that exhibit four subgroups, based on the expression data, with distinctive genomic features in terms of somatic mutations, copy-number changes or structural variations. The results suggest that, despite being clinically defined by a specific gene amplification, HER2-positive tumours melt into the whole luminal-basal breast cancer spectrum rather than standing apart. The results also lead to a refined ERBB2 amplicon of 106 kb and show that several cases of amplifications are compatible with a breakage-fusion-bridge mechanism.

    Nature communications 2016;7;12222

  • Sequence variation between 462 human individuals fine-tunes functional sites of RNA processing.

    Ferreira PG, Oti M, Barann M, Wieland T, Ezquina S, Friedländer MR, Rivas MA, Esteve-Codina A, GEUVADIS Consortium, Rosenstiel P, Strom TM, Lappalainen T, Guigó R and Sammeth M

    Bioinformatics and Genomics, Center for Genomic Regulation (CRG), 08003 Barcelona, Catalonia, Spain.

    Recent advances in the cost-efficiency of sequencing technologies enabled the combined DNA- and RNA-sequencing of human individuals at the population-scale, making genome-wide investigations of the inter-individual genetic impact on gene expression viable. Employing mRNA-sequencing data from the Geuvadis Project and genome sequencing data from the 1000 Genomes Project we show that the computational analysis of DNA sequences around splice sites and poly-A signals is able to explain several observations in the phenotype data. In contrast to widespread assessments of statistically significant associations between DNA polymorphisms and quantitative traits, we developed a computational tool to pinpoint the molecular mechanisms by which genetic markers drive variation in RNA-processing, cataloguing and classifying alleles that change the affinity of core RNA elements to their recognizing factors. The in silico models we employ further suggest RNA editing can moonlight as a splicing-modulator, albeit less frequently than genomic sequence diversity. Beyond existing annotations, we demonstrate that the ultra-high resolution of RNA-Seq combined from 462 individuals also provides evidence for thousands of bona fide novel elements of RNA processing-alternative splice sites, introns, and cleavage sites-which are often rare and lowly expressed but in other characteristics similar to their annotated counterparts.

    Scientific reports 2016;6;32406

  • The Pfam protein families database: towards a more sustainable future.

    Finn RD, Coggill P, Eberhardt RY, Eddy SR, Mistry J, Mitchell AL, Potter SC, Punta M, Qureshi M, Sangrador-Vegas A, Salazar GA, Tate J and Bateman A

    European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK

    In the last two years the Pfam database ( has undergone a substantial reorganisation to reduce the effort involved in making a release, thereby permitting more frequent releases. Arguably the most significant of these changes is that Pfam is now primarily based on the UniProtKB reference proteomes, with the counts of matched sequences and species reported on the website restricted to this smaller set. Building families on reference proteomes sequences brings greater stability, which decreases the amount of manual curation required to maintain them. It also reduces the number of sequences displayed on the website, whilst still providing access to many important model organisms. Matches to the full UniProtKB database are, however, still available and Pfam annotations for individual UniProtKB sequences can still be retrieved. Some Pfam entries (1.6%) which have no matches to reference proteomes remain; we are working with UniProt to see if sequences from them can be incorporated into reference proteomes. Pfam-B, the automatically-generated supplement to Pfam, has been removed. The current release (Pfam 29.0) includes 16 295 entries and 559 clans. The facility to view the relationship between families within a clan has been improved by the introduction of a new tool.

    Funded by: Biotechnology and Biological Sciences Research Council: BB/L024136/1; Howard Hughes Medical Institute; Wellcome Trust: 108433/Z/15/Z]

    Nucleic acids research 2016;44;D1;D279-85

  • The diversity of Klebsiella pneumoniae surface polysaccharides.

    Follador R, Heinz E, Wyres KL, Ellington MJ, Kowarik M, Holt KE and Thomson NR

    1​LimmaTech Biologics AG, Schlieren, Switzerland.

    <i>Klebsiella pneumoniae</i> is considered an urgent health concern due to the emergence of multi-drug-resistant strains for which vaccination offers a potential remedy. Vaccines based on surface polysaccharides are highly promising but need to address the high diversity of surface-exposed polysaccharides, synthesized as O-antigens (lipopolysaccharide, LPS) and K-antigens (capsule polysaccharide, CPS), present in <i>K. pneumoniae</i>. We present a comprehensive and clinically relevant study of the diversity of O- and K-antigen biosynthesis gene clusters across a global collection of over 500 <i>K. pneumoniae</i> whole-genome sequences and the seroepidemiology of human isolates from different infection types. Our study defines the genetic diversity of O- and K-antigen biosynthesis cluster sequences across this collection, identifying sequences for known serotypes as well as identifying novel LPS and CPS gene clusters found in circulating contemporary isolates. Serotypes O1, O2 and O3 were most prevalent in our sample set, accounting for approximately 80 % of all infections. In contrast, K serotypes showed an order of magnitude higher diversity and differ among infection types. In addition we investigated a potential association of O or K serotypes with phylogenetic lineage, infection type and the presence of known virulence genes. K1 and K2 serotypes, which are associated with hypervirulent <i>K. pneumoniae</i>, were associated with a higher abundance of virulence genes and more diverse O serotypes compared to other common K serotypes.

    Microbial genomics 2016;2;8;e000073

  • COSMIC: High-Resolution Cancer Genetics Using the Catalogue of Somatic Mutations in Cancer.

    Forbes SA, Beare D, Bindal N, Bamford S, Ward S, Cole CG, Jia M, Kok C, Boutselakis H, De T, Sondka Z, Ponting L, Stefancsik R, Harsha B, Tate J, Dawson E, Thompson S, Jubb H and Campbell PJ

    Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, United Kingdom.

    COSMIC ( is an expert-curated database of somatic mutations in human cancer. Broad and comprehensive in scope, recent releases in 2016 describe over 4 million coding mutations across all human cancer disease types. Mutations are annotated across the entire genome, but expert curation is focused on over 400 key cancer genes. Now encompassing the majority of molecular mutation mechanisms in oncogenetics, COSMIC additionally describes 10 million non-coding mutations, 1 million copy-number aberrations, 9 million gene-expression variants, and almost 8 million differentially methylated CpGs. This information combines a consistent interpretation of the data from the major cancer genome consortia and cancer genome literature with exhaustive hand curation of over 22,000 gene-specific literature publications. This unit describes the graphical Web site in detail; alternative protocols overview other ways the entire database can be accessed, analyzed, and downloaded. © 2016 by John Wiley & Sons, Inc.

    Current protocols in human genetics 2016;91;10.11.1-10.11.37

  • COSMIC: somatic cancer genetics at high-resolution.

    Forbes SA, Beare D, Boutselakis H, Bamford S, Bindal N, Tate J, Cole CG, Ward S, Dawson E, Ponting L, Stefancsik R, Harsha B, Kok CY, Jia M, Jubb H, Sondka Z, Thompson S, De T and Campbell PJ

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK

    COSMIC, the Catalogue of Somatic Mutations in Cancer ( is a high-resolution resource for exploring targets and trends in the genetics of human cancer. Currently the broadest database of mutations in cancer, the information in COSMIC is curated by expert scientists, primarily by scrutinizing large numbers of scientific publications. Over 4 million coding mutations are described in v78 (September 2016), combining genome-wide sequencing results from 28 366 tumours with complete manual curation of 23 489 individual publications focused on 186 key genes and 286 key fusion pairs across all cancers. Molecular profiling of large tumour numbers has also allowed the annotation of more than 13 million non-coding mutations, 18 029 gene fusions, 187 429 genome rearrangements, 1 271 436 abnormal copy number segments, 9 175 462 abnormal expression variants and 7 879 142 differentially methylated CpG dinucleotides. COSMIC now details the genetics of drug resistance, novel somatic gene mutations which allow a tumour to evade therapeutic cancer drugs. Focusing initially on highly characterized drugs and genes, COSMIC v78 contains wide resistance mutation profiles across 20 drugs, detailing the recurrence of 301 unique resistance alleles across 1934 drug-resistant tumours. All information from the COSMIC database is available freely on the COSMIC website.

    Funded by: Wellcome Trust: 077012/Z/05/Z

    Nucleic acids research 2016;45;D1;D777-D783

  • Resistance of Transmitted Founder HIV-1 to IFITM-Mediated Restriction.

    Foster TL, Wilson H, Iyer SS, Coss K, Doores K, Smith S, Kellam P, Finzi A, Borrow P, Hahn BH and Neil SJD

    Department of Infectious Diseases, King's College London Faculty of Life Sciences and Medicine, Guy's Hospital, London SE1 9RT, UK.

    Interferon-induced transmembrane proteins (IFITMs) restrict the entry of diverse enveloped viruses through incompletely understood mechanisms. While IFITMs are reported to inhibit HIV-1, their in vivo relevance is unclear. We show that IFITM sensitivity of HIV-1 strains is determined by the co-receptor usage of the viral envelope glycoproteins as well as IFITM subcellular localization within the target cell. Importantly, we find that transmitted founder HIV-1, which establishes de novo infections, is uniquely resistant to the antiviral activity of IFITMs. However, viral sensitivity to IFITMs, particularly IFITM2 and IFITM3, increases over the first 6 months of infection, primarily as a result of neutralizing antibody escape mutations. Additionally, the ability to evade IFITM restriction contributes to the different interferon sensitivities of transmitted founder and chronic viruses. Together, these data indicate that IFITMs constitute an important barrier to HIV-1 transmission and that escape from adaptive immune responses exposes the virus to antiviral restriction.

    Funded by: NIAID NIH HHS: R01 AI111789, R01 AI114266, T32 AI007632, UM1 AI100645

    Cell host & microbe 2016;20;4;429-442

  • Variant Exported Blood-Stage Proteins Encoded by Plasmodium Multigene Families Are Expressed in Liver Stages Where They Are Exported into the Parasitophorous Vacuole.

    Fougère A, Jackson AP, Bechtsi DP, Braks JA, Annoura T, Fonager J, Spaccapelo R, Ramesar J, Chevalley-Maurel S, Klop O, van der Laan AM, Tanke HJ, Kocken CH, Pasini EM, Khan SM, Böhme U, van Ooij C, Otto TD, Janse CJ and Franke-Fayard B

    Leiden Malaria Research Group, Parasitology, Center of infectious Diseases, Leiden University Medical Center (LUMC), Leiden, The Netherlands.

    Many variant proteins encoded by Plasmodium-specific multigene families are exported into red blood cells (RBC). P. falciparum-specific variant proteins encoded by the var, stevor and rifin multigene families are exported onto the surface of infected red blood cells (iRBC) and mediate interactions between iRBC and host cells resulting in tissue sequestration and rosetting. However, the precise function of most other Plasmodium multigene families encoding exported proteins is unknown. To understand the role of RBC-exported proteins of rodent malaria parasites (RMP) we analysed the expression and cellular location by fluorescent-tagging of members of the pir, fam-a and fam-b multigene families. Furthermore, we performed phylogenetic analyses of the fam-a and fam-b multigene families, which indicate that both families have a history of functional differentiation unique to RMP. We demonstrate for all three families that expression of family members in iRBC is not mutually exclusive. Most tagged proteins were transported into the iRBC cytoplasm but not onto the iRBC plasma membrane, indicating that they are unlikely to play a direct role in iRBC-host cell interactions. Unexpectedly, most family members are also expressed during the liver stage, where they are transported into the parasitophorous vacuole. This suggests that these protein families promote parasite development in both the liver and blood, either by supporting parasite development within hepatocytes and erythrocytes and/or by manipulating the host immune response. Indeed, in the case of Fam-A, which have a steroidogenic acute regulatory-related lipid transfer (START) domain, we found that several family members can transfer phosphatidylcholine in vitro. These observations indicate that these proteins may transport (host) phosphatidylcholine for membrane synthesis. This is the first demonstration of a biological function of any exported variant protein family of rodent malaria parasites.

    Funded by: Wellcome Trust: 095836/Z/1/Z

    PLoS pathogens 2016;12;11;e1005917

  • An Antibody Screen of a Plasmodium vivax Antigen Library Identifies Novel Merozoite Proteins Associated with Clinical Protection.

    França CT, Hostetler JB, Sharma S, White MT, Lin E, Kiniboro B, Waltmann A, Darcy AW, Li Wai Suen CS, Siba P, King CL, Rayner JC, Fairhurst RM and Mueller I

    Population Health and Immunity Division, Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia.

    Background: Elimination of Plasmodium vivax malaria would be greatly facilitated by the development of an effective vaccine. A comprehensive and systematic characterization of antibodies to P. vivax antigens in exposed populations is useful in guiding rational vaccine design.

    Methodology/principal findings: In this study, we investigated antibodies to a large library of P. vivax entire ectodomain merozoite proteins in 2 Asia-Pacific populations, analysing the relationship of antibody levels with markers of current and cumulative malaria exposure, and socioeconomic and clinical indicators. 29 antigenic targets of natural immunity were identified. Of these, 12 highly-immunogenic proteins were strongly associated with age and thus cumulative lifetime exposure in Solomon Islanders (P<0.001-0.027). A subset of 6 proteins, selected on the basis of immunogenicity and expression levels, were used to examine antibody levels in plasma samples from a population of young Papua New Guinean children with well-characterized individual differences in exposure. This analysis identified a strong association between reduced risk of clinical disease and antibody levels to P12, P41, and a novel hypothetical protein that has not previously been studied, PVX_081550 (IRR 0.46-0.74; P<0.001-0.041).

    Conclusion/significance: These data emphasize the benefits of an unbiased screening approach in identifying novel vaccine candidate antigens. Functional studies are now required to establish whether PVX_081550 is a key component of the naturally-acquired protective immune response, a biomarker of immune status, or both.

    Funded by: Medical Research Council: MR/J002283/1, MR/L012170/1; NIAID NIH HHS: U19 AI089686

    PLoS neglected tropical diseases 2016;10;5;e0004639

  • A single dividing cell population with imbalanced fate drives oesophageal tumour growth.

    Frede J, Greulich P, Nagy T, Simons BD and Jones PH

    Wellcome Trust Sanger Institute, Hinxton CB10 1SA, UK.

    Understanding the cellular mechanisms of tumour growth is key for designing rational anticancer treatment. Here we used genetic lineage tracing to quantify cell behaviour during neoplastic transformation in a model of oesophageal carcinogenesis. We found that cell behaviour was convergent across premalignant tumours, which contained a single proliferating cell population. The rate of cell division was not significantly different in the lesions and the surrounding epithelium. However, dividing tumour cells had a uniform, small bias in cell fate so that, on average, slightly more dividing than non-dividing daughter cells were generated at each round of cell division. In invasive cancers induced by Kras(G12D) expression, dividing cell fate became more strongly biased towards producing dividing over non-dividing cells in a subset of clones. These observations argue that agents that restore the balance of cell fate may prove effective in checking tumour growth, whereas those targeting cycling cells may show little selectivity.

    Funded by: Cancer Research UK: C609/A17257; Medical Research Council: MC_PC_12009; Wellcome Trust: 098357/Z/12/Z

    Nature cell biology 2016;18;9;967-78

  • Rapid phenotyping of knockout mice to identify genetic determinants of bone strength.

    Freudenthal B, Logan J, Sanger Institute Mouse Pipelines, Croucher PI, Williams GR and Bassett JH

    Molecular Endocrinology LaboratoryDepartment of Medicine, Imperial College London, London, UK.

    The genetic determinants of osteoporosis remain poorly understood, and there is a large unmet need for new treatments in our ageing society. Thus, new approaches for gene discovery in skeletal disease are required to complement the current genome-wide association studies in human populations. The International Knockout Mouse Consortium (IKMC) and the International Mouse Phenotyping Consortium (IMPC) provide such an opportunity. The IKMC generates knockout mice representing each of the known protein-coding genes in C57BL/6 mice and, as part of the IMPC initiative, the Origins of Bone and Cartilage Disease project identifies mutants with significant outlier skeletal phenotypes. This initiative will add value to data from large human cohorts and provide a new understanding of bone and cartilage pathophysiology, ultimately leading to the identification of novel drug targets for the treatment of skeletal disease.

    Funded by: Wellcome Trust: 094134, 101123

    The Journal of endocrinology 2016;231;1;R31-46

  • HUWE1 mutations in Juberg-Marsidi and Brooks syndromes: the results of an X-chromosome exome sequencing study.

    Friez MJ, Brooks SS, Stevenson RE, Field M, Basehore MJ, Adès LC, Sebold C, McGee S, Saxon S, Skinner C, Craig ME, Murray L, Simensen RJ, Yap YY, Shaw MA, Gardner A, Corbett M, Kumar R, Bosshard M, van Loon B, Tarpey PS, Abidi F, Gecz J and Schwartz CE

    Greenwood Genetic Center, Greenwood, South Carolina, USA.

    Background: X linked intellectual disability (XLID) syndromes account for a substantial number of males with ID. Much progress has been made in identifying the genetic cause in many of the syndromes described 20-40 years ago. Next generation sequencing (NGS) has contributed to the rapid discovery of XLID genes and identifying novel mutations in known XLID genes for many of these syndromes.

    Methods: 2 NGS approaches were employed to identify mutations in X linked genes in families with XLID disorders. 1 involved exome sequencing of genes on the X chromosome using the Agilent SureSelect Human X Chromosome Kit. The second approach was to conduct targeted NGS sequencing of 90 known XLID genes.

    Results: We identified the same mutation, a c.12928 G>C transversion in the HUWE1 gene, which gives rise to a p.G4310R missense mutation in 2 XLID disorders: Juberg-Marsidi syndrome (JMS) and Brooks syndrome. Although the original families with these disorders were considered separate entities, they indeed overlap clinically. A third family was also found to have a novel HUWE1 mutation.

    Conclusions: As we identified a HUWE1 mutation in an affected male from the original family reported by Juberg and Marsidi, it is evident the syndrome does not result from a mutation in ATRX as reported in the literature. Additionally, our data indicate that JMS and Brooks syndromes are allelic having the same HUWE1 mutation.

    Funded by: NICHD NIH HHS: 2R01HD026202, R01 HD026202; NINDS NIH HHS: 1R01NS73854, R01 NS073854

    BMJ open 2016;6;4;e009537

  • Tyrosine kinase 2 is not limiting human antiviral type III interferon responses.

    Fuchs S, Kaiser-Labusch P, Bank J, Ammann S, Kolb-Kokocinski A, Edelbusch C, Omran H and Ehl S

    Center for Chronic Immunodeficiency, Faculty of Medicine, Medical Center-University of Freiburg, Freiburg, Germany.

    Tyrosine kinase 2 (TYK2) associates with interferon (IFN) alpha receptor, IL-10 receptor (IL-10R) beta and other cytokine receptor subunits for signal transduction, in response to various cytokines, including type-I and type-III IFNs, IL-6, IL-10, IL-12 and IL-23. Data on TYK2 dependence on cytokine responses and in vivo consequences of TYK2 deficiency are inconsistent. We investigated a TYK2 deficient patient, presenting with eczema, skin abscesses, respiratory infections and IgE levels >1000 U/mL, without viral or mycobacterial infections and a corresponding cellular model to analyze the role of TYK2 in type-III IFN mediated responses and NK-cell function. We established a novel simple diagnostic monocyte assay to show that the mutation completely abolishes the IFN-α mediated antiviral response. It also partly reduces IL-10 but not IL-6 mediated signaling associated with reduced IL-10Rβ expression. However, we found almost normal type-III IFN signaling associated with minimal impairment of virus control in a TYK2 deficient human cell line. Contrary to observations in TYK2 deficient mice, NK-cell phenotype and function, including IL-12/IL-18 mediated responses, were normal in the patient. Thus, preserved type-III IFN responses and normal NK-cell function may contribute to antiviral protection in TYK2 deficiency leading to a surprisingly mild human phenotype.

    European journal of immunology 2016;46;11;2639-2649

  • The genetic architecture of type 2 diabetes.

    Fuchsberger C, Flannick J, Teslovich TM, Mahajan A, Agarwala V, Gaulton KJ, Ma C, Fontanillas P, Moutsianas L, McCarthy DJ, Rivas MA, Perry JRB, Sim X, Blackwell TW, Robertson NR, Rayner NW, Cingolani P, Locke AE, Tajes JF, Highland HM, Dupuis J, Chines PS, Lindgren CM, Hartl C, Jackson AU, Chen H, Huyghe JR, van de Bunt M, Pearson RD, Kumar A, Müller-Nurasyid M, Grarup N, Stringham HM, Gamazon ER, Lee J, Chen Y, Scott RA, Below JE, Chen P, Huang J, Go MJ, Stitzel ML, Pasko D, Parker SCJ, Varga TV, Green T, Beer NL, Day-Williams AG, Ferreira T, Fingerlin T, Horikoshi M, Hu C, Huh I, Ikram MK, Kim BJ, Kim Y, Kim YJ, Kwon MS, Lee J, Lee S, Lin KH, Maxwell TJ, Nagai Y, Wang X, Welch RP, Yoon J, Zhang W, Barzilai N, Voight BF, Han BG, Jenkinson CP, Kuulasmaa T, Kuusisto J, Manning A, Ng MCY, Palmer ND, Balkau B, Stančáková A, Abboud HE, Boeing H, Giedraitis V, Prabhakaran D, Gottesman O, Scott J, Carey J, Kwan P, Grant G, Smith JD, Neale BM, Purcell S, Butterworth AS, Howson JMM, Lee HM, Lu Y, Kwak SH, Zhao W, Danesh J, Lam VKL, Park KS, Saleheen D, So WY, Tam CHT, Afzal U, Aguilar D, Arya R, Aung T, Chan E, Navarro C, Cheng CY, Palli D, Correa A, Curran JE, Rybin D, Farook VS, Fowler SP, Freedman BI, Griswold M, Hale DE, Hicks PJ, Khor CC, Kumar S, Lehne B, Thuillier D, Lim WY, Liu J, van der Schouw YT, Loh M, Musani SK, Puppala S, Scott WR, Yengo L, Tan ST, Taylor HA, Thameem F, Wilson G, Wong TY, Njølstad PR, Levy JC, Mangino M, Bonnycastle LL, Schwarzmayr T, Fadista J, Surdulescu GL, Herder C, Groves CJ, Wieland T, Bork-Jensen J, Brandslund I, Christensen C, Koistinen HA, Doney ASF, Kinnunen L, Esko T, Farmer AJ, Hakaste L, Hodgkiss D, Kravic J, Lyssenko V, Hollensted M, Jørgensen ME, Jørgensen T, Ladenvall C, Justesen JM, Käräjämäki A, Kriebel J, Rathmann W, Lannfelt L, Lauritzen T, Narisu N, Linneberg A, Melander O, Milani L, Neville M, Orho-Melander M, Qi L, Qi Q, Roden M, Rolandsson O, Swift A, Rosengren AH, Stirrups K, Wood AR, Mihailov E, Blancher C, Carneiro MO, Maguire J, Poplin R, Shakir K, Fennell T, DePristo M, de Angelis MH, Deloukas P, Gjesing AP, Jun G, Nilsson P, Murphy J, Onofrio R, Thorand B, Hansen T, Meisinger C, Hu FB, Isomaa B, Karpe F, Liang L, Peters A, Huth C, O'Rahilly SP, Palmer CNA, Pedersen O, Rauramaa R, Tuomilehto J, Salomaa V, Watanabe RM, Syvänen AC, Bergman RN, Bharadwaj D, Bottinger EP, Cho YS, Chandak GR, Chan JCN, Chia KS, Daly MJ, Ebrahim SB, Langenberg C, Elliott P, Jablonski KA, Lehman DM, Jia W, Ma RCW, Pollin TI, Sandhu M, Tandon N, Froguel P, Barroso I, Teo YY, Zeggini E, Loos RJF, Small KS, Ried JS, DeFronzo RA, Grallert H, Glaser B, Metspalu A, Wareham NJ, Walker M, Banks E, Gieger C, Ingelsson E, Im HK, Illig T, Franks PW, Buck G, Trakalo J, Buck D, Prokopenko I, Mägi R, Lind L, Farjoun Y, Owen KR, Gloyn AL, Strauch K, Tuomi T, Kooner JS, Lee JY, Park T, Donnelly P, Morris AD, Hattersley AT, Bowden DW, Collins FS, Atzmon G, Chambers JC, Spector TD, Laakso M, Strom TM, Bell GI, Blangero J, Duggirala R, Tai ES, McVean G, Hanis CL, Wilson JG, Seielstad M, Frayling TM, Meigs JB, Cox NJ, Sladek R, Lander ES, Gabriel S, Burtt NP, Mohlke KL, Meitinger T, Groop L, Abecasis G, Florez JC, Scott LJ, Morris AP, Kang HM, Boehnke M, Altshuler D and McCarthy MI

    Department of Biostatistics and Center for Statistical Genetics, University of Michigan, Ann Arbor, Michigan, USA.

    The genetic architecture of common traits, including the number, frequency, and effect sizes of inherited variants that contribute to individual risk, has been long debated. Genome-wide association studies have identified scores of common variants associated with type 2 diabetes, but in aggregate, these explain only a fraction of the heritability of this disease. Here, to test the hypothesis that lower-frequency variants explain much of the remainder, the GoT2D and T2D-GENES consortia performed whole-genome sequencing in 2,657 European individuals with and without diabetes, and exome sequencing in 12,940 individuals from five ancestry groups. To increase statistical power, we expanded the sample size via genotyping and imputation in a further 111,548 subjects. Variants associated with type 2 diabetes after sequencing were overwhelmingly common and most fell within regions previously identified by genome-wide association studies. Comprehensive enumeration of sequence variation is necessary to identify functional alleles that provide important clues to disease pathophysiology, but large-scale sequencing does not support the idea that lower-frequency variants have a major role in predisposition to type 2 diabetes.

    Funded by: British Heart Foundation: RG/14/5/30893, SP/04/002, SP/09/002; CIHR; Medical Research Council: G0601261, G0601966, G0700931, G0800270, G0900747-­‐91070, MC_UU_12012/5, MC_UU_12015/1, MR/K002414/1, MR/L01341X/1; NCI NIH HHS: K12CA139160; NHGRI NIH HHS: R01 HG000376, R56 HG000376, U01 HG005773, U01HG005773, U54 HG003067, U54HG003067; NHLBI NIH HHS: HHSN268201300046C, HHSN268201300047C, HHSN268201300048C, HHSN268201300049C, HHSN268201300050C, R01 HL102830, R01HL102830, T32 HL007055; NIA NIH HHS: 1R01AG042188, P01 AG027734, P01AG027734, P30 AG038072, P30AG038072, R01 AG042188, R01 AG046949, R01AG046949; NIDDK NIH HHS: 1RC2DK088389, DK072193, DK085501, DK085524, DK085526, DK085545, DK085584, DK088389, DK093757, DK098032, K24 DK080140, K24 DK110550, K24DK080140, P30 DK020572, P30 DK020595, P30DK020595, P60 DK020595, P60DK20595, R00 DK092251, R00 DK099240, R00DK092251, R01 DK066358, R01 DK072193, R01 DK073541, R01 DK093757, R01 DK098032, R01 DK101478, R01 DK106236, R01DK062370, R01DK066358, R01DK073541, R01DK098032, RC2 DK088389, RC2-­‐DK088389, RC2DK088389, U01 DK062370, U01 DK078616, U01 DK085501, U01 DK085524, U01 DK085526, U01 DK085545, U01 DK085584, U01DK085501, U01DK085526; NIGMS NIH HHS: T32 GM007753, T32GM007753; NIH HHS: S10 OD018522; NIMH NIH HHS: R01 MH090937, R01 MH101820, R01MH090937, R01MH101820; NIMHD NIH HHS: U54 MD007588; Wellcome Trust: 064890, 083948, 084723, 085475, 086596, 090367, 090532, 092447, 095101, 095552, 098017, 098051, 098381, 100956

    Nature 2016;536;7614;41-47

  • Lymphoid-Tissue-Resident Commensal Bacteria Promote Members of the IL-10 Cytokine Family to Establish Mutualism.

    Fung TC, Bessman NJ, Hepworth MR, Kumar N, Shibata N, Kobuley D, Wang K, Ziegler CGK, Goc J, Shima T, Umesaki Y, Sartor RB, Sullivan KV, Lawley TD, Kunisawa J, Kiyono H and Sonnenberg GF

    Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA; Joan and Sanford I. Weill Department of Medicine, Division of Gastroenterology, Weill Cornell Medicine, New York, NY 10021 USA; Department of Microbiology and Immunology, Weill Cornell Medicine, New York, NY 10021 USA; Jill Roberts Institute for Research in Inflammatory Bowel Disease, Weill Cornell Medicine, New York, NY 10021, USA.

    Physical separation between the mammalian immune system and commensal bacteria is necessary to limit chronic inflammation. However, selective species of commensal bacteria can reside within intestinal lymphoid tissues of healthy mammals. Here, we demonstrate that lymphoid-tissue-resident commensal bacteria (LRC) colonized murine dendritic cells and modulated their cytokine production. In germ-free and antibiotic-treated mice, LRCs colonized intestinal lymphoid tissues and induced multiple members of the IL-10 cytokine family, including dendritic-cell-derived IL-10 and group 3 innate lymphoid cell (ILC3)-derived IL-22. Notably, IL-10 limited the development of pro-inflammatory Th17 cell responses, and IL-22 production enhanced LRC colonization in the steady state. Furthermore, LRC colonization protected mice from lethal intestinal damage in an IL-10-IL-10R-dependent manner. Collectively, our data reveal a unique host-commensal-bacteria dialog whereby selective subsets of commensal bacteria interact with dendritic cells to facilitate tissue-specific responses that are mutually beneficial for both the host and the microbe.

    Funded by: Medical Research Council: PF451; NCI NIH HHS: P30 CA008748; NIAID NIH HHS: R01 AI123368, R01AI123368, R56 AI114724, R56AI114724; NIDDK NIH HHS: P01 DK094779, P30 DK034987, P30-DK034987; NIH HHS: 5-P40-OD010995, DP5 OD012116, DP5OD012116, P40 OD010995; Wellcome Trust: 098051, 105644

    Immunity 2016;44;3;634-646

  • RUNX1 mutations in acute myeloid leukemia are associated with distinct clinico-pathologic and genetic features.

    Gaidzik VI, Teleanu V, Papaemmanuil E, Weber D, Paschka P, Hahn J, Wallrabenstein T, Kolbinger B, Köhne CH, Horst HA, Brossart P, Held G, Kündgen A, Ringhoffer M, Götze K, Rummel M, Gerstung M, Campbell P, Kraus JM, Kestler HA, Thol F, Heuser M, Schlegelberger B, Ganser A, Bullinger L, Schlenk RF, Döhner K and Döhner H

    Klinik für Innere Medizin III, Universitätsklinikum Ulm, Ulm, Germany.

    We evaluated the frequency, genetic architecture, clinico-pathologic features and prognostic impact of RUNX1 mutations in 2439 adult patients with newly-diagnosed acute myeloid leukemia (AML). RUNX1 mutations were found in 245 of 2439 (10%) patients; were almost mutually exclusive of AML with recurrent genetic abnormalities; and they co-occurred with a complex pattern of gene mutations, frequently involving mutations in epigenetic modifiers (ASXL1, IDH2, KMT2A, EZH2), components of the spliceosome complex (SRSF2, SF3B1) and STAG2, PHF6, BCOR. RUNX1 mutations were associated with older age (16-59 years: 8.5%; ⩾60 years: 15.1%), male gender, more immature morphology and secondary AML evolving from myelodysplastic syndrome. In univariable analyses, RUNX1 mutations were associated with inferior event-free (EFS, P<0.0001), relapse-free (RFS, P=0.0007) and overall survival (OS, P<0.0001) in all patients, remaining significant when age was considered. In multivariable analysis, RUNX1 mutations predicted for inferior EFS (P=0.01). The effect of co-mutation varied by partner gene, where patients with the secondary genotypes RUNX1<sup>mut</sup>/ASXL1<sup>mut</sup> (OS, P=0.004), RUNX1<sup>mut</sup>/SRSF2<sup>mut</sup> (OS, P=0.007) and RUNX1<sup>mut</sup>/PHF6<sup>mut</sup> (OS, P=0.03) did significantly worse, whereas patients with the genotype RUNX1<sup>mut</sup>/IDH2<sup>mut</sup> (OS, P=0.04) had a better outcome. In conclusion, RUNX1-mutated AML is associated with a complex mutation cluster and is correlated with distinct clinico-pathologic features and inferior prognosis.

    Leukemia 2016;30;11;2160-2168

  • tRNA fragments: novel players in intergenerational inheritance.

    Gapp K and Miska EA

    The Gurdon Institute and Department of Genetics, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QN, UK.

    Non-genetic inheritance is an evocative topic; in the past few years, the debate around potential inheritance of life-time experiences independent of social factors in mammals has become highly prominent due to increasing evidence for phenotypes in the offspring after paternal environmental exposures. Strikingly, two independent studies published in Science newly implicate a special class of RNA, transfer RNA fragments, in the intergenerational effects of paternal dietary intervention.

    Funded by: Cancer Research UK: 11832

    Cell research 2016;26;4;395-6

  • Interleukin-13 Activates Distinct Cellular Pathways Leading to Ductular Reaction, Steatosis, and Fibrosis.

    Gieseck RL, Ramalingam TR, Hart KM, Vannella KM, Cantu DA, Lu WY, Ferreira-González S, Forbes SJ, Vallier L and Wynn TA

    Immunopathogenesis Section, Laboratory of Parasitic Diseases, National Institute of Allergy and Infectious Diseases, NIH, Bethesda, MD 20852, USA; Wellcome Trust-Medical Research Council Stem Cell Institute, Anne McLaren Laboratory, Department of Surgery, University of Cambridge, Cambridge CB2 0SZ, UK.

    Fibroproliferative diseases are driven by dysregulated tissue repair responses and are a major cause of morbidity and mortality because they affect nearly every organ system. Type 2 cytokine responses are critically involved in tissue repair; however, the mechanisms that regulate beneficial regeneration versus pathological fibrosis are not well understood. Here, we have shown that the type 2 effector cytokine interleukin-13 simultaneously, yet independently, directed hepatic fibrosis and the compensatory proliferation of hepatocytes and biliary cells in progressive models of liver disease induced by interleukin-13 overexpression or after infection with Schistosoma mansoni. Using transgenic mice with interleukin-13 signaling genetically disrupted in hepatocytes, cholangiocytes, or resident tissue fibroblasts, we have revealed direct and distinct roles for interleukin-13 in fibrosis, steatosis, cholestasis, and ductular reaction. Together, these studies show that these mechanisms are simultaneously controlled but distinctly regulated by interleukin-13 signaling. Thus, it may be possible to promote interleukin-13-dependent hepatobiliary expansion without generating pathological fibrosis. VIDEO ABSTRACT.

    Funded by: Department of Health; Intramural NIH HHS: Z01 AI000829-11, Z01 AI001019-01; Medical Research Council: MC_PC_12009, MR/K017047/1, MR/K026666/1

    Immunity 2016;45;1;145-58

  • Cytokine Profiles during Invasive Nontyphoidal Salmonella Disease Predict Outcome in African Children.

    Gilchrist JJ, Heath JN, Msefula CL, Gondwe EN, Naranbhai V, Mandala W, MacLennan JM, Molyneux EM, Graham SM, Drayson MT, Molyneux ME and MacLennan CA

    Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, United Kingdom Department of Paediatrics, University of Oxford, Oxford, United Kingdom.

    Nontyphoidal Salmonella is a leading cause of sepsis in African children. Cytokine responses are central to the pathophysiology of sepsis and predict sepsis outcome in other settings. In this study, we investigated cytokine responses to invasive nontyphoidal Salmonella (iNTS) disease in Malawian children. We determined serum concentrations of 48 cytokines with multiplexed immunoassays in Malawian children during acute iNTS disease (n = 111) and in convalescence (n = 77). Principal component analysis and logistic regression were used to identify cytokine signatures of acute iNTS disease. We further investigated whether these responses are altered by HIV coinfection or severe malnutrition and whether cytokine responses predict inpatient mortality. Cytokine changes in acute iNTS disease were associated with two distinct cytokine signatures. The first is characterized by increased concentrations of mediators known to be associated with macrophage function, and the second is characterized by raised pro- and anti-inflammatory cytokines typical of responses reported in sepsis secondary to diverse pathogens. These cytokine responses were largely unaltered by either severe malnutrition or HIV coinfection. Children with fatal disease had a distinctive cytokine profile, characterized by raised mediators known to be associated with neutrophil function. In conclusion, cytokine responses to acute iNTS infection in Malawian children are reflective of both the cytokine storm typical of sepsis secondary to diverse pathogens and the intramacrophage replicative niche of NTS. The cytokine profile predictive of fatal disease supports a key role of neutrophils in the pathogenesis of NTS sepsis.

    Funded by: Wellcome Trust

    Clinical and vaccine immunology : CVI 2016;23;7;601-9

  • Very low-depth sequencing in a founder population identifies a cardioprotective APOC3 signal missed by genome-wide imputation.

    Gilly A, Ritchie GR, Southam L, Farmaki AE, Tsafantakis E, Dedoussis G and Zeggini E

    Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK.

    Cohort-wide very low-depth whole-genome sequencing (WGS) can comprehensively capture low-frequency sequence variation for the cost of a dense genome-wide genotyping array. Here, we analyse 1x sequence data across the APOC3 gene in a founder population from the island of Crete in Greece (n = 1239) and find significant evidence for association with blood triglyceride levels with the previously reported R19X cardioprotective null mutation (β = -1.09,σ = 0.163, P = 8.2 × 10<sup>-11</sup>) and a second loss of function mutation, rs138326449 (β = -1.17,σ = 0.188, P = 1.14 × 10<sup>-9</sup>). The signal cannot be recapitulated by imputing genome-wide genotype data on a large reference panel of 5122 individuals including 249 with 4x WGS data from the same population. Gene-level meta-analysis with other studies reporting burden signals at APOC3 provides robust evidence for a replicable cardioprotective rare variant aggregation (P = 3.2 × 10<sup>-31</sup>, n = 13 480).

    Funded by: Medical Research Council: MC_PC_15018; Wellcome Trust: 098051, 102215, WT091310

    Human molecular genetics 2016;25;11;2360-2365

  • Rapid Karyotype Evolution in Lasiopodomys Involved at Least Two Autosome - Sex Chromosome Translocations.

    Gladkikh OL, Romanenko SA, Lemskaya NA, Serdyukova NA, O'Brien PC, Kovalskaya JM, Smorkatcheva AV, Golenishchev FN, Perelman PL, Trifonov VA, Ferguson-Smith MA, Yang F and Graphodatsky AS

    Institute of Molecular and Cellular Biology, Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia.

    The generic status of Lasiopodomys and its division into subgenera Lasiopodomys (L. mandarinus, L. brandtii) and Stenocranius (L. gregalis, L. raddei) are not generally accepted because of contradictions between the morphological and molecular data. To obtain cytogenetic evidence for the Lasiopodomys genus and its subgenera and to test the autosome to sex chromosome translocation hypothesis of sex chromosome complex origin in L. mandarinus proposed previously, we hybridized chromosome painting probes from the field vole (Microtus agrestis, MAG) and the Arctic lemming (Dicrostonyx torquatus, DTO) onto the metaphases of a female Mandarin vole (L. mandarinus, 2n = 47) and a male Brandt's vole (L. brandtii, 2n = 34). In addition, we hybridized Arctic lemming painting probes onto chromosomes of a female narrow-headed vole (L. gregalis, 2n = 36). Cross-species painting revealed three cytogenetic signatures (MAG12/18, 17a/19, and 22/24) that could validate the genus Lasiopodomys and indicate the evolutionary affinity of L. gregalis to the genus. Moreover, all three species retained the associations MAG1bc/17b and 2/8a detected previously in karyotypes of all arvicolins studied. The associations MAG2a/8a/19b, 8b/21, 9b/23, 11/13b, 12b/18, 17a/19a, and 5 fissions of ancestral segments appear to be characteristic for the subgenus Lasiopodomys. We also validated the autosome to sex chromosome translocation hypothesis on the origin of complex sex chromosomes in L. mandarinus. Two translocations of autosomes onto the ancestral X chromosome in L. mandarinus led to a complex of neo-X1, neo-X2, and neo-X3 elements. Our results demonstrate that genus Lasiopodomys represents a striking example of rapid chromosome evolution involving both autosomes and sex chromosomes. Multiple reshuffling events including Robertsonian fusions, chromosomal fissions, inversions and heterochromatin expansion have led to the formation of modern species karyotypes in a very short time, about 2.4 MY.

    PloS one 2016;11;12;e0167653

  • GENOMICS. A federated ecosystem for sharing genomic, clinical data.

    Global Alliance for Genomics and Health

    Science (New York, N.Y.) 2016;352;6291;1278-80

  • Chromosomal phylogeny of Vampyressine bats (Chiroptera, Phyllostomidae) with description of two new sex chromosome systems.

    Gomes AJ, Nagamachi CY, Rodrigues LR, Benathar TC, Ribas TF, O'Brien PC, Yang F, Ferguson-Smith MA and Pieczarka JC

    Laboratório de Citogenética, CEABIO, ICB, Universidade Federal do Pará, Belém, Brazil.

    Background: The subtribe Vampyressina (sensu Baker et al. 2003) encompasses approximately 43 species and seven genera and is a recent and diversified group of New World leaf-nosed bats specialized in fruit eating. The systematics of this group continues to be debated mainly because of the lack of congruence between topologies generated by molecular and morphological data. We analyzed seven species of all genera of vampyressine bats by multidirectional chromosome painting, using whole-chromosome-painting probes from Carollia brevicauda and Phyllostomus hastatus. Phylogenetic analyses were performed using shared discrete chromosomal segments as characters and the Phylogenetic Analysis Using Parsimony (PAUP) software package, using Desmodontinae as outgroup. We also used the Tree Analysis Using New Technology (TNT) software.

    Results: The result showed a well-supported phylogeny congruent with molecular topologies regarding the sister taxa relationship of Vampyressa and Mesophylla genera, as well as the close relationship between the genus Chiroderma and Vampyriscus.

    Conclusions: Our results supported the hypothesis that all genera of this subtribe have compound sex chromosome systems that originated from an X-autosome translocation, an ancestral condition observed in the Stenodermatinae. Additional rearrangements occurred independently in the genus Vampyressa and Mesophylla yielding the X1X1X2X2/X1X2Y sex chromosome system. This work presents additional data supporting the hypothesis based on molecular studies regarding the polyphyly of the genus Vampyressa and its sister relationship to Mesophylla.

    BMC evolutionary biology 2016;16;1;119

  • Commitment issues in Plasmodium.

    Gomes AR and Talman AM

    Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.

    Nature reviews. Microbiology 2016;14;1;4

  • Standardized Welfare Terms for the Zebrafish Community.

    Goodwin N, Karp NA, Blackledge S, Clark B, Keeble R, Kovacs C, Murray KN, Price M, Thompson P and Bussell J

    1 Research Support Facility, Wellcome Trust Sanger Institute , Cambridge, United Kingdom .

    Managing the welfare of laboratory animals is critical to animal health, vital in the understanding of phenotypes created by treatment or genetic alteration and ensures compliance of regulations. Part of an animal welfare assessment is the requirement to record observations, ensuring all those responsible for the animals are aware of their health status and can act accordingly. Although the use of zebrafish in research continues to increase, guidelines for conducting welfare assessments and the reporting of observations are considered unclear compared to mammalian species. To support the movement of zebrafish between facilities, significant improvement would be achieved through the use of standardized terms to ensure clarity and consistency between facilities. Improving the clarity of terminology around welfare not only addresses our ethical obligation but also supports the research goals and provides a searchable description of the phenotypes. A Collaboration between the Wellcome Trust Sanger Institute and Cambridge University (Department of Medicine-Laboratory of Molecular Biology) has led to the creation of the zebrafish welfare terms from which standardization of terminology can be achieved.

    Zebrafish 2016

  • Evaluating and Optimizing Fish Health and Welfare During Experimental Procedures.

    Goodwin N, Westall L, Karp NA, Hazlehurst D, Kovacs C, Keeble R, Thompson P, Collins R and Bussell J

    1 Research Support Facility, Wellcome Trust Sanger Institute , Cambridge, United Kingdom .

    Many facilities house fish in separate static containers post-procedure, for example, while awaiting genotyping results. This ensures fish can be easily identified, but it does not allow for provision of continuous filtered water or diet. At the Wellcome Trust Sanger Institute, concern over the housing conditions led to the development of an individual housing system (GeneS) enabling feeding and water filtration. Trials to compare the water quality measures between the various systems found that fish housed in static containers experienced rapid deterioration in water quality. By day 1, measures of ammonia were outside the Institute's prescribed values and continued to rise until it was 25-fold higher than recommended levels. Nitrite levels were also outside recommended levels for all fish by day 9 and were twofold higher by the end of the trial. The water quality measures for tanks held on the recirculating system were stable even though food was provided. These results indicate that for housing zebrafish, running water or appropriately timed water changes are a critical component to ensure that the ethical obligations are met.

    Zebrafish 2016

  • Heterogeneity in Oct4 and Sox2 Targets Biases Cell Fate in 4-Cell Mouse Embryos.

    Goolam M, Scialdone A, Graham SJL, Macaulay IC, Jedrusik A, Hupalowska A, Voet T, Marioni JC and Zernicka-Goetz M

    Department of Physiology, Development & Neuroscience, University of Cambridge, Downing Street, Cambridge CB2 3EG, UK.

    The major and essential objective of pre-implantation development is to establish embryonic and extra-embryonic cell fates. To address when and how this fundamental process is initiated in mammals, we characterize transcriptomes of all individual cells throughout mouse pre-implantation development. This identifies targets of master pluripotency regulators Oct4 and Sox2 as being highly heterogeneously expressed between blastomeres of the 4-cell embryo, with Sox21 showing one of the most heterogeneous expression profiles. Live-cell tracking demonstrates that cells with decreased Sox21 yield more extra-embryonic than pluripotent progeny. Consistently, decreasing Sox21 results in premature upregulation of the differentiation regulator Cdx2, suggesting that Sox21 helps safeguard pluripotency. Furthermore, Sox21 is elevated following increased expression of the histone H3R26-methylase CARM1 and is lowered following CARM1 inhibition, indicating the importance of epigenetic regulation. Therefore, our results indicate that heterogeneous gene expression, as early as the 4-cell stage, initiates cell-fate decisions by modulating the balance of pluripotency and differentiation.

    Funded by: Wellcome Trust

    Cell 2016;165;1;61-74

  • Meta-analysis of 375,000 individuals identifies 38 susceptibility loci for migraine.

    Gormley P, Anttila V, Winsvold BS, Palta P, Esko T, Pers TH, Farh KH, Cuenca-Leon E, Muona M, Furlotte NA, Kurth T, Ingason A, McMahon G, Ligthart L, Terwindt GM, Kallela M, Freilinger TM, Ran C, Gordon SG, Stam AH, Steinberg S, Borck G, Koiranen M, Quaye L, Adams HH, Lehtimäki T, Sarin AP, Wedenoja J, Hinds DA, Buring JE, Schürks M, Ridker PM, Hrafnsdottir MG, Stefansson H, Ring SM, Hottenga JJ, Penninx BW, Färkkilä M, Artto V, Kaunisto M, Vepsäläinen S, Malik R, Heath AC, Madden PA, Martin NG, Montgomery GW, Kurki MI, Kals M, Mägi R, Pärn K, Hämäläinen E, Huang H, Byrnes AE, Franke L, Huang J, Stergiakouli E, Lee PH, Sandor C, Webber C, Cader Z, Muller-Myhsok B, Schreiber S, Meitinger T, Eriksson JG, Salomaa V, Heikkilä K, Loehrer E, Uitterlinden AG, Hofman A, van Duijn CM, Cherkas L, Pedersen LM, Stubhaug A, Nielsen CS, Männikkö M, Mihailov E, Milani L, Göbel H, Esserlind AL, Christensen AF, Hansen TF, Werge T, International Headache Genetics Consortium, Kaprio J, Aromaa AJ, Raitakari O, Ikram MA, Spector T, Järvelin MR, Metspalu A, Kubisch C, Strachan DP, Ferrari MD, Belin AC, Dichgans M, Wessman M, van den Maagdenberg AM, Zwart JA, Boomsma DI, Smith GD, Stefansson K, Eriksson N, Daly MJ, Neale BM, Olesen J, Chasman DI, Nyholt DR and Palotie A

    Psychiatric and Neurodevelopmental Genetics Unit, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, USA.

    Migraine is a debilitating neurological disorder affecting around one in seven people worldwide, but its molecular mechanisms remain poorly understood. There is some debate about whether migraine is a disease of vascular dysfunction or a result of neuronal dysfunction with secondary vascular changes. Genome-wide association (GWA) studies have thus far identified 13 independent loci associated with migraine. To identify new susceptibility loci, we carried out a genetic study of migraine on 59,674 affected subjects and 316,078 controls from 22 GWA studies. We identified 44 independent single-nucleotide polymorphisms (SNPs) significantly associated with migraine risk (P < 5 × 10(-8)) that mapped to 38 distinct genomic loci, including 28 loci not previously reported and a locus that to our knowledge is the first to be identified on chromosome X. In subsequent computational analyses, the identified loci showed enrichment for genes expressed in vascular and smooth muscle tissues, consistent with a predominant theory of migraine that highlights vascular etiologies.

    Funded by: Medical Research Council: G0000934, G0500539, G0600705, G1002319, MC_PC_15018, MC_UP_A320_1004, MC_UU_00002/10, MC_UU_12013/1, MC_UU_12013/3, MC_UU_12021/4; NCI NIH HHS: R01 CA047988, UM1 CA182913; NHGRI NIH HHS: R44 HG006981; NHLBI NIH HHS: R01 HL043851, R01 HL080467, R01 HL087679; NIAAA NIH HHS: K05 AA017688, P50 AA011998, R01 AA007535, R01 AA007728, R01 AA010249, R01 AA013320, R01 AA013321, R01 AA014041; NICHD NIH HHS: R01 HD042157; NIDA NIH HHS: K08 DA019951, R01 DA012854, R25 DA027995, R37 DA018673; NIDCR NIH HHS: R01 DE022905; NIDDK NIH HHS: U01 DK062418; NIGMS NIH HHS: T32 GM007748; NIMH NIH HHS: K99 MH101367, R01 MH063706, R01 MH081802, RC2 MH089951, RC2 MH089995, RL1 MH083268, U24 MH068457; NINDS NIH HHS: R21 NS092963; Wellcome Trust: 068545/Z/02, 076113/B/04/Z, 079895, 098051, 102215, WT089062

    Nature genetics 2016;48;8;856-66

  • Invasion of hepatocytes by Plasmodium sporozoites requires cGMP-dependent protein kinase and calcium dependent protein kinase 4.

    Govindasamy K, Jebiwott S, Jaijyan DK, Davidow A, Ojo KK, Van Voorhis WC, Brochet M, Billker O and Bhanot P

    Department of Microbiology, Biochemistry and Molecular Genetics, Rutgers - New Jersey Medical School, Newark, NJ, USA.

    Invasion of hepatocytes by sporozoites is essential for Plasmodium to initiate infection of the mammalian host. The parasite's subsequent intracellular differentiation in the liver is the first developmental step of its mammalian cycle. Despite their biological significance, surprisingly little is known of the signalling pathways required for sporozoite invasion. We report that sporozoite invasion of hepatocytes requires signalling through two second-messengers - cGMP mediated by the parasite's cGMP-dependent protein kinase (PKG), and Ca<sup>2+</sup> , mediated by the parasite's calcium-dependent protein kinase 4 (CDPK4). Sporozoites expressing a mutated form of Plasmodium berghei PKG or carrying a deletion of the CDPK4 gene are defective in invasion of hepatocytes. Using specific and potent inhibitors of Plasmodium PKG and CDPK4, we demonstrate that PKG and CDPK4 are required for sporozoite motility, and that PKG regulates the secretion of TRAP, an adhesin that is essential for motility. Chemical inhibition of PKG decreases parasite egress from hepatocytes by inhibiting either the formation or release of merosomes. In contrast, genetic inhibition of CDPK4 does not significantly decrease the number of merosomes. By revealing the requirement for PKG and CDPK4 in Plasmodium sporozoite invasion, our work enables a better understanding of kinase pathways that act in different Plasmodium stages.

    Funded by: NIAID NIH HHS: R21 AI094167; Wellcome Trust: WT098051

    Molecular microbiology 2016;102;2;349-363

  • Genomic Epidemiology of Gonococcal Resistance to Extended-Spectrum Cephalosporins, Macrolides, and Fluoroquinolones in the United States, 2000-2013.

    Grad YH, Harris SR, Kirkcaldy RD, Green AG, Marks DS, Bentley SD, Trees D and Lipsitch M

    Department of Immunology and Infectious Diseases.

    Background: Treatment of Neisseria gonorrhoeae infection is empirical and based on population-wide susceptibilities. Increasing antimicrobial resistance underscores the potential importance of rapid diagnostic tests, including sequence-based tests, to guide therapy. However, the usefulness of sequence-based diagnostic tests depends on the prevalence and dynamics of the resistance mechanisms.

    Methods: We define the prevalence and dynamics of resistance markers to extended-spectrum cephalosporins, macrolides, and fluoroquinolones in 1102 resistant and susceptible clinical N. gonorrhoeae isolates collected from 2000 to 2013 via the Centers for Disease Control and Prevention's Gonococcal Isolate Surveillance Project.

    Results: Reduced extended-spectrum cephalosporin susceptibility is predominantly clonal and associated with the mosaic penA XXXIV allele and derivatives (sensitivity 98% for cefixime and 91% for ceftriaxone), but alternative resistance mechanisms have sporadically emerged. Reduced azithromycin susceptibility has arisen through multiple mechanisms and shows limited clonal spread; the basis for resistance in 36% of isolates with reduced azithromycin susceptibility is unclear. Quinolone-resistant N. gonorrhoeae has arisen multiple times, with extensive clonal spread.

    Conclusions: Quinolone-resistant N. gonorrhoeae and reduced cefixime susceptibility appear amenable to development of sequence-based diagnostic tests, whereas the undefined mechanisms of resistance to ceftriaxone and azithromycin underscore the importance of phenotypic surveillance. The identification of multidrug-resistant isolates highlights the need for additional measures to respond to the threat of untreatable gonorrhea.

    Funded by: NIAID NIH HHS: K08 AI104767, T32 AI007061; NIGMS NIH HHS: R01 GM106303, U54 GM088558; Wellcome Trust: 098051

    The Journal of infectious diseases 2016;214;10;1579-1587

  • Genes Required for the Fitness of Salmonella enterica Serovar Typhimurium during Infection of Immunodeficient gp91-/- phox Mice.

    Grant AJ, Oshota O, Chaudhuri RR, Mayho M, Peters SE, Clare S, Maskell DJ and Mastroeni P

    Department of Veterinary Medicine, University of Cambridge, Cambridge, United Kingdom

    Salmonella enterica causes systemic diseases (typhoid and paratyphoid fever), nontyphoidal septicemia (NTS), and gastroenteritis in humans and other animals worldwide. An important but underrecognized emerging infectious disease problem in sub-Saharan Africa is NTS in children and immunocompromised adults. A current goal is to identify Salmonella mutants that are not pathogenic in the absence of key components of the immune system such as might be found in immunocompromised hosts. Such attenuated strains have the potential to be used as live vaccines. We have used transposon-directed insertion site sequencing (TraDIS) to screen mutants of Salmonella enterica serovar Typhimurium for their ability to infect and grow in the tissues of wild-type and immunodeficient mice. This was to identify bacterial genes that might be deleted for the development of live attenuated vaccines that would be safer to use in situations and/or geographical areas where immunodeficiencies are prevalent. The relative fitness of each of 9,356 transposon mutants, representing mutations in 3,139 different genes, was determined in gp91(-/-) phox mice. Mutations in certain genes led to reduced fitness in both wild-type and mutant mice. To validate these results, these genes were mutated by allelic replacement, and resultant mutants were retested for fitness in the mice. A defined deletion mutant of cysE was attenuated in C57BL/6 wild-type mice and immunodeficient gp91(-/-) phox mice and was effective as a live vaccine in wild-type mice.

    Funded by: Biotechnology and Biological Sciences Research Council: APG19115; Medical Research Council: G1100102; Wellcome Trust: WT098051

    Infection and immunity 2016;84;4;989-97

  • Modeling the evolution space of breakage fusion bridge cycles with a stochastic folding process.

    Greenman CD, Cooke SL, Marshall J, Stratton MR and Campbell PJ

    School of Computing Sciences, University of East Anglia, Norwich, UK.

    Breakage-fusion-bridge cycles in cancer arise when a broken segment of DNA is duplicated and an end from each copy joined together. This structure then 'unfolds' into a new piece of palindromic DNA. This is one mechanism responsible for the localised amplicons observed in cancer genome data. Here we study the evolution space of breakage-fusion-bridge structures in detail. We firstly consider discrete representations of this space with 2-d trees to demonstrate that there are [Formula: see text] qualitatively distinct evolutions involving [Formula: see text] breakage-fusion-bridge cycles. Secondly we consider the stochastic nature of the process to show these evolutions are not equally likely, and also describe how amplicons become localized. Finally we highlight these methods by inferring the evolution of breakage-fusion-bridge cycles with data from primary tissue cancer samples.

    Journal of mathematical biology 2016;72;1-2;47-86

  • Rapid parallel acquisition of somatic mutations after NPM1 in acute myeloid leukaemia evolution.

    Grove CS, Bolli N, Manes N, Varela I, Van't Veer M, Bench A, Eldaly H, Wedge D, Van Loo P and Vassiliou GS

    Wellcome Trust Sanger Institute, Hinxton, Cambridgeshire, UK.

    British journal of haematology 2016

  • Occurrence of carbapenemase-producing Klebsiella pneumoniae and Escherichia coli in the European survey of carbapenemase-producing Enterobacteriaceae (EuSCAPE): a prospective, multinational study.

    Grundmann H, Glasner C, Albiger B, Aanensen DM, Tomlinson CT, Andrasević AT, Cantón R, Carmeli Y, Friedrich AW, Giske CG, Glupczynski Y, Gniadkowski M, Livermore DM, Nordmann P, Poirel L, Rossolini GM, Seifert H, Vatopoulos A, Walsh T, Woodford N, Monnet DL and European Survey of Carbapenemase-Producing Enterobacteriaceae (EuSCAPE) Working Group

    Department of Infection Prevention and Hospital Hygiene, Faculty of Medicine, University of Freiburg, Freiburg, Germany; Department of Medical Microbiology, University of Groningen, University Medical Center Groningen, Netherlands. Electronic address:

    Background: Gaps in the diagnostic capacity and heterogeneity of national surveillance and reporting standards in Europe make it difficult to contain carbapenemase-producing Enterobacteriaceae. We report the development of a consistent sampling framework and the results of the first structured survey on the occurrence of carbapenemase-producing Klebsiella pneumoniae and Escherichia coli in European hospitals.

    Methods: National expert laboratories recruited hospitals with diagnostic capacities, who collected the first ten carbapenem non-susceptible clinical isolates of K pneumoniae or E coli and ten susceptible same-species comparator isolates and pertinent patient and hospital information. Isolates and data were relayed back to national expert laboratories, which made laboratory-substantiated information available for central analysis.

    Findings: Between Nov 1, 2013, and April 30, 2014, 455 sentinel hospitals in 36 countries submitted 2703 clinical isolates (2301 [85%] K pneumoniae and 402 (15%) E coli). 850 (37%) of 2301 K pneumoniae samples and 77 (19%) of 402 E coli samples were carbapenemase (KPC, NDM, OXA-48-like, or VIM) producers. The ratio of K pneumoniae to E coli was 11:1. 1·3 patients per 10 000 hospital admissions had positive clinical specimens. Prevalence differed greatly, with the highest rates in Mediterranean and Balkan countries. Carbapenemase-producing K pneumoniae isolates showed high resistance to last-line antibiotics.

    Interpretation: This initiative shows an encouraging commitment by all participants, and suggests that challenges in the establishment of a continent-wide enhanced sentinel surveillance for carbapenemase-producing Enterobacteriaeceae can be overcome. Strengthening infection control efforts in hospitals is crucial for controlling spread through local and national health care networks.

    Funding: European Centre for Disease Prevention and Control.

    Funded by: Medical Research Council: MR/P007295/1

    The Lancet. Infectious diseases 2016;17;2;153-163

  • Role of Plasmodium vivax Duffy-binding protein 1 in invasion of Duffy-null Africans.

    Gunalan K, Lo E, Hostetler JB, Yewhalaw D, Mu J, Neafsey DE, Yan G and Miller LH

    Laboratory of Malaria and Vector Research, National Institutes of Allergy and Infectious Diseases, National Institutes of Health, Rockville, MD 20852;

    The ability of the malaria parasite Plasmodium vivax to invade erythrocytes is dependent on the expression of the Duffy blood group antigen on erythrocytes. Consequently, Africans who are null for the Duffy antigen are not susceptible to P. vivax infections. Recently, P. vivax infections in Duffy-null Africans have been documented, raising the possibility that P. vivax, a virulent pathogen in other parts of the world, may expand malarial disease in Africa. P. vivax binds the Duffy blood group antigen through its Duffy-binding protein 1 (DBP1). To determine if mutations in DBP1 resulted in the ability of P. vivax to bind Duffy-null erythrocytes, we analyzed P. vivax parasites obtained from two Duffy-null individuals living in Ethiopia where Duffy-null and -positive Africans live side-by-side. We determined that, although the DBP1s from these parasites contained unique sequences, they failed to bind Duffy-null erythrocytes, indicating that mutations in DBP1 did not account for the ability of P. vivax to infect Duffy-null Africans. However, an unusual DNA expansion of DBP1 (three and eight copies) in the two Duffy-null P. vivax infections suggests that an expansion of DBP1 may have been selected to allow low-affinity binding to another receptor on Duffy-null erythrocytes. Indeed, we show that Salvador (Sal) I P. vivax infects Squirrel monkeys independently of DBP1 binding to Squirrel monkey erythrocytes. We conclude that P. vivax Sal I and perhaps P. vivax in Duffy-null patients may have adapted to use new ligand-receptor pairs for invasion.

    Funded by: NIAID NIH HHS: R21 AI101802; Wellcome Trust

    Proceedings of the National Academy of Sciences of the United States of America 2016;113;22;6271-6

  • Naive Pluripotent Stem Cells Derived Directly from Isolated Cells of the Human Inner Cell Mass.

    Guo G, von Meyenn F, Santos F, Chen Y, Reik W, Bertone P, Smith A and Nichols J

    Wellcome Trust - Medical Research Council Stem Cell Institute, University of Cambridge, Tennis Court Road, Cambridge CB2 1QR, UK.

    Conventional generation of stem cells from human blastocysts produces a developmentally advanced, or primed, stage of pluripotency. In vitro resetting to a more naive phenotype has been reported. However, whether the reset culture conditions of selective kinase inhibition can enable capture of naive epiblast cells directly from the embryo has not been determined. Here, we show that in these specific conditions individual inner cell mass cells grow into colonies that may then be expanded over multiple passages while retaining a diploid karyotype and naive properties. The cells express hallmark naive pluripotency factors and additionally display features of mitochondrial respiration, global gene expression, and genome-wide hypomethylation distinct from primed cells. They transition through primed pluripotency into somatic lineage differentiation. Collectively these attributes suggest classification as human naive embryonic stem cells. Human counterparts of canonical mouse embryonic stem cells would argue for conservation in the phased progression of pluripotency in mammals.

    Stem cell reports 2016;6;4;437-46

  • Functional analysis of an unusual type IV pilus in the Gram-positive Streptococcus sanguinis.

    Gurung I, Spielman I, Davies MR, Lala R, Gaustad P, Biais N and Pelicic V

    MRC Centre for Molecular Bacteriology and Infection, Imperial College London, London, UK.

    Type IV pili (Tfp), which have been studied extensively in a few Gram-negative species, are the paradigm of a group of widespread and functionally versatile nano-machines. Here, we performed the most detailed molecular characterisation of Tfp in a Gram-positive bacterium. We demonstrate that the naturally competent Streptococcus sanguinis produces retractable Tfp, which like their Gram-negative counterparts can generate hundreds of piconewton of tensile force and promote intense surface-associated motility. Tfp power 'train-like' directional motion parallel to the long axis of chains of cells, leading to spreading zones around bacteria grown on plates. However, S. sanguinis Tfp are not involved in DNA uptake, which is mediated by a related but distinct nano-machine, and are unusual because they are composed of two pilins in comparable amounts, rather than one as normally seen. Whole genome sequencing identified a locus encoding all the genes involved in Tfp biology in S. sanguinis. A systematic mutational analysis revealed that Tfp biogenesis in S. sanguinis relies on a more basic machinery (only 10 components) than in Gram-negative species and that a small subset of four proteins dispensable for pilus biogenesis are essential for motility. Intriguingly, one of the piliated mutants that does not exhibit spreading retains microscopic motility but moves sideways, which suggests that the corresponding protein controls motion directionality. Besides establishing S. sanguinis as a useful new model for studying Tfp biology, these findings have important implications for our understanding of these widespread filamentous nano-machines.

    Molecular microbiology 2016;99;2;380-92

  • Atlas of prostate cancer heritability in European and African-American men pinpoints tissue-specific regulation.

    Gusev A, Shi H, Kichaev G, Pomerantz M, Li F, Long HW, Ingles SA, Kittles RA, Strom SS, Rybicki BA, Nemesure B, Isaacs WB, Zheng W, Pettaway CA, Yeboah ED, Tettey Y, Biritwum RB, Adjei AA, Tay E, Truelove A, Niwa S, Chokkalingam AP, John EM, Murphy AB, Signorello LB, Carpten J, Leske MC, Wu SY, Hennis AJ, Neslund-Dudas C, Hsing AW, Chu L, Goodman PJ, Klein EA, Witte JS, Casey G, Kaggwa S, Cook MB, Stram DO, Blot WJ, Eeles RA, Easton D, Kote-Jarai Z, Al Olama AA, Benlloch S, Muir K, Giles GG, Southey MC, Fitzgerald LM, Gronberg H, Wiklund F, Aly M, Henderson BE, Schleutker J, Wahlfors T, Tammela TL, Nordestgaard BG, Key TJ, Travis RC, Neal DE, Donovan JL, Hamdy FC, Pharoah P, Pashayan N, Khaw KT, Stanford JL, Thibodeau SN, McDonnell SK, Schaid DJ, Maier C, Vogel W, Luedeke M, Herkommer K, Kibel AS, Cybulski C, Wokolorczyk D, Kluzniak W, Cannon-Albright L, Teerlink C, Brenner H, Dieffenbach AK, Arndt V, Park JY, Sellers TA, Lin HY, Slavov C, Kaneva R, Mitev V, Batra J, Spurdle A, Clements JA, Teixeira MR, Pandha H, Michael A, Paulo P, Maia S, Kierzek A, PRACTICAL consortium, Conti DV, Albanes D, Berg C, Berndt SI, Campa D, Crawford ED, Diver WR, Gapstur SM, Gaziano JM, Giovannucci E, Hoover R, Hunter DJ, Johansson M, Kraft P, Le Marchand L, Lindström S, Navarro C, Overvad K, Riboli E, Siddiq A, Stevens VL, Trichopoulos D, Vineis P, Yeager M, Trynka G, Raychaudhuri S, Schumacher FR, Price AL, Freedman ML, Haiman CA and Pasaniuc B

    Program in Genetic Epidemiology and Statistical Genetics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts 02115, USA.

    Although genome-wide association studies have identified over 100 risk loci that explain ∼33% of familial risk for prostate cancer (PrCa), their functional effects on risk remain largely unknown. Here we use genotype data from 59,089 men of European and African American ancestries combined with cell-type-specific epigenetic data to build a genomic atlas of single-nucleotide polymorphism (SNP) heritability in PrCa. We find significant differences in heritability between variants in prostate-relevant epigenetic marks defined in normal versus tumour tissue as well as between tissue and cell lines. The majority of SNP heritability lies in regions marked by H3k27 acetylation in prostate adenoc7arcinoma cell line (LNCaP) or by DNaseI hypersensitive sites in cancer cell lines. We find a high degree of similarity between European and African American ancestries suggesting a similar genetic architecture from common variation underlying PrCa risk. Our findings showcase the power of integrating functional annotation with genetic data to understand the genetic basis of PrCa.

    Funded by: Cancer Research UK: 10119, 10124, 13065, 14136, C12292/A11174, C1281/A12014, C1287/A10118, C1287/A10710, C16913/A6135, C490/A10124, C5047/A10692, C5047/A15007, C5047/A3354, C5047/A7357, C5047/A8384, C522/A8649, C8197/A10123, C8197/A10865, G0500966/75466; Medical Research Council: 75466, G0401527, G0500966, MC_PC_15018; NCI NIH HHS: 1 U19 CA148537-01, 1U19 CA148065, 1U19 CA148112, 1U19 CA148537, CA098758, CA128978, CA54281, CA63464, P30 CA016672, P30 CA042014, P30 CA060553, P30 CA068485, P30 CA68485, P30CA042014, R01 CA054281, R01 CA056678, R01 CA063464, R01 CA072818, R01 CA082664, R01 CA092447, R01 CA092579, R01 CA128813, R01 CA128978, R01 CA188392, R01 CA193910, R01CA056678, R01CA082664, R01CA092579, R01CA128813, R01CA72818, R35 CA197449, R37 CA054281, U01 CA063464, U01 CA098758, U01 CA164973, U01 CA188392, U01 CA194393, U10 CA037429, U19 CA148065, U19 CA148112, U19 CA148537, UG1 CA189974, UM1 CA182883; NIAID NIH HHS: U19 AI111224; NIAMS NIH HHS: R01 AR063759; NIEHS NIH HHS: R01 ES011126; NIGMS NIH HHS: F32 GM106584, R01 GM105857, R01 GM107427; NIMH NIH HHS: R01 MH101244; Wellcome Trust: 076113, 102215

    Nature communications 2016;7;10979

  • Functional implications of disease-specific variants in loci jointly associated with coeliac disease and rheumatoid arthritis.

    Gutierrez-Achury J, Zorro MM, Ricaño-Ponce I, Zhernakova DV, Coeliac Disease Immunochip Consortium, RACI Consortium, Diogo D, Raychaudhuri S, Franke L, Trynka G, Wijmenga C and Zhernakova A

    Department of Genetics, University Medical Centre Groningen, University of Groningen, Groningen, The Netherlands.

    Hundreds of genomic loci have been associated with a significant number of immune-mediated diseases, and a large proportion of these associated loci are shared among traits. Both the molecular mechanisms by which these loci confer disease susceptibility and the extent to which shared loci are implicated in a common pathogenesis are unknown. We therefore sought to dissect the functional components at loci shared between two autoimmune diseases: coeliac disease (CeD) and rheumatoid arthritis (RA). We used a cohort of 12 381 CeD cases and 7827 controls, and another cohort of 13 819 RA cases and 12 897 controls, all genotyped with the Immunochip platform. In the joint analysis, we replicated 19 previously identified loci shared by CeD and RA and discovered five new non-HLA loci shared by CeD and RA. Our fine-mapping results indicate that in nine of 24 shared loci the associated variants are distinct in the two diseases. Using cell-type-specific histone markers, we observed that loci which pointed to the same variants in both diseases were enriched for marks of promoters active in CD14+ and CD34+ immune cells (P < 0.001), while loci pointing to distinct variants in one of the two diseases showed enrichment for marks of more specialized cell types, like CD4+ regulatory T cells in CeD (P < 0.0001) compared with Th17 and CD15+ in RA (P = 0.0029).

    Funded by: Wellcome Trust: WT098051

    Human molecular genetics 2016;25;1;180-90

  • Chad Genetic Diversity Reveals an African History Marked by Multiple Holocene Eurasian Migrations.

    Haber M, Mezzavilla M, Bergström A, Prado-Martinez J, Hallast P, Saif-Ali R, Al-Habori M, Dedoussis G, Zeggini E, Blue-Smith J, Wells RS, Xue Y, Zalloua PA and Tyler-Smith C

    Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK. Electronic address:

    Understanding human genetic diversity in Africa is important for interpreting the evolution of all humans, yet vast regions in Africa, such as Chad, remain genetically poorly investigated. Here, we use genotype data from 480 samples from Chad, the Near East, and southern Europe, as well as whole-genome sequencing from 19 of them, to show that many populations today derive their genomes from ancient African-Eurasian admixtures. We found evidence of early Eurasian backflow to Africa in people speaking the unclassified isolate Laal language in southern Chad and estimate from linkage-disequilibrium decay that this occurred 4,750-7,200 years ago. It brought to Africa a Y chromosome lineage (R1b-V88) whose closest relatives are widespread in present-day Eurasia; we estimate from sequence data that the Chad R1b-V88 Y chromosomes coalesced 5,700-7,300 years ago. This migration could thus have originated among Near Eastern farmers during the African Humid Period. We also found that the previously documented Eurasian backflow into Africa, which occurred ∼3,000 years ago and was thought to be mostly limited to East Africa, had a more westward impact affecting populations in northern Chad, such as the Toubou, who have 20%-30% Eurasian ancestry today. We observed a decline in heterozygosity in admixed Africans and found that the Eurasian admixture can bias inferences on their coalescent history and confound genetic signals from adaptation and archaic introgression.

    Funded by: Wellcome Trust

    American journal of human genetics 2016;99;6;1316-1324

  • Ancient DNA and the rewriting of human history: be sparing with Occam's razor.

    Haber M, Mezzavilla M, Xue Y and Tyler-Smith C

    The Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK.

    Ancient DNA research is revealing a human history far more complex than that inferred from parsimonious models based on modern DNA. Here, we review some of the key events in the peopling of the world in the light of the findings of work on ancient DNA.

    Funded by: Wellcome Trust: 098051

    Genome biology 2016;17;1

  • Wide distribution and altitude correlation of an archaic high-altitude-adaptive EPAS1 haplotype in the Himalayas.

    Hackinger S, Kraaijenbrink T, Xue Y, Mezzavilla M, Asan, van Driem G, Jobling MA, de Knijff P, Tyler-Smith C and Ayub Q

    The Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK.

    High-altitude adaptation in Tibetans is influenced by introgression of a 32.7-kb haplotype from the Denisovans, an extinct branch of archaic humans, lying within the endothelial PAS domain protein 1 (EPAS1), and has also been reported in Sherpa. We genotyped 19 variants in this genomic region in 1507 Eurasian individuals, including 1188 from Bhutan and Nepal residing at altitudes between 86 and 4550 m above sea level. Derived alleles for five SNPs characterizing the core Denisovan haplotype (AGGAA) were present at high frequency not only in Tibetans and Sherpa, but also among many populations from the Himalayas, showing a significant correlation with altitude (Spearman's correlation coefficient = 0.75, p value 3.9 × 10(-11)). Seven East- and South-Asian 1000 Genomes Project individuals shared the Denisovan haplotype extending beyond the 32-kb region, enabling us to refine the haplotype structure and identify a candidate regulatory variant (rs370299814) that might be interacting in an additive manner with the derived G allele of rs150877473, the variant previously associated with high-altitude adaptation in Tibetans. Denisovan-derived alleles were also observed at frequencies of 3-14% in the 1000 Genomes Project African samples. The closest African haplotype is, however, separated from the Asian high-altitude haplotype by 22 mutations whereas only three mutations, including rs150877473, separate the Asians from the Denisovan, consistent with distant shared ancestry for African and Asian haplotypes and Denisovan adaptive introgression.

    Funded by: Wellcome Trust: 087576, 098051

    Human genetics 2016;135;4;393-402

  • A bit of a mouthful.

    Hadfield J and David S

    Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.

    This month's Genome Watch explores recent advances in the identification of species-level and strain-level diversity in microbiome studies, and highlights how these have provided insights into the tropism and persistence of Neisseria spp. in the human oral cavity.

    Nature reviews. Microbiology 2016;14;9;548

  • Great ape Y Chromosome and mitochondrial DNA phylogenies reflect subspecies structure and patterns of mating and dispersal.

    Hallast P, Maisano Delser P, Batini C, Zadik D, Rocchi M, Schempp W, Tyler-Smith C and Jobling MA

    Department of Genetics, University of Leicester, Leicester LE1 7RH, United Kingdom; Institute of Molecular and Cell Biology, University of Tartu, Tartu 51010, Estonia;

    The distribution of genetic diversity in great ape species is likely to have been affected by patterns of dispersal and mating. This has previously been investigated by sequencing autosomal and mitochondrial DNA (mtDNA), but large-scale sequence analysis of the male-specific region of the Y Chromosome (MSY) has not yet been undertaken. Here, we use the human MSY reference sequence as a basis for sequence capture and read mapping in 19 great ape males, combining the data with sequences extracted from the published whole genomes of 24 additional males to yield a total sample of 19 chimpanzees, four bonobos, 14 gorillas, and six orangutans, in which interpretable MSY sequence ranges from 2.61 to 3.80 Mb. This analysis reveals thousands of novel MSY variants and defines unbiased phylogenies. We compare these with mtDNA-based trees in the same individuals, estimating time-to-most-recent common ancestor (TMRCA) for key nodes in both cases. The two loci show high topological concordance and are consistent with accepted (sub)species definitions, but time depths differ enormously between loci and (sub)species, likely reflecting different dispersal and mating patterns. Gorillas and chimpanzees/bonobos present generally low and high MSY diversity, respectively, reflecting polygyny versus multimale-multifemale mating. However, particularly marked differences exist among chimpanzee subspecies: The western chimpanzee MSY phylogeny has a TMRCA of only 13.2 (10.8-15.8) thousand years, but that for central chimpanzees exceeds 1 million years. Cross-species comparison within a single MSY phylogeny emphasizes the low human diversity, and reveals species-specific branch length variation that may reflect differences in long-term generation times.

    Genome research 2016;26;4;427-39

  • Powerful decomposition of complex traits in a diploid model.

    Hallin J, Märtens K, Young AI, Zackrisson M, Salinas F, Parts L, Warringer J and Liti G

    Institute for Research on Cancer and Aging, Nice (IRCAN), CNRS UMR7284, INSERM U1081, University of Nice Sophia Antipolis, 06107 Nice, France.

    Explaining trait differences between individuals is a core and challenging aim of life sciences. Here, we introduce a powerful framework for complete decomposition of trait variation into its underlying genetic causes in diploid model organisms. We sequence and systematically pair the recombinant gametes of two intercrossed natural genomes into an array of diploid hybrids with fully assembled and phased genomes, termed Phased Outbred Lines (POLs). We demonstrate the capacity of this approach by partitioning fitness traits of 6,642 Saccharomyces cerevisiae POLs across many environments, achieving near complete trait heritability and precisely estimating additive (73%), dominance (10%), second (7%) and third (1.7%) order epistasis components. We map quantitative trait loci (QTLs) and find nonadditive QTLs to outnumber (3:1) additive loci, dominant contributions to heterosis to outnumber overdominant, and extensive pleiotropy. The POL framework offers the most complete decomposition of diploid traits to date and can be adapted to most model organisms.

    Funded by: Wellcome Trust

    Nature communications 2016;7;13311

  • Exploitation of the Apoptosis-Primed State of MYCN-Amplified Neuroblastoma to Develop a Potent and Specific Targeted Therapy Combination.

    Ham J, Costa C, Sano R, Lochmann TL, Sennott EM, Patel NU, Dastur A, Gomez-Caraballo M, Krytska K, Hata AN, Floros KV, Hughes MT, Jakubik CT, Heisey DA, Ferrell JT, Bristol ML, March RJ, Yates C, Hicks MA, Nakajima W, Gowda M, Windle BE, Dozmorov MG, Garnett MJ, McDermott U, Harada H, Taylor SM, Morgan IM, Benes CH, Engelman JA, Mossé YP and Faber AC

    Philips Institute for Oral Health Research, VCU School of Dentistry and Massey Cancer Center, Virginia Commonwealth University, Perkinson Building, Richmond, VA 23298, USA.

    Fewer than half of children with high-risk neuroblastoma survive. Many of these tumors harbor high-level amplification of MYCN, which correlates with poor disease outcome. Using data from our large drug screen we predicted, and subsequently demonstrated, that MYCN-amplified neuroblastomas are sensitive to the BCL-2 inhibitor ABT-199. This sensitivity occurs in part through low anti-apoptotic BCL-xL expression, high pro-apoptotic NOXA expression, and paradoxical, MYCN-driven upregulation of NOXA. Screening for enhancers of ABT-199 sensitivity in MYCN-amplified neuroblastomas, we demonstrate that the Aurora Kinase A inhibitor MLN8237 combines with ABT-199 to induce widespread apoptosis. In diverse models of MYCN-amplified neuroblastoma, including a patient-derived xenograft model, this combination uniformly induced tumor shrinkage, and in multiple instances led to complete tumor regression.

    Cancer cell 2016;29;2;159-72

  • Association of breast cancer risk in BRCA1 and BRCA2 mutation carriers with genetic variants showing differential allelic expression: identification of a modifier of breast cancer risk at locus 11q22.3.

    Hamdi Y, Soucy P, Kuchenbaeker KB, Pastinen T, Droit A, Lemaçon A, Adlard J, Aittomäki K, Andrulis IL, Arason A, Arnold N, Arun BK, Azzollini J, Bane A, Barjhoux L, Barrowdale D, Benitez J, Berthet P, Blok MJ, Bobolis K, Bonadona V, Bonanni B, Bradbury AR, Brewer C, Buecher B, Buys SS, Caligo MA, Chiquette J, Chung WK, Claes KB, Daly MB, Damiola F, Davidson R, De la Hoya M, De Leeneer K, Diez O, Ding YC, Dolcetti R, Domchek SM, Dorfling CM, Eccles D, Eeles R, Einbeigi Z, Ejlertsen B, EMBRACE, Engel C, Gareth Evans D, Feliubadalo L, Foretova L, Fostira F, Foulkes WD, Fountzilas G, Friedman E, Frost D, Ganschow P, Ganz PA, Garber J, Gayther SA, GEMO Study Collaborators, Gerdes AM, Glendon G, Godwin AK, Goldgar DE, Greene MH, Gronwald J, Hahnen E, Hamann U, Hansen TV, Hart S, Hays JL, HEBON, Hogervorst FB, Hulick PJ, Imyanitov EN, Isaacs C, Izatt L, Jakubowska A, James P, Janavicius R, Jensen UB, John EM, Joseph V, Just W, Kaczmarek K, Karlan BY, KConFab Investigators, Kets CM, Kirk J, Kriege M, Laitman Y, Laurent M, Lazaro C, Leslie G, Lester J, Lesueur F, Liljegren A, Loman N, Loud JT, Manoukian S, Mariani M, Mazoyer S, McGuffog L, Meijers-Heijboer HE, Meindl A, Miller A, Montagna M, Mulligan AM, Nathanson KL, Neuhausen SL, Nevanlinna H, Nussbaum RL, Olah E, Olopade OI, Ong KR, Oosterwijk JC, Osorio A, Papi L, Park SK, Pedersen IS, Peissel B, Segura PP, Peterlongo P, Phelan CM, Radice P, Rantala J, Rappaport-Fuerhauser C, Rennert G, Richardson A, Robson M, Rodriguez GC, Rookus MA, Schmutzler RK, Sevenet N, Shah PD, Singer CF, Slavin TP, Snape K, Sokolowska J, Sønderstrup IM, Southey M, Spurdle AB, Stadler Z, Stoppa-Lyonnet D, Sukiennicki G, Sutter C, Tan Y, Tea MK, Teixeira MR, Teulé A, Teo SH, Terry MB, Thomassen M, Tihomirova L, Tischkowitz M, Tognazzo S, Toland AE, Tung N, van den Ouweland AM, van der Luijt RB, van Engelen K, van Rensburg EJ, Varon-Mateeva R, Wappenschmidt B, Wijnen JT, Rebbeck T, Chenevix-Trench G, Offit K, Couch FJ, Nord S, Easton DF, Antoniou AC and Simard J

    Genomics Center, Centre Hospitalier Universitaire de Québec Research Center and Laval University, 2705 Laurier Boulevard, Quebec, QC, G1V 4G2, Canada.

    Purpose: Cis-acting regulatory SNPs resulting in differential allelic expression (DAE) may, in part, explain the underlying phenotypic variation associated with many complex diseases. To investigate whether common variants associated with DAE were involved in breast cancer susceptibility among BRCA1 and BRCA2 mutation carriers, a list of 175 genes was developed based of their involvement in cancer-related pathways.

    Methods: Using data from a genome-wide map of SNPs associated with allelic expression, we assessed the association of ~320 SNPs located in the vicinity of these genes with breast and ovarian cancer risks in 15,252 BRCA1 and 8211 BRCA2 mutation carriers ascertained from 54 studies participating in the Consortium of Investigators of Modifiers of BRCA1/2.

    Results: We identified a region on 11q22.3 that is significantly associated with breast cancer risk in BRCA1 mutation carriers (most significant SNP rs228595 p = 7 × 10(-6)). This association was absent in BRCA2 carriers (p = 0.57). The 11q22.3 region notably encompasses genes such as ACAT1, NPAT, and ATM. Expression quantitative trait loci associations were observed in both normal breast and tumors across this region, namely for ACAT1, ATM, and other genes. In silico analysis revealed some overlap between top risk-associated SNPs and relevant biological features in mammary cell data, which suggests potential functional significance.

    Conclusion: We identified 11q22.3 as a new modifier locus in BRCA1 carriers. Replication in larger studies using estrogen receptor (ER)-negative or triple-negative (i.e., ER-, progesterone receptor-, and HER2-negative) cases could therefore be helpful to confirm the association of this locus with breast cancer risk.

    Breast cancer research and treatment 2016

  • A small Acinetobacter plasmid carrying the tet39 tetracycline resistance determinant.

    Hamidian M, Holt KE, Pickard D and Hall RM

    School of Molecular Bioscience, The University of Sydney, NSW 2006, Australia

    The Journal of antimicrobial chemotherapy 2016;71;1;269-71

  • Rubinstein-Taybi syndrome type 2: report of nine new cases that extend the phenotypic and genotypic spectrum.

    Hamilton MJ, Newbury-Ecob R, Holder-Espinasse M, Yau S, Lillis S, Hurst JA, Clement E, Reardon W, Joss S, Hobson E, Blyth M, Al-Shehhi M, Lynch SA, Suri M and DDD Study

    aDepartment of Clinical Genetics, Nottingham City Hospital, Nottingham bDepartment of Clinical Genetics, University Hospitals Bristol, Bristol cClinical Genetics Service dViapath Analytics LLP, Guy's and St Thomas' Hospital eClinical Genetics Unit, Great Ormond Street Hospital for Children, London fWest of Scotland Clinical Genetics Service, Queen Elizabeth University Hospital, Glasgow gYorkshire Regional Genetics Service, Chapel Allerton Hospital, Leeds hWellcome Trust Sanger Institute, Hinxton, Cambridge, UK iDepartment of Clinical Genetics, Our Lady's Hospital for Children jACoRD, University College Dublin, Dublin, Ireland.

    Rubinstein-Taybi syndrome (RTS) is an autosomal dominant neurodevelopmental disorder characterized by growth deficiency, broad thumbs and great toes, intellectual disability and characteristic craniofacial appearance. Mutations in CREBBP account for around 55% of cases, with a further 8% attributed to the paralogous gene EP300. Comparatively few reports exist describing the phenotype of Rubinstein-Taybi because of EP300 mutations. Clinical and genetic data were obtained from nine patients from the UK and Ireland with pathogenic EP300 mutations, identified either by targeted testing or by exome sequencing. All patients had mild or moderate intellectual impairment. Behavioural or social difficulties were noted in eight patients, including three with autistic spectrum disorders. Typical dysmorphic features of Rubinstein-Taybi were only variably present. Additional observations include maternal pre-eclampsia (2/9), syndactyly (3/9), feeding or swallowing issues (3/9), delayed bone age (2/9) and scoliosis (2/9). Six patients had truncating mutations in EP300, with pathogenic missense mutations identified in the remaining three. The findings support previous observations that microcephaly, maternal pre-eclampsia, mild growth restriction and a mild to moderate intellectual disability are key pointers to the diagnosis of EP300-related RTS. Variability in the presence of typical facial features of Rubinstein-Taybi further highlights clinical heterogeneity, particularly among patients identified by exome sequencing. Features that overlap with Floating-Harbor syndrome, including craniofacial dysmorphism and delayed osseous maturation, were observed in three patients. Previous reports have only described mutations predicted to cause haploinsufficiency of EP300, whereas this cohort includes the first described pathogenic missense mutations in EP300.

    Clinical dysmorphology 2016;25;4;135-45

  • Public health interventions to protect against falsified medicines: a systematic review of international, national and local policies.

    Hamilton WL, Doyle C, Halliwell-Ewen M and Lambert G

    University of Cambridge School of Clinical Medicine, Addenbrooke's Hospital, Hills Road, Cambridge CB2 0SP, UK

    Background: Falsified medicines are deliberately fraudulent drugs that pose a direct risk to patient health and undermine healthcare systems, causing global morbidity and mortality.

    Objective: To produce an overview of anti-falsifying public health interventions deployed at international, national and local scales in low and middle income countries (LMIC).

    Data sources: We conducted a systematic search of the PubMed, Web of Science, Embase and Cochrane Central Register of Controlled Trials databases for healthcare or pharmaceutical policies relevant to reducing the burden of falsified medicines in LMIC.

    Results: Our initial search identified 660 unique studies, of which 203 met title/abstract inclusion criteria and were categorised according to their primary focus: international; national; local pharmacy; internet pharmacy; drug analysis tools. Eighty-four were included in the qualitative synthesis, along with 108 articles and website links retrieved through secondary searches.

    Discussion: On the international stage, we discuss the need for accessible pharmacovigilance (PV) global reporting systems, international leadership and funding incorporating multiple stakeholders (healthcare, pharmaceutical, law enforcement) and multilateral trade agreements that emphasise public health. On the national level, we explore the importance of establishing adequate medicine regulatory authorities and PV capacity, with drug screening along the supply chain. This requires interdepartmental coordination, drug certification and criminal justice legislation and enforcement that recognise the severity of medicine falsification. Local healthcare professionals can receive training on medicine quality assessments, drug registration and pharmacological testing equipment. Finally, we discuss novel technologies for drug analysis which allow rapid identification of fake medicines in low-resource settings. Innovative point-of-purchase systems like mobile phone verification allow consumers to check the authenticity of their medicines.

    Conclusions: Combining anti-falsifying strategies targeting different levels of the pharmaceutical supply chain provides multiple barriers of protection from falsified medicines. This requires the political will to drive policy implementation; otherwise, people around the world remain at risk.

    Health policy and planning 2016;31;10;1448-1466

  • Divergent evolution of vitamin B9 binding underlies Juno-mediated adhesion of mammalian gametes.

    Han L, Nishimura K, Sadat Al Hosseini H, Bianchi E, Wright GJ and Jovine L

    Department of Biosciences and Nutrition & Center for Innovative Medicine, Karolinska Institutet, Huddinge, SE-141 83, Sweden.

    The interaction between egg and sperm is the first necessary step of fertilization in all sexually reproducing organisms. A decade-long search for a protein pair mediating this event in mammals culminated in the identification of the glycosylphosphatidylinositol (GPI)-anchored glycoprotein Juno as the egg plasma membrane receptor of sperm Izumo1 [1,2]. The Juno-Izumo1 interaction was shown to be essential for fertilization since mice lacking either gene exhibit sex-specific sterility, making these proteins promising non-hormonal contraceptive targets [1,3]. No structural information is available on how gamete membranes interact at fertilization, and it is unclear how Juno - which was previously named folate receptor (FR) 4, based on sequence similarity considerations - triggers membrane adhesion by binding Izumo1. Here, we report the crystal structure of Juno and find that the overall fold is similar to that of FRα and FRβ but with significant flexibility within the area that corresponds to the rigid ligand-binding site of these bona fide folate receptors. This explains both the inability of Juno to bind vitamin B9/folic acid [1], and why mutations within the flexible region can either abolish or change the species specificity of this interaction. Furthermore, structural similarity between Juno and the cholesterol-binding Niemann-Pick disease type C1 protein (NPC1) suggests how the modified binding surface of Juno may recognize the helical structure of the amino-terminal domain of Izumo1. As Juno appears to be a mammalian innovation, our study indicates that a key evolutionary event in mammalian reproduction originated from the neofunctionalization of the vitamin B9-binding pocket of an ancestral folate receptor molecule.

    Funded by: European Research Council: 260759; Medical Research Council: MR/M012468/1; Wellcome Trust: 098051

    Current biology : CB 2016;26;3;R100-1

  • Fast, Accurate and Automatic Ancient Nucleosome and Methylation Maps with epiPALEOMIX.

    Hanghøj K, Seguin-Orlando A, Schubert M, Madsen T, Pedersen JS, Willerslev E and Orlando L

    Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Copenhagen, Denmark.

    The first epigenomes from archaic hominins (AH) and ancient anatomically modern humans (AMH) have recently been characterized, based, however, on a limited number of samples. The extent to which ancient genome-wide epigenetic landscapes can be reconstructed thus remains contentious. Here, we present epiPALEOMIX, an open-source and user-friendly pipeline that exploits post-mortem DNA degradation patterns to reconstruct ancient methylomes and nucleosome maps from shotgun and/or capture-enrichment data. Applying epiPALEOMIX to the sequence data underlying 35 ancient genomes including AMH, AH, equids and aurochs, we investigate the temporal, geographical and preservation range of ancient epigenetic signatures. We first assess the quality of inferred ancient epigenetic signatures within well-characterized genomic regions. We find that tissue-specific methylation signatures can be obtained across a wider range of DNA preparation types than previously thought, including when no particular experimental procedures have been used to remove deaminated cytosines prior to sequencing. We identify a large subset of samples for which DNA associated with nucleosomes is protected from post-mortem degradation, and nucleosome positioning patterns can be reconstructed. Finally, we describe parameters and conditions such as DNA damage levels and sequencing depth that limit the preservation of epigenetic signatures in ancient samples. When such conditions are met, we propose that epigenetic profiles of CTCF binding regions can be used to help data authentication. Our work, including epiPALEOMIX, opens for further investigations of ancient epigenomes through time especially aimed at tracking possible epigenetic changes during major evolutionary, environmental, socioeconomic, and cultural shifts.

    Molecular biology and evolution 2016;33;12;3284-3298

  • Genomes

    Harb OS, Boehme U, Crouch K, Ifeonu OO, Roos DS, Silva JC, Silva-Franco F, Svärd S, Tretina K and Weedall G

    In the last decade, the rise of affordable high-throughput sequencing technologies has led to rapid advances across the biological sciences. At the time of writing, annotated reference genomes are available within most clades of eukaryotic pathogens, and including un-annotated sequences over 550 genomes are available in total. This has greatly facilitated studies in many areas of parasitology. In addition, the volume of functional genomics data, including analysis of differential transcription and DNA-protein interactions, has increased exponentially. With this unprecedented increase in publicly available data, tools to search and compare datasets are also becoming ever more important. A number of database resources are available, and access to these has become fundamental for a majority of research groups. This chapter discusses the current state of genomics research for a number of eukaryotic parasites, discussing the genome and functional genomics resources available at the time of writing and highlighting functionally important or unique aspects of the genome for each group. In addition publicly accessible database resources pertaining to eukaryotic parasites are also discussed.

    Molecular Parasitology: Protozoan Parasites and their Molecules 2016;Springer

  • Germline TERT promoter mutations are rare in familial melanoma.

    Harland M, Petljak M, Robles-Espinoza CD, Ding Z, Gruis NA, van Doorn R, Pooley KA, Dunning AM, Aoude LG, Wadt KA, Gerdes AM, Brown KM, Hayward NK, Newton-Bishop JA, Adams DJ and Bishop DT

    Section of Epidemiology and Biostatistics, Leeds Institute of Cancer and Pathology, University of Leeds, Leeds, LS9 7TF, UK.

    Germline CDKN2A mutations occur in 40 % of 3-or-more case melanoma families while mutations of CDK4, BAP1, and genes involved in telomere function (ACD, TERF2IP, POT1), have also been implicated in melanomagenesis. Mutation of the promoter of the telomerase reverse transcriptase (TERT) gene (c.-57 T>G variant) has been reported in one family. We tested for the TERT promoter variant in 675 multicase families wild-type for the known high penetrance familial melanoma genes, 1863 UK population-based melanoma cases and 529 controls. Germline lymphocyte telomere length was estimated in carriers. The c.-57 T>G TERT promoter variant was identified in one 7-case family with multiple primaries and early age of onset (earliest, 15 years) but not among population cases or controls. One family member had multiple primary melanomas, basal cell carcinomas and a bladder tumour. The blood leukocyte telomere length of a carrier was similar to wild-type cases. We provide evidence confirming that a rare promoter variant of TERT (c.-57 T>G) is associated with high penetrance, early onset melanoma and potentially other cancers, and explains <1 % of UK melanoma multicase families. The identification of POT1 and TERT germline mutations highlights the importance of telomere integrity in melanoma biology.

    Funded by: Cancer Research UK: 13031, C588/A19167, C8197/A16565, C8216/A6129; Intramural NIH HHS; NCI NIH HHS: CA83115, R01 CA083115

    Familial cancer 2016;15;1;139-44

  • TRAIP promotes DNA damage response during genome replication and is mutated in primordial dwarfism.

    Harley ME, Murina O, Leitch A, Higgs MR, Bicknell LS, Yigit G, Blackford AN, Zlatanou A, Mackenzie KJ, Reddy K, Halachev M, McGlasson S, Reijns MAM, Fluteau A, Martin CA, Sabbioneda S, Elcioglu NH, Altmüller J, Thiele H, Greenhalgh L, Chessa L, Maghnie M, Salim M, Bober MB, Nürnberg P, Jackson SP, Hurles ME, Wollnik B, Stewart GS and Jackson AP

    MRC Human Genetics Unit, IGMM, University of Edinburgh, Edinburgh, EH4 2XU, UK.

    DNA lesions encountered by replicative polymerases threaten genome stability and cell cycle progression. Here we report the identification of mutations in TRAIP, encoding an E3 RING ubiquitin ligase, in patients with microcephalic primordial dwarfism. We establish that TRAIP relocalizes to sites of DNA damage, where it is required for optimal phosphorylation of H2AX and RPA2 during S-phase in response to ultraviolet (UV) irradiation, as well as fork progression through UV-induced DNA lesions. TRAIP is necessary for efficient cell cycle progression and mutations in TRAIP therefore limit cellular proliferation, providing a potential mechanism for microcephaly and dwarfism phenotypes. Human genetics thus identifies TRAIP as a component of the DNA damage response to replication-blocking DNA lesions.

    Funded by: Cancer Research UK: 11224, 13030, C17183/A13030, C6/A11224; European Research Council: 281847; Medical Research Council: MC_PC_U127580972

    Nature genetics 2016;48;1;36-43

  • The Human Gut Microbiota.

    Harmsen HJ and de Goffau MC

    Department of Medical Microbiology, University of Groningen, University Medical Center Groningen, 30001, 9700, Groningen, The Netherlands.

    The microbiota in our gut performs many different essential functions that help us to stay healthy. These functions include vitamin production, regulation of lipid metabolism and short chain fatty acid production as fuel for epithelial cells and regulation of gene expression. There is a very numerous and diverse microbial community present in the gut, especially in the colon, with reported numbers of species that vary between 400 and 1500, for some those we even do not yet have culture representatives.A healthy gut microbiota is important for maintaining a healthy host. An aberrant microbiota can cause diseases of different nature and at different ages ranging from allergies at early age to IBD in young adults. This shows that our gut microbiota needs to be treated well to stay healthy. In this chapter we describe what we consider a healthy microbiota and discuss what the role of the microbiota is in various diseases. Research into these described dysbiosis conditions could lead to new strategies for treatment and/or management of our microbiota to improve health.

    Advances in experimental medicine and biology 2016;902;95-108

  • PBP2a substitutions linked to ceftaroline resistance in MRSA isolates from the UK.

    Harrison EM, Ba X, Blane B, Ellington MJ, Loeffler A, Hill RL, Holmes MA and Peacock SJ

    Department of Medicine, University of Cambridge, Cambridge, UK

    Funded by: Medical Research Council: G1000803, G1001787, G1001787/1; Wellcome Trust

    The Journal of antimicrobial chemotherapy 2016;71;1;268-9

  • Validation of self-administered nasal swabs and postage for the isolation of Staphylococcus aureus.

    Harrison EM, Gleadall NS, Ba X, Danesh J, Peacock SJ and Holmes M

    1​Department of Veterinary Medicine, University of Cambridge, Cambridge, UK.

    Staphylococcus aureus carriers are at higher risk of S. aureus infection and are a reservoir for transmission to others. Detection of nasal S. aureus carriage is important for both targeted decolonization and epidemiological studies. Self-administered nasal swabbing has been reported previously, but the effects of posting swabs prior to culture on S. aureus yield have not been investigated. A longitudinal cohort study was performed in which healthy volunteers were recruited, trained in the swabbing procedure and asked to take weekly nasal swabs for 6 weeks (median: 3 weeks, range 1-6 weeks). Two swabs were taken at each sampling episode and randomly assigned for immediate processing on arrival to the laboratory (Swab A) or second class postage prior to processing (Swab B). S. aureus was isolated using standard methods. A total of 95 participants were recruited, who took 944 swabs (472 pairs) over a median of 5 weeks. Of these, 459 swabs were positive for S. aureus. We found no significant difference (P=0.25) between 472 pairs of nasal self-swabs processed immediately or following standard postage from 95 study participants (51.4 % vs. 48.6 %, respectively). We also provide further evidence that persistent carriers can be detected by two weekly swabs with high degrees of sensitivity [92.3 % (95 % CI 74.8-98.8 %)] and specificity [95.6 % (95 % CI 84.8-99.3 %)] compared with a gold standard of five weekly swabs. Self-swabbing and postage of nasal swabs prior to processing has no effect on yield of S. aureus, and could facilitate large community-based carriage studies.

    Funded by: Medical Research Council: G0800270, G1000803, G1001787

    Journal of medical microbiology 2016;65;12;1434-1437

  • Transmission of methicillin-resistant Staphylococcus aureus in long-term care facilities and their related healthcare networks.

    Harrison EM, Ludden C, Brodrick HJ, Blane B, Brennan G, Morris D, Coll F, Reuter S, Brown NM, Holmes MA, O'Connell B, Parkhill J, Török ME, Cormican M and Peacock SJ

    Department of Medicine, University of Cambridge, Addenbrooke's Hospital, Box 157, Hills Road, Cambridge, CB2 0QQ, UK.

    Background: Long-term care facilities (LTCF) are potential reservoirs for methicillin-resistant Staphylococcus aureus (MRSA), control of which may reduce MRSA transmission and infection elsewhere in the healthcare system. Whole-genome sequencing (WGS) has been used successfully to understand MRSA epidemiology and transmission in hospitals and has the potential to identify transmission between these and LTCF.

    Methods: Two prospective observational studies of MRSA carriage were conducted in LTCF in England and Ireland. MRSA isolates were whole-genome sequenced and analyzed using established methods. Genomic data were available for MRSA isolated in the local healthcare systems (isolates submitted by hospitals and general practitioners).

    Results: We sequenced a total of 181 MRSA isolates from the two study sites. The majority of MRSA were multilocus sequence type (ST)22. WGS identified one likely transmission event between residents in the English LTCF and three putative transmission events in the Irish LTCF. WGS also identified closely related isolates present in colonized Irish residents and their immediate environment. Based on phylogenetic reconstruction, closely related MRSA clades were identified between the LTCF and their healthcare referral network, together with putative MRSA acquisition by LTCF residents during hospital admission.

    Conclusions: These data confirm that MRSA is transmitted between residents of LTCF and is both acquired and transmitted to others in referral hospitals and beyond. Our data present compelling evidence for the importance of environmental contamination in MRSA transmission, reinforcing the importance of environmental cleaning. The use of WGS in this study highlights the need to consider infection control in hospitals and community healthcare facilities as a continuum.

    Funded by: Medical Research Council: G1000803, G1001787; Wellcome Trust

    Genome medicine 2016;8;1;102

  • Differential Killing of Salmonella enterica Serovar Typhi by Antibodies Targeting Vi and Lipopolysaccharide O:9 Antigen.

    Hart PJ, O'Shaughnessy CM, Siggins MK, Bobat S, Kingsley RA, Goulding DA, Crump JA, Reyburn H, Micoli F, Dougan G, Cunningham AF and MacLennan CA

    School of Immunity and Infection, College of Medicine and Dental Sciences, University of Birmingham, Birmingham, United Kingdom.

    Salmonella enterica serovar Typhi expresses a capsule of Vi polysaccharide, while most Salmonella serovars, including S. Enteritidis and S. Typhimurium, do not. Both S. Typhi and S. Enteritidis express the lipopolysaccharide O:9 antigen, yet there is little evidence of cross-protection from anti-O:9 antibodies. Vaccines based on Vi polysaccharide have efficacy against typhoid fever, indicating that antibodies against Vi confer protection. Here we investigate the role of Vi capsule and antibodies against Vi and O:9 in antibody-dependent complement- and phagocyte-mediated killing of Salmonella. Using isogenic Vi-expressing and non-Vi-expressing derivatives of S. Typhi and S. Typhimurium, we show that S. Typhi is inherently more sensitive to serum and blood than S. Typhimurium. Vi expression confers increased resistance to both complement- and phagocyte-mediated modalities of antibody-dependent killing in human blood. The Vi capsule is associated with reduced C3 and C5b-9 deposition, and decreased overall antibody binding to S. Typhi. However, purified human anti-Vi antibodies in the presence of complement are able to kill Vi-expressing Salmonella, while killing by anti-O:9 antibodies is inversely related to Vi expression. Human serum depleted of antibodies to antigens other than Vi retains the ability to kill Vi-expressing bacteria. Our findings support a protective role for Vi capsule in preventing complement and phagocyte killing of Salmonella that can be overcome by specific anti-Vi antibodies, but only to a limited extent by anti-O:9 antibodies.

    Funded by: Biotechnology and Biological Sciences Research Council: BB/F022778/1; Medical Research Council: G0701275, G9818340

    PloS one 2016;11;1;e0145945

  • Fluorescence-Based Flow Sorting in Parallel with Transposon Insertion Site Sequencing Identifies Multidrug Efflux Systems in Acinetobacter baumannii.

    Hassan KA, Cain AK, Huang T, Liu Q, Elbourne LD, Boinett CJ, Brzoska AJ, Li L, Ostrowski M, Nhu NT, Nhu Tdo H, Baker S, Parkhill J and Paulsen IT

    Department of Chemistry and Biomolecular Sciences, Macquarie University, Sydney, NSW, Australia

    Unlabelled: Multidrug efflux pumps provide clinically significant levels of drug resistance in a number of Gram-negative hospital-acquired pathogens. These pathogens frequently carry dozens of genes encoding putative multidrug efflux pumps. However, it can be difficult to determine how many of these pumps actually mediate antimicrobial efflux, and it can be even more challenging to identify the regulatory proteins that control expression of these pumps. In this study, we developed an innovative high-throughput screening method, combining transposon insertion sequencing and cell sorting methods (TraDISort), to identify the genes encoding major multidrug efflux pumps, regulators, and other factors that may affect the permeation of antimicrobials, using the nosocomial pathogen Acinetobacter baumannii A dense library of more than 100,000 unique transposon insertion mutants was treated with ethidium bromide, a common substrate of multidrug efflux pumps that is differentially fluorescent inside and outside the bacterial cytoplasm. Populations of cells displaying aberrant accumulations of ethidium were physically enriched using fluorescence-activated cell sorting, and the genomic locations of transposon insertions within these strains were determined using transposon-directed insertion sequencing. The relative abundance of mutants in the input pool compared to the selected mutant pools indicated that the AdeABC, AdeIJK, and AmvA efflux pumps are the major ethidium efflux systems in A. baumannii Furthermore, the method identified a new transcriptional regulator that controls expression of amvA In addition to the identification of efflux pumps and their regulators, TraDISort identified genes that are likely to control cell division, cell morphology, or aggregation in A. baumannii

    Importance: Transposon-directed insertion sequencing (TraDIS) and related technologies have emerged as powerful methods to identify genes required for bacterial survival or competitive fitness under various selective conditions. We applied fluorescence-activated cell sorting (FACS) to physically enrich for phenotypes of interest within a mutant population prior to TraDIS. To our knowledge, this is the first time that a physical selection method has been applied in parallel with TraDIS rather than a fitness-induced selection. The results demonstrate the feasibility of this combined approach to generate significant results and highlight the major multidrug efflux pumps encoded in an important pathogen. This FACS-based approach, TraDISort, could have a range of future applications, including the characterization of efflux pump inhibitors, the identification of regulatory factors controlling gene or protein expression using fluorescent reporters, and the identification of genes involved in cell replication, morphology, and aggregation.

    Funded by: Medical Research Council: G1100100; Wellcome Trust: 098051, 100087/Z/12/Z

    mBio 2016;7;5

  • Genome-wide time-to-event analysis on smoking progression stages in a family-based study.

    He L, Pitkäniemi J, Heikkilä K, Chou YL, Madden PA, Korhonen T, Sarin AP, Ripatti S, Kaprio J and Loukola A

    Department of Public Health University of Helsinki Helsinki Finland.

    Background: Various pivotal stages in smoking behavior can be identified, including initiation, conversion from experimenting to established use, development of tolerance, and cessation. Previous studies have shown high heritability for age of smoking initiation and cessation; however, time-to-event genome-wide association studies aiming to identify underpinning genes that accelerate or delay these transitions are missing to date.

    Methods: We investigated which single nucleotide polymorphisms (SNPs) across the whole genome contribute to the hazard ratio of transition between different stages of smoking behavior by performing time-to-event analyses within a large Finnish twin family cohort (N = 1962), and further conducted mediation analyses of plausible intermediate traits for significant SNPs.

    Results: Genome-wide significant signals were detected for three of the four transitions: (1) for smoking cessation on 10p14 (P = 4.47e-08 for rs72779075 flanked by RP11-575N15 and GATA3), (2) for tolerance on 11p13 (P = 1.29e-08 for rs11031684 in RP1-65P5.1), mediated by smoking quantity, and on 9q34.12 (P = 3.81e-08 for rs2304808 in FUBP3), independent of smoking quantity, and (3) for smoking initiation on 19q13.33 (P = 3.37e-08 for rs73050610 flanked by TRPM4 and SLC6A16) in analysis adjusted for first time sensations. Although our top SNPs did not replicate, another SNP in the TRPM4-SLC6A16 gene region showed statistically significant association after region-based multiple testing correction in an independent Australian twin family sample.

    Conclusion: Our results suggest that the functional effect of the TRPM4-SLC6A16 gene region deserves further investigation, and that complex neurotransmitter networks including dopamine and glutamate may play a critical role in smoking initiation. Moreover, comparison of these results implies that genetic contributions to the complex smoking behavioral phenotypes vary among the transitions.

    Funded by: NIAAA NIH HHS: P50 AA011998, R01 AA013320, R01 AA013321, R01 AA013326; NIDA NIH HHS: R01 DA012854, R56 DA012854; Wellcome Trust

    Brain and behavior 2016;6;5;e00462

  • Linear mixed model for heritability estimation that explicitly addresses environmental variation.

    Heckerman D, Gurdasani D, Kadie C, Pomilla C, Carstensen T, Martin H, Ekoru K, Nsubuga RN, Ssenyomo G, Kamali A, Kaleebu P, Widmer C and Sandhu MS

    Microsoft Research, Los Angeles, CA 90024;

    The linear mixed model (LMM) is now routinely used to estimate heritability. Unfortunately, as we demonstrate, LMM estimates of heritability can be inflated when using a standard model. To help reduce this inflation, we used a more general LMM with two random effects-one based on genomic variants and one based on easily measured spatial location as a proxy for environmental effects. We investigated this approach with simulated data and with data from a Uganda cohort of 4,778 individuals for 34 phenotypes including anthropometric indices, blood factors, glycemic control, blood pressure, lipid tests, and liver function tests. For the genomic random effect, we used identity-by-descent estimates from accurately phased genome-wide data. For the environmental random effect, we constructed a covariance matrix based on a Gaussian radial basis function. Across the simulated and Ugandan data, narrow-sense heritability estimates were lower using the more general model. Thus, our approach addresses, in part, the issue of "missing heritability" in the sense that much of the heritability previously thought to be missing was fictional. Software is available at

    Funded by: Medical Research Council: G0801566, G0901213, MR/K013491/1; Wellcome Trust

    Proceedings of the National Academy of Sciences of the United States of America 2016;113;27;7377-82

  • Conserved Features in the Structure, Mechanism, and Biogenesis of the Inverse Autotransporter Protein Family.

    Heinz E, Stubenrauch CJ, Grinter R, Croft NP, Purcell AW, Strugnell RA, Dougan G and Lithgow T

    Department of Microbiology, Infection & Immunity Program, Biomedicine Discovery Institute, Monash University, Clayton, Australia Wellcome Trust Sanger Institute, Hinxton, United Kingdom.

    The bacterial cell surface proteins intimin and invasin are virulence factors that share a common domain structure and bind selectively to host cell receptors in the course of bacterial pathogenesis. The β-barrel domains of intimin and invasin show significant sequence and structural similarities. Conversely, a variety of proteins with sometimes limited sequence similarity have also been annotated as "intimin-like" and "invasin" in genome datasets, while other recent work on apparently unrelated virulence-associated proteins ultimately revealed similarities to intimin and invasin. Here we characterize the sequence and structural relationships across this complex protein family. Surprisingly, intimins and invasins represent a very small minority of the sequence diversity in what has been previously the "intimin/invasin protein family". Analysis of the assembly pathway for expression of the classic intimin, EaeA, and a characteristic example of the most prevalent members of the group, FdeC, revealed a dependence on the translocation and assembly module as a common feature for both these proteins. While the majority of the sequences in the grouping are most similar to FdeC, a further and widespread group is two-partner secretion systems that use the β-barrel domain as the delivery device for secretion of a variety of virulence factors. This comprehensive analysis supports the adoption of the "inverse autotransporter protein family" as the most accurate nomenclature for the family and, in turn, has important consequences for our overall understanding of the Type V secretion systems of bacterial pathogens.

    Genome biology and evolution 2016;8;6;1690-705

  • Ensembl comparative genomics resources.

    Herrero J, Muffato M, Beal K, Fitzgerald S, Gordon L, Pignatelli M, Vilella AJ, Searle SM, Amode R, Brent S, Spooner W, Kulesha E, Yates A and Flicek P

    European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SD, Bill Lyons Informatics Centre, UCL Cancer Institute, University College London, London WC1E 6DD,

    Evolution provides the unifying framework with which to understand biology. The coherent investigation of genic and genomic data often requires comparative genomics analyses based on whole-genome alignments, sets of homologous genes and other relevant datasets in order to evaluate and answer evolutionary-related questions. However, the complexity and computational requirements of producing such data are substantial: this has led to only a small number of reference resources that are used for most comparative analyses. The Ensembl comparative genomics resources are one such reference set that facilitates comprehensive and reproducible analysis of chordate genome data. Ensembl computes pairwise and multiple whole-genome alignments from which large-scale synteny, per-base conservation scores and constrained elements are obtained. Gene alignments are used to define Ensembl Protein Families, GeneTrees and homologies for both protein-coding and non-coding RNA genes. These resources are updated frequently and have a consistent informatics infrastructure and data presentation across all supported species. Specialized web-based visualizations are also available including synteny displays, collapsible gene tree plots, a gene family locator and different alignment views. The Ensembl comparative genomics infrastructure is extensively reused for the analysis of non-vertebrate species by other projects including Ensembl Genomes and Gramene and much of the information here is relevant to these projects. The consistency of the annotation across species and the focus on vertebrates makes Ensembl an ideal system to perform and support vertebrate comparative genomic analyses. We use robust software and pipelines to produce reference comparative data and make it freely available.Database URL:

    Database : the journal of biological databases and curation 2016;2016

  • Burden of Diabetes and First Evidence for the Utility of HbA1c for Diagnosis and Detection of Diabetes in Urban Black South Africans: The Durban Diabetes Study.

    Hird TR, Pirie FJ, Esterhuizen TM, O'Leary B, McCarthy MI, Young EH, Sandhu MS and Motala AA

    Department of Medicine, University of Cambridge, Cambridge, United Kingdom.

    Objective: Glycated haemoglobin (HbA1c) is recommended as an additional tool to glucose-based measures (fasting plasma glucose [FPG] and 2-hour plasma glucose [2PG] during oral glucose tolerance test [OGTT]) for the diagnosis of diabetes; however, its use in sub-Saharan African populations is not established. We assessed prevalence estimates and the diagnosis and detection of diabetes based on OGTT, FPG, and HbA1c in an urban black South African population.

    Research design and methods: We conducted a population-based cross-sectional survey using multistage cluster sampling of adults aged ≥18 years in Durban (eThekwini municipality), KwaZulu-Natal. All participants had a 75-g OGTT and HbA1c measurements. Receiver operating characteristic (ROC) analysis was used to assess the overall diagnostic accuracy of HbA1c, using OGTT as the reference, and to determine optimal HbA1c cut-offs.

    Results: Among 1190 participants (851 women, 92.6% response rate), the age-standardised prevalence of diabetes was 12.9% based on OGTT, 11.9% based on FPG, and 13.1% based on HbA1c. In participants without a previous history of diabetes (n = 1077), using OGTT as the reference, an HbA1c ≥48 mmol/mol (6.5%) detected diabetes with 70.3% sensitivity (95%CI 52.7-87.8) and 98.7% specificity (95%CI 97.9-99.4) (AUC 0.94 [95%CI 0.89-1.00]). Additional analyses suggested the optimal HbA1c cut-off for detection of diabetes in this population was 42 mmol/mol (6.0%) (sensitivity 89.2% [95%CI 78.6-99.8], specificity 92.0% [95%CI: 90.3-93.7]).

    Conclusions: In an urban black South African population, we found a high prevalence of diabetes and provide the first evidence for the utility of HbA1c for the diagnosis and detection of diabetes in black Africans in sub-Saharan Africa.

    Funded by: Medical Research Council: MR/K013491/1

    PloS one 2016;11;8;e0161966

  • Study profile: the Durban Diabetes Study (DDS): a platform for chronic disease research.

    Hird TR, Young EH, Pirie FJ, Riha J, Esterhuizen TM, O'Leary B, McCarthy MI, Sandhu MS and Motala AA

    Department of Medicine, University of Cambridge, Cambridge, UK.

    The Durban Diabetes Study (DDS) is a population-based cross-sectional survey of an urban black population in the eThekwini Municipality (city of Durban) in South Africa. The survey combines health, lifestyle and socioeconomic questionnaire data with standardised biophysical measurements, biomarkers for non-communicable and infectious diseases, and genetic data. Data collection for the study is currently underway and the target sample size is 10 000 participants. The DDS has an established infrastructure for survey fieldwork, data collection and management, sample processing and storage, managed data sharing and consent for re-approaching participants, which can be utilised for further research studies. As such, the DDS represents a rich platform for investigating the distribution, interrelation and aetiology of chronic diseases and their risk factors, which is critical for developing health care policies for disease management and prevention. For data access enquiries please contact the African Partnership for Chronic Disease Research (APCDR) at or the corresponding author.

    Funded by: Medical Research Council: MR/K013491/1; Wellcome Trust

    Global health, epidemiology and genomics 2016;1;e2

  • Genomic Analysis of Companion Rabbit Staphylococcus aureus.

    Holmes MA, Harrison EM, Fisher EA, Graham EM, Parkhill J, Foster G and Paterson GK

    Department of Veterinary Medicine, University of Cambridge, Cambridge, United Kingdom.

    In addition to being an important human pathogen, Staphylococcus aureus is able to cause a variety of infections in numerous other host species. While the S. aureus strains causing infection in several of these hosts have been well characterised, this is not the case for companion rabbits (Oryctolagus cuniculus), where little data are available on S. aureus strains from this host. To address this deficiency we have performed antimicrobial susceptibility testing and genome sequencing on a collection of S. aureus isolates from companion rabbits. The findings show a diverse S. aureus population is able to cause infection in this host, and while antimicrobial resistance was uncommon, the isolates possess a range of known and putative virulence factors consistent with a diverse clinical presentation in companion rabbits including severe abscesses. We additionally show that companion rabbit isolates carry polymorphisms within dltB as described as underlying host-adaption of S. aureus to farmed rabbits. The availability of S. aureus genome sequences from companion rabbits provides an important aid to understanding the pathogenesis of disease in this host and in the clinical management and surveillance of these infections.

    PloS one 2016;11;3;e0151458

  • Five decades of genome evolution in the globally distributed, extensively antibiotic-resistant Acinetobacter baumannii global clone 1.

    Holt K, Kenyon JJ, Hamidian M, Schultz MB, Pickard DJ, Dougan G and Hall R

    1​ Department of Biochemistry & Molecular Biology, The University of Melbourne, Royal Parade, Parkville, Victoria, Australia.

    The majority of <i>Acinetobacter baumannii</i> isolates that are multiply, extensively and pan-antibiotic resistant belong to two globally disseminated clones, GC1 and GC2, that were first noticed in the 1970s. Here, we investigated microevolution and phylodynamics within GC1 via analysis of 45 whole-genome sequences, including 23 sequenced for this study. The most recent common ancestor of GC1 arose around 1960 and later diverged into two phylogenetically distinct lineages. In the 1970s, the main lineage acquired the AbaR resistance island, conferring resistance to older antibiotics, via a horizontal gene transfer event. We estimate a mutation rate of ∼5 SNPs genome<sup>- 1</sup> year<sup>- 1</sup> and detected extensive recombination within GC1 genomes, introducing nucleotide diversity into the population at >20 times the substitution rate (the ratio of SNPs introduced by recombination compared with mutation was 22). The recombination events were non-randomly distributed in the genome and created significant diversity within loci encoding outer surface molecules (including the capsular polysaccharide, the outer core lipooligosaccharide and the outer membrane protein CarO), and spread antimicrobial resistance-conferring mutations affecting the <i>gyrA</i> and <i>parC</i> genes and insertion sequence insertions activating the <i>ampC</i> gene. Both GC1 lineages accumulated resistance to newer antibiotics through various genetic mechanisms, including the acquisition of plasmids and transposons or mutations in chromosomal genes. Our data show that GC1 has diversified into multiple successful extensively antibiotic-resistant subclones that differ in their surface structures. This has important implications for all avenues of control, including epidemiological tracking, antimicrobial therapy and vaccination.

    Funded by: Wellcome Trust: 098051

    Microbial genomics 2016;2;2;e000052

  • Palmitoyl Transferases have Critical Roles in the Development of Mosquito and Liver Stages of Plasmodium.

    Hopp CS, Balaban AE, Bushell E, Billker O, Rayner JC and Sinnis P

    Department of Molecular Microbiology & Immunology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, 21205, USA.

    As the Plasmodium parasite transitions between mammalian and mosquito host, it has to adjust quickly to new environments. Palmitoylation, a reversible and dynamic lipid posttranslational modification plays a central role in regulating this process and has been implicated with functions for parasite morphology, motility and host cell invasion. While proteins associated with the gliding motility machinery have been described to be palmitoylated, no palmitoyl transferase responsible for regulating gliding motility has previously been identified. Here, we characterize two palmityol transferases with gene tagging and gene deletion approaches. We identify DHHC3, a palmitoyl transferase as a mediator of ookinete development, with a crucial role for gliding motility in ookinetes and sporozoites and we co-localize the protein with a marker for the inner membrane complex in the ookinete stage. Ookinetes and sporozoites lacking DHHC3 are impaired in gliding motility and exhibit a strong phenotype in vivo; with ookinetes being significantly less infectious to their mosquito host and sporozoites being non-infectious to mice. Importantly, genetic complementation of the DHHC3-ko parasite completely restored virulence. We generated parasites lacking both DHHC3, as well as the palmitoyl transferase DHHC9, and found an enhanced phenotype for these double knockout parasites, allowing insights into the functional overlap and compensational nature of the large family of PbDHHCs. These findings contribute to our understanding of the organization and mechanism of the gliding motility machinery, which as is becoming increasingly clear, is mediated by palmitoylation. This article is protected by copyright. All rights reserved.

    Cellular microbiology 2016

  • Retinol and ascorbate drive erasure of epigenetic memory and enhance reprogramming to naïve pluripotency by complementary mechanisms.

    Hore TA, von Meyenn F, Ravichandran M, Bachman M, Ficz G, Oxley D, Santos F, Balasubramanian S, Jurkowski TP and Reik W

    Epigenetics Programme, Babraham Institute, Cambridge CB22 3AT, United Kingdom; Department of Anatomy, University of Otago, Dunedin 9016, New Zealand;

    Epigenetic memory, in particular DNA methylation, is established during development in differentiating cells and must be erased to create naïve (induced) pluripotent stem cells. The ten-eleven translocation (TET) enzymes can catalyze the oxidation of 5-methylcytosine (5mC) to 5-hydroxymethylcytosine (5hmC) and further oxidized derivatives, thereby actively removing this memory. Nevertheless, the mechanism by which the TET enzymes are regulated, and the extent to which they can be manipulated, are poorly understood. Here we report that retinoic acid (RA) or retinol (vitamin A) and ascorbate (vitamin C) act as modulators of TET levels and activity. RA or retinol enhances 5hmC production in naïve embryonic stem cells by activation of TET2 and TET3 transcription, whereas ascorbate potentiates TET activity and 5hmC production through enhanced Fe<sup>2+</sup> recycling, and not as a cofactor as reported previously. We find that both ascorbate and RA or retinol promote the derivation of induced pluripotent stem cells synergistically and enhance the erasure of epigenetic memory. This mechanistic insight has significance for the development of cell treatments for regenenerative medicine, and enhances our understanding of how intrinsic and extrinsic signals shape the epigenome.

    Funded by: Wellcome Trust

    Proceedings of the National Academy of Sciences of the United States of America 2016;113;43;12202-12207

  • Genome-wide associations for birth weight and correlations with adult disease.

    Horikoshi M, Beaumont RN, Day FR, Warrington NM, Kooijman MN, Fernandez-Tajes J, Feenstra B, van Zuydam NR, Gaulton KJ, Grarup N, Bradfield JP, Strachan DP, Li-Gao R, Ahluwalia TS, Kreiner E, Rueedi R, Lyytikäinen LP, Cousminer DL, Wu Y, Thiering E, Wang CA, Have CT, Hottenga JJ, Vilor-Tejedor N, Joshi PK, Boh ETH, Ntalla I, Pitkänen N, Mahajan A, van Leeuwen EM, Joro R, Lagou V, Nodzenski M, Diver LA, Zondervan KT, Bustamante M, Marques-Vidal P, Mercader JM, Bennett AJ, Rahmioglu N, Nyholt DR, Ma RCW, Tam CHT, Tam WH, CHARGE Consortium Hematology Working Group, Ganesh SK, van Rooij FJ, Jones SE, Loh PR, Ruth KS, Tuke MA, Tyrrell J, Wood AR, Yaghootkar H, Scholtens DM, Paternoster L, Prokopenko I, Kovacs P, Atalay M, Willems SM, Panoutsopoulou K, Wang X, Carstensen L, Geller F, Schraut KE, Murcia M, van Beijsterveldt CE, Willemsen G, Appel EVR, Fonvig CE, Trier C, Tiesler CM, Standl M, Kutalik Z, Bonas-Guarch S, Hougaard DM, Sánchez F, Torrents D, Waage J, Hollegaard MV, de Haan HG, Rosendaal FR, Medina-Gomez C, Ring SM, Hemani G, McMahon G, Robertson NR, Groves CJ, Langenberg C, Luan J, Scott RA, Zhao JH, Mentch FD, MacKenzie SM, Reynolds RM, Early Growth Genetics (EGG) Consortium, Lowe WL, Tönjes A, Stumvoll M, Lindi V, Lakka TA, van Duijn CM, Kiess W, Körner A, Sørensen TI, Niinikoski H, Pahkala K, Raitakari OT, Zeggini E, Dedoussis GV, Teo YY, Saw SM, Melbye M, Campbell H, Wilson JF, Vrijheid M, de Geus EJ, Boomsma DI, Kadarmideen HN, Holm JC, Hansen T, Sebert S, Hattersley AT, Beilin LJ, Newnham JP, Pennell CE, Heinrich J, Adair LS, Borja JB, Mohlke KL, Eriksson JG, Widén EE, Kähönen M, Viikari JS, Lehtimäki T, Vollenweider P, Bønnelykke K, Bisgaard H, Mook-Kanamori DO, Hofman A, Rivadeneira F, Uitterlinden AG, Pisinger C, Pedersen O, Power C, Hyppönen E, Wareham NJ, Hakonarson H, Davies E, Walker BR, Jaddoe VW, Jarvelin MR, Grant SF, Vaag AA, Lawlor DA, Frayling TM, Davey Smith G, Morris AP, Ong KK, Felix JF, Timpson NJ, Perry JR, Evans DM, McCarthy MI and Freathy RM

    Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, UK.

    Birth weight (BW) has been shown to be influenced by both fetal and maternal factors and in observational studies is reproducibly associated with future risk of adult metabolic diseases including type 2 diabetes (T2D) and cardiovascular disease. These life-course associations have often been attributed to the impact of an adverse early life environment. Here, we performed a multi-ancestry genome-wide association study (GWAS) meta-analysis of BW in 153,781 individuals, identifying 60 loci where fetal genotype was associated with BW (P < 5 × 10<sup>-8</sup>). Overall, approximately 15% of variance in BW was captured by assays of fetal genetic variation. Using genetic association alone, we found strong inverse genetic correlations between BW and systolic blood pressure (R<sub>g</sub> = -0.22, P = 5.5 × 10<sup>-13</sup>), T2D (R<sub>g</sub> = -0.27, P = 1.1 × 10<sup>-6</sup>) and coronary artery disease (R<sub>g</sub> = -0.30, P = 6.5 × 10<sup>-9</sup>). In addition, using large -cohort datasets, we demonstrated that genetic factors were the major contributor to the negative covariance between BW and future cardiometabolic risk. Pathway analyses indicated that the protein products of genes within BW-associated regions were enriched for diverse processes including insulin signalling, glucose homeostasis, glycogen biosynthesis and chromatin remodelling. There was also enrichment of associations with BW in known imprinted regions (P = 1.9 × 10<sup>-4</sup>). We demonstrate that life-course associations between early growth phenotypes and adult cardiometabolic disease are in part the result of shared genetic effects and identify some of the pathways through which these causal genetic effects are mediated.

    Funded by: British Heart Foundation: RG/11/4/28734; Chief Scientist Office: CZB/4/733; Medical Research Council: G0500070, G0601261, G1001799, MC_PC_15018, MC_PC_U127561128, MC_QA137853, MC_UU_12013/1, MC_UU_12013/3, MC_UU_12013/4, MC_UU_12013/5, MC_UU_12015/1, MR/J012165/1; NHLBI NIH HHS: R01 HL122684; NICHD NIH HHS: P2C HD050924, R01 HD056465

    Nature 2016;538;7624;248-252

  • Transancestral fine-mapping of four type 2 diabetes susceptibility loci highlights potential causal regulatory mechanisms.

    Horikoshi M, Pasquali L, Wiltshire S, Huyghe JR, Mahajan A, Asimit JL, Ferreira T, Locke AE, Robertson NR, Wang X, Sim X, Fujita H, Hara K, Young R, Zhang W, Choi S, Chen H, Kaur I, Takeuchi F, Fontanillas P, Thuillier D, Yengo L, Below JE, Tam CH, Wu Y, Abecasis G, Altshuler D, Bell GI, Blangero J, Burtt NP, Duggirala R, Florez JC, Hanis CL, Seielstad M, Atzmon G, Chan JC, Ma RC, Froguel P, Wilson JG, Bharadwaj D, Dupuis J, Meigs JB, Cho YS, Park T, Kooner JS, Chambers JC, Saleheen D, Kadowaki T, Tai ES, Mohlke KL, Cox NJ, Ferrer J, Zeggini E, Kato N, Teo YY, Boehnke M, McCarthy MI, Morris AP and T2D-GENES Consortium

    Wellcome Trust Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford, UK, Oxford Centre for Diabetes, Endocrinology and Metabolism, Radcliffe Department of Medicine, University of Oxford, Oxford, UK.

    To gain insight into potential regulatory mechanisms through which the effects of variants at four established type 2 diabetes (T2D) susceptibility loci (CDKAL1, CDKN2A-B, IGF2BP2 and KCNQ1) are mediated, we undertook transancestral fine-mapping in 22 086 cases and 42 539 controls of East Asian, European, South Asian, African American and Mexican American descent. Through high-density imputation and conditional analyses, we identified seven distinct association signals at these four loci, each with allelic effects on T2D susceptibility that were homogenous across ancestry groups. By leveraging differences in the structure of linkage disequilibrium between diverse populations, and increased sample size, we localised the variants most likely to drive each distinct association signal. We demonstrated that integration of these genetic fine-mapping data with genomic annotation can highlight potential causal regulatory elements in T2D-relevant tissues. These analyses provide insight into the mechanisms through which T2D association signals are mediated, and suggest future routes to understanding the biology of specific disease susceptibility loci.

    Funded by: NIDDK NIH HHS: R01 DK072193, R01 DK078616, U01 DK078616, U01 DK105535

    Human molecular genetics 2016

  • Independent Origin and Global Distribution of Distinct Plasmodium vivax Duffy Binding Protein Gene Duplications.

    Hostetler JB, Lo E, Kanjee U, Amaratunga C, Suon S, Sreng S, Mao S, Yewhalaw D, Mascarenhas A, Kwiatkowski DP, Ferreira MU, Rathod PK, Yan G, Fairhurst RM, Duraisingh MT and Rayner JC

    Malaria Programme, Wellcome Trust Sanger Institute, Hinxton, United Kingdom.

    Background: Plasmodium vivax causes the majority of malaria episodes outside Africa, but remains a relatively understudied pathogen. The pathology of P. vivax infection depends critically on the parasite's ability to recognize and invade human erythrocytes. This invasion process involves an interaction between P. vivax Duffy Binding Protein (PvDBP) in merozoites and the Duffy antigen receptor for chemokines (DARC) on the erythrocyte surface. Whole-genome sequencing of clinical isolates recently established that some P. vivax genomes contain two copies of the PvDBP gene. The frequency of this duplication is particularly high in Madagascar, where there is also evidence for P. vivax infection in DARC-negative individuals. The functional significance and global prevalence of this duplication, and whether there are other copy number variations at the PvDBP locus, is unknown.

    Methodology/principal findings: Using whole-genome sequencing and PCR to study the PvDBP locus in P. vivax clinical isolates, we found that PvDBP duplication is widespread in Cambodia. The boundaries of the Cambodian PvDBP duplication differ from those previously identified in Madagascar, meaning that current molecular assays were unable to detect it. The Cambodian PvDBP duplication did not associate with parasite density or DARC genotype, and ranged in prevalence from 20% to 38% over four annual transmission seasons in Cambodia. This duplication was also present in P. vivax isolates from Brazil and Ethiopia, but not India.

    Conclusions/significance: PvDBP duplications are much more widespread and complex than previously thought, and at least two distinct duplications are circulating globally. The same duplication boundaries were identified in parasites from three continents, and were found at high prevalence in human populations where DARC-negativity is essentially absent. It is therefore unlikely that PvDBP duplication is associated with infection of DARC-negative individuals, but functional tests will be required to confirm this hypothesis.

    Funded by: NCATS NIH HHS: UL1 TR001414; NIAID NIH HHS: U19 AI089688

    PLoS neglected tropical diseases 2016;10;10;e0005091

  • Structure and evolutionary history of a large family of NLR proteins in the zebrafish.

    Howe K, Schiffer PH, Zielinski J, Wiehe T, Laird GK, Marioni JC, Soylemez O, Kondrashov F and Leptin M

    Wellcome Trust Sanger Institute, Cambridge, UK.

    Multicellular eukaryotes have evolved a range of mechanisms for immune recognition. A widespread family involved in innate immunity are the NACHT-domain and leucine-rich-repeat-containing (NLR) proteins. Mammals have small numbers of NLR proteins, whereas in some species, mostly those without adaptive immune systems, NLRs have expanded into very large families. We describe a family of nearly 400 NLR proteins encoded in the zebrafish genome. The proteins share a defining overall structure, which arose in fishes after a fusion of the core NLR domains with a B30.2 domain, but can be subdivided into four groups based on their NACHT domains. Gene conversion acting differentially on the NACHT and B30.2 domains has shaped the family and created the groups. Evidence of positive selection in the B30.2 domain indicates that this domain rather than the leucine-rich repeats acts as the pathogen recognition module. In an unusual chromosomal organization, the majority of the genes are located on one chromosome arm, interspersed with other large multigene families, including a new family encoding zinc-finger proteins. The NLR-B30.2 proteins represent a new family with diversity in the specific recognition module that is present in fishes in spite of the parallel existence of an adaptive immune system.

    Funded by: European Research Council: 335980; Howard Hughes Medical Institute: 55007424; NHGRI NIH HHS: HG002659; Wellcome Trust

    Open biology 2016;6;4;160009

  • WormBase 2016: expanding to enable helminth genomic research.

    Howe KL, Bolt BJ, Cain S, Chan J, Chen WJ, Davis P, Done J, Down T, Gao S, Grove C, Harris TW, Kishore R, Lee R, Lomax J, Li Y, Muller HM, Nakamura C, Nuin P, Paulini M, Raciti D, Schindelman G, Stanley E, Tuli MA, Van Auken K, Wang D, Wang X, Williams G, Wright A, Yook K, Berriman M, Kersey P, Schedl T, Stein L and Sternberg PW

    European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK

    WormBase ( is a central repository for research data on the biology, genetics and genomics of Caenorhabditis elegans and other nematodes. The project has evolved from its original remit to collect and integrate all data for a single species, and now extends to numerous nematodes, ranging from evolutionary comparators of C. elegans to parasitic species that threaten plant, animal and human health. Research activity using C. elegans as a model system is as vibrant as ever, and we have created new tools for community curation in response to the ever-increasing volume and complexity of data. To better allow users to navigate their way through these data, we have made a number of improvements to our main website, including new tools for browsing genomic features and ontology annotations. Finally, we have developed a new portal for parasitic worm genomes. WormBase ParaSite ( contains all publicly available nematode and platyhelminth annotated genome sequences, and is designed specifically to support helminth genomic research.

    Nucleic acids research 2016;44;D1;D774-80

  • Insulin resistance uncoupled from dyslipidemia due to C-terminal PIK3R1 mutations.

    Huang-Doran I, Tomlinson P, Payne F, Gast A, Sleigh A, Bottomley W, Harris J, Daly A, Rocha N, Rudge S, Clark J, Kwok A, Romeo S, McCann E, Müksch B, Dattani M, Zucchini S, Wakelam M, Foukas LC, Savage DB, Murphy R, O'Rahilly S, Barroso I and Semple RK

    The University of Cambridge Metabolic Research Laboratories, Wellcome Trust-MRC Institute of Metabolic Science, Cambridge, United Kingdom.

    Obesity-related insulin resistance is associated with fatty liver, dyslipidemia, and low plasma adiponectin. Insulin resistance due to insulin receptor (INSR) dysfunction is associated with none of these, but when due to dysfunction of the downstream kinase AKT2 phenocopies obesity-related insulin resistance. We report 5 patients with SHORT syndrome and C-terminal mutations in <i>PIK3R1</i>, encoding the p85α/p55α/p50α subunits of PI3K, which act between INSR and AKT in insulin signaling. Four of 5 patients had extreme insulin resistance without dyslipidemia or hepatic steatosis. In 3 of these 4, plasma adiponectin was preserved, as in insulin receptor dysfunction. The fourth patient and her healthy mother had low plasma adiponectin associated with a potentially novel mutation, p.Asp231Ala, in adiponectin itself. Cells studied from one patient with the p.Tyr657X <i>PIK3R1</i> mutation expressed abundant truncated PIK3R1 products and showed severely reduced insulin-stimulated association of mutant but not WT p85α with IRS1, but normal downstream signaling. In 3T3-L1 preadipocytes, mutant p85α overexpression attenuated insulin-induced AKT phosphorylation and adipocyte differentiation. Thus, <i>PIK3R1</i> C-terminal mutations impair insulin signaling only in some cellular contexts and produce a subphenotype of insulin resistance resembling INSR dysfunction but unlike AKT2 dysfunction, implicating PI3K in the pathogenesis of key components of the metabolic syndrome.

    Funded by: Biotechnology and Biological Sciences Research Council: BBS/E/B/0000H213, BBS/E/B/0000S227, BBS/E/B/000C0413; Medical Research Council: MC_PC_15018, MC_UU_12012/5, MRC_MC_UU_12012/5; NHLBI NIH HHS: RC2 HL102923, RC2 HL102924, RC2 HL102925, RC2 HL102926, RC2 HL103010, UC2 HL102923, UC2 HL102924, UC2 HL102925, UC2 HL102926, UC2 HL103010; Wellcome Trust: WT091310, WT095515, WT098051, WT098498, WT107064

    JCI insight 2016;1;17;e88766

  • The genomic basis of parasitism in the Strongyloides clade of nematodes.

    Hunt VL, Tsai IJ, Coghlan A, Reid AJ, Holroyd N, Foth BJ, Tracey A, Cotton JA, Stanley EJ, Beasley H, Bennett HM, Brooks K, Harsha B, Kajitani R, Kulkarni A, Harbecke D, Nagayasu E, Nichol S, Ogura Y, Quail MA, Randle N, Xia D, Brattig NW, Soblik H, Ribeiro DM, Sanchez-Flores A, Hayashi T, Itoh T, Denver DR, Grant W, Stoltzfus JD, Lok JB, Murayama H, Wastling J, Streit A, Kikuchi T, Viney M and Berriman M

    School of Biological Sciences, University of Bristol, Bristol, UK.

    Soil-transmitted nematodes, including the Strongyloides genus, cause one of the most prevalent neglected tropical diseases. Here we compare the genomes of four Strongyloides species, including the human pathogen Strongyloides stercoralis, and their close relatives that are facultatively parasitic (Parastrongyloides trichosuri) and free-living (Rhabditophanes sp. KR3021). A significant paralogous expansion of key gene families--families encoding astacin-like and SCP/TAPS proteins--is associated with the evolution of parasitism in this clade. Exploiting the unique Strongyloides life cycle, we compare the transcriptomes of the parasitic and free-living stages and find that these same gene families are upregulated in the parasitic stages, underscoring their role in nematode parasitism.

    Funded by: NCRR NIH HHS: P40 RR002512, RR02512; NIAID NIH HHS: AI050668, AI060516, AI105856, R01 AI050668, R21 AI105856, R33 AI105856, T32 AI060516; Wellcome Trust: 094462/Z/10/Z, 098051

    Nature genetics 2016;48;3;299-307

  • GWAS for executive function and processing speed suggests involvement of the CADM2 gene.

    Ibrahim-Verbaas CA, Bressler J, Debette S, Schuur M, Smith AV, Bis JC, Davies G, Trompet S, Smith JA, Wolf C, Chibnik LB, Liu Y, Vitart V, Kirin M, Petrovic K, Polasek O, Zgaga L, Fawns-Ritchie C, Hoffmann P, Karjalainen J, Lahti J, Llewellyn DJ, Schmidt CO, Mather KA, Chouraki V, Sun Q, Resnick SM, Rose LM, Oldmeadow C, Stewart M, Smith BH, Gudnason V, Yang Q, Mirza SS, Jukema JW, deJager PL, Harris TB, Liewald DC, Amin N, Coker LH, Stegle O, Lopez OL, Schmidt R, Teumer A, Ford I, Karbalai N, Becker JT, Jonsdottir MK, Au R, Fehrmann RS, Herms S, Nalls M, Zhao W, Turner ST, Yaffe K, Lohman K, van Swieten JC, Kardia SL, Knopman DS, Meeks WM, Heiss G, Holliday EG, Schofield PW, Tanaka T, Stott DJ, Wang J, Ridker P, Gow AJ, Pattie A, Starr JM, Hocking LJ, Armstrong NJ, McLachlan S, Shulman JM, Pilling LC, Eiriksdottir G, Scott RJ, Kochan NA, Palotie A, Hsieh YC, Eriksson JG, Penman A, Gottesman RF, Oostra BA, Yu L, DeStefano AL, Beiser A, Garcia M, Rotter JI, Nöthen MM, Hofman A, Slagboom PE, Westendorp RG, Buckley BM, Wolf PA, Uitterlinden AG, Psaty BM, Grabe HJ, Bandinelli S, Chasman DI, Grodstein F, Räikkönen K, Lambert JC, Porteous DJ, Generation Scotland, Price JF, Sachdev PS, Ferrucci L, Attia JR, Rudan I, Hayward C, Wright AF, Wilson JF, Cichon S, Franke L, Schmidt H, Ding J, de Craen AJ, Fornage M, Bennett DA, Deary IJ, Ikram MA, Launer LJ, Fitzpatrick AL, Seshadri S, van Duijn CM and Mosley TH

    Genetic Epidemiology Unit, Department of Epidemiology, Erasmus University Medical Center, Rotterdam, The Netherlands.

    To identify common variants contributing to normal variation in two specific domains of cognitive functioning, we conducted a genome-wide association study (GWAS) of executive functioning and information processing speed in non-demented older adults from the CHARGE (Cohorts for Heart and Aging Research in Genomic Epidemiology) consortium. Neuropsychological testing was available for 5429-32 070 subjects of European ancestry aged 45 years or older, free of dementia and clinical stroke at the time of cognitive testing from 20 cohorts in the discovery phase. We analyzed performance on the Trail Making Test parts A and B, the Letter Digit Substitution Test (LDST), the Digit Symbol Substitution Task (DSST), semantic and phonemic fluency tests, and the Stroop Color and Word Test. Replication was sought in 1311-21860 subjects from 20 independent cohorts. A significant association was observed in the discovery cohorts for the single-nucleotide polymorphism (SNP) rs17518584 (discovery P-value=3.12 × 10(-8)) and in the joint discovery and replication meta-analysis (P-value=3.28 × 10(-9) after adjustment for age, gender and education) in an intron of the gene cell adhesion molecule 2 (CADM2) for performance on the LDST/DSST. Rs17518584 is located about 170 kb upstream of the transcription start site of the major transcript for the CADM2 gene, but is within an intron of a variant transcript that includes an alternative first exon. The variant is associated with expression of CADM2 in the cingulate cortex (P-value=4 × 10(-4)). The protein encoded by CADM2 is involved in glutamate signaling (P-value=7.22 × 10(-15)), gamma-aminobutyric acid (GABA) transport (P-value=1.36 × 10(-11)) and neuron cell-cell adhesion (P-value=1.48 × 10(-13)). Our findings suggest that genetic variation in the CADM2 gene is associated with individual differences in information processing speed.

    Funded by: Biotechnology and Biological Sciences Research Council: BB/F019394/1; Medical Research Council: G0700704, MR/K026992/1; NCATS NIH HHS: UL1 TR000124; NCI NIH HHS: P01 CA055075, P01 CA087969, R01 CA047988, R01 CA049449, R01 CA050385, R01 CA065725, R01 CA067262, R01 CA134958, U01 CA067262, U01 CA098233; NEI NIH HHS: R01 EY009611, R01 EY015473; NHGRI NIH HHS: U01 HG004399, U01 HG004402, U01 HG004728; NHLBI NIH HHS: HHSN268200900020C, HHSN268201100005C, HHSN268201100006C, HHSN268201100007C, HHSN268201100008C, HHSN268201100009C, HHSN268201100010C, HHSN268201100011C, HHSN268201100012C, HHSN268201200036C, N01 HC015103, N01 HC025195, N01 HC035129, N01 HC045133, N01 HC075150, N01 HC085082, N01 HC085084, N01 HC085085, N01 HC085086, N01HC55222, N01HC85079, N01HC85080, N01HC85081, N01HC85082, N01HC85083, R01 HL034594, R01 HL035464, R01 HL043851, R01 HL059367, R01 HL070825, R01 HL071917, R01 HL080295, R01 HL080467, R01 HL086694, R01 HL087641, R01 HL087652, R01 HL087660, R01 HL093029, R01 HL105756, U01 HL054457, U01 HL054463, U01 HL054464, U01 HL054481, U01 HL096917; NIA NIH HHS: K08 AG034290, K25 AG041906, N01 AG012100, N01 AG062101, N01 AG062103, N01 AG062106, N01 AG821336, N01 AG916413, P30 AG010161, P50 AG005133, R01 AG008122, R01 AG015819, R01 AG015928, R01 AG016495, R01 AG017917, R01 AG020098, R01 AG023629, R01 AG027058, R01 AG030146, R01 AG032098, R01 AG033193, U01 AG049505; NIDDK NIH HHS: P01 DK070756, P30 DK063491, R01 DK058845; NIMHD NIH HHS: 263 MD821336, 263 MD9164 13; NINDS NIH HHS: R01 NS017950, R01 NS041558

    Molecular psychiatry 2016;21;2;189-97

  • Classification of low quality cells from single-cell RNA-seq data.

    Ilicic T, Kim JK, Kolodziejczyk AA, Bagger FO, McCarthy DJ, Marioni JC and Teichmann SA

    European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.

    Single-cell RNA sequencing (scRNA-seq) has broad applications across biomedical research. One of the key challenges is to ensure that only single, live cells are included in downstream analysis, as the inclusion of compromised cells inevitably affects data interpretation. Here, we present a generic approach for processing scRNA-seq data and detecting low quality cells, using a curated set of over 20 biological and technical features. Our approach improves classification accuracy by over 30 % compared to traditional methods when tested on over 5,000 cells, including CD4+ T cells, bone marrow dendritic cells, and mouse embryonic stem cells.

    Funded by: Biotechnology and Biological Sciences Research Council

    Genome biology 2016;17;29

  • Evolutionary genomics of epidemic visceral leishmaniasis in the Indian subcontinent.

    Imamura H, Downing T, Van den Broeck F, Sanders MJ, Rijal S, Sundar S, Mannaert A, Vanaerschot M, Berg M, De Muylder G, Dumetz F, Cuypers B, Maes I, Domagalska M, Decuypere S, Rai K, Uranw S, Bhattarai NR, Khanal B, Prajapati VK, Sharma S, Stark O, Schönian G, De Koning HP, Settimo L, Vanhollebeke B, Roy S, Ostyn B, Boelaert M, Maes L, Berriman M, Dujardin JC and Cotton JA

    Department of Biomedical Sciences, Institute of Tropical Medicine, Antwerp, Belgium.

    Leishmania donovani causes visceral leishmaniasis (VL), the second most deadly vector-borne parasitic disease. A recent epidemic in the Indian subcontinent (ISC) caused up to 80% of global VL and over 30,000 deaths per year. Resistance against antimonial drugs has probably been a contributing factor in the persistence of this epidemic. Here we use whole genome sequences from 204 clinical isolates to track the evolution and epidemiology of L. donovani from the ISC. We identify independent radiations that have emerged since a bottleneck coincident with 1960s DDT spraying campaigns. A genetically distinct population frequently resistant to antimonials has a two base-pair insertion in the aquaglyceroporin gene LdAQP1 that prevents the transport of trivalent antimonials. We find evidence of genetic exchange between ISC populations, and show that the mutation in LdAQP1 has spread by recombination. Our results reveal the complexity of L. donovani evolution in the ISC in response to drug treatment.

    eLife 2016;5

  • Genome-wide association studies in the Japanese population identify seven novel loci for type 2 diabetes.

    Imamura M, Takahashi A, Yamauchi T, Hara K, Yasuda K, Grarup N, Zhao W, Wang X, Huerta-Chagoya A, Hu C, Moon S, Long J, Kwak SH, Rasheed A, Saxena R, Ma RC, Okada Y, Iwata M, Hosoe J, Shojima N, Iwasaki M, Fujita H, Suzuki K, Danesh J, Jørgensen T, Jørgensen ME, Witte DR, Brandslund I, Christensen C, Hansen T, Mercader JM, Flannick J, Moreno-Macías H, Burtt NP, Zhang R, Kim YJ, Zheng W, Singh JR, Tam CH, Hirose H, Maegawa H, Ito C, Kaku K, Watada H, Tanaka Y, Tobe K, Kawamori R, Kubo M, Cho YS, Chan JC, Sanghera D, Frossard P, Park KS, Shu XO, Kim BJ, Florez JC, Tusié-Luna T, Jia W, Tai ES, Pedersen O, Saleheen D, Maeda S and Kadowaki T

    Laboratory for Endocrinology, Metabolism and Kidney Diseases, RIKEN Center for Integrative Medical Sciences, Yokohama 230-0045, Japan.

    Genome-wide association studies (GWAS) have identified more than 80 susceptibility loci for type 2 diabetes (T2D), but most of its heritability still remains to be elucidated. In this study, we conducted a meta-analysis of GWAS for T2D in the Japanese population. Combined data from discovery and subsequent validation analyses (23,399 T2D cases and 31,722 controls) identify 7 new loci with genome-wide significance (P<5 × 10(-8)), rs1116357 near CCDC85A, rs147538848 in FAM60A, rs1575972 near DMRTA1, rs9309245 near ASB3, rs67156297 near ATP8B2, rs7107784 near MIR4686 and rs67839313 near INAFM2. Of these, the association of 4 loci with T2D is replicated in multi-ethnic populations other than Japanese (up to 65,936 T2Ds and 158,030 controls, P<0.007). These results indicate that expansion of single ethnic GWAS is still useful to identify novel susceptibility loci to complex traits not only for ethnicity-specific loci but also for common loci across different ethnicities.

    Funded by: British Heart Foundation: SP/09/002; European Research Council: 268834; FIC NIH HHS: K01 TW006087, KO1TW006087; Medical Research Council: G0800270; NCI NIH HHS: BC050791, R01 CA064277, R01 CA124558, R01CA124558, R01CA64277, R37 CA070867, R37CA070867, UM1 CA182910; NCRR NIH HHS: UL1 RR024975; NIDDK NIH HHS: R01 DK082766, R01DK082766

    Nature communications 2016;7;10531

  • Comparative Antibody Responses Against three Antimalarial Vaccine Candidate Antigens from Urban and Rural Exposed Individuals in Gabon.

    Imboumy-Limoukou RK, Oyegue-Liabagui SL, Ndidi S, Pegha-Moukandja I, Kouna CL, Galaway F, Florent I and Lekana-Douki JB

    Unité de Parasitologie Médicale (UPARAM), Centre International de Recherches Médicales de Franceville (CIRMF), BP 769 Franceville, Gabon; Molécules de Communication et Adaptation des Microorganismes (MCAM, UMR 7245), Sorbonne Universités, Muséum National d'Histoire Naturelle, CNRS, CP52, 57 rue Cuvier 75005 Paris, France; Ecole Doctorale Régionale en Infectiologie Tropicale d'Afrique Centrale (ECODRAC), BP 876 Franceville, Gabon.

    The analysis of immune responses in diverse malaria endemic regions provides more information to understand the host's immune response to <i>Plasmodium falciparum.</i> Several plasmodial antigens have been reported as targets of human immunity. PfAMA1 is one of most studied vaccine candidates; PfRH5 and Pf113 are new promising vaccine candidates. The aim of this study was to evaluate humoral response against these three antigens among children of Lastourville (rural area) and Franceville (urban area). Malaria was diagnosed using rapid diagnosis tests. Plasma samples were tested against these antigens by enzyme-linked immunosorbent assay (ELISA). We found that malaria prevalence was five times higher in the rural area than in the urban area (<i>p</i> < 0.0001). The anti-PfAMA1 and PfRh5 response levels were significantly higher in Lastourville than in Franceville (<i>p</i> < 0.0001; <i>p</i> = 0.005). The anti-AMA1 response was higher than the anti-Pf113 response, which in turn was higher than the anti-PfRh5 response in both sites. Anti-PfAMA1 levels were significantly higher in infected children than those in uninfected children (<i>p</i> = 0.001) in Franceville. Anti-Pf113 and anti-PfRh5 antibody levels were lowest in children presenting severe malarial anemia. These three antigens are targets of immunity in Gabon. Further studies on the role of Pf113 in antimalarial protection against severe anemia are needed.

    European journal of microbiology & immunology 2016;6;4;287-297

  • S1PR2 variants associated with auditory function in humans and endocochlear potential decline in mouse.

    Ingham NJ, Carlisle F, Pearson S, Lewis MA, Buniello A, Chen J, Isaacson RL, Pass J, White JK, Dawson SJ and Steel KP

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.

    Progressive hearing loss is very common in the population but we still know little about the underlying pathology. A new spontaneous mouse mutation (stonedeaf, stdf ) leading to recessive, early-onset progressive hearing loss was detected and exome sequencing revealed a Thr289Arg substitution in Sphingosine-1-Phosphate Receptor-2 (S1pr2). Mutants aged 2 weeks had normal hearing sensitivity, but at 4 weeks most showed variable degrees of hearing impairment, which became severe or profound in all mutants by 14 weeks. Endocochlear potential (EP) was normal at 2 weeks old but was reduced by 4 and 8 weeks old in mutants, and the stria vascularis, which generates the EP, showed degenerative changes. Three independent mouse knockout alleles of S1pr2 have been described previously, but this is the first time that a reduced EP has been reported. Genomic markers close to the human S1PR2 gene were significantly associated with auditory thresholds in the 1958 British Birth Cohort (n = 6099), suggesting involvement of S1P signalling in human hearing loss. The finding of early onset loss of EP gives new mechanistic insight into the disease process and suggests that therapies for humans with hearing loss due to S1P signalling defects need to target strial function.

    Funded by: Medical Research Council: G0000934, G0300212, MC_QA137918; NIDDK NIH HHS: U01 DK062418; Wellcome Trust: 068545/Z/02, 076113/B/04/Z, 079895, 089622AIA, 098051, 100699

    Scientific reports 2016;6;28964

  • Evolution of atypical enteropathogenic E. coli by repeated acquisition of LEE pathogenicity island variants.

    Ingle DJ, Tauschek M, Edwards DJ, Hocking DM, Pickard DJ, Azzopardi KI, Amarasena T, Bennett-Wood V, Pearson JS, Tamboura B, Antonio M, Ochieng JB, Oundo J, Mandomando I, Qureshi S, Ramamurthy T, Hossain A, Kotloff KL, Nataro JP, Dougan G, Levine MM, Robins-Browne RM and Holt KE

    Department of Microbiology and Immunology, The University of Melbourne at the Peter Doherty Institute for Infection and Immunity, Victoria 3010, Australia.

    Atypical enteropathogenic Escherichia coli (aEPEC) is an umbrella term given to E. coli that possess a type III secretion system encoded in the locus of enterocyte effacement (LEE), but lack the virulence factors (stx, bfpA) that characterize enterohaemorrhagic E. coli and typical EPEC, respectively. The burden of disease caused by aEPEC has recently increased in industrialized and developing nations, yet the population structure and virulence profile of this emerging pathogen are poorly understood. Here, we generated whole-genome sequences of 185 aEPEC isolates collected during the Global Enteric Multicenter Study from seven study sites in Asia and Africa, and compared them with publicly available E. coli genomes. Phylogenomic analysis revealed ten distinct widely distributed aEPEC clones. Analysis of genetic variation in the LEE pathogenicity island identified 30 distinct LEE subtypes divided into three major lineages. Each LEE lineage demonstrated a preferred chromosomal insertion site and different complements of non-LEE encoded effector genes, indicating distinct patterns of evolution of these lineages. This study provides the first detailed genomic framework for aEPEC in the context of the EPEC pathotype and will facilitate further studies into the epidemiology and pathogenicity of EPEC by enabling the detection and tracking of specific clones and LEE variants.

    Funded by: Medical Research Council: MC_U190074190, MC_U190081991, MC_UP_A900_1122

    Nature microbiology 2016;1;15010

  • Molecular Surveillance Identifies Multiple Transmissions of Typhoid in West Africa.

    International Typhoid Consortium, Wong VK, Holt KE, Okoro C, Baker S, Pickard DJ, Marks F, Page AJ, Olanipekun G, Munir H, Alter R, Fey PD, Feasey NA, Weill FX, Le Hello S, Hart PJ, Kariuki S, Breiman RF, Gordon MA, Heyderman RS, Jacobs J, Lunguya O, Msefula C, MacLennan CA, Keddy KH, Smith AM, Onsare RS, De Pinna E, Nair S, Amos B, Dougan G and Obaro S

    The Wellcome Trust Sanger Institute, Hinxton, Cambridge, United Kingdom.

    Background: The burden of typhoid in sub-Saharan African (SSA) countries has been difficult to estimate, in part, due to suboptimal laboratory diagnostics. However, surveillance blood cultures at two sites in Nigeria have identified typhoid associated with Salmonella enterica serovar Typhi (S. Typhi) as an important cause of bacteremia in children.

    Methods: A total of 128 S. Typhi isolates from these studies in Nigeria were whole-genome sequenced, and the resulting data was used to place these Nigerian isolates into a worldwide context based on their phylogeny and carriage of molecular determinants of antibiotic resistance.

    Results: Several distinct S. Typhi genotypes were identified in Nigeria that were related to other clusters of S. Typhi isolates from north, west and central regions of Africa. The rapidly expanding S. Typhi clade 4.3.1 (H58) previously associated with multiple antimicrobial resistances in Asia and in east, central and southern Africa, was not detected in this study. However, antimicrobial resistance was common amongst the Nigerian isolates and was associated with several plasmids, including the IncHI1 plasmid commonly associated with S. Typhi.

    Conclusions: These data indicate that typhoid in Nigeria was established through multiple independent introductions into the country, with evidence of regional spread. MDR typhoid appears to be evolving independently of the haplotype H58 found in other typhoid endemic countries. This study highlights an urgent need for routine surveillance to monitor the epidemiology of typhoid and evolution of antimicrobial resistance within the bacterial population as a means to facilitate public health interventions to reduce the substantial morbidity and mortality of typhoid.

    Funded by: Medical Research Council: G9818340; NIAID NIH HHS: R01 AI097493; Wellcome Trust

    PLoS neglected tropical diseases 2016;10;9;e0004781

  • A Landscape of Pharmacogenomic Interactions in Cancer.

    Iorio F, Knijnenburg TA, Vis DJ, Bignell GR, Menden MP, Schubert M, Aben N, Gonçalves E, Barthorpe S, Lightfoot H, Cokelaer T, Greninger P, van Dyk E, Chang H, de Silva H, Heyn H, Deng X, Egan RK, Liu Q, Mironenko T, Mitropoulos X, Richardson L, Wang J, Zhang T, Moran S, Sayols S, Soleimani M, Tamborero D, Lopez-Bigas N, Ross-Macdonald P, Esteller M, Gray NS, Haber DA, Stratton MR, Benes CH, Wessels LFA, Saez-Rodriguez J, McDermott U and Garnett MJ

    European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Cambridge CB10 1SA, UK; Wellcome Trust Sanger Institute, Wellcome Genome Campus, Cambridge CB10 1SA, UK.

    Systematic studies of cancer genomes have provided unprecedented insights into the molecular nature of cancer. Using this information to guide the development and application of therapies in the clinic is challenging. Here, we report how cancer-driven alterations identified in 11,289 tumors from 29 tissues (integrating somatic mutations, copy number alterations, DNA methylation, and gene expression) can be mapped onto 1,001 molecularly annotated human cancer cell lines and correlated with sensitivity to 265 drugs. We find that cell lines faithfully recapitulate oncogenic alterations identified in tumors, find that many of these associate with drug sensitivity/resistance, and highlight the importance of tissue lineage in mediating drug response. Logic-based modeling uncovers combinations of alterations that sensitize to drugs, while machine learning demonstrates the relative importance of different data types in predicting drug response. Our analysis and datasets are rich resources to link genotypes with cellular phenotypes and to identify therapeutic options for selected cancer sub-populations.

    Funded by: Cancer Research UK; European Research Council: 268626; Marie Curie; NCI NIH HHS: U24 CA143835; Wellcome Trust: 086375, 102696

    Cell 2016;166;3;740-754

  • Discovery and refinement of genetic loci associated with cardiometabolic risk using dense imputation maps.

    Iotchkova V, Huang J, Morris JA, Jain D, Barbieri C, Walter K, Min JL, Chen L, Astle W, Cocca M, Deelen P, Elding H, Farmaki AE, Franklin CS, Franberg M, Gaunt TR, Hofman A, Jiang T, Kleber ME, Lachance G, Luan J, Malerba G, Matchan A, Mead D, Memari Y, Ntalla I, Panoutsopoulou K, Pazoki R, Perry JRB, Rivadeneira F, Sabater-Lleal M, Sennblad B, Shin SY, Southam L, Traglia M, van Dijk F, van Leeuwen EM, Zaza G, Zhang W, UK10K Consortium, Amin N, Butterworth A, Chambers JC, Dedoussis G, Dehghan A, Franco OH, Franke L, Frontini M, Gambaro G, Gasparini P, Hamsten A, Issacs A, Kooner JS, Kooperberg C, Langenberg C, Marz W, Scott RA, Swertz MA, Toniolo D, Uitterlinden AG, van Duijn CM, Watkins H, Zeggini E, Maurano MT, Timpson NJ, Reiner AP, Auer PL and Soranzo N

    European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK.

    Large-scale whole-genome sequence data sets offer novel opportunities to identify genetic variation underlying human traits. Here we apply genotype imputation based on whole-genome sequence data from the UK10K and 1000 Genomes Project into 35,981 study participants of European ancestry, followed by association analysis with 20 quantitative cardiometabolic and hematological traits. We describe 17 new associations, including 6 rare (minor allele frequency (MAF) < 1%) or low-frequency (1% < MAF < 5%) variants with platelet count (PLT), red blood cell indices (MCH and MCV) and HDL cholesterol. Applying fine-mapping analysis to 233 known and new loci associated with the 20 traits, we resolve the associations of 59 loci to credible sets of 20 or fewer variants and describe trait enrichments within regions of predicted regulatory function. These findings improve understanding of the allelic architecture of risk factors for cardiometabolic and hematological diseases and provide additional functional insights with the identification of potentially novel biological targets.

    Funded by: British Heart Foundation: SP/04/002; Medical Research Council: G0601966, G0700931, G0800270, MC_PC_15018, MC_U106179471, MC_UU_12013/1-­‐9, MC_UU_12015/1, MC_UU_12015/2; NHLBI NIH HHS: HHSN268201100046C, R21 HL121422; NIA NIH HHS: HHSN271201100004C; NIH HHS: S10 OD020069; WHI NIH HHS: HHSN268201100001C, HHSN268201100002C, HHSN268201100003C, HHSN268201100004C; Wellcome Trust: 084723/Z/08/Z, 091310, 092731, WT091310, WT092447/B/10/Z, WT098051

    Nature genetics 2016;48;11;1303-1312

  • In vivo genome-wide profiling reveals a tissue-specific role for 5-formylcytosine.

    Iurlaro M, McInroy GR, Burgess HE, Dean W, Raiber EA, Bachman M, Beraldi D, Balasubramanian S and Reik W

    The Babraham Institute, Epigenetics Programme, Cambridge, CB22 3AT, UK.

    Background: Genome-wide methylation of cytosine can be modulated in the presence of TET and thymine DNA glycosylase (TDG) enzymes. TET is able to oxidise 5-methylcytosine (5mC) to 5-hydroxymethylcytosine (5hmC), 5-formylcytosine (5fC) and 5-carboxylcytosine (5caC). TDG can excise the oxidative products 5fC and 5caC, initiating base excision repair. These modified bases are stable and detectable in the genome, suggesting that they could have epigenetic functions in their own right. However, functional investigation of the genome-wide distribution of 5fC has been restricted to cell culture-based systems, while its in vivo profile remains unknown.

    Results: Here, we describe the first analysis of the in vivo genome-wide profile of 5fC across a range of tissues from both wild-type and Tdg-deficient E11.5 mouse embryos. Changes in the formylation profile of cytosine upon depletion of TDG suggest TET/TDG-mediated active demethylation occurs preferentially at intron-exon boundaries and reveals a major role for TDG in shaping 5fC distribution at CpG islands. Moreover, we find that active enhancer regions specifically exhibit high levels of 5fC, resulting in characteristic tissue-diagnostic patterns, which suggest a role in embryonic development.

    Conclusions: The tissue-specific distribution of 5fC can be regulated by the collective contribution of TET-mediated oxidation and excision by TDG. The in vivo profile of 5fC during embryonic development resembles that of embryonic stem cells, sharing key features including enrichment of 5fC in enhancer and intragenic regions. Additionally, by investigating mouse embryo 5fC profiles in a tissue-specific manner, we identify targeted enrichment at active enhancers involved in tissue development.

    Funded by: Biotechnology and Biological Sciences Research Council: BBS/E/B/0000H112; Cancer Research UK; Medical Research Council; Wellcome Trust

    Genome biology 2016;17;1;141

  • Kinetoplastid Phylogenomics Reveals the Evolutionary Innovations Associated with the Origins of Parasitism.

    Jackson AP, Otto TD, Aslett M, Armstrong SD, Bringaud F, Schlacht A, Hartley C, Sanders M, Wastling JM, Dacks JB, Acosta-Serrano A, Field MC, Ginger ML and Berriman M

    Department of Infection Biology, Institute of Infection and Global Health, University of Liverpool, Liverpool Science Park Ic2, 146 Brownlow Hill, Liverpool L3 5RF, UK. Electronic address:

    The evolution of parasitism is a recurrent event in the history of life and a core problem in evolutionary biology. Trypanosomatids are important parasites and include the human pathogens Trypanosoma brucei, Trypanosoma cruzi, and Leishmania spp., which in humans cause African trypanosomiasis, Chagas disease, and leishmaniasis, respectively. Genome comparison between trypanosomatids reveals that these parasites have evolved specialized cell-surface protein families, overlaid on a well-conserved cell template. Understanding how these features evolved and which ones are specifically associated with parasitism requires comparison with related non-parasites. We have produced genome sequences for Bodo saltans, the closest known non-parasitic relative of trypanosomatids, and a second bodonid, Trypanoplasma borreli. Here we show how genomic reduction and innovation contributed to the character of trypanosomatid genomes. We show that gene loss has "streamlined" trypanosomatid genomes, particularly with respect to macromolecular degradation and ion transport, but consistent with a widespread loss of functional redundancy, while adaptive radiations of gene families involved in membrane function provide the principal innovations in trypanosomatid evolution. Gene gain and loss continued during trypanosomatid diversification, resulting in the asymmetric assortment of ancestral characters such as peptidases between Trypanosoma and Leishmania, genomic differences that were subsequently amplified by lineage-specific innovations after divergence. Finally, we show how species-specific, cell-surface gene families (DGF-1 and PSA) with no apparent structural similarity are independent derivations of a common ancestral form, which we call "bodonin." This new evidence defines the parasitic innovations of trypanosomatid genomes, revealing how a free-living phagotroph became adapted to exploiting hostile host environments.

    Current biology : CB 2016;26;2;161-172

  • DNA REPAIR. Drugging DNA repair.

    Jackson SP and Helleday T

    The Wellcome Trust/Cancer Research UK Gurdon Institute and Department of Biochemistry, University of Cambridge, Cambridge CB2 1QN, UK. The Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, UK.

    Science (New York, N.Y.) 2016;352;6290;1178-9

  • WGS analysis and molecular resistance mechanisms of azithromycin-resistant (MIC >2 mg/L) Neisseria gonorrhoeae isolates in Europe from 2009 to 2014.

    Jacobsson S, Golparian D, Cole M, Spiteri G, Martin I, Bergheim T, Borrego MJ, Crowley B, Crucitti T, Van Dam AP, Hoffmann S, Jeverica S, Kohl P, Mlynarczyk-Bonikowska B, Pakarna G, Stary A, Stefanelli P, Pavlik P, Tzelepi E, Abad R, Harris SR and Unemo M

    Örebro University, Örebro, Sweden.

    Objectives: To elucidate the genome-based epidemiology and phylogenomics of azithromycin-resistant (MIC >2 mg/L) Neisseria gonorrhoeae strains collected in 2009-14 in Europe and clarify the azithromycin resistance mechanisms.

    Methods: Seventy-five azithromycin-resistant (MIC 4 to >256 mg/L) N. gonorrhoeae isolates collected in 17 European countries during 2009-14 were examined using antimicrobial susceptibility testing and WGS.

    Results: Thirty-six N. gonorrhoeae multi-antigen sequence typing STs and five phylogenomic clades, including 4-22 isolates from several countries per clade, were identified. The azithromycin target mutation A2059G (Escherichia coli numbering) was found in all four alleles of the 23S rRNA gene in all isolates with high-level azithromycin resistance (n = 4; MIC ≥256 mg/L). The C2611T mutation was identified in two to four alleles of the 23S rRNA gene in the remaining 71 isolates. Mutations in mtrR and its promoter were identified in 43 isolates, comprising isolates within the whole azithromycin MIC range. No mutations associated with azithromycin resistance were found in the rplD gene or the rplV gene and none of the macrolide resistance-associated genes [mef(A/E), ere(A), ere(B), erm(A), erm(B), erm(C) and erm(F)] were identified in any isolate.

    Conclusions: Clonal spread of relatively few N. gonorrhoeae strains accounts for the majority of the azithromycin resistance (MIC >2 mg/L) in Europe. The four isolates with high-level resistance to azithromycin (MIC ≥256 mg/L) were widely separated in the phylogenomic tree and did not belong to any of the main clades. The main azithromycin resistance mechanisms were the A2059G mutation (high-level resistance) and the C2611T mutation (low- and moderate-level resistance) in the 23S rRNA gene.

    The Journal of antimicrobial chemotherapy 2016;71;11;3109-3116

  • Pan-genomic perspective on the evolution of the Staphylococcus aureus USA300 epidemic.

    Jamrozy DM, Harris SR, Mohamed N, Peacock SJ, Tan CY, Parkhill J, Anderson AS and Holden MTG

    1​The Wellcome Trust Sanger Institute, Cambridge CB10 1SA, UK.

    <i>Staphylococcus aureus</i> USA300 represents the dominant community-associated methicillin-resistant <i>S. aureus</i> lineage in the USA, where it is a major cause of skin and soft tissue infections. Previous comparative genomic studies have described the population structure and evolution of USA300 based on geographically restricted isolate collections. Here, we investigated the USA300 population by sequencing genomes of a geographically distributed panel of 191 clinical <i>S. aureus</i> isolates belonging to clonal complex 8 (CC8), derived from the Tigecycline Evaluation and Surveillance Trial program. Isolates were collected at 12 healthcare centres across nine USA states in 2004, 2009 or 2010. Reconstruction of evolutionary relationships revealed that CC8 was dominated by USA300 isolates (154/191, 81 %), which were heterogeneous and demonstrated limited phylogeographic clustering. Analysis of the USA300 core genomes revealed an increase in median pairwise SNP distance from 62 to 98 between 2004 and 2010, with a stable pattern of above average d<i>N</i>/d<i>S</i> ratios. The phylogeny of the USA300 population indicated that early diversification events led to the formation of nested clades, which arose through cumulative acquisition of predominantly non-synonymous SNPs in various coding sequences. The accessory genome of USA300 was largely homogenous and consisted of elements previously associated with this lineage. We observed an emergence of SCC<i>mec</i> negative and ACME negative USA300 isolates amongst more recent samples, and an increase in the prevalence of ϕSa5 prophage. Together, the analysed <i>S. aureus</i> USA300 collection revealed an evolving pan-genome through increased core genome heterogeneity and temporal variation in the frequency of certain accessory elements.

    Funded by: Wellcome Trust: 098051

    Microbial genomics 2016;2;5;e000058

  • Lineage-Specific Genome Architecture Links Enhancers and Non-coding Disease Variants to Target Gene Promoters.

    Javierre BM, Burren OS, Wilder SP, Kreuzhuber R, Hill SM, Sewitz S, Cairns J, Wingett SW, Várnai C, Thiecke MJ, Burden F, Farrow S, Cutler AJ, Rehnström K, Downes K, Grassi L, Kostadima M, Freire-Pritchett P, Wang F, BLUEPRINT Consortium, Stunnenberg HG, Todd JA, Zerbino DR, Stegle O, Ouwehand WH, Frontini M, Wallace C, Spivakov M and Fraser P

    Nuclear Dynamics Programme, The Babraham Institute, Babraham Research Campus, Cambridge CB22 3AT, UK.

    Long-range interactions between regulatory elements and gene promoters play key roles in transcriptional regulation. The vast majority of interactions are uncharted, constituting a major missing link in understanding genome control. Here, we use promoter capture Hi-C to identify interacting regions of 31,253 promoters in 17 human primary hematopoietic cell types. We show that promoter interactions are highly cell type specific and enriched for links between active promoters and epigenetically marked enhancers. Promoter interactomes reflect lineage relationships of the hematopoietic tree, consistent with dynamic remodeling of nuclear architecture during differentiation. Interacting regions are enriched in genetic variants linked with altered expression of genes they contact, highlighting their functional role. We exploit this rich resource to connect non-coding disease variants to putative target promoters, prioritizing thousands of disease-candidate genes and implicating disease pathways. Our results demonstrate the power of primary cell promoter interactomes to reveal insights into genomic regulatory mechanisms underlying common diseases.

    Cell 2016;167;5;1369-1384.e19

  • Molecular characterisation of the Chlamydia pecorum plasmid from porcine, ovine, bovine, and koala strains indicates plasmid-strain co-evolution.

    Jelocnik M, Bachmann NL, Seth-Smith H, Thomson NR, Timms P and Polkinghorne AM

    Centre for Animal Health Innovation, University of the Sunshine Coast , Sippy Downs, Queensland , Australia.

    Background. Highly stable, evolutionarily conserved, small, non-integrative plasmids are commonly found in members of the Chlamydiaceae and, in some species, these plasmids have been strongly linked to virulence. To date, evidence for such a plasmid in Chlamydia pecorum has been ambiguous. In a recent comparative genomic study of porcine, ovine, bovine, and koala C. pecorum isolates, we identified plasmids (pCpec) in a pig and three koala strains, respectively. Screening of further porcine, ovine, bovine, and koala C. pecorum isolates for pCpec showed that pCpec is common, but not ubiquitous in C. pecorum from all of the infected hosts. Methods. We used a combination of (i) bioinformatic mining of previously sequenced C. pecorum genome data sets and (ii) pCpec PCR-amplicon sequencing to characterise a further 17 novel pCpecs in C. pecorum isolates obtained from livestock, including pigs, sheep, and cattle, as well as those from koala. Results and Discussion. This analysis revealed that pCpec is conserved with all eight coding domain sequences (CDSs) present in isolates from each of the hosts studied. Sequence alignments revealed that the 21 pCpecs show 99% nucleotide sequence identity, with 83 single nucleotide polymorphisms (SNPs) shown to differentiate all of the plasmids analysed in this study. SNPs were found to be mostly synonymous and were distributed evenly across all eight pCpec CDSs as well as in the intergenic regions. Although conserved, analyses of the 21 pCpec sequences resolved plasmids into 12 distinct genotypes, with five shared between pCpecs from different isolates, and the remaining seven genotypes being unique to a single pCpec. Phylogenetic analysis revealed congruency and co-evolution of pCpecs with their cognate chromosome, further supporting polyphyletic origin of the koala C. pecorum. This study provides further understanding of the complex epidemiology of this pathogen in livestock and koala hosts and paves the way for studies to evaluate the function of this putative C. pecorum virulence factor.

    PeerJ 2016;4;e1661

  • Whole-exome sequencing in an isolated population from the Dalmatian island of Vis.

    Jeroncic A, Memari Y, Ritchie GR, Hendricks AE, Kolb-Kokocinski A, Matchan A, Vitart V, Hayward C, Kolcic I, Glodzik D, Wright AF, Rudan I, Campbell H, Durbin R, Polašek O, Zeggini E and Boraska Perica V

    Department of Research in Biomedicine and Health, University of Split School of Medicine, Split, Croatia.

    We have whole-exome sequenced 176 individuals from the isolated population of the island of Vis in Croatia in order to describe exonic variation architecture. We found 290 577 single nucleotide variants (SNVs), 65% of which are singletons, low frequency or rare variants. A total of 25 430 (9%) SNVs are novel, previously not catalogued in NHLBI GO Exome Sequencing Project, UK10K-Generation Scotland, 1000Genomes Project, ExAC or NCBI Reference Assembly dbSNP. The majority of these variants (76%) are singletons. Comparable to data obtained from UK10K-Generation Scotland that were sequenced and analysed using the same protocols, we detected an enrichment of potentially damaging variants (non-synonymous and loss-of-function) in the low frequency and common variant categories. On average 115 (range 93-140) genotypes with loss-of-function variants, 23 (15-34) of which were homozygous, were identified per person. The landscape of loss-of-function variants across an exome revealed that variants mainly accumulated in genes on the xenobiotic-related pathways, of which majority coded for enzymes. The frequency of loss-of-function variants was additionally increased in Vis runs of homozygosity regions where variants mainly affected signalling pathways. This work confirms the isolate status of Vis population by means of whole-exome sequence and reveals the pattern of loss-of-function mutations, which resembles the trails of adaptive evolution that were found in other species. By cataloguing the exomic variants and describing the allelic structure of the Vis population, this study will serve as a valuable resource for future genetic studies of human diseases, population genetics and evolution in this population.

    Funded by: Medical Research Council: MC_PC_U127561128; Wellcome Trust: 098051

    European journal of human genetics : EJHG 2016;24;10;1479-87

  • Genome-wide association study of primary sclerosing cholangitis identifies new risk loci and quantifies the genetic relationship with inflammatory bowel disease.

    Ji SG, Juran BD, Mucha S, Folseraas T, Jostins L, Melum E, Kumasaka N, Atkinson EJ, Schlicht EM, Liu JZ, Shah T, Gutierrez-Achury J, Boberg KM, Bergquist A, Vermeire S, Eksteen B, Durie PR, Farkkila M, Müller T, Schramm C, Sterneck M, Weismüller TJ, Gotthardt DN, Ellinghaus D, Braun F, Teufel A, Laudes M, Lieb W, Jacobs G, Beuers U, Weersma RK, Wijmenga C, Marschall HU, Milkiewicz P, Pares A, Kontula K, Chazouillères O, Invernizzi P, Goode E, Spiess K, Moore C, Sambrook J, Ouwehand WH, Roberts DJ, Danesh J, Floreani A, Gulamhusein AF, Eaton JE, Schreiber S, Coltescu C, Bowlus CL, Luketic VA, Odin JA, Chopra KB, Kowdley KV, Chalasani N, Manns MP, Srivastava B, Mells G, Sandford RN, Alexander G, Gaffney DJ, Chapman RW, Hirschfield GM, de Andrade M, UK-PSC Consortium, International IBD Genetics Consortium, International PSC Study Group, Rushbrook SM, Franke A, Karlsen TH, Lazaridis KN and Anderson CA

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, UK.

    Primary sclerosing cholangitis (PSC) is a rare progressive disorder leading to bile duct destruction; ∼75% of patients have comorbid inflammatory bowel disease (IBD). We undertook the largest genome-wide association study of PSC (4,796 cases and 19,955 population controls) and identified four new genome-wide significant loci. The most associated SNP at one locus affects splicing and expression of UBASH3A, with the protective allele (C) predicted to cause nonstop-mediated mRNA decay and lower expression of UBASH3A. Further analyses based on common variants suggested that the genome-wide genetic correlation (r<sub>G</sub>) between PSC and ulcerative colitis (UC) (r<sub>G</sub> = 0.29) was significantly greater than that between PSC and Crohn's disease (CD) (r<sub>G</sub> = 0.04) (P = 2.55 × 10<sup>-15</sup>). UC and CD were genetically more similar to each other (r<sub>G</sub> = 0.56) than either was to PSC (P < 1.0 × 10<sup>-15</sup>). Our study represents a substantial advance in understanding of the genetics of PSC.

    Funded by: British Heart Foundation: RG/08/014/24067, RG/09/012/28096; Department of Health: RP-PG-0310-1002, RP-PG-0310-1004; Medical Research Council: MC_PC_15018, MR/L003120/1; NIA NIH HHS: RC2 AG036495, RC4 AG039029, U01 AG009740; NIDDK NIH HHS: R01 DK084960; Wellcome Trust

    Nature genetics 2016;49;2;269-273

  • Heterogeneity of CD34 and CD38 expression in acute B lymphoblastic leukemia cells is reversible and not hierarchically organized.

    Jiang Z, Deng M, Wei X, Ye W, Xiao Y, Lin S, Wang S, Li B, Liu X, Zhang G, Lai P, Weng J, Wu D, Chen H, Wei W, Ma Y, Li Y, Liu P, Du X, Pei D, Yao Y, Xu B and Li P

    State Key Laboratory of Respiratory Disease, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, 190 Kaiyuan Avenue, Science Park, Guangzhou, Guangdong, 510530, China.

    The existence and identification of leukemia-initiating cells in adult acute B lymphoblastic leukemia (B-ALL) remain controversial. We examined whether adult B-ALL is hierarchically organized into phenotypically distinct subpopulations of leukemogenic and non-leukemogenic cells or whether most B-ALL cells retain leukemogenic capacity, irrespective of their immunophenotype profiles. Our results suggest that adult B-ALL follows the stochastic stem cell model and that the expression of CD34 and CD38 in B-ALL is reversibly and not hierarchically organized.

    Journal of hematology & oncology 2016;9;1;94

  • Anti-GPC3-CAR T Cells Suppress the Growth of Tumor Cells in Patient-Derived Xenografts of Hepatocellular Carcinoma.

    Jiang Z, Jiang X, Chen S, Lai Y, Wei X, Li B, Lin S, Wang S, Wu Q, Liang Q, Liu Q, Peng M, Yu F, Weng J, Du X, Pei D, Liu P, Yao Y, Xue P and Li P

    State Key Laboratory of Respiratory Disease, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou, China; Key Laboratory of Regenerative Biology, South China Institute for Stem Cell Biology and Regenerative Medicine, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou, China; Guangdong Provincial Key Laboratory of Stem Cell and Regenerative Medicine, South China Institute for Stem Cell Biology and Regenerative Medicine, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou, China.

    Background: The lack of a general clinic-relevant model for human cancer is a major impediment to the acceleration of novel therapeutic approaches for clinical use. We propose to establish and characterize primary human hepatocellular carcinoma (HCC) xenografts that can be used to evaluate the cytotoxicity of adoptive chimeric antigen receptor (CAR) T cells and accelerate the clinical translation of CAR T cells used in HCC.

    Methods: Primary HCCs were used to establish the xenografts. The morphology, immunological markers, and gene expression characteristics of xenografts were detected and compared to those of the corresponding primary tumors. CAR T cells were adoptively transplanted into patient-derived xenograft (PDX) models of HCC. The cytotoxicity of CAR T cells <i>in vivo</i> was evaluated.

    Results: PDX1, PDX2, and PDX3 were established using primary tumors from three individual HCC patients. All three PDXs maintained original tumor characteristics in their morphology, immunological markers, and gene expression. Tumors in PDX1 grew relatively slower than that in PDX2 and PDX3. Glypican 3 (GPC3)-CAR T cells efficiently suppressed tumor growth in PDX3 and impressively eradicated tumor cells from PDX1 and PDX2, in which GPC3 proteins were highly expressed.

    Conclusion: GPC3-CAR T cells were capable of effectively eliminating tumors in PDX model of HCC. Therefore, GPC3-CAR T cell therapy is a promising candidate for HCC treatment.

    Frontiers in immunology 2016;7;690

  • Identification of new heat-stable (STa) enterotoxin allele variants produced by human enterotoxigenic Escherichia coli (ETEC).

    Joffré E, von Mentzer A, Svennerholm AM and Sjöling Å

    Department of Microbiology and Immunology, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden; Institute of Molecular Biology and Biotechnology, Universidad Mayor de San Andrés, La Paz, Bolivia. Electronic address:

    We describe natural variants of the heat stable toxin (STa) produced by enterotoxigenic Escherichia coli (ETEC) isolates collected worldwide. Previous studies of ETEC isolated from human diarrheal cases have reported the existence of three natural STa gene variants estA1, estA2 and estA3/4 where the first variant encodes STp (porcine, bovine, and human origin) and the two latter ones encode STh (human origin). We identified STa sequences by BLASTn and profiled ST amino acid polymorphisms in a collection of 118 clinical ETEC isolates from children and adults from Asia, Africa and, Latin America that were characterized by whole genome sequencing. Three novel variants of STp and STh were found and designated STa5 and STa6, and STa7, respectively. Presence of glucose significantly decreased the production of STh and STp toxin variants (p<0.05) as well as downregulated the gene expression (STh: p<0.001, STp: p<0.05). We found that the ETEC isolates producing the most common STp variant, STa5, co-expressed coli surface antigen CS6 and was significantly associated with disease in adults in this data set (p<0.001). Expression of mature STa5 peptide as well as gene expression of tolC, involved in ST secretion, increased in response to bile (p<0.05). ETEC expressing the common STh variant STa3/4 was associated with disease in children (p<0.05). The crp gene, that positively regulate estA3/4 encoding STa3/4, and estA3/4 itself had decreased transcriptional levels in presence of bile. Since bile levels in the intestine are lower in children than adults, these results may suggest differences in pathogenicity of ETEC in children and adult populations.

    International journal of medical microbiology : IJMM 2016;306;7;586-594

  • Targeting the RB-E2F pathway in breast cancer.

    Johnson J, Thijssen B, McDermott U, Garnett M, Wessels LF and Bernards R

    Division of Molecular Carcinogenesis and Cancer Genomics Netherlands, The Netherlands Cancer Institute, Amsterdam, The Netherlands.

    Mutations of the retinoblastoma tumor-suppressor gene (RB1) or components regulating the CDK-RB-E2F pathway have been identified in nearly every human malignancy. Re-establishing cell cycle control through cyclin-dependent kinase (CDK) inhibition has therefore emerged as an attractive option in the development of targeted cancer therapy. The most successful example of this today is the use of the CDK4/6 inhibitor palbociclib combined with aromatase inhibitors for the treatment of estrogen receptor-positive breast cancers. Multiple studies have demonstrated that the CDK-RB-E2F pathway is critical for the control of cell proliferation. More recently, studies have highlighted additional roles of this pathway, especially E2F transcription factors themselves, in tumor progression, angiogenesis and metastasis. Specific E2Fs also have prognostic value in breast cancer, independent of clinical parameters. We discuss here recent advances in understanding of the RB-E2F pathway in breast cancer. We also discuss the application of genome-wide genetic screening efforts to gain insight into synthetic lethal interactions of CDK4/6 inhibitors in breast cancer for the development of more effective combination therapies.

    Funded by: Wellcome Trust: 102696STRATTON, 102696Stratton

    Oncogene 2016;35;37;4829-35

  • cgpCaVEManWrapper: Simple Execution of CaVEMan in Order to Detect Somatic Single Nucleotide Variants in NGS Data.

    Jones D, Raine KM, Davies H, Tarpey PS, Butler AP, Teague JW, Nik-Zainal S and Campbell PJ

    Cancer Genome Project, Wellcome Trust Sanger Institute, Cambridge, United Kingdom.

    CaVEMan is an expectation maximization-based somatic substitution-detection algorithm that is written in C. The algorithm analyzes sequence data from a test sample, such as a tumor relative to a reference normal sample from the same patient and the reference genome. It performs a comparative analysis of the tumor and normal sample to derive a probabilistic estimate for putative somatic substitutions. When combined with a set of validated post-hoc filters, CaVEMan generates a set of somatic substitution calls with high recall and positive predictive value. Here we provide instructions for using a wrapper script called cgpCaVEManWrapper, which runs the CaVEMan algorithm and additional downstream post-hoc filters. We describe both a simple one-shot run of cgpCaVEManWrapper and a more in-depth implementation suited to large-scale compute farms. © 2016 by John Wiley & Sons, Inc.

    Funded by: Wellcome Trust: 098051

    Current protocols in bioinformatics 2016;56;15.10.1-15.10.18

  • Salmonella Enteritidis Isolate Harboring Multiple Efflux Pumps and Pathogenicity Factors, Shows Absence of O Antigen Polymerase Gene.

    Jones-Dias D, Clemente L, Egas C, Froufe H, Sampaio DA, Vieira L, Fookes M, Thomson NR, Manageiro V and Caniça M

    National Reference Laboratory of Antibiotic Resistances and Healthcare Associated Infections, Department of Infectious Diseases, National Health Institute Doutor Ricardo Jorge (INSA)Lisbon, Portugal; Centre for the Studies of Animal Science, Institute of Agrarian and Agri-Food Sciences and Technologies, University of PortoPorto, Portugal.

    Frontiers in microbiology 2016;7;1130

  • Heterozygous KIDINS220/ARMS nonsense variants cause spastic paraplegia, intellectual disability, nystagmus, and obesity.

    Josifova DJ, Monroe GR, Tessadori F, de Graaff E, van der Zwaag B, Mehta SG, DDD Study, Harakalova M, Duran KJ, Savelberg SM, Nijman IJ, Jungbluth H, Hoogenraad CC, Bakkers J, Knoers NV, Firth HV, Beales PL, van Haaften G and van Haelst MM

    Department of Clinical Genetics, Guys' and St. Thomas' Hospital, London SE1 7EH, UK.

    We identified de novo nonsense variants in KIDINS220/ARMS in three unrelated patients with spastic paraplegia, intellectual disability, nystagmus, and obesity (SINO). KIDINS220 is an essential scaffold protein coordinating neurotrophin signal pathways in neurites and is spatially and temporally regulated in the brain. Molecular analysis of patients' variants confirmed expression and translation of truncated transcripts similar to recently characterized alternative terminal exon splice isoforms of KIDINS220 KIDINS220 undergoes extensive alternative splicing in specific neuronal populations and developmental time points, reflecting its complex role in neuronal maturation. In mice and humans, KIDINS220 is alternative spliced in the middle region as well as in the last exon. These full-length and KIDINS220 splice variants occur at precise moments in cortical, hippocampal, and motor neuron development, with splice variants similar to the variants seen in our patients and lacking the last exon of KIDINS220 occurring in adult rather than in embryonic brain. We conducted tissue-specific expression studies in zebrafish that resulted in spasms, confirming a functional link with disruption of the KIDINS220 levels in developing neurites. This work reveals a crucial physiological role of KIDINS220 in development and provides insight into how perturbation of the complex interplay of KIDINS220 isoforms and their relative expression can affect neuron control and human metabolism. Altogether, we here show that de novo protein-truncating KIDINS220 variants cause a new syndrome, SINO. This is the first report of KIDINS220 variants causing a human disease.

    Funded by: Wellcome Trust: WT098051

    Human molecular genetics 2016;25;11;2158-2167

  • New native South American Y chromosome lineages.

    Jota MS, Lacerda DR, Sandoval JR, Vieira PP, Ohasi D, Santos-Júnior JE, Acosta O, Cuellar C, Revollo S, Paz-Y-Miño C, Fujita R, Vallejo GA, Schurr TG, Tarazona-Santos EM, Pena SDj, Ayub Q, Tyler-Smith C, Santos FR and Genographic Consortium

    Departamento de Biologia Geral, Universidade Federal de Minas Gerais, Belo Horizonte, Brazil.

    Many single-nucleotide polymorphisms (SNPs) in the non-recombining region of the human Y chromosome have been described in the last decade. High-coverage sequencing has helped to characterize new SNPs, which has in turn increased the level of detail in paternal phylogenies. However, these paternal lineages still provide insufficient information on population history and demography, especially for Native Americans. The present study aimed to identify informative paternal sublineages derived from the main founder lineage of the Americas-haplogroup Q-L54-in a sample of 1841 native South Americans. For this purpose, we used a Y-chromosomal genotyping multiplex platform and conventional genotyping methods to validate 34 new SNPs that were identified in the present study by sequencing, together with many Y-SNPs previously described in the literature. We updated the haplogroup Q phylogeny and identified two new Q-M3 and three new Q-L54*(xM3) sublineages defined by five informative SNPs, designated SA04, SA05, SA02, SA03 and SA29. Within the Q-M3, sublineage Q-SA04 was mostly found in individuals from ethnic groups belonging to the Tukanoan linguistic family in the northwest Amazon, whereas sublineage Q-SA05 was found in Peruvian and Bolivian Amazon ethnic groups. Within Q-L54*, the derived sublineages Q-SA03 and Q-SA02 were exclusively found among Coyaima individuals (Cariban linguistic family) from Colombia, while Q-SA29 was found only in Maxacali individuals (Jean linguistic family) from southeast Brazil. Furthermore, we validated the usefulness of several published SNPs among indigenous South Americans. This new Y chromosome haplogroup Q phylogeny offers an informative paternal genealogy to investigate the pre-Columbian history of South America.Journal of Human Genetics advance online publication, 31 March 2016; doi:10.1038/jhg.2016.26.

    Journal of human genetics 2016;61;7;593-603

  • Deficiency of the zinc finger protein ZFP106 causes motor and sensory neurodegeneration.

    Joyce PI, Fratta P, Landman AS, Mcgoldrick P, Wackerhage H, Groves M, Busam BS, Galino J, Corrochano S, Beskina OA, Esapa C, Ryder E, Carter S, Stewart M, Codner G, Hilton H, Teboul L, Tucker J, Lionikas A, Estabel J, Ramirez-Solis R, White JK, Brandner S, Plagnol V, Bennet DL, Abramov AY, Greensmith L, Fisher EM and Acevedo-Arozena A

    MRC Mammalian Genetics Unit, Harwell, Oxfordshire OX11 0RD, UK.

    Zinc finger motifs are distributed amongst many eukaryotic protein families, directing nucleic acid-protein and protein-protein interactions. Zinc finger protein 106 (ZFP106) has previously been associated with roles in immune response, muscle differentiation, testes development and DNA damage, although little is known about its specific function. To further investigate the function of ZFP106, we performed an in-depth characterization of Zfp106 deficient mice (Zfp106(-/-)), and we report a novel role for ZFP106 in motor and sensory neuronal maintenance and survival. Zfp106(-/-) mice develop severe motor abnormalities, major deficits in muscle strength and histopathological changes in muscle. Intriguingly, despite being highly expressed throughout the central nervous system, Zfp106(-/-) mice undergo selective motor and sensory neuronal and axonal degeneration specific to the spinal cord and peripheral nervous system. Neurodegeneration does not occur during development of Zfp106(-/-) mice, suggesting that ZFP106 is likely required for the maintenance of mature peripheral motor and sensory neurons. Analysis of embryonic Zfp106(-/-) motor neurons revealed deficits in mitochondrial function, with an inhibition of Complex I within the mitochondrial electron transport chain. Our results highlight a vital role for ZFP106 in sensory and motor neuron maintenance and reveal a novel player in mitochondrial dysfunction and neurodegeneration.

    Human molecular genetics 2016;25;2;291-307

  • Mutations at protein-protein interfaces: Small changes over big surfaces have large impacts on human health.

    Jubb HC, Pandurangan AP, Turner MA, Ochoa-Montaño B, Blundell TL and Ascher DB

    Department of Biochemistry, University of Cambridge, Cambridge CB2 1GA, UK; Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK.

    Many essential biological processes including cell regulation and signalling are mediated through the assembly of protein complexes. Changes to protein-protein interaction (PPI) interfaces can affect the formation of multiprotein complexes, and consequently lead to disruptions in interconnected networks of PPIs within and between cells, further leading to phenotypic changes as functional interactions are created or disrupted. Mutations altering PPIs have been linked to the development of genetic diseases including cancer and rare Mendelian diseases, and to the development of drug resistance. The importance of these protein mutations has led to the development of many resources for understanding and predicting their effects. We propose that a better understanding of how these mutations affect the structure, function, and formation of multiprotein complexes provides novel opportunities for tackling them, including the development of small-molecule drugs targeted specifically to mutated PPIs.

    Progress in biophysics and molecular biology 2016

  • Comparison of bacterial genome assembly software for MinION data and their applicability to medical microbiology.

    Judge K, Hunt M, Reuter S, Tracey A, Quail MA, Parkhill J and Peacock SJ

    1​Department of Medicine, University of Cambridge, Level 5, Addenbrookes Hospital, CB2 0QQ Cambridge, UK.

    Translating the Oxford Nanopore MinION sequencing technology into medical microbiology requires on-going analysis that keeps pace with technological improvements to the instrument and release of associated analysis software. Here, we use a multidrug-resistant <i>Enterobacter kobei</i> isolate as a model organism to compare open source software for the assembly of genome data, and relate this to the time taken to generate actionable information. Three software tools (PBcR, Canu and miniasm) were used to assemble MinION data and a fourth (SPAdes) was used to combine MinION and Illumina data to produce a hybrid assembly. All four had a similar number of contigs and were more contiguous than the assembly using Illumina data alone, with SPAdes producing a single chromosomal contig. Evaluation of the four assemblies to represent the genome structure revealed a single large inversion in the SPAdes assembly, which also incorrectly integrated a plasmid into the chromosomal contig. Almost 50 %, 80 % and 90 % of MinION pass reads were generated in the first 6, 9 and 12 h, respectively. Using data from the first 6 h alone led to a less accurate, fragmented assembly, but data from the first 9 or 12 h generated similar assemblies to that from 48 h sequencing. Assemblies were generated in 2 h using Canu, indicating that going from isolate to assembled data is possible in less than 48 h. MinION data identified that genes responsible for resistance were carried by two plasmids encoding resistance to carbapenem and to sulphonamides, rifampicin and aminoglycosides, respectively.

    Funded by: Department of Health; Wellcome Trust: WT098600

    Microbial genomics 2016;2;9;e000085

  • Targeting Chromatin Regulators Inhibits Leukemogenic Gene Expression in NPM1 Mutant Leukemia.

    Kühn MW, Song E, Feng Z, Sinha A, Chen CW, Deshpande AJ, Cusan M, Farnoud N, Mupo A, Grove C, Koche R, Bradner JE, de Stanchina E, Vassiliou GS, Hoshii T and Armstrong SA

    Cancer Biology and Genetics Program, Memorial Sloan Kettering Cancer Center, New York, New York. Department of Medicine III, University Medical Center, Johannes Gutenberg University, Mainz, Germany.

    Homeobox (HOX) proteins and the receptor tyrosine kinase FLT3 are frequently highly expressed and mutated in acute myeloid leukemia (AML). Aberrant HOX expression is found in nearly all AMLs that harbor a mutation in the Nucleophosmin (NPM1) gene, and FLT3 is concomitantly mutated in approximately 60% of these cases. Little is known about how mutant NPM1 (NPM1<sup>mut</sup>) cells maintain aberrant gene expression. Here, we demonstrate that the histone modifiers MLL1 and DOT1L control HOX and FLT3 expression and differentiation in NPM1<sup>mut</sup> AML. Using a CRISPR/Cas9 genome editing domain screen, we show NPM1<sup>mut</sup> AML to be exceptionally dependent on the menin binding site in MLL1. Pharmacologic small-molecule inhibition of the menin-MLL1 protein interaction had profound antileukemic activity in human and murine models of NPM1<sup>mut</sup> AML. Combined pharmacologic inhibition of menin-MLL1 and DOT1L resulted in dramatic suppression of HOX and FLT3 expression, induction of differentiation, and superior activity against NPM1<sup>mut</sup> leukemia.

    Significance: MLL1 and DOT1L are chromatin regulators that control HOX, MEIS1, and FLT3 expression and are therapeutic targets in NPM1<sup>mut</sup> AML. Combinatorial small-molecule inhibition has synergistic on-target activity and constitutes a novel therapeutic concept for this common AML subtype. Cancer Discov; 6(10); 1166-81. ©2016 AACR.See related commentary by Hourigan and Aplan, p. 1087This article is highlighted in the In This Issue feature, p. 1069.

    Funded by: Medical Research Council: MC_PC_12009; NCI NIH HHS: K99 CA197498, P01 CA066996, P30 CA008748, R00 CA197498, R01 CA140575, R01 CA176745

    Cancer discovery 2016;6;10;1166-1181

  • Epstein-Barr virus nuclear protein EBNA3C directly induces expression of AID and somatic mutations in B cells.

    Kalchschmidt JS, Bashford-Rogers R, Paschos K, Gillman AC, Styles CT, Kellam P and Allday MJ

    Molecular Virology, Department of Medicine, Imperial College London, London W2 1PG, England, UK.

    Activation-induced cytidine deaminase (AID), the enzyme responsible for induction of sequence variation in immunoglobulins (Igs) during the process of somatic hypermutation (SHM) and also Ig class switching, can have a potent mutator phenotype in the development of lymphoma. Using various Epstein-Barr virus (EBV) recombinants, we provide definitive evidence that the viral nuclear protein EBNA3C is essential in EBV-infected primary B cells for the induction of AID mRNA and protein. Using lymphoblastoid cell lines (LCLs) established with EBV recombinants conditional for EBNA3C function, this was confirmed, and it was shown that transactivation of the AID gene (AICDA) is associated with EBNA3C binding to highly conserved regulatory elements located proximal to and upstream of the AICDA transcription start site. EBNA3C binding initiated epigenetic changes to chromatin at specific sites across the AICDA locus. Deep sequencing of cDNA corresponding to the IgH V-D-J region from the conditional LCL was used to formally show that SHM is activated by functional EBNA3C and induction of AID. These data, showing the direct targeting and induction of functional AID by EBNA3C, suggest a novel role for EBV in the etiology of B cell cancers, including endemic Burkitt lymphoma.

    Funded by: Wellcome Trust: 097005, 099273/Z/12/Z

    The Journal of experimental medicine 2016;213;6;921-8

  • EPEC: a cocktail of virulence.

    Kallonen T and Boinett CJ

    Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.

    Genomics studies are prompting a re-evaluation of the diversity of Escherichia coli pathovars and how this diversity corresponds to virulence.

    Funded by: Medical Research Council: G1100100

    Nature reviews. Microbiology 2016;14;4;196

  • Genome wide conditional mouse knockout resources

    Kaloff, C, Anastassiadis, K, Ayadi, A, Baldock, R, Beig, J, Birling, M-C, Bradley, A, Brown, SDM, Bürger, A, Bushell, W et al.

    Drug Discovery Today: Disease Models 2016;20;3;12

  • Analysis with the exome array identifies multiple new independent variants in lipid loci.

    Kanoni S, Masca NG, Stirrups KE, Varga TV, Warren HR, Scott RA, Southam L, Zhang W, Yaghootkar H, Müller-Nurasyid M, Couto Alves A, Strawbridge RJ, Lataniotis L, An Hashim N, Besse C, Boland A, Braund PS, Connell JM, Dominiczak A, Farmaki AE, Franks S, Grallert H, Jansson JH, Karaleftheri M, Keinänen-Kiukaanniemi S, Matchan A, Pasko D, Peters A, Poulter N, Rayner NW, Renström F, Rolandsson O, Sabater-Lleal M, Sennblad B, Sever P, Shields D, Silveira A, Stanton AV, Strauch K, Tomaszewski M, Tsafantakis E, Waldenberger M, Blakemore AI, Dedoussis G, Escher SA, Kooner JS, McCarthy MI, Palmer CN, Wellcome Trust Case Control Consortium, Hamsten A, Caulfield MJ, Frayling TM, Tobin MD, Jarvelin MR, Zeggini E, Gieger C, Chambers JC, Wareham NJ, Munroe PB, Franks PW, Samani NJ and Deloukas P

    William Harvey Research Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London, EC1M 6BQ, UK.

    It has been hypothesized that low frequency (1-5% minor allele frequency (MAF)) and rare (<1% MAF) variants with large effect sizes may contribute to the missing heritability in complex traits. Here, we report an association analysis of lipid traits (total cholesterol, LDL-cholesterol, HDL-cholesterol triglycerides) in up to 27 312 individuals with a comprehensive set of low frequency coding variants (ExomeChip), combined with conditional analysis in the known lipid loci. No new locus reached genome-wide significance. However, we found a new lead variant in 26 known lipid association regions of which 16 were >1000-fold more significant than the previous sentinel variant and not in close LD (six had MAF <5%). Furthermore, conditional analysis revealed multiple independent signals (ranging from 1 to 5) in a third of the 98 lipid loci tested, including rare variants. Addition of our novel associations resulted in between 1.5- and 2.5-fold increase in the proportion of heritability explained for the different lipid traits. Our findings suggest that rare coding variants contribute to the genetic architecture of lipid traits.

    Funded by: British Heart Foundation: RG/14/5/30893; Medical Research Council: G0600237, G0601261, G0601966, G0700931, G0802782, MR/K006584/1; NHLBI NIH HHS: R01 HL087679; NIMH NIH HHS: R01 MH063706, RL1 MH083268; Wellcome Trust

    Human molecular genetics 2016;25;18;4094-4106

  • The Ecological Dynamics of Fecal Contamination and Salmonella Typhi and Salmonella Paratyphi A in Municipal Kathmandu Drinking Water.

    Karkey A, Jombart T, Walker AW, Thompson CN, Torres A, Dongol S, Tran Vu Thieu N, Pham Thanh D, Tran Thi Ngoc D, Voong Vinh P, Singer AC, Parkhill J, Thwaites G, Basnyat B, Ferguson N and Baker S

    Oxford University Clinical Research Unit, Patan Academy of Health Sciences, Kathmandu, Nepal.

    One of the UN sustainable development goals is to achieve universal access to safe and affordable drinking water by 2030. It is locations like Kathmandu, Nepal, a densely populated city in South Asia with endemic typhoid fever, where this goal is most pertinent. Aiming to understand the public health implications of water quality in Kathmandu we subjected weekly water samples from 10 sources for one year to a range of chemical and bacteriological analyses. We additionally aimed to detect the etiological agents of typhoid fever and longitudinally assess microbial diversity by 16S rRNA gene surveying. We found that the majority of water sources exhibited chemical and bacterial contamination exceeding WHO guidelines. Further analysis of the chemical and bacterial data indicated site-specific pollution, symptomatic of highly localized fecal contamination. Rainfall was found to be a key driver of this fecal contamination, correlating with nitrates and evidence of S. Typhi and S. Paratyphi A, for which DNA was detectable in 333 (77%) and 303 (70%) of 432 water samples, respectively. 16S rRNA gene surveying outlined a spectrum of fecal bacteria in the contaminated water, forming complex communities again displaying location-specific temporal signatures. Our data signify that the municipal water in Kathmandu is a predominant vehicle for the transmission of S. Typhi and S. Paratyphi A. This study represents the first extensive spatiotemporal investigation of water pollution in an endemic typhoid fever setting and implicates highly localized human waste as the major contributor to poor water quality in the Kathmandu Valley.

    Funded by: Medical Research Council: G0902420, MR/K010174/1; NIGMS NIH HHS: U01 GM110721; Wellcome Trust: 098051, 100087, 100087/Z/12/Z

    PLoS neglected tropical diseases 2016;10;1;e0004346

  • Retrospective Analysis of Serotype Switching of Vibrio cholerae O1 in a Cholera Endemic Region Shows It Is a Non-random Process.

    Karlsson SL, Thomson N, Mutreja A, Connor T, Sur D, Ali M, Clemens J, Dougan G, Holmgren J and Lebens M

    Department of Microbiology and Immunology, Institute of Biomedicine, University of Gothenburg, Gothenburg, Sweden.

    Genomic data generated from clinical Vibrio cholerae O1 isolates collected over a five year period in an area of Kolkata, India with seasonal cholera outbreaks allowed a detailed genetic analysis of serotype switching that occurred from Ogawa to Inaba and back to Ogawa. The change from Ogawa to Inaba resulted from mutational disruption of the methyltransferase encoded by the wbeT gene. Re-emergence of the Ogawa serotype was found to result either from expansion of an already existing Ogawa clade or reversion of the mutation in an Inaba clade. Our data suggests that such transitions are not random events but rather driven by as yet unidentified selection mechanisms based on differences in the structure of the O1 antigen or in the serotype-determining wbeT gene.

    PLoS neglected tropical diseases 2016;10;10;e0005044

  • BRAF(V600E) Kinase Domain Duplication Identified in Therapy-Refractory Melanoma Patient-Derived Xenografts.

    Kemper K, Krijgsman O, Kong X, Cornelissen-Steijger P, Shahrabi A, Weeber F, van der Velden DL, Bleijerveld OB, Kuilman T, Kluin RJC, Sun C, Voest EE, Ju YS, Schumacher TNM, Altelaar AFM, McDermott U, Adams DJ, Blank CU, Haanen JB and Peeper DS

    Division of Molecular Oncology, The Netherlands Cancer Institute, Plesmanlaan 121, 1066 CX Amsterdam, the Netherlands.

    The therapeutic landscape of melanoma is improving rapidly. Targeted inhibitors show promising results, but drug resistance often limits durable clinical responses. There is a need for in vivo systems that allow for mechanistic drug resistance studies and (combinatorial) treatment optimization. Therefore, we established a large collection of patient-derived xenografts (PDXs), derived from BRAF(V600E), NRAS(Q61), or BRAF(WT)/NRAS(WT) melanoma metastases prior to treatment with BRAF inhibitor and after resistance had occurred. Taking advantage of PDXs as a limitless source, we screened tumor lysates for resistance mechanisms. We identified a BRAF(V600E) protein harboring a kinase domain duplication (BRAF(V600E/DK)) in ∼10% of the cases, both in PDXs and in an independent patient cohort. While BRAF(V600E/DK) depletion restored sensitivity to BRAF inhibition, a pan-RAF dimerization inhibitor effectively eliminated BRAF(V600E/DK)-expressing cells. These results illustrate the utility of this PDX platform and warrant clinical validation of BRAF dimerization inhibitors for this group of melanoma patients.

    Funded by: Cancer Research UK: 13031; Wellcome Trust: WT098051

    Cell reports 2016;16;1;263-277

  • Polymorphism in a lincRNA Associates with a Doubled Risk of Pneumococcal Bacteremia in Kenyan Children.

    Kenyan Bacteraemia Study Group, Wellcome Trust Case Control Consortium 2 (WTCCC2), Rautanen A, Pirinen M, Mills TC, Rockett KA, Strange A, Ndungu AW, Naranbhai V, Gilchrist JJ, Bellenguez C, Freeman C, Band G, Bumpstead SJ, Edkins S, Giannoulatou E, Gray E, Dronov S, Hunt SE, Langford C, Pearson RD, Su Z, Vukcevic D, Macharia AW, Uyoga S, Ndila C, Mturi N, Njuguna P, Mohammed S, Berkley JA, Mwangi I, Mwarumba S, Kitsao BS, Lowe BS, Morpeth SC, Khandwalla I, Kilifi Bacteraemia Surveillance Group, Blackwell JM, Bramon E, Brown MA, Casas JP, Corvin A, Duncanson A, Jankowski J, Markus HS, Mathew CG, Palmer CN, Plomin R, Sawcer SJ, Trembath RC, Viswanathan AC, Wood NW, Deloukas P, Peltonen L, Williams TN, Scott JA, Chapman SJ, Donnelly P, Hill AV and Spencer CC

    Wellcome Trust Centre for Human Genetics, University of Oxford, Roosevelt Drive, Oxford OX3 7BN, UK. Electronic address:

    Bacteremia (bacterial bloodstream infection) is a major cause of illness and death in sub-Saharan Africa but little is known about the role of human genetics in susceptibility. We conducted a genome-wide association study of bacteremia susceptibility in more than 5,000 Kenyan children as part of the Wellcome Trust Case Control Consortium 2 (WTCCC2). Both the blood-culture-proven bacteremia case subjects and healthy infants as controls were recruited from Kilifi, on the east coast of Kenya. Streptococcus pneumoniae is the most common cause of bacteremia in Kilifi and was thus the focus of this study. We identified an association between polymorphisms in a long intergenic non-coding RNA (lincRNA) gene (AC011288.2) and pneumococcal bacteremia and replicated the results in the same population (p combined = 1.69 × 10(-9); OR = 2.47, 95% CI = 1.84-3.31). The susceptibility allele is African specific, derived rather than ancestral, and occurs at low frequency (2.7% in control subjects and 6.4% in case subjects). Our further studies showed AC011288.2 expression only in neutrophils, a cell type that is known to play a major role in pneumococcal clearance. Identification of this novel association will further focus research on the role of lincRNAs in human infectious disease.

    American journal of human genetics 2016;98;6;1092-100

  • Genome-wide study for circulating metabolites identifies 62 loci and reveals novel systemic effects of LPA.

    Kettunen J, Demirkan A, Würtz P, Draisma HH, Haller T, Rawal R, Vaarhorst A, Kangas AJ, Lyytikäinen LP, Pirinen M, Pool R, Sarin AP, Soininen P, Tukiainen T, Wang Q, Tiainen M, Tynkkynen T, Amin N, Zeller T, Beekman M, Deelen J, van Dijk KW, Esko T, Hottenga JJ, van Leeuwen EM, Lehtimäki T, Mihailov E, Rose RJ, de Craen AJ, Gieger C, Kähönen M, Perola M, Blankenberg S, Savolainen MJ, Verhoeven A, Viikari J, Willemsen G, Boomsma DI, van Duijn CM, Eriksson J, Jula A, Järvelin MR, Kaprio J, Metspalu A, Raitakari O, Salomaa V, Slagboom PE, Waldenberger M, Ripatti S and Ala-Korpela M

    Computational Medicine, Faculty of Medicine, University of Oulu, PO Box 5000, 90014 Oulu, Finland.

    Genome-wide association studies have identified numerous loci linked with complex diseases, for which the molecular mechanisms remain largely unclear. Comprehensive molecular profiling of circulating metabolites captures highly heritable traits, which can help to uncover metabolic pathophysiology underlying established disease variants. We conduct an extended genome-wide association study of genetic influences on 123 circulating metabolic traits quantified by nuclear magnetic resonance metabolomics from up to 24,925 individuals and identify eight novel loci for amino acids, pyruvate and fatty acids. The LPA locus link with cardiovascular risk exemplifies how detailed metabolic profiling may inform underlying aetiology via extensive associations with very-low-density lipoprotein and triglyceride metabolism. Genetic fine mapping and Mendelian randomization uncover wide-spread causal effects of lipoprotein(a) on overall lipoprotein metabolism and we assess potential pleiotropic consequences of genetically elevated lipoprotein(a) on diverse morbidities via electronic health-care records. Our findings strengthen the argument for safe LPA-targeted intervention to reduce cardiovascular risk.

    Nature communications 2016;7;11122

  • Diagnostic Yield of Sequencing Familial Hypercholesterolemia Genes in Patients with Severe Hypercholesterolemia.

    Khera AV, Won HH, Peloso GM, Lawson KS, Bartz TM, Deng X, van Leeuwen EM, Natarajan P, Emdin CA, Bick AG, Morrison AC, Brody JA, Gupta N, Nomura A, Kessler T, Duga S, Bis JC, van Duijn CM, Cupples LA, Psaty B, Rader DJ, Danesh J, Schunkert H, McPherson R, Farrall M, Watkins H, Lander E, Wilson JG, Correa A, Boerwinkle E, Merlini PA, Ardissino D, Saleheen D, Gabriel S and Kathiresan S

    Center for Human Genetic Research, Cardiovascular Research Center and Cardiology Division, Massachusetts General Hospital, Harvard Medical School, Boston MA; Program in Medical and Population Genetics, Broad Institute, Cambridge, MA.

    Background: About 7% of US adults have severe hypercholesterolemia (untreated LDL cholesterol ≥190 mg/dl). Such high LDL levels may be due to familial hypercholesterolemia (FH), a condition caused by a single mutation in any of three genes. Lifelong elevations in LDL cholesterol in FH mutation carriers may confer CAD risk beyond that captured by a single LDL cholesterol measurement.

    Objectives: Assess the prevalence of a FH mutation among those with severe hypercholesterolemia and determine whether CAD risk varies according to mutation status beyond the observed LDL cholesterol.

    Methods: Three genes causative for FH (LDLR, APOB, PCSK9) were sequenced in 26,025 participants from 7 case-control studies (5,540 CAD cases, 8,577 CAD-free controls) and 5 prospective cohort studies (11,908 participants). FH mutations included loss-of-function variants in LDLR, missense mutations in LDLR predicted to be damaging, and variants linked to FH in ClinVar, a clinical genetics database.

    Results: Among 8,577 CAD-free control participants, 430 had LDL cholesterol ≥190 mg/dl; of these, only eight (1.9%) carried a FH mutation. Similarly, among 11,908 participants from 5 prospective cohorts, 956 had LDL cholesterol ≥190 mg/dl and of these, only 16 (1.7%) carried a FH mutation. Within any stratum of observed LDL cholesterol, risk of CAD was higher among FH mutation carriers when compared with non-carriers. When compared to a reference group with LDL cholesterol <130 mg/dl and no mutation, participants with LDL cholesterol ≥190 mg/dl and no FH mutation had six-fold higher risk for CAD (OR 6.0; 95%CI 5.2-6.9) whereas those with LDL cholesterol ≥190 mg/dl as well as a FH mutation demonstrated twenty-two fold increased risk (OR 22.3; 95%CI 10.7-53.2).

    Conclusions: Among individuals with LDL cholesterol ≥190 mg/dl, gene sequencing identified a FH mutation in <2%. However, for any given observed LDL cholesterol, FH mutation carriers are at substantially increased risk for CAD.

    Journal of the American College of Cardiology 2016

  • Evolutionary dynamics of Anolis sex chromosomes revealed by sequencing of flow sorting-derived microchromosome-specific DNA.

    Kichigin IG, Giovannotti M, Makunin AI, Ng BL, Kabilov MR, Tupikin AE, Barucchi VC, Splendiani A, Ruggeri P, Rens W, O'Brien PC, Ferguson-Smith MA, Graphodatsky AS and Trifonov VA

    Institute of Molecular and Cellular Biology SB RAS, Novosibirsk, 630090, Russia.

    Squamate reptiles show a striking diversity in modes of sex determination, including both genetic (XY or ZW) and temperature-dependent sex determination systems. The genomes of only a handful of species have been sequenced, analyzed and assembled including the genome of Anolis carolinensis. Despite a high genome coverage, only macrochromosomes of A. carolinensis were assembled whereas the content of most microchromosomes remained unclear. Most of the Anolis species have homomorphic XY sex chromosome system. However, some species have large heteromorphic XY chromosomes (e.g., A. sagrei) and even multiple sex chromosomes systems (e.g. A. pogus), that were shown to be derived from fusions of the ancestral XY with microautosomes. We applied next generation sequencing of flow sorting-derived chromosome-specific DNA pools to characterize the content and composition of microchromosomes in A. carolinensis and A. sagrei. Comparative analysis of sequenced chromosome-specific DNA pools revealed that the A. sagrei XY sex chromosomes contain regions homologous to several microautosomes of A. carolinensis. We suggest that the sex chromosomes of A. sagrei are derived by fusions of the ancestral sex chromosome with three microautosomes and subsequent loss of some genetic content on the Y chromosome.

    Molecular genetics and genomics : MGG 2016;291;5;1955-66

  • Genome-wide meta-analysis uncovers novel loci influencing circulating leptin levels.

    Kilpeläinen TO, Carli JF, Skowronski AA, Sun Q, Kriebel J, Feitosa MF, Hedman ÅK, Drong AW, Hayes JE, Zhao J, Pers TH, Schick U, Grarup N, Kutalik Z, Trompet S, Mangino M, Kristiansson K, Beekman M, Lyytikäinen LP, Eriksson J, Henneman P, Lahti J, Tanaka T, Luan J, Del Greco M F, Pasko D, Renström F, Willems SM, Mahajan A, Rose LM, Guo X, Liu Y, Kleber ME, Pérusse L, Gaunt T, Ahluwalia TS, Ju Sung Y, Ramos YF, Amin N, Amuzu A, Barroso I, Bellis C, Blangero J, Buckley BM, Böhringer S, I Chen YD, de Craen AJ, Crosslin DR, Dale CE, Dastani Z, Day FR, Deelen J, Delgado GE, Demirkan A, Finucane FM, Ford I, Garcia ME, Gieger C, Gustafsson S, Hallmans G, Hankinson SE, Havulinna AS, Herder C, Hernandez D, Hicks AA, Hunter DJ, Illig T, Ingelsson E, Ioan-Facsinay A, Jansson JO, Jenny NS, Jørgensen ME, Jørgensen T, Karlsson M, Koenig W, Kraft P, Kwekkeboom J, Laatikainen T, Ladwig KH, LeDuc CA, Lowe G, Lu Y, Marques-Vidal P, Meisinger C, Menni C, Morris AP, Myers RH, Männistö S, Nalls MA, Paternoster L, Peters A, Pradhan AD, Rankinen T, Rasmussen-Torvik LJ, Rathmann W, Rice TK, Brent Richards J, Ridker PM, Sattar N, Savage DB, Söderberg S, Timpson NJ, Vandenput L, van Heemst D, Uh HW, Vohl MC, Walker M, Wichmann HE, Widén E, Wood AR, Yao J, Zeller T, Zhang Y, Meulenbelt I, Kloppenburg M, Astrup A, Sørensen TI, Sarzynski MA, Rao DC, Jousilahti P, Vartiainen E, Hofman A, Rivadeneira F, Uitterlinden AG, Kajantie E, Osmond C, Palotie A, Eriksson JG, Heliövaara M, Knekt PB, Koskinen S, Jula A, Perola M, Huupponen RK, Viikari JS, Kähönen M, Lehtimäki T, Raitakari OT, Mellström D, Lorentzon M, Casas JP, Bandinelli S, März W, Isaacs A, van Dijk KW, van Duijn CM, Harris TB, Bouchard C, Allison MA, Chasman DI, Ohlsson C, Lind L, Scott RA, Langenberg C, Wareham NJ, Ferrucci L, Frayling TM, Pramstaller PP, Borecki IB, Waterworth DM, Bergmann S, Waeber G, Vollenweider P, Vestergaard H, Hansen T, Pedersen O, Hu FB, Eline Slagboom P, Grallert H, Spector TD, Jukema JW, Klein RJ, Schadt EE, Franks PW, Lindgren CM, Leibel RL and Loos RJ

    The Novo Nordisk Foundation Center for Basic Metabolic Research, Section of Metabolic Genetics, Faculty of Health and Medical Sciences, University of Copenhagen, Universitetsparken 1, DIKU Building, Copenhagen 2100, Denmark.

    Leptin is an adipocyte-secreted hormone, the circulating levels of which correlate closely with overall adiposity. Although rare mutations in the leptin (LEP) gene are well known to cause leptin deficiency and severe obesity, no common loci regulating circulating leptin levels have been uncovered. Therefore, we performed a genome-wide association study (GWAS) of circulating leptin levels from 32,161 individuals and followed up loci reaching P<10(-6) in 19,979 additional individuals. We identify five loci robustly associated (P<5 × 10(-8)) with leptin levels in/near LEP, SLC32A1, GCKR, CCNL1 and FTO. Although the association of the FTO obesity locus with leptin levels is abolished by adjustment for BMI, associations of the four other loci are independent of adiposity. The GCKR locus was found associated with multiple metabolic traits in previous GWAS and the CCNL1 locus with birth weight. Knockdown experiments in mouse adipose tissue explants show convincing evidence for adipogenin, a regulator of adipocyte differentiation, as the novel causal gene in the SLC32A1 locus influencing leptin levels. Our findings provide novel insights into the regulation of leptin production by adipose tissue and open new avenues for examining the influence of variation in leptin levels on adiposity and metabolic health.

    Funded by: British Heart Foundation: PG/07/131/24254, PG/13/66/30442; Canadian Institutes of Health Research: FRCN-CCT-83028; Intramural NIH HHS; Medical Research Council: G0701863, G9815508, MC_U106179471, MC_U106179472, MC_U147574242, MC_UP_A620_1016, MC_UU_12011/3, MC_UU_12011/4, MC_UU_12013/3, MC_UU_12013/8, MC_UU_12015/1, MC_UU_12015/2, MR/J012165/1; NCATS NIH HHS: UL1 TR000040, UL1 TR000124, UL1 TR001079, UL1-TR-000040, UL1-TR-001079; NCI NIH HHS: CA047988, CA055075, CA087969, CA49449, CA50385, CA65725, CA67262, P01 CA055075, P01 CA087969, R01 CA047988, R01 CA049449, R01 CA050385, R01 CA065725, R01 CA067262, U01 CA049449, U01 CA067262, U01CA098233, UM1 CA182913; NCRR NIH HHS: UL1 RR024156, UL1 RR025005, UL1-RR-24156, UL1-RR-25005; NHGRI NIH HHS: HG004399, HG004446, HHSN268200782096C, U01 HG004399, U01 HG007033, U01-HG007033; NHLBI NIH HHS: 5R01HL068891, 5R01HL087700, HL-043851, HL-045670, HL080467, N01-HC-65226, N01-HC-95160, N01-HC-95161, N01-HC-95162, N01-HC-95163, N01-HC-95164, N01-HC-95165, N01-HC-95166, N01-HC-95167, N01-HC-95168, N01-HC-95169, N01-HC95159, N01HC65226, N01HC95159, N01HC95160, N01HC95161, N01HC95162, N01HC95163, N01HC95164, N01HC95165, N01HC95166, N01HC95167, N01HC95168, N01HC95169, N02HL64278, R00 HL098459, R00-HL-098459, R01 HL043851, R01 HL045670, R01 HL068891, R01 HL071051, R01 HL071205, R01 HL071250, R01 HL071251, R01 HL071252, R01 HL071258, R01 HL071259, R01 HL080467, R01 HL087700, R01 HL088451, R01 HL117078, R01-HL-071051, R01-HL-071205, R01-HL-071250, R01-HL-071251, R01-HL-071252, R01-HL-071258, R01-HL-071259, R01-HL-088451, R01HL117078; NIA NIH HHS: 1R01AG032098-01A1, N01 AG062101, N01 AG062106, N01AG62103, R01 AG032098; NIDDK NIH HHS: 1R01DK080015, 5R01DK068336, 5R01DK075681, 5R01DK07568102, DK-26687, DK058845, DK52431, P30 DK020541, P30 DK026687, P30 DK063491, R01 DK052431, R01 DK058845, R01 DK068336, R01 DK075681, R01 DK080015, R01 DK089256, R01DK089256; NIMHD NIH HHS: 263 MD 821336, 263MD9164, R01 MD009164; PHS HHS: HHSN26800625226C, HHSN268200782096C; Wellcome Trust: 081917/Z/07/Z, 086596/Z/08/Z, 090532, WT064890, WT089062, WT090532, WT091551, WT098017, WT098051

    Nature communications 2016;7;10494

  • De Novo Mutations in SON Disrupt RNA Splicing of Genes Essential for Brain Development and Metabolism, Causing an Intellectual-Disability Syndrome.

    Kim JH, Shinde DN, Reijnders MRF, Hauser NS, Belmonte RL, Wilson GR, Bosch DGM, Bubulya PA, Shashi V, Petrovski S, Stone JK, Park EY, Veltman JA, Sinnema M, Stumpel CTRM, Draaisma JM, Nicolai J, University of Washington Center for Mendelian Genomics, Yntema HG, Lindstrom K, de Vries BBA, Jewett T, Santoro SL, Vogt J, Deciphering Developmental Disorders Study, Bachman KK,