Sanger Institute - Publications 2017

Number of papers published in 2017: 537

  • Proteomic analysis of extracellular vesicles from a Plasmodium falciparum Kenyan clinical isolate defines a core parasite secretome.

    Abdi A, Yu L, Goulding D, Rono MK, Bejon P, Choudhary J and Rayner J

    Pwani University Bioscience Research Centre, Pwani University, Kilifi, Kenya.

    Background: Many pathogens secrete effector molecules to subvert host immune responses, to acquire nutrients, and/or to prepare host cells for invasion. One of the ways that effector molecules are secreted is through extracellular vesicles (EVs) such as exosomes. Recently, the malaria parasite P. falciparum has been shown to produce EVs that can mediate transfer of genetic material between parasites and induce sexual commitment. Characterizing the content of these vesicles may improve our understanding of P. falciparum pathogenesis and virulence.

    Methods: Previous studies of P. falciparum EVs have been limited to long-term adapted laboratory isolates. In this study, we isolated EVs from a Kenyan P. falciparum clinical isolate adapted to in vitro culture for a short period and characterized their protein content by mass spectrometry (data are available via ProteomeXchange, with identifier PXD006925).

    Results: We show that P. falciparum extracellular vesicles ( PfEVs) are enriched in proteins found within the exomembrane compartments of infected erythrocytes such as Maurer's clefts (MCs), as well as the secretory endomembrane compartments in the apical end of the merozoites, suggesting that these proteins play a role in parasite-host interactions. Comparison of this novel clinically relevant dataset with previously published datasets helps to define a core secretome present in Plasmodium EVs.

    Conclusions: P. falciparum extracellular vesicles contain virulence-associated parasite proteins. Therefore, analysis of PfEVs contents from a range of clinical isolates, and their functional validation may improve our understanding of the virulence mechanisms of the parasite, and potentially identify targets for interventions or diagnostics.

    Wellcome open research 2017;2;50

  • Rapid identification of genes controlling virulence and immunity in malaria parasites.

    Abkallo HM, Martinelli A, Inoue M, Ramaprasad A, Xangsayarath P, Gitaka J, Tang J, Yahata K, Zoungrana A, Mitaka H, Acharjee A, Datta PP, Hunt P, Carter R, Kaneko O, Mustonen V, Illingworth CJR, Pain A and Culleton R

    Malaria Unit, Department of Pathology, Institute of Tropical Medicine, Nagasaki University, Nagasaki, Japan.

    Identifying the genetic determinants of phenotypes that impact disease severity is of fundamental importance for the design of new interventions against malaria. Here we present a rapid genome-wide approach capable of identifying multiple genetic drivers of medically relevant phenotypes within malaria parasites via a single experiment at single gene or allele resolution. In a proof of principle study, we found that a previously undescribed single nucleotide polymorphism in the binding domain of the erythrocyte binding like protein (EBL) conferred a dramatic change in red blood cell invasion in mutant rodent malaria parasites Plasmodium yoelii. In the same experiment, we implicated merozoite surface protein 1 (MSP1) and other polymorphic proteins, as the major targets of strain-specific immunity. Using allelic replacement, we provide functional validation of the substitution in the EBL gene controlling the growth rate in the blood stages of the parasites.

    PLoS pathogens 2017;13;7;e1006447

  • Phylogenetic characterisation of circulating, clinical influenza isolates from Bali, Indonesia: preliminary report from the BaliMEI project.

    Adisasmito W, Budayanti SN, Aisyah DN, Gallo Cassarino T, Rudge JW, Watson SJ, Kozlakidis Z, Smith GJD and Coker R

    Universitas Indonesia, Depok, Indonesia.

    Background: Human influenza represents a major public health concern, especially in south-east Asia where the risk of emergence and spread of novel influenza viruses is particularly high. The BaliMEI study aims to conduct a five year active surveillance and characterisation of influenza viruses in Bali using an extensive network of participating healthcare facilities.

    Methods: Samples were collected during routine diagnostic treatment in healthcare facilities. In addition to standard clinical and molecular methods for influenza typing, next generation sequencing and subsequent de novo genome assembly were performed to investigate the phylogeny of the collected patient samples.

    Results: The samples collected are characteristic of the seasonally circulating influenza viruses with indications of phylogenetic links to other samples characterised in neighbouring countries during the same time period.

    Conclusions: There were some strong phylogenetic links with sequences from samples collected in geographically proximal regions, with some of the samples from the same time-period resulting to small clusters at the tree-end points. However this work, which is the first of its kind completely performed within Indonesia, supports the view that the circulating seasonal influenza in Bali reflects the strains circulating in geographically neighbouring areas as would be expected to occur within a busy regional transit centre.

    BMC infectious diseases 2017;17;1;583

  • Enhanced Nasopharyngeal Infection and Shedding Associated with an Epidemic Lineage of emm3 group A Streptococcus.

    Afshar B, Turner CE, Lamagni T, Smith K, Al-Shahib A, Underwood A, Holden MTG, Efstratiou A and Sriskandan S

    a Department of Medicine , Imperial College London , London , United Kingdom.

    Background A group A Streptococcus (GAS) lineage of genotype emm3, sequence type 15 (ST15) was associated with a six month upsurge in invasive GAS disease in the UK. The epidemic lineage (Lineage C) had lost two typical emm3 prophages, Φ315.1 and Φ315.2 associated with the superantigen ssa, but gained a different prophage (ΦUK-M3.1) associated with a different superantigen, speC and a DNAse spd1. Methods and Results The presence of speC and spd1 in Lineage C ST15 strains enhanced both in vitro mitogenic and DNAse activities over non-Lineage C ST15 strains. Invasive disease models in Galleria mellonella and SPEC-sensitive transgenic mice, revealed no difference in overall invasiveness of Lineage C ST15 strains compared to non-Lineage C ST15 strains, consistent with clinical and epidemiological analysis. Lineage C strains did however markedly prolong murine nasal infection with enhanced nasal and airborne shedding compared to non-Lineage C strains. Deletion of speC or spd1 in two Lineage C strains identified a possible role for spd1 in airborne shedding from the murine nasopharynx. Conclusions Nasopharyngeal infection and shedding of Lineage C strains was enhanced compared to non-Lineage C strains and this was, in part, mediated by the gain of the DNase spd1 through prophage acquisition.

    Virulence 2017;0

  • Embedding gender equality into institutional strategy.

    Ahmed S

    Wellcome Trust Sanger Institute, Human Genetics, Cambridge, Cambridgeshire, UK.

    The SiS (Sex in Science) Programme on the WGC (Wellcome Genome Campus) was established in 2011. Key participants include the Wellcome Trust Sanger Institute, EMB-EBI (EMBL-European Bioinformatics Institute), Open Targets and Elixir. The key objectives are to catalyse cultural change, develop partnerships, communicate activities and champion our women in science work at a national and international level ( In this paper, we highlight some of the many initiatives that have taken place since 2013, to address gender inequality at the highest levels; the challenges we have faced and how we have overcome these, and the future direction of travel.

    Global health, epidemiology and genomics 2017;2;e5

  • The Helicase Aquarius/EMB-4 Is Required to Overcome Intronic Barriers to Allow Nuclear RNAi Pathways to Heritably Silence Transcription.

    Akay A, Di Domenico T, Suen KM, Nabih A, Parada GE, Larance M, Medhi R, Berkyurek AC, Zhang X, Wedeles CJ, Rudolph KLM, Engelhardt J, Hemberg M, Ma P, Lamond AI, Claycomb JM and Miska EA

    Wellcome Trust Cancer Research UK Gurdon Institute, University of Cambridge, Tennis Court Road, Cambridge CB2 1QN, UK; Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, UK.

    Small RNAs play a crucial role in genome defense against transposable elements and guide Argonaute proteins to nascent RNA transcripts to induce co-transcriptional gene silencing. However, the molecular basis of this process remains unknown. Here, we identify the conserved RNA helicase Aquarius/EMB-4 as a direct and essential link between small RNA pathways and the transcriptional machinery in Caenorhabditis elegans. Aquarius physically interacts with the germline Argonaute HRDE-1. Aquarius is required to initiate small-RNA-induced heritable gene silencing. HRDE-1 and Aquarius silence overlapping sets of genes and transposable elements. Surprisingly, removal of introns from a target gene abolishes the requirement for Aquarius, but not HRDE-1, for small RNA-dependent gene silencing. We conclude that Aquarius allows small RNA pathways to compete for access to nascent transcripts undergoing co-transcriptional splicing in order to detect and silence transposable elements. Thus, Aquarius and HRDE-1 act as gatekeepers coordinating gene expression and genome defense.

    Funded by: CIHR: MOP-274660; Cancer Research UK: C13474/A18583, C6946/A14492; European Research Council: 260688; NIGMS NIH HHS: R01 GM113242, R01 GM122080; Wellcome Trust: 092096/Z/10/Z, 104640/Z/14/Z, 108058/Z/15/Z

    Developmental cell 2017;42;3;241-255.e6

  • Genetic association analysis identifies variants associated with disease progression in primary sclerosing cholangitis.

    Alberts R, de Vries EMG, Goode EC, Jiang X, Sampaziotis F, Rombouts K, Böttcher K, Folseraas T, Weismüller TJ, Mason AL, Wang W, Alexander G, Alvaro D, Bergquist A, Björkström NK, Beuers U, Björnsson E, Boberg KM, Bowlus CL, Bragazzi MC, Carbone M, Chazouillères O, Cheung A, Dalekos G, Eaton J, Eksteen B, Ellinghaus D, Färkkilä M, Festen EAM, Floreani A, Franceschet I, Gotthardt DN, Hirschfield GM, Hoek BV, Holm K, Hohenester S, Hov JR, Imhann F, Invernizzi P, Juran BD, Lenzen H, Lieb W, Liu JZ, Marschall HU, Marzioni M, Melum E, Milkiewicz P, Müller T, Pares A, Rupp C, Rust C, Sandford RN, Schramm C, Schreiber S, Schrumpf E, Silverberg MS, Srivastava B, Sterneck M, Teufel A, Vallier L, Verheij J, Vila AV, Vries B, Zachou K, International PSC Study Group, The UK PSC Consortium, Chapman RW, Manns MP, Pinzani M, Rushbrook SM, Lazaridis KN, Franke A, Anderson CA, Karlsen TH, Ponsioen CY and Weersma RK

    Department of Gastroenterology and Hepatology, University of Groningen and University Medical Centre Groningen, Groningen, The Netherlands.

    Objective: Primary sclerosing cholangitis (PSC) is a genetically complex, inflammatory bile duct disease of largely unknown aetiology often leading to liver transplantation or death. Little is known about the genetic contribution to the severity and progression of PSC. The aim of this study is to identify genetic variants associated with PSC disease progression and development of complications.

    Design: We collected standardised PSC subphenotypes in a large cohort of 3402 patients with PSC. After quality control, we combined 130 422 single nucleotide polymorphisms of all patients-obtained using the Illumina immunochip-with their disease subphenotypes. Using logistic regression and Cox proportional hazards models, we identified genetic variants associated with binary and time-to-event PSC subphenotypes.

    Results: We identified genetic variant rs853974 to be associated with liver transplant-free survival (p=6.07×10(-9)). Kaplan-Meier survival analysis showed a 50.9% (95% CI 41.5% to 59.5%) transplant-free survival for homozygous AA allele carriers of rs853974 compared with 72.8% (95% CI 69.6% to 75.7%) for GG carriers at 10 years after PSC diagnosis. For the candidate gene in the region, RSPO3, we demonstrated expression in key liver-resident effector cells, such as human and murine cholangiocytes and human hepatic stellate cells.

    Conclusion: We present a large international PSC cohort, and report genetic loci associated with PSC disease progression. For liver transplant-free survival, we identified a genome-wide significant signal and demonstrated expression of the candidate gene RSPO3 in key liver-resident effector cells. This warrants further assessments of the role of this potential key PSC modifier gene.

    Gut 2017

  • Genetic markers associated with dihydroartemisinin-piperaquine failure in Plasmodium falciparum malaria in Cambodia: a genotype-phenotype association study.

    Amato R, Lim P, Miotto O, Amaratunga C, Dek D, Pearson RD, Almagro-Garcia J, Neal AT, Sreng S, Suon S, Drury E, Jyothi D, Stalker J, Kwiatkowski DP and Fairhurst RM

    Wellcome Trust Sanger Institute, Hinxton, UK; Centre for Genomics and Global Health, Wellcome Trust Centre for Human Genetics, Oxford, UK. Electronic address:

    Background: As the prevalence of artemisinin-resistant Plasmodium falciparum malaria increases in the Greater Mekong subregion, emerging resistance to partner drugs in artemisinin combination therapies seriously threatens global efforts to treat and eliminate this disease. Molecular markers that predict failure of artemisinin combination therapy are urgently needed to monitor the spread of partner drug resistance, and to recommend alternative treatments in southeast Asia and beyond.

    Methods: We did a genome-wide association study of 297 P falciparum isolates from Cambodia to investigate the relationship of 11 630 exonic single-nucleotide polymorphisms (SNPs) and 43 copy number variations (CNVs) with in-vitro piperaquine 50% inhibitory concentrations (IC<sub>50</sub>s), and tested whether these genetic variants are markers of treatment failure with dihydroartemisinin-piperaquine. We then did a survival analysis of 133 patients to determine whether candidate molecular markers predicted parasite recrudescence following dihydroartemisinin-piperaquine treatment.

    Findings: Piperaquine IC<sub>50</sub>s increased significantly from 2011 to 2013 in three Cambodian provinces (2011 vs 2013 median IC<sub>50</sub>s: 20·0 nmol/L [IQR 13·7-29·0] vs 39·2 nmol/L [32·8-48·1] for Ratanakiri, 19·3 nmol/L [15·1-26·2] vs 66·2 nmol/L [49·9-83·0] for Preah Vihear, and 19·6 nmol/L [11·9-33·9] vs 81·1 nmol/L [61·3-113·1] for Pursat; all p≤10<sup>-3</sup>; Kruskal-Wallis test). Genome-wide analysis of SNPs identified a chromosome 13 region that associates with raised piperaquine IC<sub>50</sub>s. A non-synonymous SNP (encoding a Glu415Gly substitution) in this region, within a gene encoding an exonuclease, associates with parasite recrudescence following dihydroartemisinin-piperaquine treatment. Genome-wide analysis of CNVs revealed that a single copy of the mdr1 gene on chromosome 5 and a novel amplification of the plasmepsin 2 and plasmepsin 3 genes on chromosome 14 also associate with raised piperaquine IC<sub>50</sub>s. After adjusting for covariates, both exo-E415G and plasmepsin 2-3 markers significantly associate (p=3·0 × 10<sup>-8</sup> and p=1·7 × 10<sup>-7</sup>, respectively) with decreased treatment efficacy (survival rates 0·38 [95% CI 0·25-0·51] and 0·41 [0·28-0·53], respectively).

    Interpretation: The exo-E415G SNP and plasmepsin 2-3 amplification are markers of piperaquine resistance and dihydroartemisinin-piperaquine failures in Cambodia, and can help monitor the spread of these phenotypes into other countries of the Greater Mekong subregion, and elucidate the mechanism of piperaquine resistance. Since plasmepsins are involved in the parasite's haemoglobin-to-haemozoin conversion pathway, targeted by related antimalarials, plasmepsin 2-3 amplification probably mediates piperaquine resistance.

    Funding: Intramural Research Program of the US National Institute of Allergy and Infectious Diseases, National Institutes of Health, Wellcome Trust, Bill & Melinda Gates Foundation, Medical Research Council, and UK Department for International Development.

    Funded by: Medical Research Council: G0600718, MR/M006212/1; Wellcome Trust

    The Lancet. Infectious diseases 2017;17;2;164-173

  • Adipocyte Accumulation in the Bone Marrow during Obesity and Aging Impairs Stem Cell-Based Hematopoietic and Bone Regeneration.

    Ambrosi TH, Scialdone A, Graja A, Gohlke S, Jank AM, Bocian C, Woelk L, Fan H, Logan DW, Schürmann A, Saraiva LR and Schulz TJ

    German Institute of Human Nutrition Potsdam-Rehbrücke, 14558 Nuthetal, Germany.

    Aging and obesity induce ectopic adipocyte accumulation in bone marrow cavities. This process is thought to impair osteogenic and hematopoietic regeneration. Here we specify the cellular identities of the adipogenic and osteogenic lineages of the bone. While aging impairs the osteogenic lineage, high-fat diet feeding activates expansion of the adipogenic lineage, an effect that is significantly enhanced in aged animals. We further describe a mesenchymal sub-population with stem cell-like characteristics that gives rise to both lineages and, at the same time, acts as a principal component of the hematopoietic niche by promoting competitive repopulation following lethal irradiation. Conversely, bone-resident cells committed to the adipocytic lineage inhibit hematopoiesis and bone healing, potentially by producing excessive amounts of Dipeptidyl peptidase-4, a protease that is a target of diabetes therapies. These studies delineate the molecular identity of the bone-resident adipocytic lineage, and they establish its involvement in age-dependent dysfunction of bone and hematopoietic regeneration.

    Cell stem cell 2017

  • The OncoArray Consortium: A Network for Understanding the Genetic Architecture of Common Cancers.

    Amos CI, Dennis J, Wang Z, Byun J, Schumacher FR, Gayther SA, Casey G, Hunter DJ, Sellers TA, Gruber SB, Dunning AM, Michailidou K, Fachal L, Doheny K, Spurdle AB, Li Y, Xiao X, Romm J, Pugh E, Coetzee GA, Hazelett DJ, Bojesen SE, Caga-Anan C, Haiman CA, Kamal A, Luccarini C, Tessier D, Vincent D, Bacot F, Van Den Berg DJ, Nelson S, Demetriades S, Goldgar DE, Couch FJ, Forman JL, Giles GG, Conti DV, Bickeböller H, Risch A, Waldenberger M, Brüske-Hohlfeld I, Hicks BD, Ling H, McGuffog L, Lee A, Kuchenbaecker K, Soucy P, Manz J, Cunningham JM, Butterbach K, Kote-Jarai Z, Kraft P, FitzGerald L, Lindström S, Adams M, McKay JD, Phelan CM, Benlloch S, Kelemen LE, Brennan P, Riggan M, O'Mara TA, Shen H, Shi Y, Thompson DJ, Goodman MT, Nielsen SF, Berchuck A, Laboissiere S, Schmit SL, Shelford T, Edlund CK, Taylor JA, Field JK, Park SK, Offit K, Thomassen M, Schmutzler R, Ottini L, Hung RJ, Marchini J, Amin Al Olama A, Peters U, Eeles RA, Seldin MF, Gillanders E, Seminara D, Antoniou AC, Pharoah PD, Chenevix-Trench G, Chanock SJ, Simard J and Easton DF

    Biomedical Data Science, Geisel School of Medicine at Dartmouth, Hanover, New Hampshire.

    Background: Common cancers develop through a multistep process often including inherited susceptibility. Collaboration among multiple institutions, and funding from multiple sources, has allowed the development of an inexpensive genotyping microarray, the OncoArray. The array includes a genome-wide backbone, comprising 230,000 SNPs tagging most common genetic variants, together with dense mapping of known susceptibility regions, rare variants from sequencing experiments, pharmacogenetic markers, and cancer-related traits.

    Methods: The OncoArray can be genotyped using a novel technology developed by Illumina to facilitate efficient genotyping. The consortium developed standard approaches for selecting SNPs for study, for quality control of markers, and for ancestry analysis. The array was genotyped at selected sites and with prespecified replicate samples to permit evaluation of genotyping accuracy among centers and by ethnic background.

    Results: The OncoArray consortium genotyped 447,705 samples. A total of 494,763 SNPs passed quality control steps with a sample success rate of 97% of the samples. Participating sites performed ancestry analysis using a common set of markers and a scoring algorithm based on principal components analysis.

    Conclusions: Results from these analyses will enable researchers to identify new susceptibility loci, perform fine-mapping of new or known loci associated with either single or multiple cancers, assess the degree of overlap in cancer causation and pleiotropic effects of loci that have been identified for disease-specific risk, and jointly model genetic, environmental, and lifestyle-related exposures.

    Impact: Ongoing analyses will shed light on etiology and risk assessment for many types of cancer. Cancer Epidemiol Biomarkers Prev; 26(1); 126-35. ©2016 AACR.

    Funded by: Cancer Research UK: 10118, 10124, 11174; NCI NIH HHS: P30 CA008748, P30 CA014089, P30 CA015083, P30 CA023108, P30 CA138313, P50 CA116201, P50 CA136393, R01 CA081488, R01 CA122443, R01 CA133996, R01 CA136924, R01 CA149429, R01 CA190182, R01 CA192393, R25 CA134286, U01 CA196386, U19 CA148065, U19 CA148107, U19 CA148112, U19 CA148127, U19 CA148537, UM1 CA164920, UM1 CA167551; NIGMS NIH HHS: P20 GM103534; NIH HHS: S10 OD020069; NLM NIH HHS: T32 LM012204; World Health Organization: 001

    Cancer epidemiology, biomarkers & prevention : a publication of the American Association for Cancer Research, cosponsored by the American Society of Preventive Oncology 2017;26;1;126-135

  • mRNA processing in mutant zebrafish lines generated by chemical and CRISPR-mediated mutagenesis produces unexpected transcripts that escape nonsense-mediated decay.

    Anderson JL, Mulligan TS, Shen MC, Wang H, Scahill CM, Tan FJ, Du SJ, Busch-Nentwich EM and Farber SA

    Carnegie Institution for Science, Department of Embryology, Baltimore, Maryland, United States of America.

    As model organism-based research shifts from forward to reverse genetics approaches, largely due to the ease of genome editing technology, a low frequency of abnormal phenotypes is being observed in lines with mutations predicted to lead to deleterious effects on the encoded protein. In zebrafish, this low frequency is in part explained by compensation by genes of redundant or similar function, often resulting from the additional round of teleost-specific whole genome duplication within vertebrates. Here we offer additional explanations for the low frequency of mutant phenotypes. We analyzed mRNA processing in seven zebrafish lines with mutations expected to disrupt gene function, generated by CRISPR/Cas9 or ENU mutagenesis methods. Five of the seven lines showed evidence of altered mRNA processing: one through a skipped exon that did not lead to a frame shift, one through nonsense-associated splicing that did not lead to a frame shift, and three through the use of cryptic splice sites. These results highlight the need for a methodical analysis of the mRNA produced in mutant lines before making conclusions or embarking on studies that assume loss of function as a result of a given genomic change. Furthermore, recognition of the types of adaptations that can occur may inform the strategies of mutant generation.

    PLoS genetics 2017;13;11;e1007105

  • One-step generation of conditional and reversible gene knockouts.

    Andersson-Rolf A, Mustata RC, Merenda A, Kim J, Perera S, Grego T, Andrews K, Tremble K, Silva JC, Fink J, Skarnes WC and Koo BK

    Wellcome Trust-Medical Research Council Stem Cell Institute, University of Cambridge, Cambridge, UK.

    Loss-of-function studies are key for investigating gene function, and CRISPR technology has made genome editing widely accessible in model organisms and cells. However, conditional gene inactivation in diploid cells is still difficult to achieve. Here, we present CRISPR-FLIP, a strategy that provides an efficient, rapid and scalable method for biallelic conditional gene knockouts in diploid or aneuploid cells, such as pluripotent stem cells, 3D organoids and cell lines, by co-delivery of CRISPR-Cas9 and a universal conditional intronic cassette.

    Funded by: Medical Research Council: MC_PC_12009; Wellcome Trust

    Nature methods 2017;14;3;287-289

  • Identifying cell populations with scRNASeq.

    Andrews TS and Hemberg M

    Wellcome Trust Sanger Institute, Hinxton, Cambridgeshire, UK.

    Single-cell RNASeq (scRNASeq) has emerged as a powerful method for quantifying the transcriptome of individual cells. However, the data from scRNASeq experiments is often both noisy and high dimensional, making the computational analysis non-trivial. Here we provide an overview of different experimental protocols and the most popular methods for facilitating the computational analysis. We focus on approaches for identifying biologically important genes, projecting data into lower dimensions and clustering data into putative cell-populations. Finally we discuss approaches to validation and biological interpretation of the identified cell-types or cell-states.

    Molecular aspects of medicine 2017

  • DeepCpG: accurate prediction of single-cell DNA methylation states using deep learning.

    Angermueller C, Lee HJ, Reik W and Stegle O

    European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.

    Recent technological advances have enabled DNA methylation to be assayed at single-cell resolution. However, current protocols are limited by incomplete CpG coverage and hence methods to predict missing methylation states are critical to enable genome-wide analyses. We report DeepCpG, a computational approach based on deep neural networks to predict methylation states in single cells. We evaluate DeepCpG on single-cell methylation data from five cell types generated using alternative sequencing protocols. DeepCpG yields substantially more accurate predictions than previous methods. Additionally, we show that the model parameters can be interpreted, thereby providing insights into how sequence composition affects methylation variability.

    Funded by: Wellcome Trust

    Genome biology 2017;18;1;67

  • Genetic diversity of the African malaria vector Anopheles gambiae.

    Anopheles gambiae 1000 Genomes Consortium, Data analysis group, Partner working group, Sample collections—Angola:, Burkina Faso:, Cameroon:, Gabon:, Guinea:, Guinea-Bissau:, Kenya:, Uganda:, Crosses:, Sequencing and data production, Web application development and Project coordination

    The sustainability of malaria control in Africa is threatened by the rise of insecticide resistance in Anopheles mosquitoes, which transmit the disease. To gain a deeper understanding of how mosquito populations are evolving, here we sequenced the genomes of 765 specimens of Anopheles gambiae and Anopheles coluzzii sampled from 15 locations across Africa, and identified over 50 million single nucleotide polymorphisms within the accessible genome. These data revealed complex population structure and patterns of gene flow, with evidence of ancient expansions, recent bottlenecks, and local variation in effective population size. Strong signals of recent selection were observed in insecticide-resistance genes, with several sweeps spreading over large geographical distances and between species. The design of new tools for mosquito control using gene-drive systems will need to take account of high levels of genetic diversity in natural mosquito populations.

    Funded by: Medical Research Council: G0600718, G1002624, G1100339, MR/M006212/1; NIAID NIH HHS: R01 AI082734, U19 AI089674; NIGMS NIH HHS: R01 GM117241; Wellcome Trust: 090532/Z/09/Z, 090770/Z/09/Z, 098051

    Nature 2017;552;7683;96-100

  • Molecular markers for artemisinin and partner drug resistance in natural Plasmodium falciparum populations following increased insecticide treated net coverage along the slope of mount Cameroon: cross-sectional study.

    Apinjoh TO, Mugri RN, Miotto O, Chi HF, Tata RB, Anchang-Kimbi JK, Fon EM, Tangoh DA, Nyingchu RV, Jacob C, Amato R, Djimde A, Kwiatkowski D, Achidi EA and Amambua-Ngwa A

    Department of Biochemistry and Molecular Biology, University of Buea, Buea, Cameroon.

    Background: Drug resistance is one of the greatest challenges of malaria control programmes, with the monitoring of parasite resistance to artemisinins or to Artemisinin Combination Therapy (ACT) partner drugs critical to elimination efforts. Markers of resistance to a wide panel of antimalarials were assessed in natural parasite populations from southwestern Cameroon.

    Methods: Individuals with asymptomatic parasitaemia or uncomplicated malaria were enrolled through cross-sectional surveys from May 2013 to March 2014 along the slope of mount Cameroon. Plasmodium falciparum malaria parasitaemic blood, screened by light microscopy, was depleted of leucocytes using CF11 cellulose columns and the parasite genotype ascertained by sequencing on the Illumina HiSeq platform.

    Results: A total of 259 participants were enrolled in this study from three different altitudes. While some alleles associated with drug resistance in pfdhfr, pfmdr1 and pfcrt were highly prevalent, less than 3% of all samples carried mutations in the pfkelch13 gene, none of which were amongst those associated with slow artemisinin parasite clearance rates in Southeast Asia. The most prevalent haplotypes were triple mutants Pfdhfr I 51 R 59 N 108 I 164(99%), pfcrt- C72V73 I 74 E 75 T 76 (47.3%), and single mutants PfdhpsS436 G 437K540A581A613(69%) and Pfmdr1 N86 F 184D1246 (53.2%).

    Conclusions: The predominance of the Pf pfcrt CVIET and Pf dhfr IRN triple mutant parasites and absence of pfkelch13 resistance alleles suggest that the amodiaquine and pyrimethamine components of AS-AQ and SP may no longer be effective in their role while chloroquine resistance still persists in southwestern Cameroon.

    Infectious diseases of poverty 2017;6;1;136

  • Rare Variant, Gene-Based Association Study of Hereditary Melanoma Using Whole-Exome Sequencing.

    Artomov M, Stratigos AJ, Kim I, Kumar R, Lauss M, Reddy BY, Miao B, Daniela Robles-Espinoza C, Sankar A, Njauw CN, Shannon K, Gragoudas ES, Marie Lane A, Iyer V, Newton-Bishop JA, Timothy Bishop D, Holland EA, Mann GJ, Singh T, Daly MJ and Tsao H

    MGH Analytic and Translational Genetics Unit, MGH and Broad Institute, Boston, MA.

    Background: Extraordinary progress has been made in our understanding of common variants in many diseases, including melanoma. Because the contribution of rare coding variants is not as well characterized, we performed an exome-wide, gene-based association study of familial cutaneous melanoma (CM) and ocular melanoma (OM).

    Methods: Using 11 990 jointly processed individual DNA samples, whole-exome sequencing was performed, followed by large-scale joint variant calling using GATK (Genome Analysis ToolKit). PLINK/SEQ was used for statistical analysis of genetic variation. Four models were used to estimate the association among different types of variants. In vitro functional validation was performed using three human melanoma cell lines in 2D and 3D proliferation assays. In vivo tumor growth was assessed using xenografts of human melanoma A375 melanoma cells in nude mice (eight mice per group). All statistical tests were two-sided.

    Results: Strong signals were detected for CDKN2A (Pmin = 6.16 × 10-8) in the CM cohort (n = 273) and BAP1 (Pmin = 3.83 × 10-6) in the OM (n = 99) cohort. Eleven genes that exhibited borderline association (P < 10-4) were independently validated using The Cancer Genome Atlas melanoma cohort (379 CM, 47 OM) and a matched set of 3563 European controls with CDKN2A (P = .009), BAP1 (P = .03), and EBF3 (P = 4.75 × 10-4), a candidate risk locus, all showing evidence of replication. EBF3 was then evaluated using germline data from a set of 132 familial melanoma cases and 4769 controls of UK origin (joint P = 1.37 × 10-5). Somatically, loss of EBF3 expression correlated with progression, poorer outcome, and high MITF tumors. Functionally, induction of EBF3 in melanoma cells reduced cell growth in vitro, retarded tumor formation in vivo, and reduced MITF levels.

    Conclusions: The results of this large rare variant germline association study further define the mutational landscape of hereditary melanoma and implicate EBF3 as a possible CM predisposition gene.

    Funded by: NCI NIH HHS: K24 CA149202

    Journal of the National Cancer Institute 2017;109;12

  • A two-stage inter-rater approach for enrichment testing of variants associated with multiple traits.

    Asimit JL, Payne F, Morris AP, Cordell HJ and Barroso I

    Wellcome Trust Sanger Institute, Hinxton, UK.

    Shared genetic aetiology may explain the co-occurrence of diseases in individuals more often than expected by chance. On identifying associated variants shared between two traits, one objective is to determine whether such overlap may be explained by specific genomic characteristics (eg, functional annotation). In clinical studies, inter-rater agreement approaches assess concordance among expert opinions on the presence/absence of a complex disease for each subject. We adapt a two-stage inter-rater agreement model to the genetic association setting to identify features predictive of overlap variants, while accounting for their marginal trait associations. The resulting corrected overlap and marginal enrichment test (COMET) also assesses enrichment at the individual trait level. Multiple categories may be tested simultaneously and the method is computationally efficient, not requiring permutations to assess significance. In an extensive simulation study, COMET identifies features predictive of enrichment with high power and has well-calibrated type I error. In contrast, testing for overlap with a single-trait enrichment test has inflated type I error. COMET is applied to three glycaemic traits using a set of functional annotation categories as predictors, followed by further analyses that focus on tissue-specific regulatory variants. The results support previous findings that regulatory variants in pancreatic islets are enriched for fasting glucose-associated variants, and give insight into differences/similarities between characteristics of variants associated with glycaemic traits. Also, despite regulatory variants in pancreatic islets being enriched for variants that are marginally associated with fasting glucose and fasting insulin, there is no enrichment of shared variants between the traits.

    Funded by: Medical Research Council: MR/K021486/1; Wellcome Trust: 098017, 098051, 102858

    European journal of human genetics : EJHG 2017;25;3;341-349

  • Single-cell RNA-sequencing uncovers transcriptional states and fate decisions in haematopoiesis.

    Athanasiadis EI, Botthof JG, Andres H, Ferreira L, Lio P and Cvejic A

    Department of Haematology, University of Cambridge, Cambridge, CB2 0XY, UK.

    The success of marker-based approaches for dissecting haematopoiesis in mouse and human is reliant on the presence of well-defined cell surface markers specific for diverse progenitor populations. An inherent problem with this approach is that the presence of specific cell surface markers does not directly reflect the transcriptional state of a cell. Here, we used a marker-free approach to computationally reconstruct the blood lineage tree in zebrafish and order cells along their differentiation trajectory, based on their global transcriptional differences. Within the population of transcriptionally similar stem and progenitor cells, our analysis reveals considerable cell-to-cell differences in their probability to transition to another committed state. Once fate decision is executed, the suppression of transcription of ribosomal genes and upregulation of lineage-specific factors coordinately controls lineage differentiation. Evolutionary analysis further demonstrates that this haematopoietic programme is highly conserved between zebrafish and higher vertebrates.

    Funded by: Cancer Research UK: C45041/A14953; Medical Research Council: MC_PC_12009; Wellcome Trust

    Nature communications 2017;8;1;2045

  • Science Forum: The Human Cell Atlas

    Aviv Regev, Sarah A Teichmann, Eric S Lander, Ido Amit, Christophe Benoist, Ewan Birney, Bernd Bodenmiller, Peter Campbell, Piero Carninci, Menna Clatworthy, Hans Clevers, Bart Deplancke, Ian Dunham, James Eberwine, Roland Eils, Wolfgang Enard, Andrew Farmer, Lars Fugger, Berthold Göttgens, Nir Hacohen, Muzlifah Haniffa, Martin Hemberg, Seung Kim, Paul Klenerman, Arnold Kriegstein, Ed Lein, Sten Linnarsson, Emma Lundberg, Joakim Lundeberg, Partha Majumder, John C Marioni, Miriam Merad, Musa Mhlanga, Martijn Nawijn, Mihai Netea, Garry Nolan, Dana Pe'er, Anthony Phillipakis, Chris P Ponting, Stephen Quake, Wolf Reik, Orit Rozenblatt-Rosen, Joshua Sanes, Rahul Satija, Ton N Schumacher, Alex Shalek, Ehud Shapiro, Padmanee Sharma, Jay W Shin, Oliver Stegle, Michael Stratton, Michael J T Stubbington, Fabian J Theis, Matthias Uhlen, Alexander van Oudenaarden, Allon Wagner, Fiona Watt, Jonathan Weissman, Barbara Wold, Ramnik Xavier, Nir Yosef and Human Cell Atlas Meeting Participants

    Advances in techniques for analysing single cells and tissues have inspired an international effort to create comprehensive reference maps of all human cells - the fundamental units of life - as a basis for both understanding human health and diagnosing, monitoring and treating disease.

    eLife 2017

  • Heterogeneity of the Epstein-Barr virus major internal repeat reveals evolutionary mechanisms of EBV and a functional defect in the prototype EBV strain B95-8.

    Ba Abdullah M, Palermo R, Palser A, Grayson NE, Kellam P, Correia S, Szymula A and White R

    Section of Virology, Imperial College Faculty of Medicine, St Mary's Hospital, London, UK.

    Epstein-Barr virus (EBV) is a ubiquitous pathogen of humans that can cause several types of lymphoma and carcinoma. Like other herpesviruses, EBV has diversified both through co-evolution with its host, and genetic exchange between virus strains. Sequence analysis of the EBV genome is unusually challenging, because of the large number and length of repeat regions within the virus. Here we describe the sequence assembly and analysis of the large internal repeat of EBV (IR1 or BamW repeats) from over 70 strains.Diversity of the latency protein EBNA-LP resides predominantly within the exons downstream of IR1. The integrity of the putative BWRF1 ORF is retained in over 80% of strains, and deletions truncating IR1 always spare BWRF1. Conserved regions include the IR1 latency promoter (Wp), and one zone upstream of and two within BWRF1.IR1 is heterogeneous in 70% of strains, and this heterogeneity arises from sequence exchange between strains as well as spontaneous mutation, with inter-strain recombination more common in tumour-derived viruses. This genetic exchange often incorporates regions of <1kb, and allelic gene conversion changes the frequency of small regions within the repeat, but not close to the flanks. These observations suggest that IR1 - and by extension EBV - diversifies through both recombination and breakpoint repair, while concerted evolution of IR1 is driven by gene conversion of small regions. Finally, the prototype EBV strain B95-8 contains four non-consensus variants within a single IR1 repeat unit, including a STOP codon in EBNA-LP. Repairing IR1 improves EBNA-LP levels and the quality of transformation by the B95-8 BAC.IMPORTANCE Epstein-Barr virus (EBV) infects the majority of the world population, but only causes illness in a small minority. Nevertheless, over 1% of cancers worldwide are attributable to EBV. Recent sequencing projects investigating virus diversity, to see if different strains have different disease impacts, have excluded regions of repeating sequence, as they are more technically challenging. Here we analyse the sequence of the largest repeat in EBV (IR1). We first characterised the variations in protein sequences encoded across IR1. In studying variations within the repeat of each strain, we identified a mutation in the main laboratory strain of EBV that impairs virus function, and suggest that tumour-associated viruses may be more likely to contain DNA mixed from two strains. Patterns of this mixing suggest that sequences can spread between strains (and also within the repeat) by copying sequence from another strain (or repeat unit) to repair DNA damage.

    Journal of virology 2017

  • Differentiation dynamics of mammary epithelial cells revealed by single-cell RNA sequencing.

    Bach K, Pensa S, Grzelak M, Hadfield J, Adams DJ, Marioni JC and Khaled WT

    Department of Pharmacology, University of Cambridge, Cambridge, CB2 1PD, UK.

    Characterising the hierarchy of mammary epithelial cells (MECs) and how they are regulated during adult development is important for understanding how breast cancer arises. Here we report the use of single-cell RNA sequencing to determine the gene expression profile of MECs across four developmental stages; nulliparous, mid gestation, lactation and post involution. Our analysis of 23,184 cells identifies 15 clusters, few of which could be fully characterised by a single marker gene. We argue instead that the epithelial cells-especially in the luminal compartment-should rather be conceptualised as being part of a continuous spectrum of differentiation. Furthermore, our data support the existence of a common luminal progenitor cell giving rise to intermediate, restricted alveolar and hormone-sensing progenitors. This luminal progenitor compartment undergoes transcriptional changes in response to a full pregnancy, lactation and involution. In summary, our results provide a global, unbiased view of adult mammary gland development.

    Funded by: Cancer Research UK: C47525/A17348

    Nature communications 2017;8;1;2128

  • Whole genome sequencing of Shigella sonnei through PulseNet Latin America and Caribbean: advancing global surveillance of foodborne illnesses.

    Baker KS, Campos J, Pichel M, Della Gaspera A, Duarte-Martínez F, Campos-Chacón E, Bolaños-Acuña HM, Guzman-Verri C, Mather AE, Velasco SD, Zamudio Rojas ML, Forbester J, Connor TR, Keddy KH, Smith AM, Lopez de Delgado EA, Angiolillo G, Cuaical N, Fernandez J, Aguayo C, Aguilar MM, Valenzuela C, Morales Medrano AJ, Esteve AS, Gustafson NW, Diaz Guevara PL, Montaño LA, Perez E and Thomson NR

    University of Liverpool, Department of Functional and Comparative Genomics, Liverpool, United Kingdom, L69 7ZB; Wellcome Trust Sanger Institute, Pathogen Variation Programme, Hinxton, United Kingdom, CB10 1SA. Electronic address:

    Objective: Shigella sonnei is a globally-important diarrhoeal pathogen tracked through the surveillance network PulseNet Latin America and Caribbean (PNLA&C), which participates in PulseNet International. PNLA&C laboratories use common molecular techniques to track pathogens causing foodborne illness. We aimed to demonstrate the possibility and advantages of transitioning to whole genome sequencing (WGS) for surveillance within existing networks across a continent where S. sonnei is endemic.

    Methods: We applied WGS to representative archive isolates of S. sonnei (n=323) from laboratories in nine PNLA&C countries to generate a regional phylogenomic reference for S. sonnei and put this in the global context. We used this reference to contextualise 16 S. sonnei from three Argentinian outbreaks, using locally-generated sequence data. Assembled genome sequences were used to predict antimicrobial resistance (AMR) phenotypes and identify AMR determinants.

    Results: S. sonnei isolates clustered in five Latin American sublineages in the global phylogeny, with many (46%, 149 of 323) belonging to previously undescribed sublineages. Predicted multiple drug resistance was common (77%, 249 of 323) and clinically-relevant differences in AMR were found among sublineages. The regional overview showed that Argentinian outbreak isolates belonged to distinct sublineages and had different epidemiological origins.

    Conclusions: Latin America contains novel genetic diversity of S. sonnei that is relevant on a global scale and commonly exhibits multiple drug resistance. Retrospective passive surveillance with WGS has utility for informing treatment , identifying regionally-epidemic sublineages and providing a framework for interpretation of prospective, locally-sequenced outbreaks.

    Clinical microbiology and infection : the official publication of the European Society of Clinical Microbiology and Infectious Diseases 2017

  • Phylogeography of human Y-chromosome haplogroup Q3-L275 from an academic/citizen science collaboration.

    Balanovsky O, Gurianov V, Zaporozhchenko V, Balaganskaya O, Urasin V, Zhabagin M, Grugni V, Canada R, Al-Zahery N, Raveane A, Wen SQ, Yan S, Wang X, Zalloua P, Marafi A, Koshel S, Semino O, Tyler-Smith C and Balanovska E

    Vavilov Institute of General Genetics, Moscow, Russia.

    Background: The Y-chromosome haplogroup Q has three major branches: Q1, Q2, and Q3. Q1 is found in both Asia and the Americas where it accounts for about 90% of indigenous Native American Y-chromosomes; Q2 is found in North and Central Asia; but little is known about the third branch, Q3, also named Q1b-L275. Here, we combined the efforts of population geneticists and genetic genealogists to use the potential of full Y-chromosome sequencing for reconstructing haplogroup Q3 phylogeography and suggest possible linkages to events in population history.

    Results: We analyzed 47 fully sequenced Y-chromosomes and reconstructed the haplogroup Q3 phylogenetic tree in detail. Haplogroup Q3-L275, derived from the oldest known split within Eurasian/American haplogroup Q, most likely occurred in West or Central Asia in the Upper Paleolithic period. During the Mesolithic and Neolithic epochs, Q3 remained a minor component of the West Asian Y-chromosome pool and gave rise to five branches (Q3a to Q3e), which spread across West, Central and parts of South Asia. Around 3-4 millennia ago (Bronze Age), the Q3a branch underwent a rapid expansion, splitting into seven branches, some of which entered Europe. One of these branches, Q3a1, was acquired by a population ancestral to Ashkenazi Jews and grew within this population during the 1st millennium AD, reaching up to 5% in present day Ashkenazi.

    Conclusions: This study dataset was generated by a massive Y-chromosome genotyping effort in the genetic genealogy community, and phylogeographic patterns were revealed by a collaboration of population geneticists and genetic genealogists. This positive experience of collaboration between academic and citizen science provides a model for further joint projects. Merging data and skills of academic and citizen science promises to combine, respectively, quality and quantity, generalization and specialization, and achieve a well-balanced and careful interpretation of the paternal-side history of human populations.

    BMC evolutionary biology 2017;17;Suppl 1;18

  • Compound heterozygous variants in NBAS as a cause of atypical osteogenesis imperfecta.

    Balasubramanian M, Hurst J, Brown S, Bishop NJ, Arundel P, DeVile C, Pollitt RC, Crooks L, Longman D, Caceres JF, Shackley F, Connolly S, Payne JH, Offiah AC, Hughes D, DDD Study, Parker MJ, Hide W and Skerry TM

    Sheffield Clinical Genetics Service, Sheffield Children's NHS Foundation Trust, UK; Highly Specialised Service for Severe, Complex and Atypical OI, UK. Electronic address:

    Background: Osteogenesis imperfecta (OI), the commonest inherited bone fragility disorder, affects 1 in 15,000 live births resulting in frequent fractures and reduced mobility, with significant impact on quality of life. Early diagnosis is important, as therapeutic advances can lead to improved clinical outcome and patient benefit.

    Report: Whole exome sequencing in patients with OI identified, in two patients with a multi-system phenotype, compound heterozygous variants in NBAS (neuroblastoma amplified sequence). Patient 1: NBAS c.5741G>A p.(Arg1914His); c.3010C>T p.(Arg1004*) in a 10-year old boy with significant short stature, bone fragility requiring treatment with bisphosphonates, developmental delay and immunodeficiency. Patient 2: NBAS c.5741G>A p.(Arg1914His); c.2032C>T p.(Gln678*) in a 5-year old boy with similar presenting features, bone fragility, mild developmental delay, abnormal liver function tests and immunodeficiency.

    Discussion: Homozygous missense NBAS variants cause SOPH syndrome (short stature; optic atrophy; Pelger-Huet anomaly), the same missense variant was found in our patients on one allele and a nonsense variant in the other allele. Recent literature suggests a multi-system phenotype. In this study, patient fibroblasts have shown reduced collagen expression, compared to control cells and RNAseq studies, in bone cells show that NBAS is expressed in osteoblasts and osteocytes of rodents and primates. These findings provide proof-of-concept that NBAS mutations have mechanistic effects in bone, and that NBAS variants are a novel cause of bone fragility, which is distinguishable from 'Classical' OI.

    Conclusions: Here we report on variants in NBAS, as a cause of bone fragility in humans, and expand the phenotypic spectrum associated with NBAS. We explore the mechanism underlying NBAS and the striking skeletal phenotype in our patients.

    Funded by: Department of Health UK; Medical Research Council: MC_PC_15018, MC_PC_U127584479; Wellcome Trust: WT098051

    Bone 2017;94;65-74

  • Chitayat syndrome: hyperphalangism, characteristic facies, hallux valgus and bronchomalacia results from a recurrent c.266A>G p.(Tyr89Cys) variant in the <i>ERF</i> gene.

    Balasubramanian M, Lord H, Levesque S, Guturu H, Thuriot F, Sillon G, Wenger AM, Sureka DL, Lester T, Johnson DS, Bowen J, Calhoun AR, Viskochil DH, DDD Study, Bejerano G, Bernstein JA and Chitayat D

    Sheffield Clinical Genetics Service, Sheffield Children's NHS Foundation Trust, Sheffield, UK.

    Background: In 1993, Chitayat <i>et al.</i>, reported a newborn with hyperphalangism, facial anomalies, and bronchomalacia. We identified three additional families with similar findings. Features include bilateral accessory phalanx resulting in shortened index fingers; hallux valgus; distinctive face; respiratory compromise.

    Objectives: To identify the genetic aetiology of Chitayat syndrome and identify a unifying cause for this specific form of hyperphalangism.

    Methods: Through ongoing collaboration, we had collected patients with strikingly-similar phenotype. Trio-based exome sequencing was first performed in Patient 2 through Deciphering Developmental Disorders study. Proband-only exome sequencing had previously been independently performed in Patient 4. Following identification of a candidate gene variant in Patient 2, the same variant was subsequently confirmed from exome data in Patient 4. Sanger sequencing was used to validate this variant in Patients 1, 3; confirm paternal inheritance in Patient 5.

    Results: A recurrent, novel variant NM_006494.2:c.266A>G p.(Tyr89Cys) in <i>ERF</i> was identified in five affected individuals: de novo (patient 1, 2 and 3) and inherited from an affected father (patient 4 and 5). p.Tyr89Cys is an aromatic polar neutral to polar neutral amino acid substitution, at a highly conserved position and lies within the functionally important ETS-domain of the protein. The recurrent <i>ERF</i> c.266A>C p.(Tyr89Cys) variant causes Chitayat syndrome.

    Discussion: <i>ERF</i> variants have previously been associated with complex craniosynostosis. In contrast, none of the patients with the c.266A>G p.(Tyr89Cys) variant have craniosynostosis.

    Conclusions: We report the molecular aetiology of Chitayat syndrome and discuss potential mechanisms for this distinctive phenotype associated with the p.Tyr89Cys substitution in <i>ERF</i>.

    Funded by: NIMH NIH HHS: U01 MH105949; Wellcome Trust: WT098051

    Journal of medical genetics 2017;54;3;157-165

  • Delineating the phenotypic spectrum of Bainbridge-Ropers syndrome: 12 new patients with <i>de novo</i>, heterozygous, loss-of-function mutations in <i>ASXL3</i> and review of published literature.

    Balasubramanian M, Willoughby J, Fry AE, Weber A, Firth HV, Deshpande C, Berg JN, Chandler K, Metcalfe KA, Lam W, Pilz DT and Tomkins S

    Sheffield Clinical Genetics Service, Sheffield Children's NHS Foundation Trust, Sheffield, UK.

    Background: Bainbridge-Ropers syndrome (BRPS) is a recently described developmental disorder caused by <i>de novo</i> truncating mutations in the additional sex combs like 3 (<i>ASXL3</i>) gene. To date, there have been fewer than 10 reported patients.

    Objectives: Here, we delineate the BRPS phenotype further by describing a series of 12 previously unreported patients identified by the Deciphering Developmental Disorders study.

    Methods: Trio-based exome sequencing was performed on all 12 patients included in this study, which found a <i>de novo</i> truncating mutation in <i>ASXL3</i>. Detailed phenotypic information and patient images were collected and summarised as part of this study.

    Results: By obtaining genotype:phenotype data, we have been able to demonstrate a second mutation cluster region within <i>ASXL3</i>. This report expands the phenotype of older patients with BRPS; common emerging features include severe intellectual disability (11/12), poor/ absent speech (12/12), autistic traits (9/12), distinct face (arched eyebrows, prominent forehead, high-arched palate, hypertelorism and downslanting palpebral fissures), (9/12), hypotonia (11/12) and significant feeding difficulties (9/12) when young.

    Discussion: Similarities in the patients reported previously in comparison with this cohort included their distinctive craniofacial features, feeding problems, absent/limited speech and intellectual disability. Shared behavioural phenotypes include autistic traits, hand-flapping, rocking, aggressive behaviour and sleep disturbance.

    Conclusions: This series expands the phenotypic spectrum of this severe disorder and highlights its surprisingly high frequency. With the advent of advanced genomic screening, we are likely to identify more variants in this gene presenting with a variable phenotype, which this study will explore.

    Funded by: Wellcome Trust: WT098051

    Journal of medical genetics 2017;54;8;537-543

  • Promoter-bound METTL3 maintains myeloid leukaemia by m<sup>6</sup>A-dependent translation control.

    Barbieri I, Tzelepis K, Pandolfini L, Shi J, Millán-Zambrano G, Robson SC, Aspris D, Migliori V, Bannister AJ, Han N, De Braekeleer E, Ponstingl H, Hendrick A, Vakoc CR, Vassiliou GS and Kouzarides T

    The Gurdon Institute and Department of Pathology, University of Cambridge, Tennis Court Road, Cambridge CB2 1QN, UK.

    N<sup>6</sup>-methyladenosine (m<sup>6</sup>A) is an abundant internal RNA modification in both coding and non-coding RNAs that is catalysed by the METTL3-METTL14 methyltransferase complex. However, the specific role of these enzymes in cancer is still largely unknown. Here we define a pathway that is specific for METTL3 and is implicated in the maintenance of a leukaemic state. We identify METTL3 as an essential gene for growth of acute myeloid leukaemia cells in two distinct genetic screens. Downregulation of METTL3 results in cell cycle arrest, differentiation of leukaemic cells and failure to establish leukaemia in immunodeficient mice. We show that METTL3, independently of METTL14, associates with chromatin and localizes to the transcriptional start sites of active genes. The vast majority of these genes have the CAATT-box binding protein CEBPZ present at the transcriptional start site, and this is required for recruitment of METTL3 to chromatin. Promoter-bound METTL3 induces m<sup>6</sup>A modification within the coding region of the associated mRNA transcript, and enhances its translation by relieving ribosome stalling. We show that genes regulated by METTL3 in this way are necessary for acute myeloid leukaemia. Together, these data define METTL3 as a regulator of a chromatin-based pathway that is necessary for maintenance of the leukaemic state and identify this enzyme as a potential therapeutic target for acute myeloid leukaemia.

    Funded by: Cancer Research UK: A17001, A23015; European Research Council: 268569; Medical Research Council: MC_PC_12009; Wellcome Trust: 092096, 095663, 098051, C6946/AI4492, WT095663MA

    Nature 2017;552;7683;126-131

  • Evaluation of applicability of DNA microarray-based characterization of bovine Shiga toxin-producing Escherichia coli isolates using whole genome sequence analysis.

    Barth SA, Menge C, Eichhorn I, Semmler T, Pickard D and Geue L

    Friedrich-Loeffler-Institut/Federal Research Institute for Animal Health, Institute of Molecular Pathogenesis, Jena, Germany (Barth, Menge, Geue).

    We assessed the ability of a commercial DNA microarray to characterize bovine Shiga toxin-producing Escherichia coli (STEC) isolates and evaluated the results using in silico hybridization of the microarray probes within whole genome sequencing scaffolds. From a total of 69,954 reactions (393 probes with 178 isolates), 68,706 (98.2%) gave identical results by DNA microarray and in silico probe hybridization. Results were more congruent when detecting the genoserotype (209 differing results from 19,758 in total; 1.1%) or antimicrobial resistance genes (AMRGs; 141 of 26,878; 0.5%) than when detecting virulence-associated genes (VAGs; 876 of 22,072; 4.0%). Owing to the limited coverage of O-antigens by the microarray, only 37.2% of the isolates could be genoserotyped. However, the microarray proved suitable to rapidly screen bovine STEC strains for the occurrence of high numbers of VAGs and AMRGs and is suitable for molecular surveillance workflows.

    Journal of veterinary diagnostic investigation : official publication of the American Association of Veterinary Laboratory Diagnosticians, Inc 2017;29;5;721-724

  • Accurate characterization of the IFITM locus using MiSeq and PacBio sequencing shows genetic variation in Galliformes.

    Bassano I, Ong SH, Lawless N, Whitehead T, Fife M and Kellam P

    The Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.

    Background: Interferon inducible transmembrane (IFITM) proteins are effectors of the immune system widely characterized for their role in restricting infection by diverse enveloped and non-enveloped viruses. The chicken IFITM (chIFITM) genes are clustered on chromosome 5 and to date four genes have been annotated, namely chIFITM1, chIFITM3, chIFITM5 and chIFITM10. However, due to poor assembly of this locus in the Gallus Gallus v4 genome, accurate characterization has so far proven problematic. Recently, a new chicken reference genome assembly Gallus Gallus v5 was generated using Sanger, 454, Illumina and PacBio sequencing technologies identifying considerable differences in the chIFITM locus over the previous genome releases.

    Methods: We re-sequenced the locus using both Illumina MiSeq and PacBio RS II sequencing technologies and we mapped RNA-seq data from the European Nucleotide Archive (ENA) to this finalized chIFITM locus. Using SureSelect probes capture probes designed to the finalized chIFITM locus, we sequenced the locus of a different chicken breed, namely a White Leghorn, and a turkey.

    Results: We confirmed the Gallus Gallus v5 consensus except for two insertions of 5 and 1 base pair within the chIFITM3 and B4GALNT4 genes, respectively, and a single base pair deletion within the B4GALNT4 gene. The pull down revealed a single amino acid substitution of A63V in the CIL domain of IFITM2 compared to Red Jungle fowl and 13, 13 and 11 differences between IFITM1, 2 and 3 of chickens and turkeys, respectively. RNA-seq shows chIFITM2 and chIFITM3 expression in numerous tissue types of different chicken breeds and avian cell lines, while the expression of the putative chIFITM1 is limited to the testis, caecum and ileum tissues.

    Conclusions: Locus resequencing using these capture probes and RNA-seq based expression analysis will allow the further characterization of genetic diversity within Galliformes.

    Funded by: Biotechnology and Biological Sciences Research Council: BB/L00397X/2, BB/L00397X/1 , BB/L003996/1

    BMC genomics 2017;18;1;419

  • Editing the genome of hiPSC with CRISPR/Cas9: disease models.

    Bassett AR

    Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.

    The advent of human-induced pluripotent stem cell (hiPSC) technology has provided a unique opportunity to establish cellular models of disease from individual patients, and to study the effects of the underlying genetic aberrations upon multiple different cell types, many of which would not normally be accessible. Combining this with recent advances in genome editing techniques such as the clustered regularly interspaced short palindromic repeat (CRISPR) system has provided an ability to repair putative causative alleles in patient lines, or introduce disease alleles into a healthy "WT" cell line. This has enabled analysis of isogenic cell pairs that differ in a single genetic change, which allows a thorough assessment of the molecular and cellular phenotypes that result from this abnormality. Importantly, this establishes the true causative lesion, which is often impossible to ascertain from human genetic studies alone. These isogenic cell lines can be used not only to understand the cellular consequences of disease mutations, but also to perform high throughput genetic and pharmacological screens to both understand the underlying pathological mechanisms and to develop novel therapeutic agents to prevent or treat such diseases. In the future, optimising and developing such genetic manipulation technologies may facilitate the provision of cellular or molecular gene therapies, to intervene and ultimately cure many debilitating genetic disorders.

    Funded by: Wellcome: Core funding; Wellcome Trust

    Mammalian genome : official journal of the International Mammalian Genome Society 2017;28;7-8;348-364

  • A Family Based Study of Carbon Monoxide and Nitric Oxide Signalling Genes and Preeclampsia.

    Bauer AE, Avery CL, Shi M, Weinberg CR, Olshan AF, Harmon QE, Luo J, Yang J, Manuck TA, Wu MC, Williams N, McGinnis R, Morgan L, Klungsøyr K, Trogstad L, Magnus P and Engel SM

    Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, NC, USA.

    Background: Preeclampsia is thought to originate during placentation, with incomplete remodelling and perfusion of the spiral arteries leading to reduced placental vascular capacity. Nitric oxide (NO) and carbon monoxide (CO) are powerful vasodilators that play a role in the placental vascular system. Although family clustering of preeclampsia has been observed, the existing genetic literature is limited by a failure to consider both mother and child.

    Methods: We conducted a nested case-control study within the Norwegian Mother and Child Birth Cohort of 1545 case-pairs and 995 control-pairs from 2540 validated dyads (2011 complete pairs, 529 missing mother or child genotype). We selected 1518 single-nucleotide polymorphisms (SNPs) with minor allele frequency >5% in NO and CO signalling pathways. We used log-linear Poisson regression models and likelihood ratio tests to assess maternal and child effects.

    Results: One SNP met criteria for a false discovery rate Q-value <0.05. The child variant, rs12547243 in adenylate cyclase 8 (ADCY8), was associated with an increased risk (relative risk [RR] 1.42, 95% confidence interval [CI] 1.20, 1.69 for AG vs. GG, RR 2.14, 95% CI 1.47, 3.11 for AA vs. GG, Q = 0.03). The maternal variant, rs30593 in PDE1C was associated with a decreased risk for the subtype of preeclampsia accompanied by early delivery (RR 0.45, 95% CI 0.27, 0.75 for TC vs. CC; Q = 0.02). None of the associations were replicated after correction for multiple testing.

    Conclusions: This study uses a novel approach to disentangle maternal and child genotypic effects of NO and CO signalling genes on preeclampsia.

    Paediatric and perinatal epidemiology 2017

  • Evolution of complexity in the zebrafish synapse proteome.

    Bayés À, Collins MO, Reig-Viader R, Gou G, Goulding D, Izquierdo A, Choudhary JS, Emes RD and Grant SG

    Molecular Physiology of the Synapse Laboratory, Biomedical Research Institute Sant Pau (IIB Sant Pau), Sant Antoni Maria Claret 167, 08025 Barcelona, Spain.

    The proteome of human brain synapses is highly complex and is mutated in over 130 diseases. This complexity arose from two whole-genome duplications early in the vertebrate lineage. Zebrafish are used in modelling human diseases; however, its synapse proteome is uncharacterized, and whether the teleost-specific genome duplication (TSGD) influenced complexity is unknown. We report the characterization of the proteomes and ultrastructure of central synapses in zebrafish and analyse the importance of the TSGD. While the TSGD increases overall synapse proteome complexity, the postsynaptic density (PSD) proteome of zebrafish has lower complexity than mammals. A highly conserved set of ∼1,000 proteins is shared across vertebrates. PSD ultrastructural features are also conserved. Lineage-specific proteome differences indicate that vertebrate species evolved distinct synapse types and functions. The data sets are a resource for a wide range of studies and have important implications for the use of zebrafish in modelling human synaptic diseases.

    Nature communications 2017;8;14613

  • The evolving craniofacial phenotype of a patient with Sensenbrenner syndrome caused by IFT140 compound heterozygous mutations.

    Bayat A, Kerr B, Douzgou S and DDD Study

    aDepartment of Pediatrics, University Hospital of Hvidovre, Hvidovre, Denmark bManchester Centre for Genomic Medicine, St Mary's Hospital, Central Manchester University Hospitals NHS Foundation Trust, Manchester Academic Health Sciences Centre cSchool of Biological Sciences, Division of Evolution and Genomic Sciences, University of Manchester, Manchester dWellcome Trust Sanger Institute, Hinxton, Cambridge, UK.

    Clinical dysmorphology 2017;26;4;247-251

  • Recurrent mutation of IGF signalling genes and distinct patterns of genomic rearrangement in osteosarcoma.

    Behjati S, Tarpey PS, Haase K, Ye H, Young MD, Alexandrov LB, Farndon SJ, Collord G, Wedge DC, Martincorena I, Cooke SL, Davies H, Mifsud W, Lidgren M, Martin S, Latimer C, Maddison M, Butler AP, Teague JW, Pillay N, Shlien A, McDermott U, Futreal PA, Baumhoer D, Zaikova O, Bjerkehagen B, Myklebost O, Amary MF, Tirabosco R, Van Loo P, Stratton MR, Flanagan AM and Campbell PJ

    Cancer Genome Project, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK.

    Osteosarcoma is a primary malignancy of bone that affects children and adults. Here, we present the largest sequencing study of osteosarcoma to date, comprising 112 childhood and adult tumours encompassing all major histological subtypes. A key finding of our study is the identification of mutations in insulin-like growth factor (IGF) signalling genes in 8/112 (7%) of cases. We validate this observation using fluorescence in situ hybridization (FISH) in an additional 87 osteosarcomas, with IGF1 receptor (IGF1R) amplification observed in 14% of tumours. These findings may inform patient selection in future trials of IGF1R inhibitors in osteosarcoma. Analysing patterns of mutation, we identify distinct rearrangement profiles including a process characterized by chromothripsis and amplification. This process operates recurrently at discrete genomic regions and generates driver mutations. It may represent an age-independent mutational mechanism that contributes to the development of osteosarcoma in children and adults alike.

    Funded by: Medical Research Council: MR/N005813/1; Wellcome Trust

    Nature communications 2017;8;15936

  • Prospects of Fine-Mapping Trait-Associated Genomic Regions by Using Summary Statistics from Genome-wide Association Studies.

    Benner C, Havulinna AS, Järvelin MR, Salomaa V, Ripatti S and Pirinen M

    Institute for Molecular Medicine Finland, University of Helsinki, 00014 Helsinki, Finland; Department of Public Health, University of Helsinki, 00014 Helsinki, Finland. Electronic address:

    During the past few years, various novel statistical methods have been developed for fine-mapping with the use of summary statistics from genome-wide association studies (GWASs). Although these approaches require information about the linkage disequilibrium (LD) between variants, there has not been a comprehensive evaluation of how estimation of the LD structure from reference genotype panels performs in comparison with that from the original individual-level GWAS data. Using population genotype data from Finland and the UK Biobank, we show here that a reference panel of 1,000 individuals from the target population is adequate for a GWAS cohort of up to 10,000 individuals, whereas smaller panels, such as those from the 1000 Genomes Project, should be avoided. We also show, both theoretically and empirically, that the size of the reference panel needs to scale with the GWAS sample size; this has important consequences for the application of these methods in ongoing GWAS meta-analyses and large biobank studies. We conclude by providing software tools and by recommending practices for sharing LD information to more efficiently exploit summary statistics in genetics research.

    American journal of human genetics 2017

  • Citrobacter rodentium Subverts ATP Flux and Cholesterol Homeostasis in Intestinal Epithelial Cells In Vivo.

    Berger CN, Crepin VF, Roumeliotis TI, Wright JC, Carson D, Pevsner-Fischer M, Furniss RCD, Dougan G, Bachash M, Yu L, Clements A, Collins JW, Elinav E, Larrouy-Maumus GJ, Choudhary JS and Frankel G

    MRC Centre for Molecular Bacteriology and Infection, Department of Life Sciences, Imperial College London, London, UK.

    The intestinal epithelial cells (IECs) that line the gut form a robust line of defense against ingested pathogens. We investigated the impact of infection with the enteric pathogen Citrobacter rodentium on mouse IEC metabolism using global proteomic and targeted metabolomics and lipidomics. The major signatures of the infection were upregulation of the sugar transporter Sglt4, aerobic glycolysis, and production of phosphocreatine, which mobilizes cytosolic energy. In contrast, biogenesis of mitochondrial cardiolipins, essential for ATP production, was inhibited, which coincided with increased levels of mucosal O2 and a reduction in colon-associated anaerobic commensals. In addition, IECs responded to infection by activating Srebp2 and the cholesterol biosynthetic pathway. Unexpectedly, infected IECs also upregulated the cholesterol efflux proteins AbcA1, AbcG8, and ApoA1, resulting in higher levels of fecal cholesterol and a bloom of Proteobacteria. These results suggest that C. rodentium manipulates host metabolism to evade innate immune responses and establish a favorable gut ecosystem.

    Cell metabolism 2017

  • Paleolithic networking.

    Bergström A and Tyler-Smith C

    The Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1SA, UK.

    Funded by: Wellcome Trust

    Science (New York, N.Y.) 2017;358;6363;586-587

  • A Neolithic expansion, but strong genetic structure, in the independent history of New Guinea.

    Bergström A, Oppenheimer SJ, Mentzer AJ, Auckland K, Robson K, Attenborough R, Alpers MP, Koki G, Pomat W, Siba P, Xue Y, Sandhu MS and Tyler-Smith C

    Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK.

    New Guinea shows human occupation since ~50 thousand years ago (ka), independent adoption of plant cultivation ~10 ka, and great cultural and linguistic diversity today. We performed genome-wide single-nucleotide polymorphism genotyping on 381 individuals from 85 language groups in Papua New Guinea and find a sharp divide originating 10 to 20 ka between lowland and highland groups and a lack of non-New Guinean admixture in the latter. All highlanders share ancestry within the last 10 thousand years, with major population growth in the same period, suggesting population structure was reshaped following the Neolithic lifestyle transition. However, genetic differentiation between groups in Papua New Guinea is much stronger than in comparable regions in Eurasia, demonstrating that such a transition does not necessarily limit the genetic and linguistic diversity of human societies.

    Funded by: European Research Council: 294557; Wellcome Trust: 090532, 098051, 106289

    Science (New York, N.Y.) 2017;357;6356;1160-1163

  • Cross-Species Y Chromosome Function Between Malaria Vectors of the <i>Anopheles gambiae</i> Species Complex.

    Bernardini F, Galizi R, Wunderlich M, Taxiarchi C, Kranjc N, Kyrou K, Hammond A, Nolan T, Lawniczak MNK, Papathanos PA, Crisanti A and Windbichler N

    Department of Life Sciences, Imperial College London, South Kensington Campus, SW7 2AZ, United Kingdom.

    Y chromosome function, structure and evolution is poorly understood in many species, including the <i>Anopheles</i> genus of mosquitoes-an emerging model system for studying speciation that also represents the major vectors of malaria. While the Anopheline Y had previously been implicated in male mating behavior, recent data from the <i>Anopheles gambiae</i> complex suggests that, apart from the putative primary sex-determiner, no other genes are conserved on the Y. Studying the functional basis of the evolutionary divergence of the Y chromosome in the gambiae complex is complicated by complete F1 male hybrid sterility. Here, we used an F1 × F0 crossing scheme to overcome a severe bottleneck of male hybrid incompatibilities that enabled us to experimentally purify a genetically labeled <i>A. gambiae</i> Y chromosome in an <i>A. arabiensis</i> background. Whole genome sequencing (WGS) confirmed that the <i>A. gambiae</i> Y retained its original sequence content in the <i>A. arabiensis</i> genomic background. In contrast to comparable experiments in <i>Drosophila</i>, we find that the presence of a heterospecific Y chromosome has no significant effect on the expression of <i>A. arabiensis</i> genes, and transcriptional differences can be explained almost exclusively as a direct consequence of transcripts arising from sequence elements present on the <i>A. gambiae</i> Y chromosome itself. We find that Y hybrids show no obvious fertility defects, and no substantial reduction in male competitiveness. Our results demonstrate that, despite their radically different structure, Y chromosomes of these two species of the gambiae complex that diverged an estimated 1.85 MYA function interchangeably, thus indicating that the Y chromosome does not harbor loci contributing to hybrid incompatibility. Therefore, Y chromosome gene flow between members of the gambiae complex is possible even at their current level of divergence. Importantly, this also suggests that malaria control interventions based on sex-distorting Y drive would be transferable, whether intentionally or contingent, between the major malaria vector species.

    Funded by: European Research Council: 335724; Medical Research Council: G1100339; Wellcome Trust: 098051

    Genetics 2017;207;2;729-740

  • An endosiRNA-Based Repression Mechanism Counteracts Transposon Activation during Global DNA Demethylation in Embryonic Stem Cells.

    Berrens RV, Andrews S, Spensberger D, Santos F, Dean W, Gould P, Sharif J, Olova N, Chandra T, Koseki H, von Meyenn F and Reik W

    Epigenetics Programme, Babraham Institute, Cambridge CB22 3AT, UK; University of Cambridge, The Old Schools, Trinity Lane, Cambridge CB2 1TN, UK. Electronic address:

    Erasure of DNA methylation and repressive chromatin marks in the mammalian germline leads to risk of transcriptional activation of transposable elements (TEs). Here, we used mouse embryonic stem cells (ESCs) to identify an endosiRNA-based mechanism involved in suppression of TE transcription. In ESCs with DNA demethylation induced by acute deletion of Dnmt1, we saw an increase in sense transcription at TEs, resulting in an abundance of sense/antisense transcripts leading to high levels of ARGONAUTE2 (AGO2)-bound small RNAs. Inhibition of Dicer or Ago2 expression revealed that small RNAs are involved in an immediate response to demethylation-induced transposon activation, while the deposition of repressive histone marks follows as a chronic response. In vivo, we also found TE-specific endosiRNAs present during primordial germ cell development. Our results suggest that antisense TE transcription is a "trap" that elicits an endosiRNA response to restrain acute transposon activity during epigenetic reprogramming in the mammalian germline.

    Cell stem cell 2017;21;5;694-703.e7

  • Complexity and conservation of regulatory landscapes underlie evolutionary resilience of mammalian gene expression.

    Berthelot C, Villar D, Horvath JE, Odom DT and Flicek P

    European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.

    To gain insight into how mammalian gene expression is controlled by rapidly evolving regulatory elements, we jointly analysed promoter and enhancer activity with downstream transcription levels in liver samples from 15 species. Genes associated with complex regulatory landscapes generally exhibit high expression levels that remain evolutionarily stable. While the number of regulatory elements is the key driver of transcriptional output and resilience, regulatory conservation matters: elements active across mammals most effectively stabilize gene expression. In contrast, recently evolved enhancers typically contribute weakly, consistent with their high evolutionary plasticity. These effects are observed across the entire mammalian clade and are robust to potential confounders, such as the gene expression level. Using liver as a representative somatic tissue, our results illuminate how the evolutionary stability of gene expression is profoundly entwined with both the number and conservation of surrounding promoters and enhancers.

    Nature ecology & evolution 2017

  • Cracking Ali Baba's code.

    Billker O

    Wellcome Trust Sanger Institute, Hinxton, United Kingdom.

    A protein called P36 holds the key to how different species of malaria parasite invade liver cells.

    eLife 2017;6

  • Variants of AbGRI3 carrying the armA gene in extensively antibiotic-resistant Acinetobacter baumannii from Singapore.

    Blackwell GA, Holt KE, Bentley SD, Hsu LY and Hall RM

    School of Life and Environmental Sciences, The University of Sydney, NSW 2006, Australia.

    Objectives: To investigate the context of the ribosomal RNA methyltransferase gene armA in carbapenem-resistant global clone 2 (GC2) Acinetobacter baumannii isolates from Singapore.

    Methods: Antibiotic resistance was determined using disc diffusion; PCR was used to identify resistance genes. Whole genome sequences were determined and contigs were assembled and ordered using PCR. Resistance regions in unsequenced isolates were mapped.

    Results: Fifteen GC2 A. baumannii isolated at Singapore General Hospital over the period 2004-11 and found to carry the armA gene were resistant to carbapenems, third-generation cephalosporins, fluoroquinolones and most aminoglycosides. In these isolates, the armA gene was located in a third chromosomal resistance island, previously designated AbGRI3. In four isolates, armA was in a 19 kb IS26-bounded transposon, designated Tn6180 In three of them, a 2.7 kb transposon carrying the aphA1b gene, designated Tn6179, was found adjacent to and sharing an IS26 with Tn6180. However, in these four isolates a 3.1 kb segment of the adjacent chromosomal DNA has been inverted by an IS26-mediated event. The remaining 11 isolates all contained a derivative of Tn6180 that had lost part of the central segment and only one retained Tn6179 The chromosomal inversion was present in four of these and in seven the deletion extended beyond the inversion into adjacent chromosomal DNA. AbGRI3 forms were found in available GC2 sequences carrying armA.

    Conclusions: In GC2 A. baumannii, the armA gene is located in various forms of a third genomic resistance island named AbGRI3. An aphA1b transposon is variably present in AbGRI3.

    The Journal of antimicrobial chemotherapy 2017

  • Induction of Cell Cycle and NK Cell Responses by Live-Attenuated Oral Vaccines against Typhoid Fever.

    Blohmke CJ, Hill J, Darton TC, Carvalho-Burger M, Eustace A, Jones C, Schreiber F, Goodier MR, Dougan G, Nakaya HI and Pollard AJ

    Oxford Vaccine Group, Department of Paediatrics, University of Oxford, NIHR Oxford Biomedical Research Centre, Oxford, United Kingdom.

    The mechanisms by which oral, live-attenuated vaccines protect against typhoid fever are poorly understood. Here, we analyze transcriptional responses after vaccination with Ty21a or vaccine candidate, M01ZH09. Alterations in response profiles were related to vaccine-induced immune responses and subsequent outcome after wild-type Salmonella Typhi challenge. Despite broad genetic similarity, we detected differences in transcriptional responses to each vaccine. Seven days after M01ZH09 vaccination, marked cell cycle activation was identified and associated with humoral immunogenicity. By contrast, vaccination with Ty21a was associated with NK cell activity and validated in peripheral blood mononuclear cell stimulation assays confirming superior induction of an NK cell response. Moreover, transcriptional signatures of amino acid metabolism in Ty21a recipients were associated with protection against infection, including increased incubation time and decreased severity. Our data provide detailed insight into molecular immune responses to typhoid vaccines, which could aid the rational design of improved oral, live-attenuated vaccines against enteric pathogens.

    Funded by: Wellcome Trust

    Frontiers in immunology 2017;8;1276

  • Galleria mellonella is low cost and suitable surrogate host for studying virulence of human pathogenic Vibrio cholerae.

    Bokhari H, Ali A, Noreen Z, Thomson N and Wren BW

    Department of Biosciences, COMSATS Institute of Information Technology, Islamabad, Pakistan. Electronic address:

    Vibrio cholerae causes a severe diarrheal disease affecting millions of people worldwide, particularly in low income countries. V. cholerae successfully persist in aquatic environment and its pathogenic strains results in sever enteric disease in humans. This dual life style contributes towards its better survival and persistence inside host gut and in the environment. Alternative animal replacement models are of great value in studying host-pathogen interaction and for quick screening of various pathogenic strains. One such model is Galleria mellonella, a wax moth which has a complex innate immune system and here we investigate its suitability as a model for clinical human isolates of O1 El TOR, Ogawa serotype belonging to two genetically distinct subclades found in Pakistan (PSC-1 and PSC-2). We demonstrate that the PSC-2 strain D59 frequently isolated from inland areas, was more virulent than PSC-1 strain K7 mainly isolated from coastal areas (p=0.0001). In addition, we compared the relative biofilm capability of the representative strains as indicators of their survival and persistence in the environment and K7 showed enhanced biofilm forming capabilities (p=0.004). Finally we present the annotated genomes of the strains D59 and K7, and compared them with the reference strain N16961.

    Gene 2017;628;1-7

  • Analysis of the genomic landscape of multiple myeloma highlights novel prognostic markers and disease subgroups.

    Bolli N, Biancon G, Moarii M, Gimondi S, Li Y, de Philippis C, Maura F, Sathiaseelan V, Tai YT, Mudie L, O'Meara S, Raine K, Teague JW, Butler AP, Carniti C, Gerstung M, Bagratuni T, Kastritis E, Dimopoulos M, Corradini P, Anderson K, Moreau P, Minvielle S, Campbell PJ, Papaemmanuil E, Avet-Loiseau H and Munshi NC

    University of Milan, Department of Oncology and Onco-Hematology, Milan, Italy.

    In multiple myeloma, next generation sequencing (NGS) has expanded our knowledge of genomic lesions, and highlighted a dynamic and heterogeneous composition of the tumor. Here, we used NGS to characterize the genomic landscape of 418 multiple myeloma cases at diagnosis and correlate this with prognosis and classification. Translocations and copy number changes (CNAs) had a preponderant contribution over gene mutations in defining the genotype and prognosis of each case. Known and novel independent prognostic markers were identified in our cohort of proteasome inhibitor and IMiD-treated patients with long follow-up, including events with context-specific prognostic value, such as deletions of the PRDM1 gene. Taking advantage of the comprehensive genomic annotation of each case, we used innovative statistical approaches to identify potential novel myeloma subgroups. We observed clusters of patients stratified based on the overall number of mutations and number/type of CNAs, with distinct effects on survival, suggesting that extended genotype of multiple myeloma at diagnosis may lead to improved disease classification and prognostication.Leukemia accepted article preview online, 06 December 2017. doi:10.1038/leu.2017.344.

    Funded by: BLRD VA: I01 BX001584; NCI NIH HHS: P01 CA155258; Wellcome Trust

    Leukemia 2017

  • The impact of rare and low-frequency genetic variants in common disease.

    Bomba L, Walter K and Soranzo N

    Human Genetics, Wellcome Trust Sanger Institute, Genome Campus, Hinxton, CB10 1HH, UK.

    Despite thousands of genetic loci identified to date, a large proportion of genetic variation predisposing to complex disease and traits remains unaccounted for. Advances in sequencing technology enable focused explorations on the contribution of low-frequency and rare variants to human traits. Here we review experimental approaches and current knowledge on the contribution of these genetic variants in complex disease and discuss challenges and opportunities for personalised medicine.

    Funded by: Wellcome Trust

    Genome biology 2017;18;1;77

  • Revealing hidden complexities of genomic rearrangements generated with Cas9.

    Boroviak K, Fu B, Yang F, Doe B and Bradley A

    Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SA, United Kingdom.

    Modelling human diseases caused by large genomic rearrangements has become more accessible since the utilization of CRISPR/Cas9 in mammalian systems. In a previous study, we showed that genomic rearrangements of up to one million base pairs can be generated by direct injection of CRISPR/Cas9 reagents into mouse zygotes. Although these rearrangements are ascertained by junction PCR, we describe here a variety of anticipated structural changes often involving reintegration of the region demarcated by the gRNAs in the vicinity of the edited locus. We illustrate here some of this diversity detected by high-resolution fibre-FISH and conclude that extensive molecular analysis is required to fully understand the structure of engineered chromosomes generated by Cas9.

    Funded by: NIH HHS: U42 OD011174; Wellcome Trust: WT206194

    Scientific reports 2017;7;1;12867

  • Whole Genome Sequencing for Surveillance of Antimicrobial Resistance in Actinobacillus pleuropneumoniae.

    Bossé JT, Li Y, Rogers J, Fernandez Crespo R, Li Y, Chaudhuri RR, Holden MT, Maskell DJ, Tucker AW, Wren BW, Rycroft AN and Langford PR

    Section of Paediatrics, Department of Medicine, Imperial College London London, UK.

    The aim of this study was to evaluate the correlation between antimicrobial resistance (AMR) profiles of 96 clinical isolates of Actinobacillus pleuropneumoniae, an important porcine respiratory pathogen, and the identification of AMR genes in whole genome sequence (wgs) data. Susceptibility of the isolates to nine antimicrobial agents (ampicillin, enrofloxacin, erythromycin, florfenicol, sulfisoxazole, tetracycline, tilmicosin, trimethoprim, and tylosin) was determined by agar dilution susceptibility test. Except for the macrolides tested, elevated MICs were highly correlated to the presence of AMR genes identified in wgs data using ResFinder or BLASTn. Of the isolates tested, 57% were resistant to tetracycline [MIC ≥ 4 mg/L; 94.8% with either tet(B) or tet(H)]; 48% to sulfisoxazole (MIC ≥ 256 mg/L or DD = 6; 100% with sul2), 20% to ampicillin (MIC ≥ 4 mg/L; 100% with blaROB-1), 17% to trimethoprim (MIC ≥ 32 mg/L; 100% with dfrA14), and 6% to enrofloxacin (MIC ≥ 0.25 mg/L; 100% with GyrAS83F). Only 33% of the isolates did not have detectable AMR genes, and were sensitive by MICs for the antimicrobial agents tested. Although 23 isolates had MIC ≥ 32 mg/L for tylosin, all isolates had MIC ≤ 16 mg/L for both erythromycin and tilmicosin, and no macrolide resistance genes or known point mutations were detected. Other than the GyrAS83F mutation, the AMR genes detected were mapped to potential plasmids. In addition to presence on plasmid(s), the tet(B) gene was also found chromosomally either as part of a 56 kb integrative conjugative element (ICEApl1) in 21, or as part of a Tn7 insertion in 15 isolates. Our results indicate that, with the exception of macrolides, wgs data can be used to accurately predict resistance of A. pleuropneumoniae to the tested antimicrobial agents and provides added value for routine surveillance.

    Frontiers in microbiology 2017;8;311

  • Loss of the homologous recombination gene rad51 leads to Fanconi anemia-like symptoms in zebrafish.

    Botthof JG, Bielczyk-Maczyńska E, Ferreira L and Cvejic A

    Department of Haematology, University of Cambridge, Addenbrookes Hospital, Cambridge CB2 0XY, United Kingdom.

    RAD51 is an indispensable homologous recombination protein, necessary for strand invasion and crossing over. It has recently been designated as a Fanconi anemia (FA) gene, following the discovery of two patients carrying dominant-negative mutations. FA is a hereditary DNA-repair disorder characterized by various congenital abnormalities, progressive bone marrow failure, and cancer predisposition. In this report, we describe a viable vertebrate model of RAD51 loss. Zebrafish rad51 loss-of-function mutants developed key features of FA, including hypocellular kidney marrow, sensitivity to cross-linking agents, and decreased size. We show that some of these symptoms stem from both decreased proliferation and increased apoptosis of embryonic hematopoietic stem and progenitor cells. Comutation of p53 was able to rescue the hematopoietic defects seen in the single mutants, but led to tumor development. We further demonstrate that prolonged inflammatory stress can exacerbate the hematological impairment, leading to an additional decrease in kidney marrow cell numbers. These findings strengthen the assignment of RAD51 as a Fanconi gene and provide more evidence for the notion that aberrant p53 signaling during embryogenesis leads to the hematological defects seen later in life in FA. Further research on this zebrafish FA model will lead to a deeper understanding of the molecular basis of bone marrow failure in FA and the cellular role of RAD51.

    Proceedings of the National Academy of Sciences of the United States of America 2017

  • Semantic prioritization of novel causative genomic variants.

    Boudellioua I, Mahamad Razali RB, Kulmanov M, Hashish Y, Bajic VB, Goncalves-Serra E, Schoenmakers N, Gkoutos GV, Schofield PN and Hoehndorf R

    King Abdullah University of Science and Technology, Computer, Electrical & Mathematical Sciences and Engineering Division, Computational Bioscience Research Center, Thuwal, Saudi Arabia.

    Discriminating the causative disease variant(s) for individuals with inherited or de novo mutations presents one of the main challenges faced by the clinical genetics community today. Computational approaches for variant prioritization include machine learning methods utilizing a large number of features, including molecular information, interaction networks, or phenotypes. Here, we demonstrate the PhenomeNET Variant Predictor (PVP) system that exploits semantic technologies and automated reasoning over genotype-phenotype relations to filter and prioritize variants in whole exome and whole genome sequencing datasets. We demonstrate the performance of PVP in identifying causative variants on a large number of synthetic whole exome and whole genome sequences, covering a wide range of diseases and syndromes. In a retrospective study, we further illustrate the application of PVP for the interpretation of whole exome sequencing data in patients suffering from congenital hypothyroidism. We find that PVP accurately identifies causative variants in whole exome and whole genome sequencing datasets and provides a powerful resource for the discovery of causal variants.

    PLoS computational biology 2017;13;4;e1005500

  • Genome-wide chemical mutagenesis screens allow unbiased saturation of the cancer genome and identification of drug resistance mutations.

    Brammeld JS, Petljak M, Martincorena I, Williams SP, Alonso LG, Dalmases A, Bellosillo B, Robles-Espinoza CD, Price S, Barthorpe S, Tarpey P, Alifrangis C, Bignell G, Vidal J, Young J, Stebbings L, Beal K, Stratton MR, Saez-Rodriguez J, Garnett M, Montagut C, Iorio F and McDermott U

    Wellcome Trust Sanger Institute, Hinxton CB10 1SA, United Kingdom.

    Drug resistance is an almost inevitable consequence of cancer therapy and ultimately proves fatal for the majority of patients. In many cases, this is the consequence of specific gene mutations that have the potential to be targeted to resensitize the tumor. The ability to uniformly saturate the genome with point mutations without chromosome or nucleotide sequence context bias would open the door to identify all putative drug resistance mutations in cancer models. Here, we describe such a method for elucidating drug resistance mechanisms using genome-wide chemical mutagenesis allied to next-generation sequencing. We show that chemically mutagenizing the genome of cancer cells dramatically increases the number of drug-resistant clones and allows the detection of both known and novel drug resistance mutations. We used an efficient computational process that allows for the rapid identification of involved pathways and druggable targets. Such a priori knowledge would greatly empower serial monitoring strategies for drug resistance in the clinic as well as the development of trials for drug-resistant patients.

    Funded by: Cancer Research UK; Medical Research Council; Wellcome Trust

    Genome research 2017;27;4;613-625

  • Artificial and natural RNA interactions between bacteria and C. elegans.

    Braukmann F, Jordan D and Miska E

    a Gurdon Institute, University of Cambridge , Tennis Court Road, Cambridge , CB2 1QN , United Kingdom.

    19 years after Lisa Timmons and Andy Fire first described RNA transfer from bacteria to C. elegans in an experimental setting [Timmons and Fire, 1998 ] the biological role of this trans-kingdom RNA-based communication remains unknown. Here we summarize our current understanding on the mechanism and potential role of such social RNA.

    RNA biology 2017;0

  • Longitudinal genomic surveillance of multidrug-resistant Escherichia coli carriage in a long-term care facility in the United Kingdom.

    Brodrick HJ, Raven KE, Kallonen T, Jamrozy D, Blane B, Brown NM, Martin V, Török ME, Parkhill J and Peacock SJ

    Department of Medicine, University of Cambridge, Box 157, Addenbrooke's Hospital, Hills Road, Cambridge, CB2 0QQ, UK.

    Background: Residents of long-term care facilities (LTCF) may have high carriage rates of multidrug-resistant pathogens, but are not currently included in surveillance programmes for antimicrobial resistance or healthcare-associated infections. Here, we describe the value derived from a longitudinal epidemiological and genomic surveillance study of drug-resistant Escherichia coli in a LTCF in the United Kingdom (UK).

    Methods: Forty-five of 90 (50%) residents were recruited and followed for six months in 2014. Participants were screened weekly for carriage of extended-spectrum beta-lactamase (ESBL) producing E. coli. Participants positive for ESBL E. coli were also screened for ESBL-negative E. coli. Phenotypic antibiotic susceptibility of E. coli was determined using the Vitek2 instrument and isolates were sequenced on an Illumina HiSeq2000 instrument. Information was collected on episodes of clinical infection and antibiotic consumption.

    Results: Seventeen of 45 participants (38%) carried ESBL E. coli. Twenty-three of the 45 participants (51%) had 63 documented episodes of clinical infection treated with antibiotics. Treatment with antibiotics was associated with higher risk of carrying ESBL E. coli. ESBL E. coli was mainly sequence type (ST)131 (16/17, 94%). Non-ESBL E. coli from these 17 cases was more genetically diverse, but ST131 was found in eight (47%) cases. Whole-genome analysis of 297 ST131 E. coli from the 17 cases demonstrated highly related strains from six participants, indicating acquisition from a common source or person-to-person transmission. Five participants carried highly related strains of both ESBL-positive and ESBL-negative ST131. Genome-based comparison of ST131 isolates from the LTCF study participants with ST131 associated with bloodstream infection at a nearby acute hospital and in hospitals across England revealed sharing of highly related lineages between the LTCF and a local hospital.

    Conclusions: This study demonstrates the power of genomic surveillance to detect multidrug-resistant pathogens and confirm their connectivity within a healthcare network.

    Genome medicine 2017;9;1;70

  • Targeting DNA Repair in Cancer: Beyond PARP Inhibitors.

    Brown JS, O'Carrigan B, Jackson SP and Yap TA

    Royal Marsden NHS Foundation Trust, London, United Kingdom.

    Germline aberrations in critical DNA-repair and DNA damage-response (DDR) genes cause cancer predisposition, whereas various tumors harbor somatic mutations causing defective DDR/DNA repair. The concept of synthetic lethality can be exploited in such malignancies, as exemplified by approval of poly(ADP-ribose) polymerase inhibitors for treating BRCA1/2-mutated ovarian cancers. Herein, we detail how cellular DDR processes engage various proteins that sense DNA damage, initiate signaling pathways to promote cell-cycle checkpoint activation, trigger apoptosis, and coordinate DNA repair. We focus on novel therapeutic strategies targeting promising DDR targets and discuss challenges of patient selection and the development of rational drug combinations.

    Significance: Various inhibitors of DDR components are in preclinical and clinical development. A thorough understanding of DDR pathway complexities must now be combined with strategies and lessons learned from the successful registration of PARP inhibitors in order to fully exploit the potential of DDR inhibitors and to ensure their long-term clinical success. Cancer Discov; 7(1); 20-37. ©2016 AACR.

    Cancer discovery 2017;7;1;20-37

  • Transmission of the gut microbiota: spreading of health.

    Browne HP, Neville BA, Forster SC and Lawley TD

    Host-Microbiota Interactions Laboratory, Wellcome Trust Sanger Institute, Hinxton, Cambridgeshire CB10 1SA, UK.

    Transmission of commensal intestinal bacteria between humans could promote health by establishing, maintaining and replenishing microbial diversity in the microbiota of an individual. Unlike pathogens, the routes of transmission for commensal bacteria remain unappreciated and poorly understood, despite the likely commonalities between both. Consequently, broad infection control measures that are designed to prevent pathogen transmission and infection, such as oversanitation and the overuse of antibiotics, may inadvertently affect human health by altering normal commensal transmission. In this Review, we discuss the mechanisms and factors that influence host-to-host transmission of the intestinal microbiota and examine how a better understanding of these processes will identify new approaches to nurture and restore transmission routes that are used by beneficial bacteria.

    Funded by: Medical Research Council: MR/K000551/1; Wellcome Trust: 098051

    Nature reviews. Microbiology 2017;15;9;531-543

  • Expanding the clinical spectrum of recessive truncating mutations of <i>KLHL7</i> to a Bohring-Opitz-like phenotype.

    Bruel AL, Bigoni S, Kennedy J, Whiteford M, Buxton C, Parmeggiani G, Wherlock M, Woodward G, Greenslade M, Williams M, St-Onge J, Ferlini A, Garani G, Ballardini E, van Bon BW, Acuna-Hidalgo R, Bohring A, Deleuze JF, Boland A, Meyer V, Olaso R, Ginglinger E, Study D, Rivière JB, Brunner HG, Hoischen A, Newbury-Ecob R, Faivre L, Thauvin-Robinet C and Thevenon J

    Inserm UMR 1231 GAD Team, Genetics of Developmental Anomalies, Université de Bourgogne-Franche Comté, Dijon, France.

    Background: Bohring-Opitz syndrome (BOS) is a rare genetic disorder characterised by a recognisable craniofacial appearance and a typical 'BOS' posture. BOS is caused by sporadic mutations of<i>ASXL1</i>. However, several typical patients with BOS have no molecular diagnosis, suggesting clinical and genetic heterogeneity.

    Objectives: To expand the phenotypical spectrum of autosomal recessive variants of <i>KLHL7</i>, reported as causing Crisponi syndrome/cold-induced sweating syndrome type 1 (CS/CISS1)-like syndrome.

    Methods: We performed whole-exome sequencing in two families with a suspected recessive mode of inheritance. We used the Matchmaker Exchange initiative to identify additional patients.

    Results: Here, we report six patients with microcephaly, facial dysmorphism, including exophthalmos, nevus flammeus of the glabella and joint contractures with a suspected BOS posture in five out of six patients. We identified autosomal recessive truncating mutations in the <i>KLHL7</i> gene. <i>KLHL7</i> encodes a BTB-kelch protein implicated in the cell cycle and in protein degradation by the ubiquitin-proteasome pathway. Recently, biallelic mutations in the <i>KLHL7</i> gene were reported in four families and associated with CS/CISS1, characterised by clinical features overlapping with our patients.

    Conclusion: We have expanded the clinical spectrum of <i>KLHL7</i> autosomal recessive variants by describing a syndrome with features overlapping CS/CISS1 and BOS.

    Journal of medical genetics 2017;54;12;830-835

  • f-scLVM: scalable and versatile factor analysis for single-cell RNA-seq.

    Buettner F, Pratanwanich N, McCarthy DJ, Marioni JC and Stegle O

    European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.

    Single-cell RNA-sequencing (scRNA-seq) allows studying heterogeneity in gene expression in large cell populations. Such heterogeneity can arise due to technical or biological factors, making decomposing sources of variation difficult. We here describe f-scLVM (factorial single-cell latent variable model), a method based on factor analysis that uses pathway annotations to guide the inference of interpretable factors underpinning the heterogeneity. Our model jointly estimates the relevance of individual factors, refines gene set annotations, and infers factors without annotation. In applications to multiple scRNA-seq datasets, we find that f-scLVM robustly decomposes scRNA-seq datasets into interpretable components, thereby facilitating the identification of novel subpopulations.

    Genome biology 2017;18;1;212

  • Chromosome contacts in activated T cells identify autoimmune disease candidate genes.

    Burren OS, Rubio García A, Javierre BM, Rainbow DB, Cairns J, Cooper NJ, Lambourne JJ, Schofield E, Castro Dopico X, Ferreira RC, Coulson R, Burden F, Rowlston SP, Downes K, Wingett SW, Frontini M, Ouwehand WH, Fraser P, Spivakov M, Todd JA, Wicker LS, Cutler AJ and Wallace C

    Department of Medicine, University of Cambridge, Addenbrooke's Hospital, Cambridge, CB2 0SP, UK.

    Background: Autoimmune disease-associated variants are preferentially found in regulatory regions in immune cells, particularly CD4(+) T cells. Linking such regulatory regions to gene promoters in disease-relevant cell contexts facilitates identification of candidate disease genes.

    Results: Within 4 h, activation of CD4(+) T cells invokes changes in histone modifications and enhancer RNA transcription that correspond to altered expression of the interacting genes identified by promoter capture Hi-C. By integrating promoter capture Hi-C data with genetic associations for five autoimmune diseases, we prioritised 245 candidate genes with a median distance from peak signal to prioritised gene of 153 kb. Just under half (108/245) prioritised genes related to activation-sensitive interactions. This included IL2RA, where allele-specific expression analyses were consistent with its interaction-mediated regulation, illustrating the utility of the approach.

    Conclusions: Our systematic experimental framework offers an alternative approach to candidate causal gene identification for variants with cell state-specific functional effects, with achievable sample sizes.

    Genome biology 2017;18;1;165

  • Functional Profiling of a Plasmodium Genome Reveals an Abundance of Essential Genes.

    Bushell E, Gomes AR, Sanderson T, Anar B, Girling G, Herd C, Metcalf T, Modrzynska K, Schwach F, Martin RE, Mather MW, McFadden GI, Parts L, Rutledge GG, Vaidya AB, Wengelnik K, Rayner JC and Billker O

    Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK.

    The genomes of malaria parasites contain many genes of unknown function. To assist drug development through the identification of essential genes and pathways, we have measured competitive growth rates in mice of 2,578 barcoded Plasmodium berghei knockout mutants, representing >50% of the genome, and created a phenotype database. At a single stage of its complex life cycle, P. berghei requires two-thirds of genes for optimal growth, the highest proportion reported from any organism and a probable consequence of functional optimization necessitated by genomic reductions during the evolution of parasitism. In contrast, extreme functional redundancy has evolved among expanded gene families operating at the parasite-host interface. The level of genetic redundancy in a single-celled organism may thus reflect the degree of environmental variation it experiences. In the case of Plasmodium parasites, this helps rationalize both the relative successes of drugs and the greater difficulty of making an effective vaccine.

    Funded by: NIAID NIH HHS: R01 AI028398, R56 AI028398; Wellcome Trust

    Cell 2017;170;2;260-272.e8

  • Synergistic malaria vaccine combinations identified by systematic antigen screening.

    Bustamante LY, Powell GT, Lin YC, Macklin MD, Cross N, Kemp A, Cawkill P, Sanderson T, Crosnier C, Muller-Sienerth N, Doumbo OK, Traore B, Crompton PD, Cicuta P, Tran TM, Wright GJ and Rayner JC

    Malaria Programme, Wellcome Trust Sanger Institute, Cambridge CB10 1SA, United Kingdom.

    A highly effective vaccine would be a valuable weapon in the drive toward malaria elimination. No such vaccine currently exists, and only a handful of the hundreds of potential candidates in the parasite genome have been evaluated. In this study, we systematically evaluated 29 antigens likely to be involved in erythrocyte invasion, an essential developmental stage during which the malaria parasite is vulnerable to antibody-mediated inhibition. Testing antigens alone and in combination identified several strain-transcending targets that had synergistic combinatorial effects in vitro, while studies in an endemic population revealed that combinations of the same antigens were associated with protection from febrile malaria. Video microscopy established that the most effective combinations targeted multiple discrete stages of invasion, suggesting a mechanistic explanation for synergy. Overall, this study both identifies specific antigen combinations for high-priority clinical testing and establishes a generalizable approach that is more likely to produce effective vaccines.

    Funded by: Medical Research Council: MR/J002283/1; NCATS NIH HHS: KL2 TR000163; NIAID NIH HHS: K08 AI125682; Wellcome Trust: 090851

    Proceedings of the National Academy of Sciences of the United States of America 2017;114;45;12045-12050

  • Guideline for the investigation and management of eosinophilia.

    Butt NM, Lambert J, Ali S, Beer PA, Cross NC, Duncombe A, Ewing J, Harrison CN, Knapper S, McLornan D, Mead AJ, Radia D, Bain BJ and British Committee for Standards in Haematology

    Royal Liverpool and Broadgreen University Teaching Hospitals NHS Trust, Liverpool, UK.

    British journal of haematology 2017

  • 11,670 whole-genome sequences representative of the Han Chinese population from the CONVERGE project.

    Cai N, Bigdeli TB, Kretzschmar WW, Li Y, Liang J, Hu J, Peterson RE, Bacanu S, Webb BT, Riley B, Li Q, Marchini J, Mott R, Kendler KS and Flint J

    Wellcome Trust Centre for Human Genetics, OX3 7BN Oxford, UK.

    The China, Oxford and Virginia Commonwealth University Experimental Research on Genetic Epidemiology (CONVERGE) project on Major Depressive Disorder (MDD) sequenced 11,670 female Han Chinese at low-coverage (1.7X), providing the first large-scale whole genome sequencing resource representative of the largest ethnic group in the world. Samples are collected from 58 hospitals from 23 provinces around China. We are able to call 22 million high quality single nucleotide polymorphisms (SNP) from the nuclear genome, representing the largest SNP call set from an East Asian population to date. We use these variants for imputation of genotypes across all samples, and this has allowed us to perform a successful genome wide association study (GWAS) on MDD. The utility of these data can be extended to studies of genetic ancestry in the Han Chinese and evolutionary genetics when integrated with data from other populations. Molecular phenotypes, such as copy number variations and structural variations can be detected, quantified and analysed in similar ways.

    Funded by: European Research Council: 617306; NIMH NIH HHS: R01 MH100549, T32 MH020030; Wellcome Trust

    Scientific data 2017;4;170011

  • The AMP-activated protein kinase beta 1 subunit modulates erythrocyte integrity.

    Cambridge EL, McIntyre Z, Clare S, Arends MJ, Goulding D, Isherwood C, Caetano SS, Reviriego CB, Swiatkowska A, Kane L, Harcourt K, Sanger Mouse Genetics Project, Adams DJ, White JK and Speak AO

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, UK.

    Failure to maintain a normal in vivo erythrocyte half-life results in the development of hemolytic anemia. Half-life is affected by numerous factors, including energy balance, electrolyte gradients, reactive oxygen species, and membrane plasticity. The heterotrimeric AMP-activated protein kinase (AMPK) is an evolutionarily conserved serine/threonine kinase that acts as a critical regulator of cellular energy balance. Previous roles for the alpha 1 and gamma 1 subunits in the control of erythrocyte survival have been reported. In the work described here, we studied the role of the beta 1 subunit in erythrocytes and observed microcytic anemia with compensatory extramedullary hematopoiesis together with splenomegaly and increased osmotic resistance.

    Funded by: Cancer Research UK: 13031; Wellcome Trust

    Experimental hematology 2017;45;64-68.e5

  • Cliques and Schisms of Cancer Genes.

    Campbell PJ

    Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton CB10 1SA, UK; Department of Haematology, University of Cambridge, Cambridge CB2 2XY, UK. Electronic address:

    With a few exceptions, cancers typically carry more than one driver mutation, sometimes five, ten, or more, and these driver mutations do not necessarily assort randomly. In this issue of Cancer Cell, Mina et al. systematically characterize patterns of co-mutation and mutual exclusivity in 6,456 cancers across 23 tumor types.

    Cancer cell 2017;32;2;129-130

  • CamOptimus: a tool for exploiting complex adaptive evolution to optimize experiments and processes in biotechnology.

    Cankorur-Cetinkaya A, Dias JML, Kludas J, Slater NKH, Rousu J, Oliver SG and Dikicioglu D

    1​Cambridge Systems Biology Centre and Department of Biochemistry, University of Cambridge, Cambridge, CB2 1GA, UK.

    Multiple interacting factors affect the performance of engineered biological systems in synthetic biology projects. The complexity of these biological systems means that experimental design should often be treated as a multiparametric optimization problem. However, the available methodologies are either impractical, due to a combinatorial explosion in the number of experiments to be performed, or are inaccessible to most experimentalists due to the lack of publicly available, user-friendly software. Although evolutionary algorithms may be employed as alternative approaches to optimize experimental design, the lack of simple-to-use software again restricts their use to specialist practitioners. In addition, the lack of subsidiary approaches to further investigate critical factors and their interactions prevents the full analysis and exploitation of the biotechnological system. We have addressed these problems and, here, provide a simple-to-use and freely available graphical user interface to empower a broad range of experimental biologists to employ complex evolutionary algorithms to optimize their experimental designs. Our approach exploits a Genetic Algorithm to discover the subspace containing the optimal combination of parameters, and Symbolic Regression to construct a model to evaluate the sensitivity of the experiment to each parameter under investigation. We demonstrate the utility of this method using an example in which the culture conditions for the microbial production of a bioactive human protein are optimized. CamOptimus is available through: (

    Microbiology (Reading, England) 2017

  • Adipose Tissue Function and Expandability as Determinants of Lipotoxicity and the Metabolic Syndrome.

    Carobbio S, Pellegrinelli V and Vidal-Puig A

    Metabolic Research Laboratories, Wellcome Trust-MRC Institute of Metabolic Science, University of Cambridge, Addenbrooke's Hospital, Box 289, Cambridge, CB2 OQQ, UK.

    The adipose tissue organ is organised as distinct anatomical depots located all along the body axis and it is constituted of three different types of adipocytes : white, beige and brown which are integrated with vascular, immune, neural and extracellular stroma cells. These distinct adipocytes serve different specialised functions. The main function of white adipocytes is to ensure healthy storage of excess nutrients/energy and its rapid mobilisation to supply the demand of energy imposed by physiological cues in other organs, whereas brown and beige adipocytes are designed for heat production through uncoupling lipid oxidation from energy production. The concert action of the three type of adipocytes/tissues has been reported to ensure an optimal metabolic status in rodents. However, when one or multiple of these adipose depots become dysfunctional as a consequence of sustained lipid/nutrient overload, then insulin resistance and associated metabolic complications ensue. These metabolic alterations negatively affects the adipose tissue functionality and compromises global metabolic homeostasis. Optimising white adipose tissue expandability and its functional metabolic flexibility and/or promoting brown/beige mediated thermogenic activity counteracts obesity and its associated lipotoxic metabolic effects. The development of these therapeutic approaches requires a deep understanding of adipose tissue in all broad aspects. In this chapter we will discuss the characteristics of the different adipose tissue depots with respect to origins and precursors recruitment, plasticity, cellular composition and expandability capacity as well as molecular and metabolic signatures in both physiological and pathophysiological conditions.

    Advances in experimental medicine and biology 2017;960;161-196

  • Genome-wide association study of nevirapine hypersensitivity in a sub-Saharan African HIV-infected population.

    Carr DF, Bourgeois S, Chaponda M, Takeshita LY, Morris AP, Castro EM, Alfirevic A, Jones AR, Rigden DJ, Haldenby S, Khoo S, Lalloo DG, Heyderman RS, Dandara C, Kampira E, van Oosterhout JJ, Ssali F, Munderi P, Novelli G, Borgiani P, Nelson MR, Holden A, Deloukas P and Pirmohamed M

    Department of Molecular and Clinical Pharmacology, University of Liverpool, Liverpool, UK

    Background: The antiretroviral nevirapine is associated with hypersensitivity reactions in 6%-10% of patients, including hepatotoxicity, maculopapular exanthema, Stevens-Johnson syndrome (SJS) and toxic epidermal necrolysis (TEN).

    Objectives: To undertake a genome-wide association study (GWAS) to identify genetic predisposing factors for the different clinical phenotypes associated with nevirapine hypersensitivity.

    Methods: A GWAS was undertaken in a discovery cohort of 151 nevirapine-hypersensitive and 182 tolerant, HIV-infected Malawian adults. Replication of signals was determined in a cohort of 116 cases and 68 controls obtained from Malawi, Uganda and Mozambique. Interaction with ERAP genes was determined in patients positive for HLA-C*04:01 In silico docking studies were also performed for HLA-C*04:01 RESULTS: Fifteen SNPs demonstrated nominal significance (P < 1 × 10(-5)) with one or more of the hypersensitivity phenotypes. The most promising signal was seen in SJS/TEN, where rs5010528 (HLA-C locus) approached genome-wide significance (P < 8.5 × 10(-8)) and was below HLA-wide significance (P < 2.5 × 10(-4)) in the meta-analysis of discovery and replication cohorts [OR 4.84 (95% CI 2.71-8.61)]. rs5010528 is a strong proxy for HLA-C*04:01 carriage: in silico docking showed that two residues (33 and 123) in the B pocket were the most likely nevirapine interactors. There was no interaction between HLA-C*04:01 and ERAP1, but there is a potential protective effect with ERAP2 [P = 0.019, OR 0.43 (95% CI 0.21-0.87)].

    Conclusions: HLA-C*04:01 predisposes to nevirapine-induced SJS/TEN in sub-Saharan Africans, but not to other hypersensitivity phenotypes. This is likely to be mediated via binding to the B pocket of the HLA-C peptide. Whether this risk is modulated by ERAP2 variants requires further study.

    The Journal of antimicrobial chemotherapy 2017

  • TCTE1 is a conserved component of the dynein regulatory complex and is required for motility and metabolism in mouse spermatozoa.

    Castaneda JM, Hua R, Miyata H, Oji A, Guo Y, Cheng Y, Zhou T, Guo X, Cui Y, Shen B, Wang Z, Hu Z, Zhou Z, Sha J, Prunskaite-Hyyrylainen R, Yu Z, Ramirez-Solis R, Ikawa M, Matzuk MM and Liu M

    Department of Pathology and Immunology, Baylor College of Medicine, Houston, TX 77030.

    Flagella and cilia are critical cellular organelles that provide a means for cells to sense and progress through their environment. The central component of flagella and cilia is the axoneme, which comprises the "9+2" microtubule arrangement, dynein arms, radial spokes, and the nexin-dynein regulatory complex (N-DRC). Failure to properly assemble components of the axoneme leads to defective flagella and in humans leads to a collection of diseases referred to as ciliopathies. Ciliopathies can manifest as severe syndromic diseases that affect lung and kidney function, central nervous system development, bone formation, visceral organ organization, and reproduction. T-Complex-Associated-Testis-Expressed 1 (TCTE1) is an evolutionarily conserved axonemal protein present from Chlamydomonas (DRC5) to mammals that localizes to the N-DRC. Here, we show that mouse TCTE1 is testis-enriched in its expression, with its mRNA appearing in early round spermatids and protein localized to the flagellum. TCTE1 is 498 aa in length with a leucine rich repeat domain at the C terminus and is present in eukaryotes containing a flagellum. Knockout of Tcte1 results in male sterility because Tcte1-null spermatozoa show aberrant motility. Although the axoneme is structurally normal in Tcte1 mutant spermatozoa, Tcte1-null sperm demonstrate a significant decrease of ATP, which is used by dynein motors to generate the bending force of the flagellum. These data provide a link to defining the molecular intricacies required for axoneme function, sperm motility, and male fertility.

    Proceedings of the National Academy of Sciences of the United States of America 2017

  • Transcriptional repression of Plxnc1 by Lmx1a and Lmx1b directs topographic dopaminergic circuit formation.

    Chabrat A, Brisson G, Doucet-Beaupré H, Salesse C, Schaan Profes M, Dovonou A, Akitegetse C, Charest J, Lemstra S, Côté D, Pasterkamp RJ, Abrudan MI, Metzakopian E, Ang SL and Lévesque M

    Department of Psychiatry and Neurosciences, Faculty of Medicine, Université Laval, Québec, Quebec, G1V 0A6, Canada.

    Mesodiencephalic dopamine neurons play central roles in the regulation of a wide range of brain functions, including voluntary movement and behavioral processes. These functions are served by distinct subtypes of mesodiencephalic dopamine neurons located in the substantia nigra pars compacta and the ventral tegmental area, which form the nigrostriatal, mesolimbic, and mesocortical pathways. Until now, mechanisms involved in dopaminergic circuit formation remained largely unknown. Here, we show that Lmx1a, Lmx1b, and Otx2 transcription factors control subtype-specific mesodiencephalic dopamine neurons and their appropriate axon innervation. Our results revealed that the expression of Plxnc1, an axon guidance receptor, is repressed by Lmx1a/b and enhanced by Otx2. We also found that Sema7a/Plxnc1 interactions are responsible for the segregation of nigrostriatal and mesolimbic dopaminergic pathways. These findings identify Lmx1a/b, Otx2, and Plxnc1 as determinants of dopaminergic circuit formation and should assist in engineering mesodiencephalic dopamine neurons capable of regenerating appropriate connections for cell therapy.Midbrain dopaminergic neurons (mDAs) in the VTA and SNpc project to different regions and form distinct circuits. Here the authors show that transcription factors Lmx1a, Lmx1b, and Otx2 control the axon guidance of mDAs and the segregation of mesolimbic and nigrostriatal dopaminergic pathways.

    Funded by: Parkinson's UK: G-0906; Wellcome Trust

    Nature communications 2017;8;1;933

  • Adaptation... that's what you need?

    Chaguza C and Bentley SD

    Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.

    Nature reviews. Microbiology 2017;15;8;452

  • Population genetic structure, antibiotic resistance, capsule switching and evolution of invasive pneumococci before conjugate vaccination in Malawi.

    Chaguza C, Cornick JE, Andam CP, Gladstone RA, Alaerts M, Musicha P, Peno C, Bar-Zeev N, Kamng'ona AW, Kiran AM, Msefula CL, McGee L, Breiman RF, Kadioglu A, French N, Heyderman RS, Hanage WP, Bentley SD and Everett DB

    Department of Clinical Infection, Microbiology and Immunology, Institute of Infection and Global Health, University of Liverpool, Liverpool, UK; Malawi-Liverpool-Wellcome Trust Clinical Research Programme, Blantyre, Malawi.

    Introduction: Pneumococcal infections cause a high death toll in Sub Saharan Africa (SSA) but the recently rolled out pneumococcal conjugate vaccines (PCV) will reduce the disease burden. To better understand the population impact of these vaccines, comprehensive analysis of large collections of pneumococcal isolates sampled prior to vaccination is required. Here we present a population genomic study of the invasive pneumococcal isolates sampled before the implementation of PCV13 in Malawi.

    Materials and methods: We retrospectively sampled and whole genome sequenced 585 invasive isolates from 2004 to 2010. We determine the pneumococcal population genetic structure and assessed serotype prevalence, antibiotic resistance rates, and the occurrence of serotype switching.

    Results: Population structure analysis revealed 22 genetically distinct sequence clusters (SCs), which consisted of closely related isolates. Serotype 1 (ST217), a vaccine-associated serotype in clade SC2, showed highest prevalence (19.3%), and was associated with the highest MDR rate (81.9%) followed by serotype 12F, a non-vaccine serotype in clade SC10 with an MDR rate of 57.9%. Prevalence of serotypes was stable prior to vaccination although there was an increase in the PMEN19 clone, serotype 5 ST289, in clade SC1 in 2010 suggesting a potential undetected local outbreak. Coalescent analysis revealed recent emergence of the SCs and there was evidence of natural capsule switching in the absence of vaccine induced selection pressure. Furthermore, majority of the highly prevalent capsule-switched isolates were associated with acquisition of vaccine-targeted capsules.

    Conclusions: This study provides descriptions of capsule-switched serotypes and serotypes with potential to cause serotype replacement post-vaccination such as 12F. Continued surveillance is critical to monitor these serotypes and antibiotic resistance in order to design better infection prevention and control measures such as inclusion of emerging replacement serotypes in future conjugate vaccines.

    Funded by: Wellcome Trust: 084679/Z/08/Z, OPP1023440, OPP1034556

    Vaccine 2017;35;35 Pt B;4594-4602

  • The evolutionary and phylogeographic history of woolly mammoths: a comprehensive mitogenomic analysis.

    Chang D, Knapp M, Enk J, Lippold S, Kircher M, Lister A, MacPhee RD, Widga C, Czechowski P, Sommer R, Hodges E, Stümpel N, Barnes I, Dalén L, Derevianko A, Germonpré M, Hillebrand-Voiculescu A, Constantin S, Kuznetsova T, Mol D, Rathgeber T, Rosendahl W, Tikhonov AN, Willerslev E, Hannon G, Lalueza-Fox C, Joger U, Poinar H, Hofreiter M and Shapiro B

    Department of Ecology and Evolutionary Biology, University of California, Santa Cruz, Santa Cruz, CA 95064, USA.

    Near the end of the Pleistocene epoch, populations of the woolly mammoth (Mammuthus primigenius) were distributed across parts of three continents, from western Europe and northern Asia through Beringia to the Atlantic seaboard of North America. Nonetheless, questions about the connectivity and temporal continuity of mammoth populations and species remain unanswered. We use a combination of targeted enrichment and high-throughput sequencing to assemble and interpret a data set of 143 mammoth mitochondrial genomes, sampled from fossils recovered from across their Holarctic range. Our dataset includes 54 previously unpublished mitochondrial genomes and significantly increases the coverage of the Eurasian range of the species. The resulting global phylogeny confirms that the Late Pleistocene mammoth population comprised three distinct mitochondrial lineages that began to diverge ~1.0-2.0 million years ago (Ma). We also find that mammoth mitochondrial lineages were strongly geographically partitioned throughout the Pleistocene. In combination, our genetic results and the pattern of morphological variation in time and space suggest that male-mediated gene flow, rather than large-scale dispersals, was important in the Pleistocene evolutionary history of mammoths.

    Scientific reports 2017;7;44585

  • THE REAL McCOIL: A method for the concurrent estimation of the complexity of infection and SNP allele frequency for malaria parasites.

    Chang HH, Worby CJ, Yeka A, Nankabirwa J, Kamya MR, Staedke SG, Dorsey G, Murphy M, Neafsey DE, Jeffreys AE, Hubbart C, Rockett KA, Amato R, Kwiatkowski DP, Buckee C and Greenhouse B

    Center for Communicable Disease Dynamics, Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, United States.

    As many malaria-endemic countries move towards elimination of Plasmodium falciparum, the most virulent human malaria parasite, effective tools for monitoring malaria epidemiology are urgent priorities. P. falciparum population genetic approaches offer promising tools for understanding transmission and spread of the disease, but a high prevalence of multi-clone or polygenomic infections can render estimation of even the most basic parameters, such as allele frequencies, challenging. A previous method, COIL, was developed to estimate complexity of infection (COI) from single nucleotide polymorphism (SNP) data, but relies on monogenomic infections to estimate allele frequencies or requires external allele frequency data which may not available. Estimates limited to monogenomic infections may not be representative, however, and when the average COI is high, they can be difficult or impossible to obtain. Therefore, we developed THE REAL McCOIL, Turning HEterozygous SNP data into Robust Estimates of ALelle frequency, via Markov chain Monte Carlo, and Complexity Of Infection using Likelihood, to incorporate polygenomic samples and simultaneously estimate allele frequency and COI. This approach was tested via simulations then applied to SNP data from cross-sectional surveys performed in three Ugandan sites with varying malaria transmission. We show that THE REAL McCOIL consistently outperforms COIL on simulated data, particularly when most infections are polygenomic. Using field data we show that, unlike with COIL, we can distinguish epidemiologically relevant differences in COI between and within these sites. Surprisingly, for example, we estimated high average COI in a peri-urban subregion with lower transmission intensity, suggesting that many of these cases were imported from surrounding regions with higher transmission intensity. THE REAL McCOIL therefore provides a robust tool for understanding the molecular epidemiology of malaria across transmission settings.

    PLoS computational biology 2017;13;1;e1005348

  • The exported chaperone Hsp70-x supports virulence functions for Plasmodium falciparum blood stage parasites.

    Charnaud SC, Dixon MWA, Nie CQ, Chappell L, Sanders PR, Nebl T, Hanssen E, Berriman M, Chan JA, Blanch AJ, Beeson JG, Rayner JC, Przyborski JM, Tilley L, Crabb BS and Gilson PR

    Burnet Institute, Melbourne, Victoria, Australia.

    Malaria is caused by five different Plasmodium spp. in humans each of which modifies the host erythrocyte to survive and replicate. The two main causes of malaria, P. falciparum and P. vivax, differ in their ability to cause severe disease, mainly due to differences in the cytoadhesion of infected erythrocytes (IE) in the microvasculature. Cytoadhesion of P. falciparum in the brain leads to a large number of deaths each year and is a consequence of exported parasite proteins, some of which modify the erythrocyte cytoskeleton while others such as PfEMP1 project onto the erythrocyte surface where they bind to endothelial cells. Here we investigate the effects of knocking out an exported Hsp70-type chaperone termed Hsp70-x that is present in P. falciparum but not P. vivax. Although the growth of Δhsp70-x parasites was unaffected, the export of PfEMP1 cytoadherence proteins was delayed and Δhsp70-x IE had reduced adhesion. The Δhsp70-x IE were also more rigid than wild-type controls indicating changes in the way the parasites modified their host erythrocyte. To investigate the cause of this, transcriptional and translational changes in exported and chaperone proteins were monitored and some changes were observed. We propose that PfHsp70-x is not essential for survival in vitro, but may be required for the efficient export and functioning of some P. falciparum exported proteins.

    PloS one 2017;12;7;e0181656

  • "Like sugar in milk": reconstructing the genetic history of the Parsi population.

    Chaubey G, Ayub Q, Rai N, Prakash S, Mushrif-Tripathy V, Mezzavilla M, Pathak AK, Tamang R, Firasat S, Reidla M, Karmin M, Rani DS, Reddy AG, Parik J, Metspalu E, Rootsi S, Dalal K, Khaliq S, Mehdi SQ, Singh L, Metspalu M, Kivisild T, Tyler-Smith C, Villems R and Thangaraj K

    Evolutionary Biology Group, Estonian Biocentre, Riia23b, Tartu, 51010, Estonia.

    Background: The Parsis are one of the smallest religious communities in the world. To understand the population structure and demographic history of this group in detail, we analyzed Indian and Pakistani Parsi populations using high-resolution genetic variation data on autosomal and uniparental loci (Y-chromosomal and mitochondrial DNA). Additionally, we also assayed mitochondrial DNA polymorphisms among ancient Parsi DNA samples excavated from Sanjan, in present day Gujarat, the place of their original settlement in India.

    Results: Among present-day populations, the Parsis are genetically closest to Iranian and the Caucasus populations rather than their South Asian neighbors. They also share the highest number of haplotypes with present-day Iranians and we estimate that the admixture of the Parsis with Indian populations occurred ~1,200 years ago. Enriched homozygosity in the Parsi reflects their recent isolation and inbreeding. We also observed 48% South-Asian-specific mitochondrial lineages among the ancient samples, which might have resulted from the assimilation of local females during the initial settlement. Finally, we show that Parsis are genetically closer to Neolithic Iranians than to modern Iranians, who have witnessed a more recent wave of admixture from the Near East.

    Conclusions: Our results are consistent with the historically-recorded migration of the Parsi populations to South Asia in the 7th century and in agreement with their assimilation into the Indian sub-continent's population and cultural milieu "like sugar in milk". Moreover, in a wider context our results support a major demographic transition in West Asia due to the Islamic conquest.

    Funded by: Wellcome Trust

    Genome biology 2017;18;1;110

  • Transposon insertional mutagenesis in mice identifies human breast cancer susceptibility genes and signatures for stratification.

    Chen L, Jenjaroenpun P, Pillai AM, Ivshina AV, Ow GS, Efthimios M, Zhiqun T, Tan TZ, Lee SC, Rogers K, Ward JM, Mori S, Adams DJ, Jenkins NA, Copeland NG, Ban KH, Kuznetsov VA and Thiery JP

    Institute of Molecular and Cell Biology, Singapore 138673.

    Robust prognostic gene signatures and therapeutic targets are difficult to derive from expression profiling because of the significant heterogeneity within breast cancer (BC) subtypes. Here, we performed forward genetic screening in mice using Sleeping Beauty transposon mutagenesis to identify candidate BC driver genes in an unbiased manner, using a stabilized N-terminal truncated β-catenin gene as a sensitizer. We identified 134 mouse susceptibility genes from 129 common insertion sites within 34 mammary tumors. Of these, 126 genes were orthologous to protein-coding genes in the human genome (hereafter, human BC susceptibility genes, hBCSGs), 70% of which are previously reported cancer-associated genes, and ∼16% are known BC suppressor genes. Network analysis revealed a gene hub consisting of E1A binding protein P300 (<i>EP300</i>), CD44 molecule (<i>CD44</i>), neurofibromin (<i>NF1</i>) and phosphatase and tensin homolog (<i>PTEN</i>), which are linked to a significant number of mutated hBCSGs. From our survival prediction analysis of the expression of human BC genes in 2,333 BC cases, we isolated a six-gene-pair classifier that stratifies BC patients with high confidence into prognostically distinct low-, moderate-, and high-risk subgroups. Furthermore, we proposed prognostic classifiers identifying three basal and three claudin-low tumor subgroups. Intriguingly, our hBCSGs are mostly unrelated to cell cycle/mitosis genes and are distinct from the prognostic signatures currently used for stratifying BC patients. Our findings illustrate the strength and validity of integrating functional mutagenesis screens in mice with human cancer transcriptomic data to identify highly prognostic BC subtyping biomarkers.

    Funded by: Cancer Research UK: 13031

    Proceedings of the National Academy of Sciences of the United States of America 2017;114;11;E2215-E2224

  • Pan-cancer analysis of homozygous deletions in primary tumours uncovers rare tumour suppressors.

    Cheng J, Demeulemeester J, Wedge DC, Vollan HKM, Pitt JJ, Russnes HG, Pandey BP, Nilsen G, Nord S, Bignell GR, White KP, Børresen-Dale AL, Campbell PJ, Kristensen VN, Stratton MR, Lingjærde OC, Moreau Y and Loo PV

    Department of Electrical Engineering (ESAT) and iMinds Future Health Department, University of Leuven, Kasteelpark Arenberg 10, B-3001, Leuven, Belgium.

    Homozygous deletions are rare in cancers and often target tumour suppressor genes. Here, we build a compendium of 2218 primary tumours across 12 human cancer types and systematically screen for homozygous deletions, aiming to identify rare tumour suppressors. Our analysis defines 96 genomic regions recurrently targeted by homozygous deletions. These recurrent homozygous deletions occur either over tumour suppressors or over fragile sites, regions of increased genomic instability. We construct a statistical model that separates fragile sites from regions showing signatures of positive selection for homozygous deletions and identify candidate tumour suppressors within those regions. We find 16 established tumour suppressors and propose 27 candidate tumour suppressors. Several of these genes (including MGMT, RAD17, and USP44) show prior evidence of a tumour suppressive function. Other candidate tumour suppressors, such as MAFTRR, KIAA1551, and IGF2BP2, are novel. Our study demonstrates how rare tumour suppressors can be identified through copy number meta-analysis.

    Nature communications 2017;8;1;1221

  • Whole-genome view of the consequences of a population bottleneck using 2926 genome sequences from Finland and United Kingdom.

    Chheda H, Palta P, Pirinen M, McCarthy S, Walter K, Koskinen S, Salomaa V, Daly M, Durbin R, Palotie A, Aittokallio T and Ripatti S

    Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, Finland.

    Isolated populations with enrichment of variants due to recent population bottlenecks provide a powerful resource for identifying disease-associated genetic variants and genes. As a model of an isolate population, we sequenced the genomes of 1463 Finnish individuals as part of the Sequencing Initiative Suomi (SISu) Project. We compared the genomic profiles of the 1463 Finns to a sample of 1463 British individuals that were sequenced in parallel as part of the UK10K Project. Whereas there were no major differences in the allele frequency of common variants, a significant depletion of variants in the rare frequency spectrum was observed in Finns when comparing the two populations. On the other hand, we observed >2.1 million variants that were twice as frequent among Finns compared with Britons and 800 000 variants that were more than 10 times more frequent in Finns. Furthermore, in Finns we observed a relative proportional enrichment of variants in the minor allele frequency range between 2 and 5% (P<2.2 × 10(-16)). When stratified by their functional annotations, loss-of-function variants showed the highest proportional enrichment in Finns (P=0.0291). In the non-coding part of the genome, variants in conserved regions (P=0.002) and promoters (P=0.01) were also significantly enriched in the Finnish samples. These functional categories represent the highest a priori power for downstream association studies of rare variants using population isolates.

    Funded by: Wellcome Trust

    European journal of human genetics : EJHG 2017;25;4;477-484

  • Pathways to understanding the genomic aetiology of osteoarthritis.

    Cibrián Uhalte E, Wilkinson JM, Southam L and Zeggini E

    Human Genetics and Cellular Genetics, Wellcome Trust Sanger Institute, Hinxton CB10 1SA, UK.

    Osteoarthritis is a common, complex disease with no curative therapy. In this review, we summarize current knowledge on disease aetiopathogenesis and outline genetics and genomics approaches that are helping catalyse a much-needed improved understanding of the biological underpinning of disease development and progression.

    Human molecular genetics 2017;26;R2;R193-R201

  • Culture adaptation of malaria parasites selects for convergent loss-of-function mutants.

    Claessens A, Affara M, Assefa SA, Kwiatkowski DP and Conway DJ

    London School of Hygiene and Tropical Medicine, London, UK.

    Cultured human pathogens may differ significantly from source populations. To investigate the genetic basis of laboratory adaptation in malaria parasites, clinical Plasmodium falciparum isolates were sampled from patients and cultured in vitro for up to three months. Genome sequence analysis was performed on multiple culture time point samples from six monoclonal isolates, and single nucleotide polymorphism (SNP) variants emerging over time were detected. Out of a total of five positively selected SNPs, four represented nonsense mutations resulting in stop codons, three of these in a single ApiAP2 transcription factor gene, and one in SRPK1. To survey further for nonsense mutants associated with culture, genome sequences of eleven long-term laboratory-adapted parasite strains were examined, revealing four independently acquired nonsense mutations in two other ApiAP2 genes, and five in Epac. No mutants of these genes exist in a large database of parasite sequences from uncultured clinical samples. This implicates putative master regulator genes in which multiple independent stop codon mutations have convergently led to culture adaptation, affecting most laboratory lines of P. falciparum. Understanding the adaptive processes should guide development of experimental models, which could include targeted gene disruption to adapt fastidious malaria parasite species to culture.

    Scientific reports 2017;7;41303

  • Genome-wide base-resolution mapping of DNA methylation in single cells using single-cell bisulfite sequencing (scBS-seq).

    Clark SJ, Smallwood SA, Lee HJ, Krueger F, Reik W and Kelsey G

    Epigenetics Programme, Babraham Institute, Cambridge, UK.

    DNA methylation (DNAme) is an important epigenetic mark in diverse species. Our current understanding of DNAme is based on measurements from bulk cell samples, which obscures intercellular differences and prevents analyses of rare cell types. Thus, the ability to measure DNAme in single cells has the potential to make important contributions to the understanding of several key biological processes, such as embryonic development, disease progression and aging. We have recently reported a method for generating genome-wide DNAme maps from single cells, using single-cell bisulfite sequencing (scBS-seq), allowing the quantitative measurement of DNAme at up to 50% of CpG dinucleotides throughout the mouse genome. Here we present a detailed protocol for scBS-seq that includes our most recent developments to optimize recovery of CpGs, mapping efficiency and success rate; reduce hands-on time; and increase sample throughput with the option of using an automated liquid handler. We provide step-by-step instructions for each stage of the method, comprising cell lysis and bisulfite (BS) conversion, preamplification and adaptor tagging, library amplification, sequencing and, lastly, alignment and methylation calling. An individual with relevant molecular biology expertise can complete library preparation within 3 d. Subsequent computational steps require 1-3 d for someone with bioinformatics expertise.

    Nature protocols 2017;12;3;534-547

  • Longitudinal genomic surveillance of MRSA in the UK reveals transmission patterns in hospitals and the community.

    Coll F, Harrison EM, Toleman MS, Reuter S, Raven KE, Blane B, Palmer B, Kappeler ARM, Brown NM, Török ME, Parkhill J and Peacock SJ

    London School of Hygiene and Tropical Medicine, London, UK.

    Genome sequencing has provided snapshots of the transmission of methicillin-resistant<i>Staphylococcus aureus</i>(MRSA) during suspected outbreaks in isolated hospital wards. Scale-up to populations is now required to establish the full potential of this technology for surveillance. We prospectively identified all individuals over a 12-month period who had at least one MRSA-positive sample processed by a routine diagnostic microbiology laboratory in the East of England, which received samples from three hospitals and 75 general practitioner (GP) practices. We sequenced at least 1 MRSA isolate from 1465 individuals (2282 MRSA isolates) and recorded epidemiological data. An integrated epidemiological and phylogenetic analysis revealed 173 transmission clusters containing between 2 and 44 cases and involving 598 people (40.8%). Of these, 118 clusters (371 people) involved hospital contacts alone, 27 clusters (72 people) involved community contacts alone, and 28 clusters (157 people) had both types of contact. Community- and hospital-associated MRSA lineages were equally capable of transmission in the community, with instances of spread in households, long-term care facilities, and GP practices. Our study provides a comprehensive picture of MRSA transmission in a sampled population of 1465 people and suggests the need to review existing infection control policy and practice.

    Funded by: Medical Research Council: G1000803; Wellcome Trust

    Science translational medicine 2017;9;413

  • Global, site-specific analysis of neuronal protein S-acylation.

    Collins MO, Woodley KT and Choudhary JS

    Wellcome Trust Sanger Institute, Hinxton, Cambridge, CB10 1SA, UK.

    Protein S-acylation (palmitoylation) is a reversible lipid modification that is an important regulator of dynamic membrane-protein interactions. Proteomic approaches have uncovered many putative palmitoylated proteins however, methods for comprehensive palmitoylation site characterization are lacking. We demonstrate a quantitative site-specific-Acyl-Biotin-Exchange (ssABE) method that allowed the identification of 906 putative palmitoylation sites on 641 proteins from mouse forebrain. 62% of sites map to known palmitoylated proteins and 102 individual palmitoylation sites are known from the literature. 54% of palmitoylation sites map to synaptic proteins including many GPCRs, receptors/ion channels and peripheral membrane proteins. Phosphorylation sites were also identified on a subset of peptides that were palmitoylated, demonstrating for the first time co-identification of these modifications by mass spectrometry. Palmitoylation sites were identified on over half of the family of palmitoyl-acyltransferases (PATs) that mediate protein palmitoylation, including active site thioester-linked palmitoyl intermediates. Distinct palmitoylation motifs and site topology were identified for integral membrane and soluble proteins, indicating potential differences in associated PAT specificity and palmitoylation function. ssABE allows the global identification of palmitoylation sites as well as measurement of the active site modification state of PATs, enabling palmitoylation to be studied at a systems level.

    Funded by: Wellcome Trust

    Scientific reports 2017;7;1;4683

  • Clonal haematopoiesis is not prevalent in survivors of childhood cancer.

    Collord G, Park N, Podestà M, Dagnino M, Cilloni D, Jones D, Varela I, Frassoni F and Vassiliou GS

    Wellcome Trust Sanger Institute, Cambridge, UK.

    British journal of haematology 2017

  • The Driver Mutational Landscape of Ovarian Squamous Cell Carcinomas Arising in Mature Cystic Teratoma.

    Cooke SL, Ennis D, Evers L, Dowson S, Chan MY, Paul J, Hirschowitz L, Glasspool RM, Singh N, Bell S, Day E, Kochman A, Wilkinson N, Beer P, Martin S, Millan D, Biankin AV, McNeish IA and Scottish Genomes Partnership

    Institute of Cancer Sciences, University of Glasgow, Glasgow, United Kingdom.

    <b>Purpose:</b> We sought to identify the genomic abnormalities in squamous cell carcinomas (SCC) arising in ovarian mature cystic teratoma (MCT), a rare gynecological malignancy of poor prognosis.<b>Experimental design:</b> We performed copy number, mutational state, and zygosity analysis of 151 genes in SCC arising in MCT (<i>n</i> = 25) using next-generation sequencing. The presence of high-/intermediate-risk HPV genotypes was assessed by quantitative PCR. Genomic events were correlated with clinical features and outcome.<b>Results:</b> MCT had a low mutation burden with a mean of only one mutation per case. Zygosity analyses of MCT indicated four separate patterns, suggesting that MCT can arise from errors at various stages of oogenesis. A total of 244 abnormalities were identified in 79 genes in MCT-associated SCC, and the overall mutational burden was high (mean 10.2 mutations per megabase). No SCC was positive for HPV. The most frequently altered genes in SCC were <i>TP53</i> (20/25 cases, 80%), <i>PIK3CA</i> (13/25 cases, 52%), and <i>CDKN2A</i> (11/25 cases, 44%). Mutation in <i>TP53</i> was associated with improved overall survival. In 8 of 20 cases with <i>TP53</i> mutations, two or more variants were identified, which were bi-allelic.<b>Conclusions:</b> Ovarian SCC arising in MCT has a high mutational burden, with <i>TP53</i> mutation the most common abnormality. The presence of <i>TP53</i> mutation is a good prognostic factor. SCC arising in MCT share similar mutation profiles to other SCC. Given their rarity, they should be included in basket studies that recruit patients with SCC of other organs. <i>Clin Cancer Res; 23(24); 7633-40. ©2017 AACR</i>.

    Funded by: Chief Scientist Office: SGP/1; Medical Research Council: G0501974, G0601891, MC_PC_15080, MR/N005813/1

    Clinical cancer research : an official journal of the American Association for Cancer Research 2017;23;24;7633-7640

  • Frequency-dependent selection in vaccine-associated pneumococcal population dynamics.

    Corander J, Fraser C, Gutmann MU, Arnold B, Hanage WP, Bentley SD, Lipsitch M and Croucher NJ

    Helsinki Institute for Information Technology, Department of Mathematics and Statistics, University of Helsinki, 00014, Helsinki, Finland.

    Many bacterial species are composed of multiple lineages distinguished by extensive variation in gene content. These often cocirculate in the same habitat, but the evolutionary and ecological processes that shape these complex populations are poorly understood. Addressing these questions is particularly important for Streptococcus pneumoniae, a nasopharyngeal commensal and respiratory pathogen, because the changes in population structure associated with the recent introduction of partial-coverage vaccines have substantially reduced pneumococcal disease. Here we show that pneumococcal lineages from multiple populations each have a distinct combination of intermediate-frequency genes. Functional analysis suggested that these loci may be subject to negative frequency-dependent selection (NFDS) through interactions with other bacteria, hosts or mobile elements. Correspondingly, these genes had similar frequencies in four populations with dissimilar lineage compositions. These frequencies were maintained following substantial alterations in lineage prevalences once vaccination programmes began. Fitting a multilocus NFDS model of post-vaccine population dynamics to three genomic datasets using Approximate Bayesian Computation generated reproducible estimates of the influence of NFDS on pneumococcal evolution, the strength of which varied between loci. Simulations replicated the stable frequency of lineages unperturbed by vaccination, patterns of serotype switching and clonal replacement. This framework highlights how bacterial ecology affects the impact of clinical interventions.

    Funded by: NIAID NIH HHS: R01 AI048935, R01 AI106786; Wellcome Trust

    Nature ecology & evolution 2017;1;12;1950-1960

  • From clinical sample to complete genome: Comparing methods for the extraction of HIV-1 RNA for high-throughput deep sequencing.

    Cornelissen M, Gall A, Vink M, Zorgdrager F, Binter Š, Edwards S, Jurriaans S, Bakker M, Ong SH, Gras L, van Sighem A, Bezemer D, de Wolf F, Reiss P, Kellam P, Berkhout B, Fraser C, van der Kuyl AC and BEEHIVE Consortium

    Laboratory of Experimental Virology, Department of Medical Microbiology, Center for Infection and Immunity Amsterdam (CINIMA), Academic Medical Center of the University of Amsterdam, Meibergdreef 15, 1105 AZ Amsterdam, The Netherlands.

    The BEEHIVE (Bridging the Evolution and Epidemiology of HIV in Europe) project aims to analyse nearly-complete viral genomes from >3000 HIV-1 infected Europeans using high-throughput deep sequencing techniques to investigate the virus genetic contribution to virulence. Following the development of a computational pipeline, including a new de novo assembler for RNA virus genomes, to generate larger contiguous sequences (contigs) from the abundance of short sequence reads that characterise the data, another area that determines genome sequencing success is the quality and quantity of the input RNA. A pilot experiment with 125 patient plasma samples was performed to investigate the optimal method for isolation of HIV-1 viral RNA for long amplicon genome sequencing. Manual isolation with the QIAamp Viral RNA Mini Kit (Qiagen) was superior over robotically extracted RNA using either the QIAcube robotic system, the mSample Preparation Systems RNA kit with automated extraction by the m2000sp system (Abbott Molecular), or the MagNA Pure 96 System in combination with the MagNA Pure 96 Instrument (Roche Diagnostics). We scored amplification of a set of four HIV-1 amplicons of ∼1.9, 3.6, 3.0 and 3.5kb, and subsequent recovery of near-complete viral genomes. Subsequently, 616 BEEHIVE patient samples were analysed to determine factors that influence successful amplification of the genome in four overlapping amplicons using the QIAamp Viral RNA Kit for viral RNA isolation. Both low plasma viral load and high sample age (stored before 1999) negatively influenced the amplification of viral amplicons >3kb. A plasma viral load of >100,000 copies/ml resulted in successful amplification of all four amplicons for 86% of the samples, this value dropped to only 46% for samples with viral loads of <20,000 copies/ml.

    Virus research 2017;239;10-16

  • Natural variation of Epstein-Barr virus genes, proteins and pri-miRNA (revised).

    Correia S, Palser A, Elgueta Karstegl C, Middeldorp JM, Ramayanti O, Cohen JI, Hildesheim A, Fellner MD, Wiels J, White RE, Kellam P and Farrell PJ

    Section of Virology, Imperial College Faculty of Medicine, Norfolk Place, London W2 1PG, UK.

    Viral gene sequences from an enlarged set of about 200 Epstein-Barr virus (EBV) strains including many primary isolates have been used to investigate variation in key viral genetic regions, particularly LMP1, Zp, gp350, EBNA1 and the BART miRNA cluster 2. Determination of type 1 and type 2 EBV in saliva samples from people from a wide range of geographic and ethnic backgrounds demonstrates a small percentage of healthy white Caucasian British people carrying predominantly type 2 EBV. Linkage of Zp and gp350 variants to type 2 EBV is likely to be due to their genes being adjacent to the EBNA3 locus, which is one of the major determinants of the type 1/type 2 distinction. A novel classification of EBNA1 DNA binding domains named QCIGP results from phylogeny analysis of their protein sequences but is not linked to the type 1/type 2 classification. The BART cluster 2 miRNA region is classified into three major variants through SNPs in the pri-miRNA outside of the mature miRNA sequences. These SNPs can result in altered levels of expression of some miRNAs from the BART variant frequently present in Chinese and Indonesian nasopharyngeal carcinoma (NPC) samples. The EBV genetic variants identified here provide a basis for future more directed analysis of association of specific EBV variation with EBV biology and EBV associated diseases.IMPORTANCE Incidence of diseases associated with EBV varies greatly in different parts of the world. Relationships between EBV genome sequence variation and health, disease, geography and ethnicity of the host may thus be important for understanding the role of EBV in diseases and for development of an effective EBV vaccine. This paper provides the most comprehensive analysis so far of variation in specific EBV genes relevant to these diseases and proposed EBV vaccines. By focussing on variation in LMP1, Zp, gp350, EBNA1 and the BART miRNA cluster 2, new relationships to the known type 1/type 2 strains are demonstrated and novel classification of EBNA1 and the BART miRNAs is proposed.

    Journal of virology 2017

  • The Expanding World of Human Leishmaniasis.

    Cotton JA

    Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambs, CB10 1SA, UK. Electronic address:

    New Leishmania isolates form a novel group of human parasites related to Leishmania enrietti, with cases in Ghana, Thailand, and Martinique; other relatives infect Australian and South American wildlife. These parasites apparently cause both cutaneous and visceral disease, and may have evolved a novel transmission mechanism exploiting blood-feeding midges.

    Trends in parasitology 2017

  • The genome of Leishmania adleri from a mammalian host highlights chromosome fission in Sauroleishmania.

    Coughlan S, Mulhair P, Sanders M, Schonian G, Cotton JA and Downing T

    School of Mathematics, Applied Mathematics and Statistics, National University of Ireland, Galway, Ireland.

    Control of pathogens arising from humans, livestock and wild animals can be enhanced by genome-based investigation. Phylogenetically classifying and optimal construction of these genomes using short sequence reads are key to this process. We examined the mammal-infecting unicellular parasite Leishmania adleri belonging to the lizard-infecting Sauroleishmania subgenus. L. adleri has been associated with cutaneous disease in humans, but can be asymptomatic in wild animals. We sequenced, assembled and investigated the L. adleri genome isolated from an asymptomatic Ethiopian rodent (MARV/ET/75/HO174) and verified it as L. adleri by comparison with other Sauroleishmania species. Chromosome-level scaffolding was achieved by combining reference-guided with de novo assembly followed by extensive improvement steps to produce a final draft genome with contiguity comparable with other references. L. tarentolae and L. major genome annotation was transferred and these gene models were manually verified and improved. This first high-quality draft Leishmania adleri reference genome is also the first Sauroleishmania genome from a non-reptilian host. Comparison of the L. adleri HO174 genome with those of L. tarentolae Parrot-TarII and lizard-infecting L. adleri RLAT/KE/1957/SKINK-7 showed extensive gene amplifications, pervasive aneuploidy, and fission of chromosomes 30 and 36. There was little genetic differentiation between L. adleri extracted from mammals and reptiles, highlighting challenges for leishmaniasis surveillance.

    Scientific reports 2017;7;43747

  • Using whole genome sequencing to investigate transmission in a multi-host system: bovine tuberculosis in New Zealand.

    Crispell J, Zadoks RN, Harris SR, Paterson B, Collins DM, de-Lisle GW, Livingstone P, Neill MA, Biek R, Lycett SJ, Kao RR and Price-Carter M

    Institute of Biodiversity, Animal Health, and Comparative Medicine, University of Glasgow, Glasgow, Scotland, G61 1QH, UK.

    Background: Bovine tuberculosis (bTB), caused by Mycobacterium bovis, is an important livestock disease raising public health and economic concerns around the world. In New Zealand, a number of wildlife species are implicated in the spread and persistence of bTB in cattle populations, most notably the brushtail possum (Trichosurus vulpecula). Whole Genome Sequenced (WGS) M. bovis isolates sourced from infected cattle and wildlife across New Zealand were analysed. Bayesian phylogenetic analyses were conducted to estimate the substitution rate of the sampled population and investigate the role of wildlife. In addition, the utility of WGS was examined with a view to these methods being incorporated into routine bTB surveillance.

    Results: A high rate of exchange was evident between the sampled wildlife and cattle populations but directional estimates of inter-species transmission were sensitive to the sampling strategy employed. A relatively high substitution rate was estimated, this, in combination with a strong spatial signature and a good agreement to previous typing methods, acts to endorse WGS as a typing tool.

    Conclusions: In agreement with the current knowledge of bTB in New Zealand, transmission of M. bovis between cattle and wildlife was evident. Without direction, these estimates are less informative but taken in conjunction with the low prevalence of bTB in New Zealand's cattle population it is likely that, currently, wildlife populations are acting as the main bTB reservoir. Wildlife should therefore continue to be targeted if bTB is to be eradicated from New Zealand. WGS will be a considerable aid to bTB eradication by greatly improving the discriminatory power of molecular typing data. The substitution rates estimated here will be an important part of epidemiological investigations using WGS data.

    BMC genomics 2017;18;1;180

  • Diverse evolutionary patterns of pneumococcal antigens identified by pangenome-wide immunological screening.

    Croucher NJ, Campo JJ, Le TQ, Liang X, Bentley SD, Hanage WP and Lipsitch M

    Department of Infectious Disease Epidemiology, Imperial College London, London W2 1PG, United Kingdom;

    Characterizing the immune response to pneumococcal proteins is critical in understanding this bacterium's epidemiology and vaccinology. Probing a custom-designed proteome microarray with sera from 35 healthy US adults revealed a continuous distribution of IgG affinities for 2,190 potential antigens from the species-wide pangenome. Reproducibly elevated IgG binding was elicited by 208 "antibody binding targets" (ABTs), which included 109 variants of the diverse pneumococcal surface proteins A and C (PspA and PspC) and zinc metalloprotease A and B (ZmpA and ZmpB) proteins. Functional analysis found ABTs were enriched in motifs for secretion and cell surface association, with extensive representation of cell wall synthesis machinery, adhesins, transporter solute-binding proteins, and degradative enzymes. ABTs were associated with stronger evidence for evolving under positive selection, although this varied between functional categories, as did rates of diversification through recombination. Particularly rapid variation was observed at some immunogenic accessory loci, including a phage protein and a phase-variable glycosyltransferase ubiquitous among the diverse set of genomic islands encoding the serine-rich PsrP glycoprotein. Nevertheless, many antigens were conserved in the core genome, and strains' antigenic profiles were generally stable. No strong evidence was found for any epistasis between antigens driving population dynamics, or redundancy between functionally similar accessory ABTs, or age stratification of antigen profiles. These results highlight the paradox of why substantial variation is observed in only a subset of epitopes. This result may indicate only some interactions between immunoglobulins and ABTs clear pneumococcal colonization or that acquired immunity to pneumococci is an accumulation of individually weak responses to ABTs evolving under different levels of functional constraint.

    Proceedings of the National Academy of Sciences of the United States of America 2017

  • ACTB Loss-of-Function Mutations Result in a Pleiotropic Developmental Disorder.

    Cuvertino S, Stuart HM, Chandler KE, Roberts NA, Armstrong R, Bernardini L, Bhaskar S, Callewaert B, Clayton-Smith J, Davalillo CH, Deshpande C, Devriendt K, Digilio MC, Dixit A, Edwards M, Friedman JM, Gonzalez-Meneses A, Joss S, Kerr B, Lampe AK, Langlois S, Lennon R, Loget P, Ma DYT, McGowan R, Des Medt M, O'Sullivan J, Odent S, Parker MJ, Pebrel-Richard C, Petit F, Stark Z, Stockler-Ipsiroglu S, Tinschert S, Vasudevan P, Villa O, White SM, Zahir FR, DDD Study, Woolf AS and Banka S

    Division of Evolution and Genomic Sciences, School of Biological Sciences, Faculty of Biology, Medicine, and Health, The University of Manchester, M13 9PL Manchester, UK.

    ACTB encodes β-actin, an abundant cytoskeletal housekeeping protein. In humans, postulated gain-of-function missense mutations cause Baraitser-Winter syndrome (BRWS), characterized by intellectual disability, cortical malformations, coloboma, sensorineural deafness, and typical facial features. To date, the consequences of loss-of-function ACTB mutations have not been proven conclusively. We describe heterozygous ACTB deletions and nonsense and frameshift mutations in 33 individuals with developmental delay, apparent intellectual disability, increased frequency of internal organ malformations (including those of the heart and the renal tract), growth retardation, and a recognizable facial gestalt (interrupted wavy eyebrows, dense eyelashes, wide nose, wide mouth, and a prominent chin) that is distinct from characteristics of individuals with BRWS. Strikingly, this spectrum overlaps with that of several chromatin-remodeling developmental disorders. In wild-type mouse embryos, β-actin expression was prominent in the kidney, heart, and brain. ACTB mRNA expression levels in lymphoblastic lines and fibroblasts derived from affected individuals were decreased in comparison to those in control cells. Fibroblasts derived from an affected individual and ACTB siRNA knockdown in wild-type fibroblasts showed altered cell shape and migration, consistent with known roles of cytoplasmic β-actin. We also demonstrate that ACTB haploinsufficiency leads to reduced cell proliferation, altered expression of cell-cycle genes, and decreased amounts of nuclear, but not cytoplasmic, β-actin. In conclusion, we show that heterozygous loss-of-function ACTB mutations cause a distinct pleiotropic malformation syndrome with intellectual disability. Our biological studies suggest that a critically reduced amount of this protein alters cell shape, migration, proliferation, and gene expression to the detriment of brain, heart, and kidney development.

    Funded by: Medical Research Council: MR/L002744/1; Wellcome Trust

    American journal of human genetics 2017;101;6;1021-1033

  • BCFtools/csq: haplotype-aware variant consequences.

    Danecek P and McCarthy SA

    Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1SA, UK.

    Motivation: Prediction of functional variant consequences is an important part of sequencing pipelines, allowing the categorization and prioritization of genetic variants for follow up analysis. However, current predictors analyze variants as isolated events, which can lead to incorrect predictions when adjacent variants alter the same codon, or when a frame-shifting indel is followed by a frame-restoring indel. Exploiting known haplotype information when making consequence predictions can resolve these issues.

    Results: BCFtools/csq is a fast program for haplotype-aware consequence calling which can take into account known phase. Consequence predictions are changed for 501 of 5019 compound variants found in the 81.7M variants in the 1000 Genomes Project data, with an average of 139 compound variants per haplotype. Predictions match existing tools when run in localized mode, but the program is an order of magnitude faster and requires an order of magnitude less memory.

    Availability and implementation: The program is freely available for commercial and non-commercial use in the BCFtools package which is available for download from .


    Supplementary information: Supplementary data are available at Bioinformatics online.

    Funded by: Wellcome Trust

    Bioinformatics (Oxford, England) 2017;33;13;2037-2039

  • The STRATAA study protocol: a programme to assess the burden of enteric fever in Bangladesh, Malawi and Nepal using prospective population census, passive surveillance, serological studies and healthcare utilisation surveys.

    Darton TC, Meiring JE, Tonks S, Khan MA, Khanam F, Shakya M, Thindwa D, Baker S, Basnyat B, Clemens JD, Dougan G, Dolecek C, Dunstan SJ, Gordon MA, Heyderman RS, Holt KE, Pitzer VE, Qadri F, Zaman K, Pollard AJ and STRATAA Study Consortium

    The Hospital for Tropical Diseases, Wellcome Trust Major Overseas Programme, Oxford University Clinical Research Unit, Ho Chi Minh City, Vietnam.

    Introduction: Invasive infections caused by Salmonella enterica serovar Typhi and Paratyphi A are estimated to account for 12-27 million febrile illness episodes worldwide annually. Determining the true burden of typhoidal Salmonellae infections is hindered by lack of population-based studies and adequate laboratory diagnostics.The Strategic Typhoid alliance across Africa and Asia study takes a systematic approach to measuring the age-stratified burden of clinical and subclinical disease caused by typhoidal Salmonellae infections at three high-incidence urban sites in Africa and Asia. We aim to explore the natural history of Salmonella transmission in endemic settings, addressing key uncertainties relating to the epidemiology of enteric fever identified through mathematical models, and enabling optimisation of vaccine strategies.

    Methods/design: Using census-defined denominator populations of ≥100 000 individuals at sites in Malawi, Bangladesh and Nepal, the primary outcome is to characterise the burden of enteric fever in these populations over a 24-month period. During passive surveillance, clinical and household data, and laboratory samples will be collected from febrile individuals. In parallel, healthcare utilisation and water, sanitation and hygiene surveys will be performed to characterise healthcare-seeking behaviour and assess potential routes of transmission. The rates of both undiagnosed and subclinical exposure to typhoidal Salmonellae (seroincidence), identification of chronic carriage and population seroprevalence of typhoid infection will be assessed through age-stratified serosurveys performed at each site. Secondary attack rates will be estimated among household contacts of acute enteric fever cases and possible chronic carriers.

    Ethics and dissemination: This protocol has been ethically approved by the Oxford Tropical Research Ethics Committee, the icddr,b Institutional Review Board, the Malawian National Health Sciences Research Committee and College of Medicine Research Ethics Committee and Nepal Health Research Council. The study is being conducted in accordance with the principles of the Declaration of Helsinki and Good Clinical Practice. Informed consent was obtained before study enrolment. Results will be submitted to international peer-reviewed journals and presented at international conferences.

    Trial registration number: ISRCTN 12131979.

    Ethics references: Oxford (Oxford Tropical Research Ethics Committee 39-15).Bangladesh (icddr,b Institutional Review Board PR-15119).Malawi (National Health Sciences Research Committee 15/5/1599).Nepal (Nepal Health Research Council 306/2015).

    BMJ open 2017;7;6;e016283

  • No evidence for maintenance of a sympatric <i>Heliconius</i> species barrier by chromosomal inversions.

    Davey JW, Barker SL, Rastas PM, Pinharanda A, Martin SH, Durbin R, McMillan WO, Merrill RM and Jiggins CD

    Department of Zoology University of Cambridge Downing Street Cambridge CB2 3EJ United Kingdom.

    Mechanisms that suppress recombination are known to help maintain species barriers by preventing the breakup of coadapted gene combinations. The sympatric butterfly species <i>Heliconius melpomene</i> and <i>Heliconius cydno</i> are separated by many strong barriers, but the species still hybridize infrequently in the wild, and around 40% of the genome is influenced by introgression. We tested the hypothesis that genetic barriers between the species are maintained by inversions or other mechanisms that reduce between-species recombination rate. We constructed fine-scale recombination maps for Panamanian populations of both species and their hybrids to directly measure recombination rate within and between species, and generated long sequence reads to detect inversions. We find no evidence for a systematic reduction in recombination rates in F1 hybrids, and also no evidence for inversions longer than 50 kb that might be involved in generating or maintaining species barriers. This suggests that mechanisms leading to global or local reduction in recombination do not play a significant role in the maintenance of species barriers between <i>H. melpomene</i> and <i>H. cydno</i>.

    Funded by: Wellcome Trust

    Evolution letters 2017;1;3;138-154

  • Seeding and Establishment of Legionella pneumophila in Hospitals: Implications for Genomic Investigations of Nosocomial Legionnaires' Disease.

    David S, Afshar B, Mentasti M, Ginevra C, Podglajen I, Harris SR, Chalker VJ, Jarraud S, Harrison TG and Parkhill J

    Pathogen Genomics, Wellcome Trust Sanger Institute, Cambridge, UK.

    Background: Legionnaires' disease is an important cause of hospital-acquired pneumonia and is caused by infection with the bacterium Legionella. Because current typing methods often fail to resolve the infection source in possible nosocomial cases, we aimed to determine whether whole-genome sequencing (WGS) could be used to support or refute suspected links between cases and hospitals. We focused on cases involving a major nosocomial-associated strain, L. pneumophila sequence type (ST) 1.

    Methods: WGS data from 229 L. pneumophila ST1 isolates were analyzed, including 99 isolates from the water systems of 17 hospitals and 42 clinical isolates from patients with confirmed or suspected hospital-acquired infections, as well as isolates obtained from or associated with community-acquired sources of Legionnaires' disease.

    Results: Phylogenetic analysis demonstrated that all hospitals from which multiple isolates were obtained have been colonized by 1 or more distinct ST1 populations. However, deep sampling of 1 hospital also revealed the existence of substantial diversity and ward-specific microevolution within the population. Across all hospitals, suspected links with cases were supported with WGS, although the degree of support was dependent on the depth of environmental sampling and available contextual information. Finally, phylogeographic analysis revealed that hospitals have been seeded with L. pneumophila via both local and international spread of ST1.

    Conclusions: WGS can be used to support or refute suspected links between hospitals and Legionnaires' disease cases. However, deep hospital sampling is frequently required due to the potential coexistence of multiple populations, existence of substantial diversity, and similarity of hospital isolates to local populations.

    Clinical infectious diseases : an official publication of the Infectious Diseases Society of America 2017;64;9;1251-1259

  • Dynamics and impact of homologous recombination on the evolution of Legionella pneumophila.

    David S, Sánchez-Busó L, Harris SR, Marttinen P, Rusniok C, Buchrieser C, Harrison TG and Parkhill J

    Pathogen Genomics, Wellcome Trust Sanger Institute, Cambridge, United Kingdom.

    Legionella pneumophila is an environmental bacterium and the causative agent of Legionnaires' disease. Previous genomic studies have shown that recombination accounts for a high proportion (>96%) of diversity within several major disease-associated sequence types (STs) of L. pneumophila. This suggests that recombination represents a potentially important force shaping adaptation and virulence. Despite this, little is known about the biological effects of recombination in L. pneumophila, particularly with regards to homologous recombination (whereby genes are replaced with alternative allelic variants). Using newly available population genomic data, we have disentangled events arising from homologous and non-homologous recombination in six major disease-associated STs of L. pneumophila (subsp. pneumophila), and subsequently performed a detailed characterisation of the dynamics and impact of homologous recombination. We identified genomic "hotspots" of homologous recombination that include regions containing outer membrane proteins, the lipopolysaccharide (LPS) region and Dot/Icm effectors, which provide interesting clues to the selection pressures faced by L. pneumophila. Inference of the origin of the recombined regions showed that isolates have most frequently imported DNA from isolates belonging to their own clade, but also occasionally from other major clades of the same subspecies. This supports the hypothesis that the possibility for horizontal exchange of new adaptations between major clades of the subspecies may have been a critical factor in the recent emergence of several clinically important STs from diverse genomic backgrounds. However, acquisition of recombined regions from another subspecies, L. pneumophila subsp. fraseri, was rarely observed, suggesting the existence of a recombination barrier and/or the possibility of ongoing speciation between the two subspecies. Finally, we suggest that multi-fragment recombination may occur in L. pneumophila, whereby multiple non-contiguous segments that originate from the same molecule of donor DNA are imported into a recipient genome during a single episode of recombination.

    Funded by: Wellcome Trust

    PLoS genetics 2017;13;6;e1006855

  • A point mutation in the ion conduction pore of AMPA receptor GRIA3 causes dramatically perturbed sleep patterns as well as intellectual disability.

    Davies B, Brown LA, Cais O, Watson J, Clayton AJ, Chang VT, Biggs D, Preece C, Hernandez-Pliego P, Krohn J, Bhomra A, Twigg SRF, Rimmer A, Kanapin A, WGS500 Consortium, Sen A, Zaiwalla Z, McVean G, Foster R, Donnelly P, Taylor JC, Blair E, Nutt D, Aricescu AR, Greger IH, Peirson SN, Flint J and Martin HC

    Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, Oxfordshire OX3 7BN, UK.

    The discovery of genetic variants influencing sleep patterns can shed light on the physiological processes underlying sleep. As part of a large clinical sequencing project, WGS500, we sequenced a family in which the two male children had severe developmental delay and a dramatically disturbed sleep-wake cycle, with very long wake and sleep durations, reaching up to 106-h awake and 48-h asleep. The most likely causal variant identified was a novel missense variant in the X-linked GRIA3 gene, which has been implicated in intellectual disability. GRIA3 encodes GluA3, a subunit of AMPA-type ionotropic glutamate receptors (AMPARs). The mutation (A653T) falls within the highly conserved transmembrane domain of the ion channel gate, immediately adjacent to the analogous residue in the Grid2 (glutamate receptor) gene, which is mutated in the mouse neurobehavioral mutant, Lurcher. In vitro, the GRIA3(A653T) mutation stabilizes the channel in a closed conformation, in contrast to Lurcher. We introduced the orthologous mutation into a mouse strain by CRISPR-Cas9 mutagenesis and found that hemizygous mutants displayed significant differences in the structure of their activity and sleep compared to wild-type littermates. Typically, mice are polyphasic, exhibiting multiple sleep bouts of sleep several minutes long within a 24-h period. The Gria3A653T mouse showed significantly fewer brief bouts of activity and sleep than the wild-types. Furthermore, Gria3A653T mice showed enhanced period lengthening under constant light compared to wild-type mice, suggesting an increased sensitivity to light. Our results suggest a role for GluA3 channel activity in the regulation of sleep behavior in both mice and humans.

    Funded by: Medical Research Council: MC_U105174197, MR/L009609/1, MR/L016265/1; Wellcome Trust

    Human molecular genetics 2017;26;20;3869-3882

  • Whole-Genome Sequencing Reveals Breast Cancers with Mismatch Repair Deficiency.

    Davies H, Morganella S, Purdie CA, Jang SJ, Borgen E, Russnes H, Glodzik D, Zou X, Viari A, Richardson AL, Børresen-Dale AL, Thompson A, Eyfjord JE, Kong G, Stratton MR and Nik-Zainal S

    Wellcome Trust Sanger Institute, Hinxton, United Kingdom.

    Mismatch repair (MMR)-deficient cancers have been discovered to be highly responsive to immune therapies such as PD-1 checkpoint blockade, making their definition in patients, where they may be relatively rare, paramount for treatment decisions. In this study, we utilized patterns of mutagenesis known as mutational signatures, which are imprints of the mutagenic processes associated with MMR deficiency, to identify MMR-deficient breast tumors from a whole-genome sequencing dataset comprising a cohort of 640 patients. We identified 11 of 640 tumors as MMR deficient, but only 2 of 11 exhibited germline mutations in MMR genes or Lynch Syndrome. Two additional tumors had a substantially reduced proportion of mutations attributed to MMR deficiency, where the predominant mutational signatures were related to APOBEC enzymatic activity. Overall, 6 of 11 of the MMR-deficient cases in this cohort were confirmed genetically or epigenetically as having abrogation of MMR genes. However, IHC analysis of MMR-related proteins revealed all but one of 10 samples available for testing as MMR deficient. Thus, the mutational signatures more faithfully reported MMR deficiency than sequencing of MMR genes, because they represent a direct pathophysiologic readout of repair pathway abnormalities. As whole-genome sequencing continues to become more affordable, it could be used to expose individually abnormal tumors in tissue types where MMR deficiency has been rarely detected, but also rarely sought. <i>Cancer Res; 77(18); 4755-62. ©2017 AACR</i>.

    Funded by: Cancer Research UK: C60100/A23916; NCI NIH HHS: P30 CA016672, P50 CA168504; Wellcome Trust: WT101126/B/13/Z

    Cancer research 2017;77;18;4755-4762

  • Cytogenetic Resources and Information.

    De Braekeleer E, Huret JL, Mossafa H and Dessen P

    Haematological Cancer Genetics & Stem Cell Genetics, Wellcome Trust Sanger Institute, Hinxton, Cambridge, CB10 1SA, UK.

    The main databases devoted stricto sensu to cancer cytogenetics are the "Mitelman Database of Chromosome Aberrations and Gene Fusions in Cancer" ( ), the "Atlas of Genetics and Cytogenetics in Oncology and Haematology" ( ), and COSMIC ( ).However, being a complex multistep process, cancer cytogenetics are broadened to "cytogenomics," with complementary resources on: general databases (nucleic acid and protein sequences databases; cartography browsers: GenBank, RefSeq, UCSC, Ensembl, UniProtKB, and Entrez Gene), cancer genomic portals associated with recent international integrated programs, such as TCGA or ICGC, other fusion genes databases, array CGH databases, copy number variation databases, and mutation databases. Other resources such as the International System for Human Cytogenomic Nomenclature (ISCN), the International Classification of Diseases for Oncology (ICD-O), and the Human Gene Nomenclature Database (HGNC) allow a common language.Data within the scientific/medical community should be freely available. However, most of the institutional stakeholders are now gradually disengaging, and well-known databases are forced to beg or to disappear (which may happen!).

    Methods in molecular biology (Clifton, N.J.) 2017;1541;311-331

  • A single-copy Sleeping Beauty transposon mutagenesis screen identifies new PTEN-cooperating tumor suppressor genes.

    de la Rosa J, Weber J, Friedrich MJ, Li Y, Rad L, Ponstingl H, Liang Q, de Quirós SB, Noorani I, Metzakopian E, Strong A, Li MA, Astudillo A, Fernández-García MT, Fernández-García MS, Hoffman GJ, Fuente R, Vassiliou GS, Rad R, López-Otín C, Bradley A and Cadiñanos J

    The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, UK.

    The overwhelming number of genetic alterations identified through cancer genome sequencing requires complementary approaches to interpret their significance and interactions. Here we developed a novel whole-body insertional mutagenesis screen in mice, which was designed for the discovery of Pten-cooperating tumor suppressors. Toward this aim, we coupled mobilization of a single-copy inactivating Sleeping Beauty transposon to Pten disruption within the same genome. The analysis of 278 transposition-induced prostate, breast and skin tumors detected tissue-specific and shared data sets of known and candidate genes involved in cancer. We validated ZBTB20, CELF2, PARD3, AKAP13 and WAC, which were identified by our screens in multiple cancer types, as new tumor suppressor genes in prostate cancer. We demonstrated their synergy with PTEN in preventing invasion in vitro and confirmed their clinical relevance. Further characterization of Wac in vivo showed obligate haploinsufficiency for this gene (which encodes an autophagy-regulating factor) in a Pten-deficient context. Our study identified complex PTEN-cooperating tumor suppressor networks in different cancer types, with potential clinical implications.

    Funded by: Medical Research Council: MC_PC_12009; Wellcome Trust

    Nature genetics 2017;49;5;730-741

  • Disentangling <i>PTEN</i>-cooperating tumor suppressor gene networks in cancer.

    de la Rosa J, Weber J, Rad R, Bradley A and Cadiñanos J

    The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, UK.

    We have recently performed a whole-body, genome-wide screen in mice using a single-copy inactivating transposon for the identification of <i>Pten</i> (phosphatase and tensin homolog)-cooperating tumor suppressor genes (TSGs). We identified known and putative TSGs in multiple cancer types and validated the functional and clinical relevance of several promising candidates for human prostate cancer.

    Molecular & cellular oncology 2017;4;4;e1325550

  • Identifying transposon insertions and their effects from RNA-sequencing data.

    de Ruiter JR, Kas SM, Schut E, Adams DJ, Koudijs MJ, Wessels LFA and Jonkers J

    Division of Molecular Pathology and Cancer Genomics Netherlands, Netherlands Cancer Institute, Plesmanlaan 121, Amsterdam 1066 CX, The Netherlands.

    Insertional mutagenesis using engineered transposons is a potent forward genetic screening technique used to identify cancer genes in mouse model systems. In the analysis of these screens, transposon insertion sites are typically identified by targeted DNA-sequencing and subsequently assigned to predicted target genes using heuristics. As such, these approaches provide no direct evidence that insertions actually affect their predicted targets or how transcripts of these genes are affected. To address this, we developed IM-Fusion, an approach that identifies insertion sites from gene-transposon fusions in standard single- and paired-end RNA-sequencing data. We demonstrate IM-Fusion on two separate transposon screens of 123 mammary tumors and 20 B-cell acute lymphoblastic leukemias, respectively. We show that IM-Fusion accurately identifies transposon insertions and their true target genes. Furthermore, by combining the identified insertion sites with expression quantification, we show that we can determine the effect of a transposon insertion on its target gene(s) and prioritize insertions that have a significant effect on expression. We expect that IM-Fusion will significantly enhance the accuracy of cancer gene discovery in forward genetic screens and provide initial insight into the biological effects of insertions on candidate cancer genes.

    Funded by: Cancer Research UK: 13031; European Research Council: 319661

    Nucleic acids research 2017;45;12;7064-7077

  • Rapid establishment of the European Bank for induced Pluripotent Stem Cells (EBiSC) - the Hot Start experience.

    De Sousa PA, Steeg R, Wachter E, Bruce K, King J, Hoeve M, Khadun S, McConnachie G, Holder J, Kurtz A, Seltmann S, Dewender J, Reimann S, Stacey G, O'Shea O, Chapman C, Healy L, Zimmermann H, Bolton B, Rawat T, Atkin I, Veiga A, Kuebler B, Serano BM, Saric T, Hescheler J, Brüstle O, Peitz M, Thiele C, Geijsen N, Holst B, Clausen C, Lako M, Armstrong L, Gupta SK, Kvist AJ, Hicks R, Jonebring A, Brolén G, Ebneth A, Cabrera-Socorro A, Foerch P, Geraerts M, Stummann TC, Harmon S, George C, Streeter I, Clarke L, Parkinson H, Harrison PW, Faulconbridge A, Cherubin L, Burdett T, Trigueros C, Patel MJ, Lucas C, Hardy B, Predan R, Dokler J, Brajnik M, Keminer O, Pless O, Gribbon P, Claussen C, Ringwald A, Kreisel B, Courtney A and Allsopp TE

    Centre for Clinical Brain Sciences, Chancellors Building, 49 Little France Crescent, University of Edinburgh, Edinburgh EH16 4SB, UK; Roslin Cells Ltd(1), Head office, Nine Edinburgh Bioquarter, 9 Little France Rd, Edinburgh EH16 4UX, UK; EBiSC banking facility, Babraham Research Campus, B260 Meditrina, Cambridge CB22 3AT, UK. Electronic address:

    A fast track "Hot Start" process was implemented to launch the European Bank for Induced Pluripotent Stem Cells (EBiSC) to provide early release of a range of established control and disease linked human induced pluripotent stem cell (hiPSC) lines. Established practice amongst consortium members was surveyed to arrive at harmonised and publically accessible Standard Operations Procedures (SOPs) for tissue procurement, bio-sample tracking, iPSC expansion, cryopreservation, qualification and distribution to the research community. These were implemented to create a quality managed foundational collection of lines and associated data made available for distribution. Here we report on the successful outcome of this experience and work flow for banking and facilitating access to an otherwise disparate European resource, with lessons to benefit the international research community. ETOC: The report focuses on the EBiSC experience of rapidly establishing an operational capacity to procure, bank and distribute a foundational collection of established hiPSC lines. It validates the feasibility and defines the challenges of harnessing and integrating the capability and productivity of centres across Europe using commonly available resources currently in the field.

    Funded by: Biotechnology and Biological Sciences Research Council: BB/E012841/1, BBS/B/14779; Medical Research Council: G0301182

    Stem cell research 2017;20;105-114

  • Phase-variable methylation and epigenetic regulation by type I restriction-modification systems.

    De Ste Croix M, Vacca I, Kwun MJ, Ralph JD, Bentley SD, Haigh R, Croucher NJ and Oggioni MR

    Department of Genetics, University of Leicester, Leicester LE1 7RH, UK.

    Epigenetic modifications in bacteria, such as DNA methylation, have been shown to affect gene regulation, thereby generating cells that are isogenic but with distinctly different phenotypes. Restriction-modification (RM) systems contain prototypic methylases that are responsible for much of bacterial DNA methylation. This review focuses on a distinctive group of type I RM loci that , through phase variation, can modify their methylation target specificity and can thereby switch bacteria between alternative patterns of DNA methylation. Phase variation occurs at the level of the target recognition domains of the hsdS (specificity) gene via reversible recombination processes acting upon multiple hsdS alleles. We describe the global distribution of such loci throughout the prokaryotic kingdom and highlight the differences in loci structure across the various bacterial species. Although RM systems are often considered simply as an evolutionary response to bacteriophages, these multi-hsdS type I systems have also shown the capacity to change bacterial phenotypes. The ability of these RM systems to allow bacteria to reversibly switch between different physiological states, combined with the existence of such loci across many species of medical and industrial importance, highlights the potential of phase-variable DNA methylation to act as a global regulatory mechanism in bacteria.

    FEMS microbiology reviews 2017;41;Supp_1;S3-S15

  • Prevalence and architecture of de novo mutations in developmental disorders.

    Deciphering Developmental Disorders Study

    The genomes of individuals with severe, undiagnosed developmental disorders are enriched in damaging de novo mutations (DNMs) in developmentally important genes. Here we have sequenced the exomes of 4,293 families containing individuals with developmental disorders, and meta-analysed these data with data from another 3,287 individuals with similar disorders. We show that the most important factors influencing the diagnostic yield of DNMs are the sex of the affected individual, the relatedness of their parents, whether close relatives are affected and the parental ages. We identified 94 genes enriched in damaging DNMs, including 14 that previously lacked compelling evidence of involvement in developmental disorders. We have also characterized the phenotypic diversity among these disorders. We estimate that 42% of our cohort carry pathogenic DNMs in coding sequences; approximately half of these DNMs disrupt gene function and the remainder result in altered protein function. We estimate that developmental disorders caused by DNMs have an average prevalence of 1 in 213 to 1 in 448 births, depending on parental age. Given current global demographics, this equates to almost 400,000 children born per year.

    Funded by: Medical Research Council: G0800674, MC_PC_U127561093, MR/M014568/1; Wellcome Trust; Wellcome Trust Sanger Institute: WT098051

    Nature 2017;542;7642;433-438

  • Environmental DNA metabarcoding: Transforming how we survey animal and plant communities.

    Deiner K, Bik HM, Mächler E, Seymour M, Lacoursière-Roussel A, Altermatt F, Creer S, Bista I, Lodge DM, de Vere N, Pfrender ME and Bernatchez L

    Atkinson Center for a Sustainable Future, Department of Ecology and Evolutionary Biology, Cornell University, Ithaca, NY, USA.

    The genomic revolution has fundamentally changed how we survey biodiversity on earth. High-throughput sequencing ("HTS") platforms now enable the rapid sequencing of DNA from diverse kinds of environmental samples (termed "environmental DNA" or "eDNA"). Coupling HTS with our ability to associate sequences from eDNA with a taxonomic name is called "eDNA metabarcoding" and offers a powerful molecular tool capable of noninvasively surveying species richness from many ecosystems. Here, we review the use of eDNA metabarcoding for surveying animal and plant richness, and the challenges in using eDNA approaches to estimate relative abundance. We highlight eDNA applications in freshwater, marine and terrestrial environments, and in this broad context, we distill what is known about the ability of different eDNA sample types to approximate richness in space and across time. We provide guiding questions for study design and discuss the eDNA metabarcoding workflow with a focus on primers and library preparation methods. We additionally discuss important criteria for consideration of bioinformatic filtering of data sets, with recommendations for increasing transparency. Finally, looking to the future, we discuss emerging applications of eDNA metabarcoding in ecology, conservation, invasion biology, biomonitoring, and how eDNA metabarcoding can empower citizen science and biodiversity education.

    Molecular ecology 2017;26;21;5872-5895

  • Principles of Reconstructing the Subclonal Architecture of Cancers.

    Dentro SC, Wedge DC and Van Loo P

    Wellcome Trust Sanger Institute, Cambridge CB10 1HH, United Kingdom.

    Most cancers evolve from a single founder cell through a series of clonal expansions that are driven by somatic mutations. These clonal expansions can lead to several coexisting subclones sharing subsets of mutations. Analysis of massively parallel sequencing data can infer a tumor's subclonal composition through the identification of populations of cells with shared mutations. We describe the principles that underlie subclonal reconstruction through single nucleotide variants (SNVs) or copy number alterations (CNAs) from bulk or single-cell sequencing. These principles include estimating the fraction of tumor cells for SNVs and CNAs, performing clustering of SNVs from single- and multisample cases, and single-cell sequencing. The application of subclonal reconstruction methods is providing key insights into tumor evolution, identifying subclonal driver mutations, patterns of parallel evolution and differences in mutational signatures between cellular populations, and characterizing the mechanisms of therapy resistance, spread, and metastasis.

    Funded by: Wellcome Trust

    Cold Spring Harbor perspectives in medicine 2017;7;8

  • Bacterial microbiota of the upper respiratory tract and childhood asthma.

    Depner M, Ege MJ, Cox MJ, Dwyer S, Walker AW, Birzele LT, Genuneit J, Horak E, Braun-Fahrländer C, Danielewicz H, Maier RM, Moffatt MF, Cookson WO, Heederik D, von Mutius E and Legatzki A

    Dr von Hauner Children's Hospital, LMU Munich, Munich, Germany. Electronic address:

    Background: Patients with asthma and healthy controls differ in bacterial colonization of the respiratory tract. The upper airways have been shown to reflect colonization of the lower airways, the actual site of inflammation in asthma, which is hardly accessible in population studies.

    Objective: We sought to characterize the bacterial communities at 2 sites of the upper respiratory tract obtained from children from a rural area and to relate these to asthma.

    Methods: The microbiota of 327 throat and 68 nasal samples from school-age farm and nonfarm children were analyzed by 454-pyrosequencing of the bacterial 16S ribosomal RNA gene.

    Results: Alterations in nasal microbiota but not of throat microbiota were associated with asthma. Children with asthma had lower α- and β-diversity of the nasal microbiota as compared with healthy control children. Furthermore, asthma presence was positively associated with a specific operational taxonomic unit from the genus Moraxella in children not exposed to farming, whereas in farm children Moraxella colonization was unrelated to asthma. In nonfarm children, Moraxella colonization explained the association between bacterial diversity and asthma to a large extent.

    Conclusions: Asthma was mainly associated with an altered nasal microbiota characterized by lower diversity and Moraxella abundance. Children living on farms might not be susceptible to the disadvantageous effect of Moraxella. Prospective studies may clarify whether Moraxella outgrowth is a cause or a consequence of loss in diversity.

    Funded by: Medical Research Council: G1000758

    The Journal of allergy and clinical immunology 2017;139;3;826-834.e13

  • Principles guiding embryo selection following genome-wide haplotyping of preimplantation embryos.

    Dimitriadou E, Melotte C, Debrock S, Esteki MZ, Dierickx K, Voet T, Devriendt K, de Ravel T, Legius E, Peeraer K, Meuleman C and Vermeesch JR

    Department of Human Genetics, Centre for Human Genetics, University Hospitals Leuven, O&N I Herestraat 49 - box 602, KU Leuven, 3000 Leuven, Belgium.

    Study question: How to select and prioritize embryos during PGD following genome-wide haplotyping?

    Summary answer: In addition to genetic disease-specific information, the embryo selected for transfer is based on ranking criteria including the existence of mitotic and/or meiotic aneuploidies, but not carriership of mutations causing recessive disorders.

    What is known already: Embryo selection for monogenic diseases has been mainly performed using targeted disease-specific assays. Recently, these targeted approaches are being complemented by generic genome-wide genetic analysis methods such as karyomapping or haplarithmisis, which are based on genomic haplotype reconstruction of cell(s) biopsied from embryos. This provides not only information about the inheritance of Mendelian disease alleles but also about numerical and structural chromosome anomalies and haplotypes genome-wide. Reflections on how to use this information in the diagnostic laboratory are lacking.

    Study design, size, duration: We present the results of the first 101 PGD cycles (373 embryos) using haplarithmisis, performed in the Centre for Human Genetics, UZ Leuven. The questions raised were addressed by a multidisciplinary team of clinical geneticist, fertility specialists and ethicists.

    Participants/materials, setting, methods: Sixty-three couples enrolled in the genome-wide haplotyping-based PGD program. Families presented with either inherited genetic variants causing known disorders and/or chromosomal rearrangements that could lead to unbalanced translocations in the offspring.

    Main results and the role of chance: Embryos were selected based on the absence or presence of the disease allele, a trisomy or other chromosomal abnormality leading to known developmental disorders. In addition, morphologically normal Day 5 embryos were prioritized for transfer based on the presence of other chromosomal imbalances and/or carrier information.

    Limitations, reasons for caution: Some of the choices made and principles put forward are specific for cleavage-stage-based genetic testing. The proposed guidelines are subject to continuous update based on the accumulating knowledge from the implementation of genome-wide methods for PGD in many different centers world-wide as well as the results of ongoing scientific research.

    Wider implications of the findings: Our embryo selection principles have a profound impact on the organization of PGD operations and on the information that is transferred among the genetic unit, the fertility clinic and the patients. These principles are also important for the organization of pre- and post-counseling and influence the interpretation and reporting of preimplantation genotyping results. As novel genome-wide approaches for embryo selection are revolutionizing the field of reproductive genetics, national and international discussions to set general guidelines are warranted.

    Study funding/competing interest(s): The European Union's Research and Innovation funding programs FP7-PEOPLE-2012-IAPP SARM: 324509 and Horizon 2020 WIDENLIFE: 692065 to J.R.V., T.V., E.D. and M.Z.E. J.R.V., T.V. and M.Z.E. have patents ZL910050-PCT/EP2011/060211-WO/2011/157846 ('Methods for haplotyping single cells') with royalties paid and ZL913096-PCT/EP2014/068315-WO/2015/028576 ('Haplotyping and copy-number typing using polymorphic variant allelic frequencies') with royalties paid, licensed to Cartagenia (Agilent technologies). J.R.V. also has a patent ZL91 2076-PCT/EP20 one 3/070858 ('High throughout genotyping by sequencing') with royalties paid.

    Trial registration number: N/A.

    Human reproduction (Oxford, England) 2017;32;3;687-697

  • Using reference-free compressed data structures to analyze sequencing reads from thousands of human genomes.

    Dolle DD, Liu Z, Cotten M, Simpson JT, Iqbal Z, Durbin R, McCarthy SA and Keane TM

    Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, United Kingdom.

    We are rapidly approaching the point where we have sequenced millions of human genomes. There is a pressing need for new data structures to store raw sequencing data and efficient algorithms for population scale analysis. Current reference-based data formats do not fully exploit the redundancy in population sequencing nor take advantage of shared genetic variation. In recent years, the Burrows-Wheeler transform (BWT) and FM-index have been widely employed as a full-text searchable index for read alignment and de novo assembly. We introduce the concept of a population BWT and use it to store and index the sequencing reads of 2705 samples from the 1000 Genomes Project. A key feature is that, as more genomes are added, identical read sequences are increasingly observed, and compression becomes more efficient. We assess the support in the 1000 Genomes read data for every base position of two human reference assembly versions, identifying that 3.2 Mbp with population support was lost in the transition from GRCh37 with 13.7 Mbp added to GRCh38. We show that the vast majority of variant alleles can be uniquely described by overlapping 31-mers and show how rapid and accurate SNP and indel genotyping can be carried out across the genomes in the population BWT. We use the population BWT to carry out nonreference queries to search for the presence of all known viral genomes and discover human T-lymphotropic virus 1 integrations in six samples in a recognized epidemiological distribution.

    Funded by: Wellcome Trust

    Genome research 2017;27;2;300-309

  • Integrated view of <i>Vibrio cholerae</i> in the Americas.

    Domman D, Quilici ML, Dorman MJ, Njamkepo E, Mutreja A, Mather AE, Delgado G, Morales-Espinosa R, Grimont PAD, Lizárraga-Partida ML, Bouchier C, Aanensen DM, Kuri-Morales P, Tarr CL, Dougan G, Parkhill J, Campos J, Cravioto A, Weill FX and Thomson NR

    Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1SA, UK.

    Latin America has experienced two of the largest cholera epidemics in modern history; one in 1991 and the other in 2010. However, confusion still surrounds the relationships between globally circulating pandemic <i>Vibrio cholerae</i> clones and local bacterial populations. We used whole-genome sequencing to characterize cholera across the Americas over a 40-year time span. We found that both epidemics were the result of intercontinental introductions of seventh pandemic El Tor <i>V. cholerae</i> and that at least seven lineages local to the Americas are associated with disease that differs epidemiologically from epidemic cholera. Our results consolidate historical accounts of pandemic cholera with data to show the importance of local lineages, presenting an integrated view of cholera that is important to the design of future disease control strategies.

    Funded by: Wellcome Trust

    Science (New York, N.Y.) 2017;358;6364;789-793

  • Wounding induces dedifferentiation of epidermal Gata6<sup>+</sup> cells and acquisition of stem cell properties.

    Donati G, Rognoni E, Hiratsuka T, Liakath-Ali K, Hoste E, Kar G, Kayikci M, Russell R, Kretzschmar K, Mulder KW, Teichmann SA and Watt FM

    King's College London Centre for Stem Cells and Regenerative Medicine, 28th Floor, Tower Wing, Guy's Campus, Great Maze Pond, London SE1 9RT, UK.

    The epidermis is maintained by multiple stem cell populations whose progeny differentiate along diverse, and spatially distinct, lineages. Here we show that the transcription factor Gata6 controls the identity of the previously uncharacterized sebaceous duct (SD) lineage and identify the Gata6 downstream transcription factor network that specifies a lineage switch between sebocytes and SD cells. During wound healing differentiated Gata6<sup>+</sup> cells migrate from the SD into the interfollicular epidermis and dedifferentiate, acquiring the ability to undergo long-term self-renewal and differentiate into a much wider range of epidermal lineages than in undamaged tissue. Our data not only demonstrate that the structural and functional complexity of the junctional zone is regulated by Gata6, but also reveal that dedifferentiation is a previously unrecognized property of post-mitotic, terminally differentiated cells that have lost contact with the basement membrane. This resolves the long-standing debate about the contribution of terminally differentiated cells to epidermal wound repair.

    Funded by: Medical Research Council: G1100073, MC_U105185859

    Nature cell biology 2017;19;6;603-613

  • Population genetic structuring of methicillin-resistant Staphylococcus aureus clone EMRSA-15 within UK reflects patient referral patterns.

    Donker T, Reuter S, Scriberras J, Reynolds R, Brown NM, Török ME, James R, Network EOEMR, Aanensen DM, Bentley SD, Holden MTG, Parkhill J, Spratt BG, Peacock SJ, Feil EJ and Grundmann H

    Nuffield Department of Medicine, University of Oxford, John Radcliffe Hospital, Headley Way, Oxford OX3 9DU, UK.

    Antibiotic resistance forms a serious threat to the health of hospitalised patients, rendering otherwise treatable bacterial infections potentially life-threatening. A thorough understanding of the mechanisms by which resistance spreads between patients in different hospitals is required in order to design effective control strategies. We measured the differences between bacterial populations of 52 hospitals in the United Kingdom and Ireland, using whole-genome sequences from 1085 MRSA clonal complex 22 isolates collected between 1998 and 2012. The genetic differences between bacterial populations were compared with the number of patients transferred between hospitals and their regional structure. The MRSA populations within single hospitals, regions and countries were genetically distinct from the rest of the bacterial population at each of these levels. Hospitals from the same patient referral regions showed more similar MRSA populations, as did hospitals sharing many patients. Furthermore, the bacterial populations from different time-periods within the same hospital were generally more similar to each other than contemporaneous bacterial populations from different hospitals. We conclude that, while a large part of the dispersal and expansion of MRSA takes place among patients seeking care in single hospitals, inter-hospital spread of resistant bacteria is by no means a rare occurrence. Hospitals are exposed to constant introductions of MRSA on a number of levels: (1) most MRSA is received from hospitals that directly transfer large numbers of patients, while (2) fewer introductions happen between regions or (3) across national borders, reflecting lower numbers of transferred patients. A joint coordinated control effort between hospitals, is therefore paramount for the national control of MRSA, antibiotic-resistant bacteria and other hospital-associated pathogens.

    Microbial genomics 2017;3;7;e000113

  • No Functional Role for microRNA-342 in a Mouse Model of Pancreatic Acinar Carcinoma.

    Dooley J, Lagou V, Pasciuto E, Linterman MA, Prosser HM, Himmelreich U and Liston A

    Translational Immunology Laboratory, VIB, Leuven, Belgium.

    The intronic microRNA (miR)-342 has been proposed as a potent tumor-suppressor gene. miR-342 is found to be downregulated or epigenetically silenced in multiple different tumor sites, and this loss of expression permits the upregulation of several key oncogenic pathways. In several different cell lines, lower miR-342 expression results in enhanced proliferation and metastasis potential, both in vitro and in xenogenic transplant conditions. Here, we sought to determine the function of miR-342 in an in vivo spontaneous cancer model, using the Ela1-TAg transgenic model of pancreatic acinar carcinoma. Through longitudinal magnetic resonance imaging monitoring of Ela1-TAg transgenic mice, either wild-type or knockout for miR-342, we found no role for miR-342 in the development, growth rate, or pathogenicity of pancreatic acinar carcinoma. These results indicate the importance of assessing miR function in the complex physiology of in vivo model systems and indicate that further functional testing of miR-342 is required before concluding it is a bona fide tumor-suppressor-miR.

    Frontiers in oncology 2017;7;101

  • Control of virulence gene transcription by indirect readout in Vibrio cholerae and Salmonella enterica serovar Typhimurium.

    Dorman CJ and Dorman MJ

    Department of Microbiology, Moyne Institute of Preventive Medicine, Trinity College Dublin, Dublin, Ireland.

    Indirect readout mechanisms of transcription control rely on the recognition of DNA shape by transcription factors (TFs). TFs may also employ a direct readout mechanism that involves the reading of the base sequence in the DNA major groove at the binding site. TFs with winged helix-turn-helix (wHTH) motifs use an alpha helix to read the base sequence in the major groove while inserting a beta sheet 'wing' into the adjacent minor groove. Such wHTH proteins are important regulators of virulence gene transcription in many pathogens; they also control housekeeping genes. This article considers the cases of the non-invasive Gram-negative pathogen Vibrio cholerae and the invasive pathogen Salmonella enterica serovar Typhimurium. Both possess clusters of A + T-rich horizontally acquired virulence genes that are silenced by the nucleoid-associated protein H-NS and regulated positively or negatively by wHTH TFs: for example, ToxR and LeuO in V. cholerae; HilA, LeuO, SlyA and OmpR in S. Typhimurium. Because of their relatively relaxed base sequence requirements for target recognition, indirect readout mechanisms have the potential to engage regulatory proteins with many more targets than might be the case using direct readout, making indirect readout an important, yet often ignored, contributor to the expression of pathogenic phenotypes.

    Funded by: Wellcome Trust: 098051

    Environmental microbiology 2017;19;10;3834-3845

  • Genome watch: Klebsiella pneumoniae: when a colonizer turns bad.

    Dorman MJ and Short FL

    Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.

    Nature reviews. Microbiology 2017;15;7;384

  • Genome-wide analysis of ivermectin response by Onchocerca volvulus reveals that genetic drift and soft selective sweeps contribute to loss of drug sensitivity.

    Doyle SR, Bourguinat C, Nana-Djeunga HC, Kengne-Ouafo JA, Pion SDS, Bopda J, Kamgno J, Wanji S, Che H, Kuesel AC, Walker M, Basáñez MG, Boakye DA, Osei-Atweneboana MY, Boussinesq M, Prichard RK and Grant WN

    Department of Animal, Plant and Soil Sciences, La Trobe University, Bundoora, Australia.

    Background: Treatment of onchocerciasis using mass ivermectin administration has reduced morbidity and transmission throughout Africa and Central/South America. Mass drug administration is likely to exert selection pressure on parasites, and phenotypic and genetic changes in several Onchocerca volvulus populations from Cameroon and Ghana-exposed to more than a decade of regular ivermectin treatment-have raised concern that sub-optimal responses to ivermectin's anti-fecundity effect are becoming more frequent and may spread.

    Methodology/principal findings: Pooled next generation sequencing (Pool-seq) was used to characterise genetic diversity within and between 108 adult female worms differing in ivermectin treatment history and response. Genome-wide analyses revealed genetic variation that significantly differentiated good responder (GR) and sub-optimal responder (SOR) parasites. These variants were not randomly distributed but clustered in ~31 quantitative trait loci (QTLs), with little overlap in putative QTL position and gene content between the two countries. Published candidate ivermectin SOR genes were largely absent in these regions; QTLs differentiating GR and SOR worms were enriched for genes in molecular pathways associated with neurotransmission, development, and stress responses. Finally, single worm genotyping demonstrated that geographic isolation and genetic change over time (in the presence of drug exposure) had a significantly greater role in shaping genetic diversity than the evolution of SOR.

    Conclusions/significance: This study is one of the first genome-wide association analyses in a parasitic nematode, and provides insight into the genomics of ivermectin response and population structure of O. volvulus. We argue that ivermectin response is a polygenically-determined quantitative trait (QT) whereby identical or related molecular pathways but not necessarily individual genes are likely to determine the extent of ivermectin response in different parasite populations. Furthermore, we propose that genetic drift rather than genetic selection of SOR is the underlying driver of population differentiation, which has significant implications for the emergence and potential spread of SOR within and between these parasite populations.

    Funded by: Wellcome Trust; World Health Organization: 001

    PLoS neglected tropical diseases 2017;11;7;e0005816

  • Use of CRISPR-modified human stem cell organoids to study the origin of mutational signatures in cancer.

    Drost J, van Boxtel R, Blokzijl F, Mizutani T, Sasaki N, Sasselli V, de Ligt J, Behjati S, Grolleman JE, van Wezel T, Nik-Zainal S, Kuiper RP, Cuppen E and Clevers H

    Hubrecht Institute, Royal Netherlands Academy of Arts and Sciences (KNAW) and UMC Utrecht, 3584CT Utrecht, Netherlands.

    Mutational processes underlie cancer initiation and progression. Signatures of these processes in cancer genomes may explain cancer etiology, and hold diagnostic and prognostic value. Here, we develop a strategy that can be used to explore the origin of cancer-associated mutational signatures. We used CRISPR/Cas9 technology to delete key DNA repair genes in human colon organoids, followed by delayed sub-cloning and whole-genome sequencing. We found that mutation accumulation in organoids deficient in the mismatch repair gene MLH1 is driven by replication errors and accurately models the mutation profiles observed in mismatch repair-deficient colorectal cancers. Application of this strategy to the cancer predisposition gene NTHL1, which encodes a base excision repair protein, revealed a mutational footprint (signature 30) previously observed in a breast cancer cohort. We show that signature 30 can arise from germline NTHL1 mutations.

    Science (New York, N.Y.) 2017

  • The Experimental Design Assistant.

    du Sert NP, Bamsey I, Bate ST, Berdoy M, Clark RA, Cuthill IC, Fry D, Karp NA, Macleod M, Moon L, Stanford SC and Lings B

    National Centre for the Replacement, Refinement and Reduction of Animals in Research (NC3Rs), London, UK.

    Nature methods 2017

  • Strategies for managing rival bacterial communities: lessons from burying beetles.

    Duarte A, Welch M, Swannack C, Wagner J and Kilner RM

    Department of Zoology, University of Cambridge, Cambridge, CB2 3EJ, UK.

    1.The role of bacteria in animal development, ecology and evolution is increasingly well-understood, yet little is known of how animal behaviour affects bacterial communities. Animals that benefit from defending a key resource from microbial competitors are likely to evolve behaviours to control or manipulate the animal's associated external microbiota. 2.We describe four possible mechanisms by which animals could gain a competitive edge by disrupting a rival bacterial community: 'weeding', 'seeding', 'replanting' and 'preserving'. By combining detailed behavioural observations with molecular and bioinformatic analyses, we then test which of these mechanisms best explains how burying beetles, Nicrophorus vespilloides, manipulate the bacterial communities on their carcass breeding resource. 3.Burying beetles are a suitable species to study how animals manage external microbiota because reproduction revolves around a small vertebrate carcass. Parents shave a carcass and apply antimicrobial exudates on its surface, shaping it into an edible nest for their offspring. We compared bacterial communities in mice carcasses that were either fresh, prepared by beetles or unprepared but buried underground for the same length of time. We also analysed bacterial communities in the burying beetle's gut, during and after breeding, to understand whether beetles could be 'seeding' the carcass with particular microbes. 4.We show that burying beetles do not 'preserve' the carcass by reducing bacterial load, as is commonly supposed. Instead, our results suggest they 'seed' the carcass with bacterial groups which are part of the Nicrophorus core microbiome. They may also 'replant' other bacteria from the carcass gut onto the surface of their carrion nest. Both these processes may lead to the observed increase in bacterial load on the carcass surface in the presence of beetles. Beetles may also 'weed' the bacterial community by eliminating some groups of bacteria on the carcass, perhaps through the production of antimicrobials themselves. 5.Whether these alterations to the bacterial community are adaptive from the beetle's perspective, or are simply a by-product of the way in which the beetles prepare the carcass for reproduction, remains to be determined in future work. In general, our work suggests that animals might use more sophisticated techniques for attacking and disrupting rival microbial communities than is currently appreciated. This article is protected by copyright. All rights reserved.

    The Journal of animal ecology 2017

  • Population genetic structure and adaptation of malaria parasites on the edge of endemic distribution.

    Duffy CW, Ba H, Assefa S, Ahouidi AD, Deh YB, Tandia A, Kirsebom FC, Kwiatkowski DP and Conway DJ

    Department of Pathogen Molecular Biology, London School of Hygiene & Tropical Medicine, London, Keppel St, UK.

    To determine whether the major human malaria parasite Plasmodium falciparum exhibits fragmented population structure or local adaptation at the northern limit of its African distribution where the dry Sahel zone meets the Sahara, samples were collected from diverse locations within Mauritania over a range of ~ 1000 kilometres. Microsatellite genotypes were obtained for 203 clinical infection samples from eight locations, and Illumina paired-end sequences were obtained to yield high coverage genome-wide single nucleotide polymorphism (SNP) data for 65 clinical infection samples from four locations. Most infections contained single parasite genotypes, reflecting low rates of transmission and superinfection locally, in contrast to the situation seen in population samples from countries further south. A minority of infections shared related or identical genotypes locally, indicating some repeated transmission of parasite clones without recombination. This caused some multi-locus linkage disequilibrium and local divergence, but aside from the effect of repeated genotypes there was minimal differentiation between locations. Several chromosomal regions had elevated integrated haplotype scores (|iHS|) indicating recent selection, including those containing drug resistance genes. A genome-wide FST scan comparison with previous sequence data from an area in West Africa with higher infection endemicity indicates that regional gene flow prevents genetic isolation, but revealed allele frequency differentiation at three drug resistance loci and an erythrocyte invasion ligand gene. Contrast of extended haplotype signatures revealed none to be unique to Mauritania. Discrete foci of infection on the edge of the Sahara are genetically highly connected to the wider continental parasite population, and local elimination would be difficult to achieve without very substantial reduction in malaria throughout the region. This article is protected by copyright. All rights reserved.

    Molecular ecology 2017

  • Modulation of Aneuploidy in Leishmania donovani during Adaptation to Different In Vitro and In Vivo Environments and Its Impact on Gene Expression.

    Dumetz F, Imamura H, Sanders M, Seblova V, Myskova J, Pescher P, Vanaerschot M, Meehan CJ, Cuypers B, De Muylder G, Späth GF, Bussotti G, Vermeesch JR, Berriman M, Cotton JA, Volf P, Dujardin JC and Domagalska MA

    Molecular Parasitology, Institute of Tropical Medicine, Antwerp, Belgium.

    Aneuploidy is usually deleterious in multicellular organisms but appears to be tolerated and potentially beneficial in unicellular organisms, including pathogens. Leishmania, a major protozoan parasite, is emerging as a new model for aneuploidy, since in vitro-cultivated strains are highly aneuploid, with interstrain diversity and intrastrain mosaicism. The alternation of two life stages in different environments (extracellular promastigotes and intracellular amastigotes) offers a unique opportunity to study the impact of environment on aneuploidy and gene expression. We sequenced the whole genomes and transcriptomes of Leishmania donovani strains throughout their adaptation to in vivo conditions mimicking natural vertebrate and invertebrate host environments. The nucleotide sequences were almost unchanged within a strain, in contrast to highly variable aneuploidy. Although high in promastigotes in vitro, aneuploidy dropped significantly in hamster amastigotes, in a progressive and strain-specific manner, accompanied by the emergence of new polysomies. After a passage through a sand fly, smaller yet consistent karyotype changes were detected. Changes in chromosome copy numbers were correlated with the corresponding transcript levels, but additional aneuploidy-independent regulation of gene expression was observed. This affected stage-specific gene expression, downregulation of the entire chromosome 31, and upregulation of gene arrays on chromosomes 5 and 8. Aneuploidy changes in Leishmania are probably adaptive and exploited to modulate the dosage and expression of specific genes; they are well tolerated, but additional mechanisms may exist to regulate the transcript levels of other genes located on aneuploid chromosomes. Our model should allow studies of the impact of aneuploidy on molecular adaptations and cellular fitness.IMPORTANCE Aneuploidy is usually detrimental in multicellular organisms, but in several microorganisms, it can be tolerated and even beneficial. Leishmania-a protozoan parasite that kills more than 30,000 people each year-is emerging as a new model for aneuploidy studies, as unexpectedly high levels of aneuploidy are found in clinical isolates. Leishmania lacks classical regulation of transcription at initiation through promoters, so aneuploidy could represent a major adaptive strategy of this parasite to modulate gene dosage in response to stressful environments. For the first time, we document the dynamics of aneuploidy throughout the life cycle of the parasite, in vitro and in vivo We show its adaptive impact on transcription and its interaction with regulation. Besides offering a new model for aneuploidy studies, we show that further genomic studies should be done directly in clinical samples without parasite isolation and that adequate methods should be developed for this.

    mBio 2017;8;3

  • "Matching" consent to purpose: The example of the Matchmaker Exchange.

    Dyke SOM, Knoppers BM, Hamosh A, Firth HV, Hurles M, Brudno M, Boycott KM, Philippakis AA and Rehm HL

    Centre of Genomics and Policy, Faculty of Medicine, McGill University, Montreal, Quebec, Canada.

    The Matchmaker Exchange (MME) connects rare disease clinicians and researchers to facilitate the sharing of data from undiagnosed patients for the purpose of novel gene discovery. Such sharing raises the odds that two or more similar patients with candidate genes in common may be found, thereby allowing their condition to be more readily studied and understood. Consent considerations for data sharing in MME included both the ethical and legal differences between clinical and research settings and the level of privacy risk involved in sharing varying amounts of rare disease patient data to enable patient matches. In this commentary, we discuss these consent considerations and the resulting MME Consent Policy as they may be relevant to other international data sharing initiatives.

    Human mutation 2017

  • Genome-wide analysis of differential transcriptional and epigenetic variability across human immune cell types.

    Ecker S, Chen L, Pancaldi V, Bagger FO, Fernández JM, Carrillo de Santa Pau E, Juan D, Mann AL, Watt S, Casale FP, Sidiropoulos N, Rapin N, Merkel A, BLUEPRINT Consortium, Stunnenberg HG, Stegle O, Frontini M, Downes K, Pastinen T, Kuijpers TW, Rico D, Valencia A, Beck S, Soranzo N and Paul DS

    Structural Biology and Biocomputing Programme, Spanish National Cancer Research Centre (CNIO), Melchor Fernández Almagro 3, 28029, Madrid, Spain.

    Background: A healthy immune system requires immune cells that adapt rapidly to environmental challenges. This phenotypic plasticity can be mediated by transcriptional and epigenetic variability.

    Results: We apply a novel analytical approach to measure and compare transcriptional and epigenetic variability genome-wide across CD14<sup>+</sup>CD16<sup>-</sup> monocytes, CD66b<sup>+</sup>CD16<sup>+</sup> neutrophils, and CD4<sup>+</sup>CD45RA<sup>+</sup> naïve T cells from the same 125 healthy individuals. We discover substantially increased variability in neutrophils compared to monocytes and T cells. In neutrophils, genes with hypervariable expression are found to be implicated in key immune pathways and are associated with cellular properties and environmental exposure. We also observe increased sex-specific gene expression differences in neutrophils. Neutrophil-specific DNA methylation hypervariable sites are enriched at dynamic chromatin regions and active enhancers.

    Conclusions: Our data highlight the importance of transcriptional and epigenetic variability for the key role of neutrophils as the first responders to inflammatory stimuli. We provide a resource to enable further functional studies into the plasticity of immune cells, which can be accessed from: .

    Funded by: British Heart Foundation: RG/08/014/24067, RG/13/13/30194; Medical Research Council: G0800270, MR/L003120/1; Wellcome Trust: WT091310, WT098051

    Genome biology 2017;18;1;18

  • Drug Resistance Mechanisms in Colorectal Cancer Dissected with Cell Type-Specific Dynamic Logic Models.

    Eduati F, Doldàn-Martelli V, Klinger B, Cokelaer T, Sieber A, Kogera F, Dorel M, Garnett MJ, Blüthgen N and Saez-Rodriguez J

    European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, United Kingdom.

    Genomic features are used as biomarkers of sensitivity to kinase inhibitors used widely to treat human cancer, but effective patient stratification based on these principles remains limited in impact. Insofar as kinase inhibitors interfere with signaling dynamics, and, in turn, signaling dynamics affects inhibitor responses, we investigated associations in this study between cell-specific dynamic signaling pathways and drug sensitivity. Specifically, we measured 14 phosphoproteins under 43 different perturbed conditions (combinations of 5 stimuli and 7 inhibitors) in 14 colorectal cancer cell lines, building cell line-specific dynamic logic models of underlying signaling networks. Model parameters representing pathway dynamics were used as features to predict sensitivity to a panel of 27 drugs. Specific parameters of signaling dynamics correlated strongly with drug sensitivity for 14 of the drugs, 9 of which had no genomic biomarker. Following one of these associations, we validated a drug combination predicted to overcome resistance to MEK inhibitors by coblockade of GSK3, which was not found based on associations with genomic data. These results suggest that to better understand the cancer resistance and move toward personalized medicine, it is essential to consider signaling network dynamics that cannot be inferred from static genotypes. <i>Cancer Res; 77(12); 3364-75. ©2017 AACR</i>.

    Cancer research 2017;77;12;3364-3375

  • Phylogenetic Analysis of Klebsiella pneumoniae from Hospitalized Children, Pakistan.

    Ejaz H, Wang N, Wilksch JJ, Page AJ, Cao H, Gujaran S, Keane JA, Lithgow T, Ul-Haq I, Dougan G, Strugnell RA and Heinz E

    Klebsiella pneumoniae shows increasing emergence of multidrug-resistant lineages, including strains resistant to all available antimicrobial drugs. We conducted whole-genome sequencing of 178 highly drug-resistant isolates from a tertiary hospital in Lahore, Pakistan. Phylogenetic analyses to place these isolates into global context demonstrate the expansion of multiple independent lineages, including K. quasipneumoniae.

    Funded by: Wellcome Trust

    Emerging infectious diseases 2017;23;11;1872-1875

  • Deriving an optimal threshold of waist circumference for detecting cardiometabolic risk in sub-Saharan Africa.

    Ekoru K, Murphy GAV, Young EH, Delisle H, Jerome CS, Assah F, Longo-Mbenza B, Nzambi JPD, On'Kin JBK, Buntix F, Muyer MC, Christensen DL, Wesseh CS, Sabir A, Okafor C, Gezawa ID, Puepet F, Enang O, Raimi T, Ohwovoriole E, Oladapo OO, Bovet P, Mollentze W, Unwin N, Gray WK, Walker R, Agoudavi K, Siziya S, Chifamba J, Njelekela M, Fourie CM, Kruger S, Schutte AE, Walsh C, Gareta D, Kamali A, Seeley J, Norris SA, Crowther NJ, Pillay D, Kaleebu P, Motala AA and Sandhu MS

    Sandhu Group, Department of Medicine, University of Cambridge, Cambridge, UK.

    Background: Waist circumference (WC) thresholds derived from western populations continue to be used in sub-Saharan Africa (SSA) despite increasing evidence of ethnic variation in the association between adiposity and cardiometabolic disease and availability of data from African populations. We aimed to derive a SSA-specific optimal WC cut-point for identifying individuals at increased cardiometabolic risk.

    Methods: We used individual level cross-sectional data on 24 181 participants aged ⩾15 years from 17 studies conducted between 1990 and 2014 in eight countries in SSA. Receiver operating characteristic curves were used to derive optimal WC cut-points for detecting the presence of at least two components of metabolic syndrome (MS), excluding WC.

    Results: The optimal WC cut-point was 81.2 cm (95% CI 78.5-83.8 cm) and 81.0 cm (95% CI 79.2-82.8 cm) for men and women, respectively, with comparable accuracy in men and women. Sensitivity was higher in women (64%, 95% CI 63-65) than in men (53%, 95% CI 51-55), and increased with the prevalence of obesity. Having WC above the derived cut-point was associated with a twofold probability of having at least two components of MS (age-adjusted odds ratio 2.6, 95% CI 2.4-2.9, for men and 2.2, 95% CI 2.0-2.3, for women).

    Conclusion: The optimal WC cut-point for identifying men at increased cardiometabolic risk is lower (⩾81.2 cm) than current guidelines (⩾94.0 cm) recommend, and similar to that in women in SSA. Prospective studies are needed to confirm these cut-points based on cardiometabolic outcomes.International Journal of Obesity advance online publication, 31 October 2017; doi:10.1038/ijo.2017.240.

    International journal of obesity (2005) 2017

  • A reversible haploid mouse embryonic stem cell biobank resource for functional genomics.

    Elling U, Wimmer RA, Leibbrandt A, Burkard T, Michlits G, Leopoldi A, Micheler T, Abdeen D, Zhuk S, Aspalter IM, Handl C, Liebergesell J, Hubmann M, Husa AM, Kinzer M, Schuller N, Wetzel E, van de Loo N, Martinez JAZ, Estoppey D, Riedl R, Yang F, Fu B, Dechat T, Ivics Z, Agu CA, Bell O, Blaas D, Gerhardt H, Hoepfner D, Stark A and Penninger JM

    Institute of Molecular Biotechnology of the Austrian Academy of Science (IMBA), Vienna Biocenter (VBC), Dr. Bohr Gasse 3, Vienna, Austria.

    The ability to directly uncover the contributions of genes to a given phenotype is fundamental for biology research. However, ostensibly homogeneous cell populations exhibit large clonal variance that can confound analyses and undermine reproducibility. Here we used genome-saturated mutagenesis to create a biobank of over 100,000 individual haploid mouse embryonic stem (mES) cell lines targeting 16,970 genes with genetically barcoded, conditional and reversible mutations. This Haplobank is, to our knowledge, the largest resource of hemi/homozygous mutant mES cells to date and is available to all researchers. Reversible mutagenesis overcomes clonal variance by permitting functional annotation of the genome directly in sister cells. We use the Haplobank in reverse genetic screens to investigate the temporal resolution of essential genes in mES cells, and to identify novel genes that control sprouting angiogenesis and lineage specification of blood vessels. Furthermore, a genome-wide forward screen with Haplobank identified PLA2G16 as a host factor that is required for cytotoxicity by rhinoviruses, which cause the common cold. Therefore, clones from the Haplobank combined with the use of reversible technologies enable high-throughput, reproducible, functional annotation of the genome.

    Funded by: Austrian Science Fund FWF: P 23308; European Research Council: 341036

    Nature 2017;550;7674;114-118

  • A non-endoscopic device to sample the oesophageal microbiota: a case-control study.

    Elliott DR, Walker AW, O'Donovan M, Parkhill J and Fitzgerald RC

    Medical Research Centre Cancer Unit, Hutchison/MRC Research Centre, University of Cambridge, Cambridge, UK.

    Background: The strongest risk factor for oesophageal adenocarcinoma is reflux disease, and the rising incidence of this coincides with the eradication of Helicobacter pylori, both of which might alter the oesophageal microbiota. We aimed to profile the microbiota at different stages of Barrett's carcinogenesis and investigate the Cytosponge as a minimally invasive tool for sampling the oesophageal microbiota.

    Methods: In this case-control study, 16S rRNA gene amplicon sequencing was done on 210 oesophageal samples from 86 patients representing the Barrett's oesophagus progression sequence (normal squamous controls [n=20], non-dysplastic [n=24] and dysplastic Barrett's oesophagus [n=23], and oesophageal adenocarcinoma [n=19]), relevant negative controls, and replicates on the Illumina MiSeq platform. Samples were taken from patients enrolled in the BEST2 study at five UK hospitals and the OCCAMS study at six UK hospitals. We compared fresh frozen tissue, fresh frozen endoscopic brushings, and the Cytosponge device for microbial DNA yield (qPCR), diversity, and community composition.

    Findings: There was decreased microbial diversity in oesophageal adenocarcinoma tissue compared with tissue from healthy control patients as measured by the observed operational taxonomic unit (OTU) richness (p=0·0012), Chao estimated total richness (p=0·0004), and Shannon diversity index (p=0·0075). Lactobacillus fermentum was enriched in oesophageal adenocarcinoma (p=0·028), and lactic acid bacteria dominated the microenvironment in seven (47%) of 15 cases of oesophageal adenocarcinoma. Comparison of oesophageal sampling methods showed that the Cytosponge yielded more than ten-times higher quantities of microbial DNA than did endoscopic brushes or biopsies using quantitative PCR (p<0·0001). The Cytosponge samples contained the majority of taxa detected in biopsy and brush samples, but were enriched for genera from the oral cavity and stomach, including Fusobacterium, Megasphaera, Campylobacter, Capnocytophaga, and Dialister. The Cytosponge detected decreased microbial diversity in patients with high-grade dysplasia in comparison to control patients, as measured by the observed OTU richness (p=0·0147), Chao estimated total richness (p=0·023), and Shannon diversity index (p=0·0085).

    Interpretation: Alterations in microbial communities occur in the lower oesophagus in Barrett's carcinogenesis, which can be detected at the pre-invasive stage of high-grade dysplasia with the novel Cytosponge device. Our findings are potentially applicable to early disease detection, and future test development should focus on longitudinal sampling of the microbiota to monitor for changes in microbial diversity in a larger cohort of patients.

    Funding: Cancer Research UK, National Institute for Health Research, Medical Research Council, Wellcome Trust, The Scottish Government (RESAS).

    The lancet. Gastroenterology & hepatology 2017;2;1;32-42

  • Phenotypic Consequences of a Genetic Predisposition to Enhanced Nitric Oxide Signaling.

    Emdin CA, Khera AV, Klarin D, Natarajan P, Zekavat SM, Nomura A, Haas ME, Aragam K, Ardissino D, Wilson JG, Schunkert H, McPherson R, Watkins H, Elosua R, Bown MJ, Samani NJ, Baber U, Erdmann J, Gormley P, Palotie A, Stitziel N, Gupta N, Danesh JN, Saleheen D, Gabriel SB and Kathiresan S

    Center for Genomic Medicine, Massachusetts General Hospital, Harvard Medical School, Boston, MA; Cardiology Division Department of Medicine, Massachusetts General Hospital, Harvard Medical School, Boston, MA; Program in Medical and Population Genetics, Broad Institute, Cambridge, MA.

    Background -Nitric oxide signaling plays a key role in regulation of vascular tone and platelet activation. Here, we seek to understand the impact of a genetic predisposition to enhanced nitric oxide signaling on risk for cardiovascular diseases, thus informing the potential utility of pharmacologic stimulation of the nitric oxide pathway as a therapeutic strategy. Methods -We analyzed the association of common and rare genetic variants in two genes that mediate nitric oxide signaling [Nitric Oxide Synthase 3 (NOS3) and Guanylate Cyclase 1, Soluble, Alpha 3 (GUCY1A3)] with a range of human phenotypes. We selected two common variants (rs3918226 in NOS3 and rs7692387 in GUCY1A3) known to associate with increased NOS3 and GUCY1A3 expression and reduced mean arterial pressure, combined them into a genetic score, and standardized this exposure to a 5 mm Hg reduction in mean arterial pressure. Using individual-level data from 335,464 participants in the UK Biobank and summary association results from seven large-scale genome wide association studies, we examined the effect of this nitric oxide signaling score on cardiometabolic and other diseases. We also examined whether rare loss-of-function mutations in NOS3 and GUCY1A3 were associated with coronary heart disease using gene sequencing data from the Myocardial Infarction Genetics Consortium (n=27,815). Results -A genetic predisposition to enhanced nitric oxide signaling was associated with reduced risks of coronary heart disease [OR 0.37 95% CI 0.31, 0.45; p=5.5*10(-26)], peripheral arterial disease (OR 0.42 CI 0.26, 0.68; p=0.0005) and stroke (OR 0.53 CI 0.37, 0.76; p=0.0006). In a mediation analysis, the effect of the genetic score on decreased coronary heart disease risk extended beyond its effect on blood pressure. Conversely, rare variants that inactivate the NOS3 or GUCY1A3 genes were associated with a 23 mm Hg higher systolic blood pressure (CI 12, 34 mm Hg; p=5.6*10(-5)) and a three-fold higher risk of coronary heart disease (OR 3.03 CI 1.29, 7.12, p=0.01). Conclusions -A genetic predisposition to enhanced nitric oxide signaling is associated with reduced risks of coronary heart disease, peripheral arterial disease and stroke. Pharmacologic stimulation of nitric oxide signaling may prove useful in the prevention or treatment of cardiovascular disease.

    Circulation 2017

  • Application of rare variant transmission disequilibrium tests to epileptic encephalopathy trio sequence data.

    Epi4K Consortium, EuroEPINOMICS-RES Consortium and Epilepsy Phenome Genome Project

    The classic epileptic encephalopathies, including infantile spasms (IS) and Lennox-Gastaut syndrome (LGS), are severe seizure disorders that usually arise sporadically. De novo variants in genes mainly encoding ion channel and synaptic proteins have been found to account for over 15% of patients with IS or LGS. The contribution of autosomal recessive genetic variation, however, is less well understood. We implemented a rare variant transmission disequilibrium test (TDT) to search for autosomal recessive epileptic encephalopathy genes in a cohort of 320 outbred patient-parent trios that were generally prescreened for rare metabolic disorders. In the current sample, our rare variant transmission disequilibrium test did not identify individual genes with significantly distorted transmission over expectation after correcting for the multiple tests. While the rare variant transmission disequilibrium test did not find evidence of a role for individual autosomal recessive genes, our current sample is insufficiently powered to assess the overall role of autosomal recessive genotypes in an outbred epileptic encephalopathy population.

    Funded by: NHLBI NIH HHS: RC2 HL102923, RC2 HL102924, RC2 HL102925, RC2 HL102926, RC2 HL103010, UC2 HL102923, UC2 HL102924, UC2 HL102925, UC2 HL102926, UC2 HL103010; NIA NIH HHS: P30 AG028377; NIAID NIH HHS: R56 AI098588, U19 AI067854, UM1 AI100645; NIMH NIH HHS: K01 MH098126, R01 MH097993; NINDS NIH HHS: U01 NS053998, U01 NS077274, U01 NS077276, U01 NS077303, U01 NS077364; Wellcome Trust

    European journal of human genetics : EJHG 2017;25;7;894-899

  • Integration of Tmc1/2 into the mechanotransduction complex in zebrafish hair cells is regulated by Transmembrane O-methyltransferase (Tomt).

    Erickson T, Morgan CP, Olt J, Hardy K, Busch-Nentwich EM, Maeda R, Clemens-Grisham R, Krey JF, Nechiporuk AV, Barr-Gillespie PG, Marcotti W and Nicolson T

    Oregon Hearing Research Center and the Vollum Institute, Oregon Health and Science University, Portland, United States.

    Transmembrane O-methyltransferase (TOMT / LRTOMT) is responsible for non-syndromic deafness DFNB63. However, the specific defects that lead to hearing loss have not been described. Using a zebrafish model of DFNB63, we show that the auditory and vestibular phenotypes are due to a lack of mechanotransduction (MET) in Tomt-deficient hair cells. GFP-tagged Tomt is enriched in the Golgi of hair cells, suggesting that Tomt might regulate the trafficking of other MET components to the hair bundle. We found that Tmc1/2 proteins are specifically excluded from the hair bundle in tomt mutants, whereas other MET complex proteins can still localize to the bundle. Furthermore, mouse TOMT and TMC1 can directly interact in HEK 293 cells, and this interaction is modulated by His183 in TOMT. Thus, we propose a model of MET complex assembly where Tomt and the Tmcs interact within the secretory pathway to traffic Tmc proteins to the hair bundle.

    Funded by: NICHD NIH HHS: R01 HD072844

    eLife 2017;6

  • A Temporal Proteomic Map of Epstein-Barr Virus Lytic Replication in B Cells.

    Ersing I, Nobre L, Wang LW, Soday L, Ma Y, Paulo JA, Narita Y, Ashbaugh CW, Jiang C, Grayson NE, Kieff E, Gygi SP, Weekes MP and Gewurz BE

    Division of Infectious Disease, Department of Medicine, Brigham & Women's Hospital, Harvard Medical School, 181 Longwood Avenue, Boston, MA 02115, USA; Institut für Klinische und Molekulare Virologie, Friedrich-Alexander-Universität Erlangen-Nürnberg, 91054 Erlangen, Germany.

    Epstein-Barr virus (EBV) replication contributes to multiple human diseases, including infectious mononucleosis, nasopharyngeal carcinoma, B cell lymphomas, and oral hairy leukoplakia. We performed systematic quantitative analyses of temporal changes in host and EBV proteins during lytic replication to gain insights into virus-host interactions, using conditional Burkitt lymphoma models of type I and II EBV infection. We quantified profiles of >8,000 cellular and 69 EBV proteins, including >500 plasma membrane proteins, providing temporal views of the lytic B cell proteome and EBV virome. Our approach revealed EBV-induced remodeling of cell cycle, innate and adaptive immune pathways, including upregulation of the complement cascade and proteasomal degradation of the B cell receptor complex, conserved between EBV types I and II. Cross-comparison with proteomic analyses of human cytomegalovirus infection and of a Kaposi-sarcoma-associated herpesvirus immunoevasin identified host factors targeted by multiple herpesviruses. Our results provide an important resource for studies of EBV replication.

    Cell reports 2017;19;7;1479-1493

  • Loss of PBRM1 rescues VHL dependent replication stress to promote renal carcinogenesis.

    Espana-Agusti J, Warren A, Chew SK, Adams DJ and Matakidou A

    Department of Oncology, University of Cambridge, CRUK Cambridge institute, Cambridge, CB2 0RE, UK.

    Inactivation of the VHL (Von Hippel Lindau) tumour suppressor has long been recognised as necessary for the pathogenesis of clear cell renal cancer (ccRCC); however, the molecular mechanisms underlying transformation and the requirement for additional genetic hits remain unclear. Here, we show that loss of VHL alone results in DNA replication stress and damage accumulation, effects that constrain cellular growth and transformation. By contrast, concomitant loss of the chromatin remodelling factor PBRM1 (mutated in 40% of ccRCC) rescues VHL-induced replication stress, maintaining cellular fitness and allowing proliferation. In line with these data we demonstrate that combined deletion of Vhl and Pbrm1 in the mouse kidney is sufficient for the development of fully-penetrant, multifocal carcinomas, closely mimicking human ccRCC. Our results illustrate how VHL and PBRM1 co-operate to drive renal transformation and uncover replication stress as an underlying vulnerability of all VHL mutated renal cancers that could be therapeutically exploited.

    Funded by: Cancer Research UK: C37839/A12177; Wellcome Trust

    Nature communications 2017;8;1;2026

  • Identification and initial characterisation of a protein involved in Campylobacter jejuni cell shape.

    Esson D, Gupta S, Bailey D, Wigley P, Wedley A, Mather AE, Méric G, Mastroeni P, Sheppard SK, Thomson NR, Parkhill J, Maskell DJ, Christie G and Grant AJ

    Department of Veterinary Medicine, University of Cambridge, Madingley Road, Cambridge, UK.

    Campylobacter jejuni is the leading cause of bacterial food borne illness. While helical cell shape is considered important for C. jejuni pathogenesis, this bacterium is capable of adopting other morphologies. To better understand how helical-shaped C. jejuni maintain their shape and thus any associated colonisation, pathogenicity or other advantage, it is first important to identify the genes and proteins involved. So far, two peptidoglycan modifying enzymes Pgp1 and Pgp2 have been shown to be required for C. jejuni helical cell shape. We performed a visual screen of ∼2000 transposon mutants of C. jejuni for cell shape mutants. Whole genome sequence data of the mutants with altered cell shape, directed mutants, wild type stocks and isolated helical and rod-shaped 'wild type' C. jejuni, identified a number of different mutations in pgp1 and pgp2, which result in a change in helical to rod bacterial cell shape. We also identified an isolate with a loss of curvature. In this study, we have identified the genomic change in this isolate, and found that targeted deletion of the gene with the change resulted in bacteria with loss of curvature. Helical cell shape was restored by supplying the gene in trans. We examined the effect of loss of the gene on bacterial motility, adhesion and invasion of tissue culture cells and chicken colonisation, as well as the effect on the muropeptide profile of the peptidoglycan sacculus. Our work identifies another factor involved in helical cell shape.

    Microbial pathogenesis 2017;104;202-211

  • Structural analysis of pathogenic mutations in the DYRK1A gene in patients with developmental disorders.

    Evers JM, Laskowski RA, Bertolli M, Clayton-Smith J, Deshpande C, Eason J, Elmslie F, Flinter F, Gardiner C, Hurst JA, Kingston H, Kini U, Lampe AK, Lim D, Male A, Naik S, Parker MJ, Price S, Robert L, Sarkar A, Straub V, Woods G, Thornton JM, DDD Study and Wright CF

    European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, UK.

    Haploinsufficiency in DYRK1A is associated with a recognizable developmental syndrome, though the mechanism of action of pathogenic missense mutations is currently unclear. Here we present 19 de novo mutations in this gene, including five missense mutations, identified by the Deciphering Developmental Disorder study. Protein structural analysis reveals that the missense mutations are either close to the ATP or peptide binding-sites within the kinase domain, or are important for protein stability, suggesting they lead to a loss of the protein's function mechanism. Furthermore, there is some correlation between the magnitude of the change and the severity of the resultant phenotype. A comparison of the distribution of the pathogenic mutations along the length of DYRK1A with that of natural variants, as found in the ExAC database, confirms that mutations in the N-terminal end of the kinase domain are more disruptive of protein function. In particular, pathogenic mutations occur in significantly closer proximity to the ATP and the substrate peptide than the natural variants. Overall, we suggest that de novo dominant mutations in DYRK1A account for nearly 0.5% of severe developmental disorders due to substantially reduced kinase function.

    Funded by: Wellcome Trust: WT098051

    Human molecular genetics 2017;26;3;519-526

  • Integrated genome and transcriptome sequencing identifies a noncoding mutation in the genome replication factor DONSON as the cause of microcephaly-micromelia syndrome.

    Evrony GD, Cordero DR, Shen J, Partlow JN, Yu TW, Rodin RE, Hill RS, Coulter ME, Lam AN, Jayaraman D, Gerrelli D, Diaz DG, Santos C, Morrison V, Galli A, Tschulena U, Wiemann S, Martel MJ, Spooner B, Ryu SC, Elhosary PC, Richardson JM, Tierney D, Robinson CA, Chibbar R, Diudea D, Folkerth R, Wiebe S, Barkovich AJ, Mochida GH, Irvine J, Lemire EG, Blakley P and Walsh CA

    Division of Genetics and Genomics, Manton Center for Orphan Disease, and Howard Hughes Medical Institute, Boston Children's Hospital, Boston, Massachusetts 02115, USA.

    While next-generation sequencing has accelerated the discovery of human disease genes, progress has been largely limited to the "low hanging fruit" of mutations with obvious exonic coding or canonical splice site impact. In contrast, the lack of high-throughput, unbiased approaches for functional assessment of most noncoding variants has bottlenecked gene discovery. We report the integration of transcriptome sequencing (RNA-seq), which surveys all mRNAs to reveal functional impacts of variants at the transcription level, into the gene discovery framework for a unique human disease, microcephaly-micromelia syndrome (MMS). MMS is an autosomal recessive condition described thus far in only a single First Nations population and causes intrauterine growth restriction, severe microcephaly, craniofacial anomalies, skeletal dysplasia, and neonatal lethality. Linkage analysis of affected families, including a very large pedigree, identified a single locus on Chromosome 21 linked to the disease (LOD > 9). Comprehensive genome sequencing did not reveal any pathogenic coding or canonical splicing mutations within the linkage region but identified several nonconserved noncoding variants. RNA-seq analysis detected aberrant splicing in DONSON due to one of these noncoding variants, showing a causative role for DONSON disruption in MMS. We show that DONSON is expressed in progenitor cells of embryonic human brain and other proliferating tissues, is co-expressed with components of the DNA replication machinery, and that Donson is essential for early embryonic development in mice as well, suggesting an essential conserved role for DONSON in the cell cycle. Our results demonstrate the utility of integrating transcriptomics into the study of human genetic disease when DNA sequencing alone is not sufficient to reveal the underlying pathogenic mutation.

    Funded by: NICHD NIH HHS: K12 HD001255; NIDCD NIH HHS: R03 DC013866; NIGMS NIH HHS: T32 GM007753; NIMH NIH HHS: U24 MH081810; NINDS NIH HHS: R01 NS035129

    Genome research 2017;27;8;1323-1335

  • The Reactome Pathway Knowledgebase.

    Fabregat A, Jupe S, Matthews L, Sidiropoulos K, Gillespie M, Garapati P, Haw R, Jassal B, Korninger F, May B, Milacic M, Roca CD, Rothfels K, Sevilla C, Shamovsky V, Shorser S, Varusai T, Viteri G, Weiser J, Wu G, Stein L, Hermjakob H and D'Eustachio P

    European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK.

    The Reactome Knowledgebase ( provides molecular details of signal transduction, transport, DNA replication, metabolism, and other cellular processes as an ordered network of molecular transformations-an extended version of a classic metabolic map, in a single consistent data model. Reactome functions both as an archive of biological processes and as a tool for discovering unexpected functional relationships in data such as gene expression profiles or somatic mutation catalogues from tumor cells. To support the continued brisk growth in the size and complexity of Reactome, we have implemented a graph database, improved performance of data analysis tools, and designed new data structures and strategies to boost diagram viewer performance. To make our website more accessible to human users, we have improved pathway display and navigation by implementing interactive Enhanced High Level Diagrams (EHLDs) with an associated icon library, and subpathway highlighting and zooming, in a simplified and reorganized web site with adaptive design. To encourage re-use of our content, we have enabled export of pathway diagrams as 'PowerPoint' files.

    Funded by: NHGRI NIH HHS: U41 HG003751

    Nucleic acids research 2017

  • Multiple short windows of Calcium-Dependent Protein Kinase 4 activity coordinate distinct cell cycle events during Plasmodium gametogenesis.

    Fang H, Klages N, Baechler B, Hillner E, Yu L, Pardo M, Choudhary J and Brochet M

    Department of Microbiology and Molecular Medicine, University of Geneva, Geneva, Switzerland.

    Malaria transmission relies on the production of gametes following ingestion by a mosquito. Here, we show that (Ca2+)-dependent protein kinase 4 controls three processes essential to progress from a single haploid microgametocyte to the release of eight flagellated microgametes in Plasmodium berghei. A myristoylated isoform is activated by (Ca2+) to initiate a first genome replication within twenty seconds of activation. This role is mediated by a protein of the SAPS-domain family involved in S-phase entry. At the same time, CDPK4 is required for the assembly of the subsequent mitotic spindle and to phosphorylate a microtubule-associated protein important for mitotic spindle formation. Finally, a non-myristoylated isoform is essential to complete cytokinesis by activating motility of the male flagellum. This role has been linked to phosphorylation of an uncharacterised flagellar protein. Altogether, this study reveals how a kinase integrates and transduces multiple signals to control key cell-cycle transitions during Plasmodium gametogenesis.

    eLife 2017;6

  • Neutrophil-mediated IL-6 receptor trans-signaling and the risk of chronic obstructive pulmonary disease and asthma.

    Farahi N, Paige E, Balla J, Prudence E, Ferreira RC, Southwood M, Appleby SL, Bakke P, Gulsvik A, Litonjua AA, Sparrow D, Silverman EK, Cho MH, Danesh J, Paul DS, Freitag DF and Chilvers ER

    Division of Respiratory Medicine, Department of Medicine, University of Cambridge School of Clinical Medicine, Cambridge CB2 0QQ, UK.

    The Asp358Ala variant in the interleukin-6 receptor (IL-6R) gene has been implicated in asthma, autoimmune and cardiovascular disorders, but its role in other respiratory conditions such as chronic obstructive pulmonary disease (COPD) has not been investigated. The aims of this study were to evaluate whether there is an association between Asp358Ala and COPD or asthma risk, and to explore the role of the Asp358Ala variant in sIL-6R shedding from neutrophils and its pro-inflammatory effects in the lung. We undertook logistic regression using data from the UK Biobank and the ECLIPSE COPD cohort. Results were meta-analyzed with summary data from a further three COPD cohorts (7,519 total cases and 35,653 total controls), showing no association between Asp358Ala and COPD (OR = 1.02 [95% CI: 0.96, 1.07]). Data from the UK Biobank showed a positive association between the Asp358Ala variant and atopic asthma (OR = 1.07 [1.01, 1.13]). In a series of in vitro studies using blood samples from 37 participants, we found that shedding of sIL-6R from neutrophils was greater in carriers of the Asp358Ala minor allele than in non-carriers. Human pulmonary artery endothelial cells cultured with serum from homozygous carriers showed an increase in MCP-1 release in carriers of the minor allele, with the difference eliminated upon addition of tocilizumab. In conclusion, there is evidence that neutrophils may be an important source of sIL-6R in the lungs, and the Asp358Ala variant may have pro-inflammatory effects in lung cells. However, we were unable to identify evidence for an association between Asp358Ala and COPD.

    Funded by: British Heart Foundation: RG/08/014/24067, RG/13/13/30194; Medical Research Council: G0800270, MC_QA137853, MR/J00345X/1, MR/L003120/1; NHLBI NIH HHS: R01 HL089856, R01 HL089897, U01 HL089856, U01 HL089897; NIEHS NIH HHS: R25 ES011080

    Human molecular genetics 2017;26;8;1584-1596

  • The mountainous Cretan dietary patterns and their relationship with cardiovascular risk factors: the Hellenic Isolated Cohorts MANOLIS study.

    Farmaki AE, Rayner NW, Matchan A, Spiliopoulou P, Gilly A, Kariakli V, Kiagiadaki C, Tsafantakis E, Zeggini E and Dedoussis G

    1Department of Nutrition and Dietetics,School of Health Science and Education,Harokopio University,70 El Venizelou Avenue,17671 Athens,Greece.

    Objective: We carried out de novo recruitment of a population-based cohort (MANOLIS study) and describe the specific population, which displays interesting characteristics in terms of diet and health in old age, through deep phenotyping.

    Design: Cross-sectional study where anthropometric, biochemical and clinical measurements were taken in addition to interview-based completion of an extensive questionnaire on health and lifestyle parameters. Dietary patterns were derived through principal component analysis based on a validated FFQ.

    Setting: Geographically isolated Mylopotamos villages on Mount Idi, Crete, Greece.

    Subjects: Adults (n 1553).

    Results: Mean age of the participants was 61·6 years and 55·8 % were women. Of the population, 82·7 % were overweight or obese with a significantly different prevalence between overweight men and women (43·4 v. 34·7 %, P=0·002). The majority (70·6 %) of participants were married, while a larger proportion of women were widowed than men (27·8 v. 3·5 %, P<0·001). Smoking was more prevalent in men (38·7 v. 8·2 %, P<0·001), as 88·8% of women had never smoked. Four dietary patterns emerged as characteristic of the population; these were termed 'local', 'high fat and sugar, 'Greek café/tavern' and 'olive oil, fruits and vegetables'. Individuals more adherent to the local dietary pattern presented higher blood glucose (β=4·026, P<0·001). Similarly, individuals with higher compliance with the Greek café/tavern pattern had higher waist-to-hip ratio (β=0·012, P<0·001), blood pressure (β=1·015, P=0·005) and cholesterol (β=5·398, P<0·001).

    Conclusions: Profiling of the MANOLIS elderly population identifies unique unhealthy dietary patterns that are associated with cardiometabolic indices.

    Public health nutrition 2017;20;6;1063-1074

  • How to use… lymph node biopsy in paediatrics.

    Farndon S, Behjati S, Jonas N and Messahel B

    Cancer Genome Project, Wellcome Trust Sanger Institute, Cambridge, UK.

    Lymphadenopathy is a common finding in children. It often causes anxiety among parents and healthcare professionals because it can be a sign of cancer. There is limited high-quality evidence to guide clinicians as to which children should be referred for lymph node biopsy. The gold standard method for evaluating lymphadenopathy of unknown cause is an excision biopsy. In this Interpretation, we discuss the use of lymph node biopsy in children.

    Archives of disease in childhood. Education and practice edition 2017

  • Association of Genetic Variants Related to CETP Inhibitors and Statins With Lipoprotein Levels and Cardiovascular Risk.

    Ference BA, Kastelein JJP, Ginsberg HN, Chapman MJ, Nicholls SJ, Ray KK, Packard CJ, Laufs U, Brook RD, Oliver-Williams C, Butterworth AS, Danesh J, Smith GD, Catapano AL and Sabatine MS

    Division of Cardiovascular Medicine, Wayne State University School of Medicine, Detroit, Michigan.

    Importance: Some cholesteryl ester transfer protein (CETP) inhibitors lower low-density lipoprotein cholesterol (LDL-C) levels without reducing cardiovascular events, suggesting that the clinical benefit of lowering LDL-C may depend on how LDL-C is lowered.

    Objective: To estimate the association between changes in levels of LDL-C (and other lipoproteins) and the risk of cardiovascular events related to variants in the CETP gene, both alone and in combination with variants in the 3-hydroxy-3-methylglutaryl-CoA reductase (HMGCR) gene.

    Design, setting, and participants: Mendelian randomization analyses evaluating the association between CETP and HMGCR scores, changes in lipid and lipoprotein levels, and the risk of cardiovascular events involving 102 837 participants from 14 cohort or case-control studies conducted in North America or the United Kingdom between 1948 and 2012. The associations with cardiovascular events were externally validated in 189 539 participants from 48 studies conducted between 2011 and 2015.

    Exposures: Differences in mean high-density lipoprotein cholesterol (HDL-C), LDL-C, and apolipoprotein B (apoB) levels in participants with CETP scores at or above vs below the median.

    Main outcomes and measures: Odds ratio (OR) for major cardiovascular events.

    Results: The primary analysis included 102 837 participants (mean age, 59.9 years; 58% women) who experienced 13 821 major cardiovascular events. The validation analyses included 189 539 participants (mean age, 58.5 years; 39% women) with 62 240 cases of coronary heart disease (CHD). Considered alone, the CETP score was associated with higher levels of HDL-C, lower LDL-C, concordantly lower apoB, and a corresponding lower risk of major vascular events (OR, 0.946 [95% CI, 0.921-0.972]) that was similar in magnitude to the association between the HMGCR score and risk of major cardiovascular events per unit change in levels of LDL-C (and apoB). When combined with the HMGCR score, the CETP score was associated with the same reduction in LDL-C levels but an attenuated reduction in apoB levels and a corresponding attenuated nonsignificant risk of major cardiovascular events (OR, 0.985 [95% CI, 0.955-1.015]). In external validation analyses, a genetic score consisting of variants with naturally occurring discordance between levels of LDL-C and apoB was associated with a similar risk of CHD per unit change in apoB level (OR, 0.782 [95% CI, 0.720-0.845] vs 0.793 [95% CI, 0.774-0.812]; P = .79 for difference), but a significantly attenuated risk of CHD per unit change in LDL-C level (OR, 0.916 [95% CI, 0.890-0.943] vs 0.831 [95% CI, 0.816-0.847]; P < .001) compared with a genetic score associated with concordant changes in levels of LDL-C and apoB.

    Conclusions and relevance: Combined exposure to variants in the genes that encode the targets of CETP inhibitors and statins was associated with discordant reductions in LDL-C and apoB levels and a corresponding risk of cardiovascular events that was proportional to the attenuated reduction in apoB but significantly less than expected per unit change in LDL-C. The clinical benefit of lowering LDL-C levels may therefore depend on the corresponding reduction in apoB-containing lipoprotein particles.

    JAMA 2017

  • Arc Requires PSD95 for Assembly into Postsynaptic Complexes Involved with Neural Dysfunction and Intelligence.

    Fernández E, Collins MO, Frank RAW, Zhu F, Kopanitsa MV, Nithianantharajah J, Lemprière SA, Fricker D, Elsegood KA, McLaughlin CL, Croning MDR, Mclean C, Armstrong JD, Hill WD, Deary IJ, Cencelli G, Bagni C, Fromer M, Purcell SM, Pocklington AJ, Choudhary JS, Komiyama NH and Grant SGN

    Genes to Cognition Programme, The Wellcome Trust Sanger Institute, Hinxton, Cambridgeshire, UK; KU Leuven, Center for Human Genetics and Leuven Institute for Neurodegenerative Diseases (LIND), and VIB Center for the Biology of Disease, Leuven, Belgium.

    Arc is an activity-regulated neuronal protein, but little is known about its interactions, assembly into multiprotein complexes, and role in human disease and cognition. We applied an integrated proteomic and genetic strategy by targeting a tandem affinity purification (TAP) tag and Venus fluorescent protein into the endogenous Arc gene in mice. This allowed biochemical and proteomic characterization of native complexes in wild-type and knockout mice. We identified many Arc-interacting proteins, of which PSD95 was the most abundant. PSD95 was essential for Arc assembly into 1.5-MDa complexes and activity-dependent recruitment to excitatory synapses. Integrating human genetic data with proteomic data showed that Arc-PSD95 complexes are enriched in schizophrenia, intellectual disability, autism, and epilepsy mutations and normal variants in intelligence. We propose that Arc-PSD95 postsynaptic complexes potentially affect human cognitive function.

    Cell reports 2017;21;3;679-691

  • Beyond the lysosome: Cholesterol role on endoplasmic reticulum and lipid droplets in Parkinson's disease.

    Fernandes HJR

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, United Kingdom.

    Movement disorders : official journal of the Movement Disorder Society 2017

  • An efficient method for generation of bi-allelic null mutant mouse embryonic stem cells and its application for investigating epigenetic modifiers.

    Fisher CL, Marks H, Cho LT, Andrews R, Wormald S, Carroll T, Iyer V, Tate P, Rosen B, Stunnenberg HG, Fisher AG and Skarnes WC

    Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.

    Mouse embryonic stem (ES) cells are a popular model system to study biological processes, though uncovering recessive phenotypes requires inactivating both alleles. Building upon resources from the International Knockout Mouse Consortium (IKMC), we developed a targeting vector for second allele inactivation in conditional-ready IKMC 'knockout-first' ES cell lines. We applied our technology to several epigenetic regulators, recovering bi-allelic targeted clones with a high efficiency of 60% and used Flp recombinase to restore expression in two null cell lines to demonstrate how our system confirms causality through mutant phenotype reversion. We designed our strategy to select against re-targeting the 'knockout-first' allele and identify essential genes in ES cells, including the histone methyltransferase Setdb1. For confirmation, we exploited the flexibility of our system, enabling tamoxifen inducible conditional gene ablation while controlling for genetic background and tamoxifen effects. Setdb1 ablated ES cells exhibit severe growth inhibition, which is not rescued by exogenous Nanog expression or culturing in naive pluripotency '2i' media, suggesting that the self-renewal defect is mediated through pluripotency network independent pathways. Our strategy to generate null mutant mouse ES cells is applicable to thousands of genes and repurposes existing IKMC Intermediate Vectors.

    Nucleic acids research 2017

  • Genome editing reveals a role for OCT4 in human embryogenesis.

    Fogarty NME, McCarthy A, Snijders KE, Powell BE, Kubikova N, Blakeley P, Lea R, Elder K, Wamaitha SE, Kim D, Maciulyte V, Kleinjung J, Kim JS, Wells D, Vallier L, Bertero A, Turner JMA and Niakan KK

    Human Embryo and Stem Cell Laboratory, The Francis Crick Institute, 1 Midland Road, London NW1 1AT, UK.

    Despite their fundamental biological and clinical importance, the molecular mechanisms that regulate the first cell fate decisions in the human embryo are not well understood. Here we use CRISPR-Cas9-mediated genome editing to investigate the function of the pluripotency transcription factor OCT4 during human embryogenesis. We identified an efficient OCT4-targeting guide RNA using an inducible human embryonic stem cell-based system and microinjection of mouse zygotes. Using these refined methods, we efficiently and specifically targeted the gene encoding OCT4 (POU5F1) in diploid human zygotes and found that blastocyst development was compromised. Transcriptomics analysis revealed that, in POU5F1-null cells, gene expression was downregulated not only for extra-embryonic trophectoderm genes, such as CDX2, but also for regulators of the pluripotent epiblast, including NANOG. By contrast, Pou5f1-null mouse embryos maintained the expression of orthologous genes, and blastocyst development was established, but maintenance was compromised. We conclude that CRISPR-Cas9-mediated genome editing is a powerful method for investigating gene function in the context of human development.

    Funded by: British Heart Foundation: FS/11/77/39327; Wellcome Trust: FC001120, FC001193

    Nature 2017;550;7674;67-73

  • Deletion of the MAD2L1 spindle assembly checkpoint gene is tolerated in mouse models of acute T-cell lymphoma and hepatocellular carcinoma.

    Foijer F, Albacker LA, Bakker B, Spierings DC, Yue Y, Xie SZ, Davis SH, Lutum-Jehle A, Takemoto D, Hare B, Furey B, Bronson RT, Lansdorp PM, Bradley A and Sorger PK

    European Research Institute for the Biology of Ageing, University Medical Center Groningen, Groningen, Netherlands.

    Chromosome instability (CIN) is deleterious to normal cells because of the burden of aneuploidy. However, most human solid tumors have an abnormal karyotype implying that gain and loss of chromosomes by cancer cells confers a selective advantage. CIN can be induced in the mouse by inactivating the spindle assembly checkpoint. This is lethal in the germline but we show here that adult T cells and hepatocytes can survive conditional inactivation of the Mad2l1 SAC gene and resulting CIN. This causes rapid onset of acute lymphoblastic leukemia (T-ALL) and progressive development of hepatocellular carcinoma (HCC), both lethal diseases. The resulting DNA copy number variation and patterns of chromosome loss and gain are tumor-type specific, suggesting differential selective pressures on the two tumor cell types.

    eLife 2017;6

  • Conservation and diversification of small RNA pathways within flatworms.

    Fontenla S, Rinaldi G, Smircich P and Tort JF

    Departamento de Genética, Facultad de Medicina, Universidad de la República (UDELAR), Gral. Flores 2125, CP11800, Montevideo, MVD, Uruguay.

    Background: Small non-coding RNAs, including miRNAs, and gene silencing mediated by RNA interference have been described in free-living and parasitic lineages of flatworms, but only few key factors of the small RNA pathways have been exhaustively investigated in a limited number of species. The availability of flatworm draft genomes and predicted proteomes allowed us to perform an extended survey of the genes involved in small non-coding RNA pathways in this phylum.

    Results: Overall, findings show that the small non-coding RNA pathways are conserved in all the analyzed flatworm linages; however notable peculiarities were identified. While Piwi genes are amplified in free-living worms they are completely absent in all parasitic species. Remarkably all flatworms share a specific Argonaute family (FL-Ago) that has been independently amplified in different lineages. Other key factors such as Dicer are also duplicated, with Dicer-2 showing structural differences between trematodes, cestodes and free-living flatworms. Similarly, a very divergent GW182 Argonaute interacting protein was identified in all flatworm linages. Contrasting to this, genes involved in the amplification of the RNAi interfering signal were detected only in the ancestral free living species Macrostomum lignano. We here described all the putative small RNA pathways present in both free living and parasitic flatworm lineages.

    Conclusion: These findings highlight innovations specifically evolved in platyhelminths presumably associated with novel mechanisms of gene expression regulation mediated by small RNA pathways that differ to what has been classically described in model organisms. Understanding these phylum-specific innovations and the differences between free living and parasitic species might provide clues to adaptations to parasitism, and would be relevant for gene-silencing technology development for parasitic flatworms that infect hundreds of million people worldwide.

    BMC evolutionary biology 2017;17;1;215

  • COSMIC: somatic cancer genetics at high-resolution.

    Forbes SA, Beare D, Boutselakis H, Bamford S, Bindal N, Tate J, Cole CG, Ward S, Dawson E, Ponting L, Stefancsik R, Harsha B, Kok CY, Jia M, Jubb H, Sondka Z, Thompson S, De T and Campbell PJ

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK

    COSMIC, the Catalogue of Somatic Mutations in Cancer ( is a high-resolution resource for exploring targets and trends in the genetics of human cancer. Currently the broadest database of mutations in cancer, the information in COSMIC is curated by expert scientists, primarily by scrutinizing large numbers of scientific publications. Over 4 million coding mutations are described in v78 (September 2016), combining genome-wide sequencing results from 28 366 tumours with complete manual curation of 23 489 individual publications focused on 186 key genes and 286 key fusion pairs across all cancers. Molecular profiling of large tumour numbers has also allowed the annotation of more than 13 million non-coding mutations, 18 029 gene fusions, 187 429 genome rearrangements, 1 271 436 abnormal copy number segments, 9 175 462 abnormal expression variants and 7 879 142 differentially methylated CpG dinucleotides. COSMIC now details the genetics of drug resistance, novel somatic gene mutations which allow a tumour to evade therapeutic cancer drugs. Focusing initially on highly characterized drugs and genes, COSMIC v78 contains wide resistance mutation profiles across 20 drugs, detailing the recurrence of 301 unique resistance alleles across 1934 drug-resistant tumours. All information from the COSMIC database is available freely on the COSMIC website.

    Funded by: Wellcome Trust: 077012/Z/05/Z

    Nucleic acids research 2017;45;D1;D777-D783

  • Genome-wide genetic screening with chemically mutagenized haploid embryonic stem cells.

    Forment JV, Herzog M, Coates J, Konopka T, Gapp BV, Nijman SM, Adams DJ, Keane TM and Jackson SP

    The Wellcome Trust and Cancer Research UK Gurdon Institute, and Department of Biochemistry, University of Cambridge, Cambridge, UK.

    In model organisms, classical genetic screening via random mutagenesis provides key insights into the molecular bases of genetic interactions, helping to define synthetic lethality, synthetic viability and drug-resistance mechanisms. The limited genetic tractability of diploid mammalian cells, however, precludes this approach. Here, we demonstrate the feasibility of classical genetic screening in mammalian systems by using haploid cells, chemical mutagenesis and next-generation sequencing, providing a new tool to explore mammalian genetic interactions.

    Funded by: Cancer Research UK: 13031, A11224; European Research Council: 311166

    Nature chemical biology 2017;13;1;12-14

  • Illuminating microbial diversity.

    Forster SC

    Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK and the Hudson Institute of Medical Research, Clayton, Victoria 3168, Australia.

    Nature reviews. Microbiology 2017;15;10;578

  • Identification of highly-protective combinations of Plasmodium vivax recombinant proteins for vaccine development.

    França CT, White MT, He WQ, Hostetler JB, Brewster J, Frato G, Malhotra I, Gruszczyk J, Huon C, Lin E, Kiniboro B, Yadava A, Siba P, Galinski MR, Healer J, Chitnis C, Cowman AF, Takashima E, Tsuboi T, Tham WH, Fairhurst RM, Rayner JC, King CL and Mueller I

    Division of Population Health and Immunity, Walter and Eliza Hall Institute of Medical Research, Parkville, Australia.

    The study of antigenic targets of naturally-acquired immunity is essential to identify and prioritize antigens for further functional characterization. We measured total IgG antibodies to 38 P. vivax antigens, investigating their relationship with prospective risk of malaria in a cohort of 1-3 years old Papua New Guinean children. Using simulated annealing algorithms, the potential protective efficacy of antibodies to multiple antigen-combinations, and the antibody thresholds associated with protection were investigated for the first time. High antibody levels to multiple known and newly identified proteins were strongly associated with protection (IRR 0.44-0.74, P<0.001-0.041). Among five-antigen combinations with the strongest protective effect (>90%), EBP, DBPII, RBP1a, CyRPA, and PVX_081550 were most frequently identified; several of them requiring very low antibody levels to show a protective association. These data identify individual antigens that should be prioritized for further functional testing and establish a clear path to testing a multicomponent P. vivax vaccine.

    eLife 2017;6

  • Document retrieval on repetitive string collections.

    Gagie T, Hartikainen A, Karhu K, Kärkkäinen J, Navarro G, Puglisi SJ and Sirén J

    CeBiB - Center of Biotechnology and Bioengineering, School of Computer Science and Telecommunications, Diego Portales University, Santiago, Chile.

    Most of the fastest-growing string collections today are repetitive, that is, most of the constituent documents are similar to many others. As these collections keep growing, a key approach to handling them is to exploit their repetitiveness, which can reduce their space usage by orders of magnitude. We study the problem of indexing repetitive string collections in order to perform efficient document retrieval operations on them. Document retrieval problems are routinely solved by search engines on large natural language collections, but the techniques are less developed on generic string collections. The case of repetitive string collections is even less understood, and there are very few existing solutions. We develop two novel ideas, interleaved LCPs and precomputed document lists, that yield highly compressed indexes solving the problem of document listing (find all the documents where a string appears), top-k document retrieval (find the k documents where a string appears most often), and document counting (count the number of documents where a string appears). We also show that a classical data structure supporting the latter query becomes highly compressible on repetitive data. Finally, we show how the tools we developed can be combined to solve ranked conjunctive and disjunctive multi-term queries under the simple [Formula: see text] model of relevance. We thoroughly evaluate the resulting techniques in various real-life repetitiveness scenarios, and recommend the best choices for each case.

    Funded by: Wellcome Trust

    Information retrieval 2017;20;3;253-291

  • Wheeler graphs: A framework for BWT-based data structures.

    Gagie T, Manzini G and Sirén J

    Diego Portales University and CEBIB, Santiago, Chile.

    The famous Burrows-Wheeler Transform (BWT) was originally defined for a single string but variations have been developed for sets of strings, labeled trees, de Bruijn graphs, etc. In this paper we propose a framework that includes many of these variations and that we hope will simplify the search for more. We first define Wheeler graphs and show they have a property we call path coherence. We show that if the state diagram of a finite-state automaton is a Wheeler graph then, by its path coherence, we can order the nodes such that, for any string, the nodes reachable from the initial state or states by processing that string are consecutive. This means that even if the automaton is non-deterministic, we can still store it compactly and process strings with it quickly. We then rederive several variations of the BWT by designing straightforward finite-state automata for the relevant problems and showing that their state diagrams are Wheeler graphs.

    Theoretical computer science 2017;698;67-78

  • P113 is a merozoite surface protein that binds the N terminus of Plasmodium falciparum RH5.

    Galaway F, Drought LG, Fala M, Cross N, Kemp AC, Rayner JC and Wright GJ

    Cell Surface Signalling Laboratory, Wellcome Trust Sanger Institute, Cambridge CB10 1SA, UK.

    Invasion of erythrocytes by Plasmodium falciparum merozoites is necessary for malaria pathogenesis and is therefore a primary target for vaccine development. RH5 is a leading subunit vaccine candidate because anti-RH5 antibodies inhibit parasite growth and the interaction with its erythrocyte receptor basigin is essential for invasion. RH5 is secreted, complexes with other parasite proteins including CyRPA and RIPR, and contains a conserved N-terminal region (RH5Nt) of unknown function that is cleaved from the native protein. Here, we identify P113 as a merozoite surface protein that directly interacts with RH5Nt. Using recombinant proteins and a sensitive protein interaction assay, we establish the binding interdependencies of all the other known RH5 complex components and conclude that the RH5Nt-P113 interaction provides a releasable mechanism for anchoring RH5 to the merozoite surface. We exploit these findings to design a chemically synthesized peptide corresponding to RH5Nt, which could contribute to a cost-effective malaria vaccine.

    Nature communications 2017;8;14333

  • APOBEC3A/B deletion polymorphism and cancer risk.

    Gansmo LB, Romundstad P, Hveem K, Vatten L, Nik-Zainal S, Lønning PE and Knappskog S

    Section of Oncology, Department of Clinical Science, University of Bergen, Bergen, Norway.

    Activity of the APOBEC enzymes has been linked to specific mutational processes in human cancer genomes. A germline APOBEC3A/B deletion polymorphism is associated with APOBEC-dependent mutational signatures, and the deletion allele has been reported to confer an elevated risk of some cancers in Asian populations, while the results in European populations, so far, have been conflicting. We genotyped the APOBEC3A/B deletion polymorphism in a large population based sample consisting of 11,106 Caucasian (Norwegian) individuals, including 7,279 incident cancer cases (1,769 breast- , 1,360 lung-, 1,585 colon-, and 2,565 prostate cancer) and a control group of 3,827 matched individuals without cancer (1,918 females and 1,909 males) from the same population. Overall, the APOBEC3A/B deletion polymorphism was not associated with risk of any of the four cancer types. However, in subgroup analyses stratified by age, we found that the deletion allele was associated with increased risk for lung cancer among individuals <50 years of age (OR 2.17, CI 1.19-3.97), and that the association was gradually reduced with increasing age (p=0.01). A similar but weaker pattern was observed for prostate cancer. In support of these findings, the APOBEC3A/B deletion was associated with young age at diagnosis among the cancer cases for both cancer forms (lung cancer: p=0.02; dominant model and prostate cancer: p=0.03; recessive model). No such associations were observed for breast or colon cancer.

    Carcinogenesis 2017

  • Epigenetic germline inheritance in mammals: looking to the past to understand the future.

    Gapp K and Bohacek J

    Gurdon Institute, University of Cambridge, Cambridge, UK.

    Life experiences can induce epigenetic changes in mammalian germ cells, which can influence the developmental trajectory of the offspring and impact health and disease across generations. While this concept of epigenetic germline inheritance has long been met with skepticism, evidence in support of this route of information transfer is now overwhelming, and some key mechanisms underlying germline transmission of acquired information are emerging. This review focuses specifically on sperm RNAs as causal vectors of inheritance. We examine how they might become altered in the germline, and how different classes of sperm RNAs might interact with other epimodifications in germ cells or in the zygote. We integrate the latest findings with earlier pioneering work in this field, point out major questions and challenges, and suggest how new experiments could address them.

    Genes, brain, and behavior 2017

  • Minimal genetic change in Vibrio cholerae in Mozambique over time: Multilocus variable number tandem repeat analysis and whole genome sequencing.

    Garrine M, Mandomando I, Vubil D, Nhampossa T, Acacio S, Li S, Paulson JN, Almeida M, Domman D, Thomson NR, Alonso P and Stine OC

    Centro de Investigação em Saúde de Manhiça (CISM), Maputo, Mozambique.

    Although cholera is a major public health concern in Mozambique, its transmission patterns remain unknown. We surveyed the genetic relatedness of 75 Vibrio cholerae isolates from patients at Manhiça District Hospital between 2002-2012 and 3 isolates from river using multilocus variable-number tandem-repeat analysis (MLVA) and whole genome sequencing (WGS). MLVA revealed 22 genotypes in two clonal complexes and four unrelated genotypes. WGS revealed i) the presence of recombination, ii) 67 isolates descended monophyletically from a single source connected to Wave 3 of the Seventh Pandemic, and iii) four clinical isolates lacking the cholera toxin gene. This Wave 3 strain persisted for at least eight years in either an environmental reservoir or circulating within the human population. Our data raises important questions related to where these isolates persist and how identical isolates can be collected years apart despite our understanding of high change rate of MLVA loci and the V. cholerae molecular clock.

    Funded by: NIAID NIH HHS: R01 AI123422

    PLoS neglected tropical diseases 2017;11;6;e0005671

  • No genetic association between attention-deficit/hyperactivity disorder (ADHD) and Parkinson's disease in nine ADHD candidate SNPs.

    Geissler JM, International Parkinson Disease Genomics Consortium members, Romanos M, Gerlach M, Berg D and Schulte C

    Department of Child and Adolescent Psychiatry, Psychosomatics and Psychotherapy, Center of Mental Health, University Hospital of Würzburg, Margarete-Höppel-Platz 1, 97080, Würzburg, Germany.

    Attention-deficit/hyperactivity disorder (ADHD) and Parkinson's disease (PD) involve pathological changes in brain structures such as the basal ganglia, which are essential for the control of motor and cognitive behavior and impulsivity. The cause of ADHD and PD remains unknown, but there is increasing evidence that both seem to result from a complicated interplay of genetic and environmental factors affecting numerous cellular processes and brain regions. To explore the possibility of common genetic pathways within the respective pathophysiologies, nine ADHD candidate single nucleotide polymorphisms (SNPs) in seven genes were tested for association with PD in 5333 cases and 12,019 healthy controls: one variant, respectively, in the genes coding for synaptosomal-associated protein 25 k (SNAP25), the dopamine (DA) transporter (SLC6A3; DAT1), DA receptor D4 (DRD4), serotonin receptor 1B (HTR1B), tryptophan hydroxylase 2 (TPH2), the norepinephrine transporter SLC6A2 and three SNPs in cadherin 13 (CDH13). Information was extracted from a recent meta-analysis of five genome-wide association studies, in which 7,689,524 SNPs in European samples were successfully imputed. No significant association was observed after correction for multiple testing. Therefore, it is reasonable to conclude that candidate variants implicated in the pathogenesis of ADHD do not play a substantial role in PD.

    Attention deficit and hyperactivity disorders 2017

  • Precision oncology for acute myeloid leukemia using a knowledge bank approach.

    Gerstung M, Papaemmanuil E, Martincorena I, Bullinger L, Gaidzik VI, Paschka P, Heuser M, Thol F, Bolli N, Ganly P, Ganser A, McDermott U, Döhner K, Schlenk RF, Döhner H and Campbell PJ

    Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, UK.

    Underpinning the vision of precision medicine is the concept that causative mutations in a patient's cancer drive its biology and, by extension, its clinical features and treatment response. However, considerable between-patient heterogeneity in driver mutations complicates evidence-based personalization of cancer care. Here, by reanalyzing data from 1,540 patients with acute myeloid leukemia (AML), we explore how large knowledge banks of matched genomic-clinical data can support clinical decision-making. Inclusive, multistage statistical models accurately predicted likelihoods of remission, relapse and mortality, which were validated using data from independent patients in The Cancer Genome Atlas. Comparison of long-term survival probabilities under different treatments enables therapeutic decision support, which is available in exploratory form online. Personally tailored management decisions could reduce the number of hematopoietic cell transplants in patients with AML by 20-25% while maintaining overall survival rates. Power calculations show that databases require information from thousands of patients for accurate decision support. Knowledge banks facilitate personally tailored therapeutic decisions but require sustainable updating, inclusive cohorts and large sample sizes.

    Funded by: NCI NIH HHS: P30 CA008748; Wellcome Trust

    Nature genetics 2017;49;3;332-340

  • Evidence for Contemporary Switching of the O-Antigen Gene Cluster between Shiga Toxin-Producing Escherichia coli Strains Colonizing Cattle.

    Geue L, Menge C, Eichhorn I, Semmler T, Wieler LH, Pickard D, Berens C and Barth SA

    Friedrich-Loeffler-Institut/Federal Research Institute for Animal Health, Institute of Molecular Pathogenesis Jena, Germany.

    Shiga toxin-producing Escherichia coli (STEC) comprise a group of zoonotic enteric pathogens with ruminants, especially cattle, as the main reservoir. O-antigens are instrumental for host colonization and bacterial niche adaptation. They are highly immunogenic and, therefore, targeted by the adaptive immune system. The O-antigen is one of the most diverse bacterial cell constituents and variation not only exists between different bacterial species, but also between individual isolates/strains within a single species. We recently identified STEC persistently infecting cattle and belonging to the different serotypes O156:H25 (n = 21) and O182:H25 (n = 15) that were of the MLST sequence types ST300 or ST688. These STs differ by a single nucleotide in purA only. Fitness-, virulence-associated genome regions, and CRISPR/CAS (clustered regularly interspaced short palindromic repeats/CRISPR associated sequence) arrays of these STEC O156:H25 and O182:H25 isolates were highly similar, and identical genomic integration sites for the stx converting bacteriophages and the core LEE, identical Shiga toxin converting bacteriophage genes for stx1a, identical complete LEE loci, and identical sets of chemotaxis and flagellar genes were identified. In contrast to this genomic similarity, the nucleotide sequences of the O-antigen gene cluster (O-AGC) regions between galF and gnd and very few flanking genes differed fundamentally and were specific for the respective serotype. Sporadic aEPEC O156:H8 isolates (n = 5) were isolated in temporal and spatial proximity. While the O-AGC and the corresponding 5' and 3' flanking regions of these aEPEC isolates were identical to the respective region in the STEC O156:H25 isolates, the core genome, the virulence associated genome regions and the CRISPR/CAS elements differed profoundly. Our cumulative epidemiological and molecular data suggests a recent switch of the O-AGC between isolates with O156:H8 strains having served as DNA donors. Such O-antigen switches can affect the evaluation of a strain's pathogenic and virulence potential, suggesting that NGS methods might lead to a more reliable risk assessment.

    Frontiers in microbiology 2017;8;424

  • A staging system for correct phenotype interpretation of mouse embryos harvested on embryonic day 14 (E14.5).

    Geyer SH, Reissig L, Rose J, Wilson R, Prin F, Szumska D, Ramirez-Solis R, Tudor C, White J, Mohun TJ and Weninger WJ

    Centre for Anatomy and Cell Biology & MIC, Medical University of Vienna, Vienna, Austria.

    We present a simple and quick system for accurately scoring the developmental progress of mouse embryos harvested on embryonic day 14 (E14.5). Based solely on the external appearance of the maturing forelimb, we provide a convenient way to distinguish six developmental sub-stages. Using a variety of objective morphometric data obtained from the commonly used C57BL/6N mouse strain, we show that these stages correlate precisely with the growth of the entire embryo and its organs. Applying the new staging system to phenotype analyses of E14.5 embryos of 58 embryonic lethal null mutant lines from the DMDD research programme ( and its pilot, we show that homozygous mutant embryos are frequently delayed in development. To demonstrate the importance of our staging system for correct phenotype interpretation, we describe stage-specific changes of the palate, heart and gut, and provide examples in which correct diagnosis of malformations relies on correct staging.

    Journal of anatomy 2017

  • Morphology, topology and dimensions of the heart and arteries of genetically normal and mutant mouse embryos at stages S21-S23.

    Geyer SH, Reissig LF, Hüsemann M, Höfle C, Wilson R, Prin F, Szumska D, Galli A, Adams DJ, White J, Mohun TJ and Weninger WJ

    Division of Anatomy & MIC, Medical University of Vienna, Vienna, Austria.

    Accurate identification of abnormalities in the mouse embryo depends not only on comparisons with appropriate, developmental stage-matched controls, but also on an appreciation of the range of anatomical variation that can be expected during normal development. Here we present a morphological, topological and metric analysis of the heart and arteries of mouse embryos harvested on embryonic day (E)14.5, based on digital volume data of whole embryos analysed by high-resolution episcopic microscopy (HREM). By comparing data from 206 genetically normal embryos, we have analysed the range and frequency of normal anatomical variations in the heart and major arteries across Theiler stages S21-S23. Using this, we have identified abnormalities in these structures among 298 embryos from mutant mouse lines carrying embryonic lethal gene mutations produced for the Deciphering the Mechanisms of Developmental Disorders (DMDD) programme. We present examples of both commonly occurring abnormal phenotypes and novel pathologies that most likely alter haemodynamics in these genetically altered mouse embryos. Our findings offer a reference baseline for identifying accurately abnormalities of the heart and arteries in embryos that have largely completed organogenesis.

    Journal of anatomy 2017

  • Activation of the Aryl Hydrocarbon Receptor Interferes with Early Embryonic Development.

    Gialitakis M, Tolaini M, Li Y, Pardo M, Yu L, Toribio A, Choudhary JS, Niakan K, Papayannopoulos V and Stockinger B

    The Francis Crick Institute, 1 Midland Road, London, NW1 1AT, UK. Electronic address:

    The transcriptional program of early embryonic development is tightly regulated by a set of well-defined transcription factors that suppress premature expression of differentiation genes and sustain the pluripotent identity. It is generally accepted that this program can be perturbed by environmental factors such as chemical pollutants; however, the precise molecular mechanisms remain unknown. The aryl hydrocarbon receptor (AHR) is a widely expressed nuclear receptor that senses environmental stimuli and modulates target gene expression. Here, we have investigated the AHR interactome in embryonic stem cells by mass spectrometry and show that ectopic activation of AHR during early differentiation disrupts the differentiation program via the chromatin remodeling complex NuRD (nucleosome remodeling and deacetylation). The activated AHR/NuRD complex altered the expression of differentiation-specific genes that control the first two developmental decisions without affecting the pluripotency program. These findings identify a mechanism that allows environmental stimuli to disrupt embryonic development through AHR signaling.

    Stem cell reports 2017

  • Increased Expression of a MicroRNA Correlates with Anthelmintic Resistance in Parasitic Nematodes.

    Gillan V, Maitland K, Laing R, Gu H, Marks ND, Winter AD, Bartley D, Morrison A, Skuce PJ, Rezansoff AM, Gilleard JS, Martinelli A, Britton C and Devaney E

    Institute of Biodiversity, Animal Health and Comparative Medicine, College of Medical, Veterinary and Life Sciences, University of Glasgow, Glasgow, United Kingdom.

    Resistance to anthelmintic drugs is a major problem in the global fight against parasitic nematodes infecting humans and animals. While previous studies have identified mutations in drug target genes in resistant parasites, changes in the expression levels of both targets and transporters have also been reported. The mechanisms underlying these changes in gene expression are unresolved. Here, we take a novel approach to this problem by investigating the role of small regulatory RNAs in drug resistant strains of the important parasite Haemonchus contortus. microRNAs (miRNAs) are small (22 nt) non-coding RNAs that regulate gene expression by binding predominantly to the 3' UTR of mRNAs. Changes in miRNA expression have been implicated in drug resistance in a variety of tumor cells. In this study, we focused on two geographically distinct ivermectin resistant strains of H. contortus and two lines generated by multiple rounds of backcrossing between susceptible and resistant parents, with ivermectin selection. All four resistant strains showed significantly increased expression of a single miRNA, hco-miR-9551, compared to the susceptible strain. This same miRNA is also upregulated in a multi-drug-resistant strain of the related nematode Teladorsagia circumcincta. hco-miR-9551 is enriched in female worms, is likely to be located on the X chromosome and is restricted to clade V parasitic nematodes. Genes containing predicted binding sites for hco-miR-9551 were identified computationally and refined based on differential expression in a transcriptomic dataset prepared from the same drug resistant and susceptible strains. This analysis identified three putative target mRNAs, one of which, a CHAC domain containing protein, is located in a region of the H. contortus genome introgressed from the resistant parent. hco-miR-9551 was shown to interact with the 3' UTR of this gene by dual luciferase assay. This study is the first to suggest a role for miRNAs and the genes they regulate in drug resistant parasitic nematodes. miR-9551 also has potential as a biomarker of resistance in different nematode species.

    Frontiers in cellular and infection microbiology 2017;7;452

  • De novo yeast genome assemblies from MinION, PacBio and MiSeq platforms.

    Giordano F, Aigrain L, Quail MA, Coupland P, Bonfield JK, Davies RM, Tischler G, Jackson DK, Keane TM, Li J, Yue JX, Liti G, Durbin R and Ning Z

    The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK.

    Long-read sequencing technologies such as Pacific Biosciences and Oxford Nanopore MinION are capable of producing long sequencing reads with average fragment lengths of over 10,000 base-pairs and maximum lengths reaching 100,000 base- pairs. Compared with short reads, the assemblies obtained from long-read sequencing platforms have much higher contig continuity and genome completeness as long fragments are able to extend paths into problematic or repetitive regions. Many successful assembly applications of the Pacific Biosciences technology have been reported ranging from small bacterial genomes to large plant and animal genomes. Recently, genome assemblies using Oxford Nanopore MinION data have attracted much attention due to the portability and low cost of this novel sequencing instrument. In this paper, we re-sequenced a well characterized genome, the Saccharomyces cerevisiae S288C strain using three different platforms: MinION, PacBio and MiSeq. We present a comprehensive metric comparison of assemblies generated by various pipelines and discuss how the platform associated data characteristics affect the assembly quality. With a given read depth of 31X, the assemblies from both Pacific Biosciences and Oxford Nanopore MinION show excellent continuity and completeness for the 16 nuclear chromosomes, but not for the mitochondrial genome, whose reconstruction still represents a significant challenge.

    Scientific reports 2017;7;1;3935

  • New insights into sex chromosome evolution in anole lizards (Reptilia, Dactyloidae).

    Giovannotti M, Trifonov VA, Paoletti A, Kichigin IG, O'Brien PC, Kasai F, Giovagnoli G, Ng BL, Ruggeri P, Cerioni PN, Splendiani A, Pereira JC, Olmo E, Rens W, Caputo Barucchi V and Ferguson-Smith MA

    Dipartimento di Scienze della Vita e dell'Ambiente, Università Politecnica delle Marche, via Brecce Bianche, 60131, Ancona, Italy.

    Anoles are a clade of iguanian lizards that underwent an extensive radiation between 125 and 65 million years ago. Their karyotypes show wide variation in diploid number spanning from 26 (Anolis evermanni) to 44 (A. insolitus). This chromosomal variation involves their sex chromosomes, ranging from simple systems (XX/XY), with heterochromosomes represented by either micro- or macrochromosomes, to multiple systems (X<sub>1</sub>X<sub>1</sub>X<sub>2</sub>X<sub>2</sub>/X<sub>1</sub>X<sub>2</sub>Y). Here, for the first time, the homology relationships of sex chromosomes have been investigated in nine anole lizards at the whole chromosome level. Cross-species chromosome painting using sex chromosome paints from A. carolinensis, Ctenonotus pogus and Norops sagrei and gene mapping of X-linked genes demonstrated that the anole ancestral sex chromosome system constituted by microchromosomes is retained in all the species with the ancestral karyotype (2n = 36, 12 macro- and 24 microchromosomes). On the contrary, species with a derived karyotype, namely those belonging to genera Ctenonotus and Norops, show a series of rearrangements (fusions/fissions) involving autosomes/microchromosomes that led to the formation of their current sex chromosome systems. These results demonstrate that different autosomes were involved in translocations with sex chromosomes in closely related lineages of anole lizards and that several sequential microautosome/sex chromosome fusions lead to a remarkable increase in size of Norops sagrei sex chromosomes.

    Chromosoma 2017;126;2;245-260

  • Immunogenomic approaches to understand the function of immune disease variants.

    Glinos DA, Soskic B and Trynka G

    Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK.

    Mapping hundreds of genetic variants through genome wide association studies provided an opportunity to gain insights into the pathobiology of immune-mediated diseases. However, as most of the disease variants fall outside the gene coding sequences the functional interpretation of the exact role of the associated variants remains to be determined. The integration of disease-associated variants with large scale genomic maps of cell-type-specific gene regulation at both chromatin and transcript levels deliver examples of functionally prioritized causal variants and genes. In particular, the enrichment of disease variants with histone marks can point towards the cell types most relevant to disease development. Furthermore, chromatin contact maps that link enhancers to promoter regions in a direct way allow the identification of genes that can be regulated by the disease variants. Candidate genes implicated with such approaches can be further examined through the correlation of gene expression with genotypes. Additionally, in the context of immune-mediated diseases it is important to combine genomics with immunology approaches. Genotype correlations with the immune system as a whole, as well as with cellular responses to different stimuli, provide a valuable platform for understanding the functional impact of disease-associated variants. The intersection of immunogenomic resources with disease-associated variants paints a detailed picture of disease causal mechanisms. Here, we provide an overview of recent studies that combine these approaches to identify disease vulnerable pathways.

    Funded by: Wellcome Trust: WT206194

    Immunology 2017;152;4;527-535

  • A somatic-mutational process recurrently duplicates germline susceptibility loci and tissue-specific super-enhancers in breast cancers.

    Glodzik D, Morganella S, Davies H, Simpson PT, Li Y, Zou X, Diez-Perez J, Staaf J, Alexandrov LB, Smid M, Brinkman AB, Rye IH, Russnes H, Raine K, Purdie CA, Lakhani SR, Thompson AM, Birney E, Stunnenberg HG, van de Vijver MJ, Martens JW, Børresen-Dale AL, Richardson AL, Kong G, Viari A, Easton D, Evan G, Campbell PJ, Stratton MR and Nik-Zainal S

    Wellcome Trust Sanger Institute, Cambridge, UK.

    Somatic rearrangements contribute to the mutagenized landscape of cancer genomes. Here, we systematically interrogated rearrangements in 560 breast cancers by using a piecewise constant fitting approach. We identified 33 hotspots of large (>100 kb) tandem duplications, a mutational signature associated with homologous-recombination-repair deficiency. Notably, these tandem-duplication hotspots were enriched in breast cancer germline susceptibility loci (odds ratio (OR) = 4.28) and breast-specific 'super-enhancer' regulatory elements (OR = 3.54). These hotspots may be sites of selective susceptibility to double-strand-break damage due to high transcriptional activity or, through incrementally increasing copy number, may be sites of secondary selective pressure. The transcriptomic consequences ranged from strong individual oncogene effects to weak but quantifiable multigene expression effects. We thus present a somatic-rearrangement mutational process affecting coding sequences and noncoding regulatory elements and contributing a continuum of driver consequences, from modest to strong effects, thereby supporting a polygenic model of cancer development.

    Funded by: European Research Council: 322737; Wellcome Trust: 098051

    Nature genetics 2017;49;3;341-348

  • Genetic diversity of next generation antimalarial targets: A baseline for drug resistance surveillance programmes.

    Gomes AR, Ravenhall M, Benavente ED, Talman A, Sutherland C, Roper C, Clark TG and Campino S

    Faculty of Infectious and Tropical Diseases, London School of Hygiene and Tropical Medicine, London, UK.

    Drug resistance is a recurrent problem in the fight against malaria. Genetic and epidemiological surveillance of antimalarial resistant parasite alleles is crucial to guide drug therapies and clinical management. New antimalarial compounds are currently at various stages of clinical trials and regulatory evaluation. Using ∼2000 Plasmodium falciparum genome sequences, we investigated the genetic diversity of eleven gene-targets of promising antimalarial compounds and assessed their potential efficiency across malaria endemic regions. We determined if the loci are under selection prior to the introduction of new drugs and established a baseline of genetic variance, including potential resistant alleles, for future surveillance programmes.

    International journal for parasitology. Drugs and drug resistance 2017;7;2;174-180

  • Immuno-oncology from the perspective of somatic evolution.

    González S, Volkova N, Beer P and Gerstung M

    European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK.

    The past years have witnessed significant success for cancer immunotherapies that activate a patient's immune system against their cancer cells. At the same time our understanding of the genetic changes driving tumor evolution have progressed dramatically. The study of cancer genomes has shown that tumors are best understood as cell populations governed by the rules of evolution, leading to the emergence and spread of cell lineages with pathogenic mutations. Moreover, somatic evolution can explain the acquisition of mutations conferring drug resistance in the ever-lasting battle for reaching even fitter cell states. Here, we review the current state of the art of somatic cancer evolution and mechanisms of immune control and escape. We also revisit the principles of immunotherapy from the perspective of somatic evolution and discuss the basic rules of resistance to immunotherapies as dictated by evolution.

    Seminars in cancer biology 2017

  • Stem cell senescence drives age-attenuated induction of pituitary tumours in mouse models of paediatric craniopharyngioma.

    Gonzalez-Meljem JM, Haston S, Carreno G, Apps JR, Pozzi S, Stache C, Kaushal G, Virasami A, Panousopoulos L, Mousavy-Gharavy SN, Guerrero A, Rashid M, Jani N, Goding CR, Jacques TS, Adams DJ, Gil J, Andoniadou CL and Martinez-Barbera JP

    Developmental Biology and Cancer Programme, Birth Defects Research Centre, UCL Institute of Child Health, London, WC1N 1EH, UK.

    Senescent cells may promote tumour progression through the activation of a senescence-associated secretory phenotype (SASP), whether these cells are capable of initiating tumourigenesis in vivo is not known. Expression of oncogenic β-catenin in Sox2+ young adult pituitary stem cells leads to formation of clusters of stem cells and induction of tumours resembling human adamantinomatous craniopharyngioma (ACP), derived from Sox2- cells in a paracrine manner. Here, we uncover the mechanisms underlying this paracrine tumourigenesis. We show that expression of oncogenic β-catenin in Hesx1+ embryonic precursors also results in stem cell clusters and paracrine tumours. We reveal that human and mouse clusters are analogous and share a common signature of senescence and SASP. Finally, we show that mice with reduced senescence and SASP responses exhibit decreased tumour-inducing potential. Together, we provide evidence that senescence and a stem cell-associated SASP drive cell transformation and tumour initiation in vivo in an age-dependent fashion.

    Funded by: Cancer Research UK: 13031; Medical Research Council: MC_U120085810, MR/L016729/1, MR/M000125/1; Wellcome Trust

    Nature communications 2017;8;1;1819

  • Gastrointestinal carriage is a major reservoir of K. pneumoniae infection in intensive care patients.

    Gorrie CL, Mirceta M, Wick RR, Edwards DJ, Thomson NR, Strugnell RA, Pratt N, Garlick J, Watson K, Pilcher D, McGloughlin S, Spelman DW, Jenney AW and Holt KE

    Centre for Systems Genomics, The University of Melbourne, Victoria, Australia.

    Background: Klebsiella pneumoniae (Kp) is an opportunistic pathogen and leading cause of hospital-associated infections. Intensive care unit (ICU) patients are particularly at risk. Kp is part of the healthy human microbiome, providing a potential reservoir for infection. However, the frequency of gut colonization and its contribution to infections are not well characterized.

    Methods: We conducted a one-year prospective cohort study in which 498 ICU patients were screened for rectal and throat carriage of Kp shortly after admission. Kp isolated from screening swabs and clinical diagnostic samples were characterized using whole genome sequencing and combined with epidemiological data to identify likely transmission events.

    Results: Kp carriage frequencies were estimated at 6% (95% CI, 3%-8%) amongst ICU patients admitted direct from the community, and 19% (95% CI, 14% - 51%) amongst those with recent healthcare contact. Gut colonisation on admission was significantly associated with subsequent infection (infection risk 16% vs 3%, OR=6.9, p<0.001), and genome data indicated matching carriage and infection isolates in 80% of isolate pairs. Five likely transmission chains were identified, responsible for 12% of Kp infections in ICU. 49% of Kp infections were caused by the patients' own unique strain, and 48% of screened patients with infections were positive for prior colonisation.

    Conclusions: These data confirm Kp colonisation is a significant risk factor for infection in ICU, and indicate ~50% of Kp infections result from patients' own microbiota. Screening for colonisation on admission could limit risk of infection in the colonised patient and others.

    Clinical infectious diseases : an official publication of the Infectious Diseases Society of America 2017

  • De novo SETD5 loss-of-function variant as a cause for intellectual disability in a 10-year old boy with an aberrant blind ending bronchus.

    Green C, Willoughby J, DDD Study and Balasubramanian M

    Sheffield Clinical Genetics Service, Sheffield Children's NHS Foundation Trust, Sheffield, UK.

    Although rare, 3p microdeletion cases have been well described in the clinical literature. The clinical phenotype includes; intellectual disability (ID), growth retardation, facial dysmorphism, and cardiac malformations. Advances in chromosome microarray (CMA) testing narrowed the 3p25 critical region to a 124 kb region, and recent Whole Exome Sequencing (WES) studies have suggested that the SETD5 gene contributes significantly to the 3p25 phenotype. Loss-of-Function (LoF) variants in SETD5 are now considered a likely cause of ID. We report here a patient with a frameshift LoF variant in exon 12 of SETD5. This patient has features overlapping with other patients described with LoF SETD5 variants to include; similar facial morphology, feeding difficulties, ID, behavioral abnormalities and leg length discrepancy. In addition, he presents with an aberrant blind ending bronchus. This report adds to publications describing intragenic mutations in SETD5 and supports the assertion that de novo LoF mutations in SETD5 present with an overlapping but distinct phenotype in comparison with 3p25 microdeletion syndromes.

    American journal of medical genetics. Part A 2017;173;12;3165-3171

  • Mosaic autosomal aneuploidies are detectable from single-cell RNAseq data.

    Griffiths JA, Scialdone A and Marioni JC

    Cancer Research UK Cambridge Institute, University of Cambridge, CB2 0RE, Cambridge, UK.

    Background: Aneuploidies are copy number variants that affect entire chromosomes. They are seen commonly in cancer, embryonic stem cells, human embryos, and in various trisomic diseases. Aneuploidies frequently affect only a subset of cells in a sample; this is known as "mosaic" aneuploidy. A cell that harbours an aneuploidy exhibits disrupted gene expression patterns which can alter its behaviour. However, detection of aneuploidies using conventional single-cell DNA-sequencing protocols is slow and expensive.

    Methods: We have developed a method that uses chromosome-wide expression imbalances to identify aneuploidies from single-cell RNA-seq data. The method provides quantitative aneuploidy calls, and is integrated into an R software package available on GitHub and as an Additional file of this manuscript.

    Results: We validate our approach using data with known copy number, identifying the vast majority of aneuploidies with a low rate of false discovery. We show further support for the method's efficacy by exploiting allele-specific gene expression levels, and differential expression analyses.

    Conclusions: The method is quick and easy to apply, straightforward to interpret, and represents a substantial cost saving compared to single-cell genome sequencing techniques. However, the method is less well suited to data where gene expression is highly variable. The results obtained from the method can be used to investigate the consequences of aneuploidy itself, or to exclude aneuploidy-affected expression values from conventional scRNA-seq data analysis.

    Funded by: Cancer Research UK: A17197; Wellcome Trust: 105031/B/14/Z, 109081/Z/15/A

    BMC genomics 2017;18;1;904

  • Megakaryocytes in Myeloproliferative Neoplasms Have Unique Somatic Mutations.

    Guo BB, Nigel Allcock RJ, Mirzai B, Jacobus Malherbe JA, Choudry FA, Frontini M, Chuah H, Liang J, Kavanagh SE, Howman R, Ouwehand WH, Fuller KA and Erber WN

    School of Biomedical Sciences, University of Western Australia, Crawley, Western Australia, Australia.

    Myeloproliferative neoplasms (MPNs) are a group of related clonal hemopoietic stem cell disorders associated with hyperproliferation of myeloid cells. They are driven by mutations in the hemopoietic stem cell, most notably JAK2(V617F), CALR, and MPL. Clinically, they have the propensity to progress to myelofibrosis and transform to acute myeloid leukemia. Megakaryocytic hyperplasia with abnormal features are characteristic, and it is thought that these cells stimulate and drive fibrotic progression. The biological defects underpinning this remain to be explained. In this study we examined the megakaryocyte genome in 12 patients with MPNs to determine whether there are somatic variants and whether there is any association with marrow fibrosis. We performed targeted next-generation sequencing for 120 genes associated with myeloid neoplasms on megakaryocytes isolated from aspirated bone marrow. Eleven of the 12 patients had genomic defects in megakaryocytes that were not present in nonmegakaryocytic hemopoietic marrow cells from the same patient. The greatest allelic burden was in patients with increased reticulin deposition. The megakaryocyte-unique mutations were predominantly in genes that regulate chromatin remodeling, chromosome alignment, and stability. These findings show that genomic abnormalities are present in megakaryocytes in MPNs and that these appear to be associated with progression to bone marrow fibrosis.

    The American journal of pathology 2017

  • First Draft Genome Sequence of the Dourine Causative Agent: Trypanosoma Equiperdum Strain OVI.

    Hébert L, Moumen B, Madeline A, Steinbiss S, Lakhdar L, Van Reet N, Büscher P, Laugier C, Cauchard J and Petry S

    ANSES, Dozulé Laboratory for Equine Diseases, Bacteriology and Parasitology Unit, 14430 Goustranville, France.

    Trypanosoma equiperdum is the causative agent of dourine, a sexually-transmitted infection of horses. This parasite belongs to the subgenus Trypanozoon that also includes the agent of sleeping sickness (Trypanosoma brucei) and surra (Trypanosoma evansi). We herein report the genome sequence of a T. equiperdum strain OVI, isolated from a horse in South-Africa in 1976. This is the first genome sequence of the T. equiperdum species, and its availability will provide important insights for future studies on genetic classification of the subgenus Trypanozoon.

    Journal of genomics 2017;5;1-3

  • Statistical methods to detect pleiotropy in human complex traits.

    Hackinger S and Zeggini E

    Wellcome Trust Sanger Institute, Hinxton CB10 1SA, UK.

    In recent years pleiotropy, the phenomenon of one genetic locus influencing several traits, has become a widely researched field in human genetics. With the increasing availability of genome-wide association study summary statistics, as well as the establishment of deeply phenotyped sample collections, it is now possible to systematically assess the genetic overlap between multiple traits and diseases. In addition to increasing power to detect associated variants, multi-trait methods can also aid our understanding of how different disorders are aetiologically linked by highlighting relevant biological pathways. A plethora of available tools to perform such analyses exists, each with their own advantages and limitations. In this review, we outline some of the currently available methods to conduct multi-trait analyses. First, we briefly introduce the concept of pleiotropy and outline the current landscape of pleiotropy research in human genetics; second, we describe analytical considerations and analysis methods; finally, we discuss future directions for the field.

    Open biology 2017;7;11

  • The Hidden Genomics of Chlamydia trachomatis.

    Hadfield J, Bénard A, Domman D and Thomson N

    Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK.

    The application of whole-genome sequencing has moved us on from sequencing single genomes to defining unravelling population structures in different niches, and at the -species, -serotype or even -genus level, and in local, national and global settings. This has been instrumental in cataloguing and revealing a huge a range of diversity in this bacterium, when at first we thought there was little. Genomics has challenged assumptions, added insight, as well as confusion and glimpses of truths. What is clear is that at a time when we start to realise the extent and nature of the diversity contained within a genus or a species like this, the huge depth of knowledge communities have developed, through cell biology, as well as the new found molecular approaches will be more precious than ever to link genotype to phenotype. Here we detail the technological developments and insights we have seen during the relatively short time since we began to see the hidden genome of Chlamydia trachomatis.

    Current topics in microbiology and immunology 2017

  • Phandango: an interactive viewer for bacterial population genomics.

    Hadfield J, Croucher NJ, Goater RJ, Abudahab K, Aanensen DM and Harris SR

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK.

    Summary: Fully exploiting the wealth of data in current bacterial population genomics datasets requires synthesising and integrating different types of analysis across millions of base pairs in hundreds or thousands of isolates. Current approaches often use static representations of phylogenetic, epidemiological, statistical and evolutionary analysis results that are difficult to relate to one another. Phandango is an interactive application running in a web browser allowing fast exploration of large-scale population genomics datasets combining the output from multiple genomic analysis methods in an intuitive and interactive manner.

    Availability: Phandango is a web application freely available for use at and includes a diverse collection of datasets as examples. Source code together with a detailed wiki page is available on GitHub at


    Bioinformatics (Oxford, England) 2017

  • Dietary restriction protects from age-associated DNA methylation and induces epigenetic reprogramming of lipid metabolism.

    Hahn O, Grönke S, Stubbs TM, Ficz G, Hendrich O, Krueger F, Andrews S, Zhang Q, Wakelam MJ, Beyer A, Reik W and Partridge L

    Max Planck Institute for Biology of Ageing, 50931, Cologne, Germany.

    Background: Dietary restriction (DR), a reduction in food intake without malnutrition, increases most aspects of health during aging and extends lifespan in diverse species, including rodents. However, the mechanisms by which DR interacts with the aging process to improve health in old age are poorly understood. DNA methylation could play an important role in mediating the effects of DR because it is sensitive to the effects of nutrition and can affect gene expression memory over time.

    Results: Here, we profile genome-wide changes in DNA methylation, gene expression and lipidomics in response to DR and aging in female mouse liver. DR is generally strongly protective against age-related changes in DNA methylation. During aging with DR, DNA methylation becomes targeted to gene bodies and is associated with reduced gene expression, particularly of genes involved in lipid metabolism. The lipid profile of the livers of DR mice is correspondingly shifted towards lowered triglyceride content and shorter chain length of triglyceride-associated fatty acids, and these effects become more pronounced with age.

    Conclusions: Our results indicate that DR remodels genome-wide patterns of DNA methylation so that age-related changes are profoundly delayed, while changes at loci involved in lipid metabolism affect gene expression and the resulting lipid profile.

    Genome biology 2017;18;1;56

  • The Y chromosomes of the great apes.

    Hallast P and Jobling MA

    Institute of Molecular and Cell Biology, University of Tartu, Tartu, 51010, Estonia.

    The great apes (orangutans, gorillas, chimpanzees, bonobos and humans) descended from a common ancestor around 13 million years ago, and since then their sex chromosomes have followed very different evolutionary paths. While great-ape X chromosomes are highly conserved, their Y chromosomes, reflecting the general lability and degeneration of this male-specific part of the genome since its early mammalian origin, have evolved rapidly both between and within species. Understanding great-ape Y chromosome structure, gene content and diversity would provide a valuable evolutionary context for the human Y, and would also illuminate sex-biased behaviours, and the effects of the evolutionary pressures exerted by different mating strategies on this male-specific part of the genome. High-quality Y-chromosome sequences are available for human and chimpanzee (and low-quality for gorilla). The chromosomes differ in size, sequence organisation and content, and while retaining a relatively stable set of ancestral single-copy genes, show considerable variation in content and copy number of ampliconic multi-copy genes. Studies of Y-chromosome diversity in other great apes are relatively undeveloped compared to those in humans, but have nevertheless provided insights into speciation, dispersal, and mating patterns. Future studies, including data from larger sample sizes of wild-born and geographically well-defined individuals, and full Y-chromosome sequences from bonobos, gorillas and orangutans, promise to further our understanding of population histories, male-biased behaviours, mutation processes, and the functions of Y-chromosomal genes.

    Human genetics 2017

  • Extreme mutation bias and high AT content in Plasmodium falciparum.

    Hamilton WL, Claessens A, Otto TD, Kekre M, Fairhurst RM, Rayner JC and Kwiatkowski D

    Malaria Programme, Wellcome Trust Sanger Institute, Hinxton, CB10 1SA, UK.

    For reasons that remain unknown, the Plasmodium falciparum genome has an exceptionally high AT content compared to other Plasmodium species and eukaryotes in general - nearly 80% in coding regions and approaching 90% in non-coding regions. Here, we examine how this phenomenon relates to genome-wide patterns of de novo mutation. Mutation accumulation experiments were performed by sequential cloning of six P. falciparum isolates growing in human erythrocytes in vitro for 4 years, with 279 clones sampled for whole genome sequencing at different time points. Genome sequence analysis of these samples revealed a significant excess of G:C to A:T transitions compared to other types of nucleotide substitution, which would naturally cause AT content to equilibrate close to the level seen across the P. falciparum reference genome (80.6% AT). These data also uncover an extremely high rate of small indel mutation relative to other species, primarily associated with repetitive AT-rich sequences, in addition to larger-scale structural rearrangements focused in antigen-coding var genes. In conclusion, high AT content in P. falciparum is driven by a systematic mutational bias and ultimately leads to an unusual level of microstructural plasticity, raising the question of whether this contributes to adaptive evolution.

    Funded by: Medical Research Council: G0600718, MR/M006212/1; Wellcome Trust: 098051

    Nucleic acids research 2017;45;4;1889-1901

  • Comparing Ancient DNA Preservation in Petrous Bone and Tooth Cementum.

    Hansen HB, Damgaard PB, Margaryan A, Stenderup J, Lynnerup N, Willerslev E and Allentoft ME

    Centre for GeoGenetics, Natural History Museum, University of Copenhagen, Copenhagen, Denmark.

    Large-scale genomic analyses of ancient human populations have become feasible partly due to refined sampling methods. The inner part of petrous bones and the cementum layer in teeth roots are currently recognized as the best substrates for such research. We present a comparative analysis of DNA preservation in these two substrates obtained from the same human skulls, across a range of different ages and preservation environments. Both substrates display significantly higher endogenous DNA content (average of 16.4% and 40.0% for teeth and petrous bones, respectively) than parietal skull bone (average of 2.2%). Despite sample-to-sample variation, petrous bone overall performs better than tooth cementum (p = 0.001). This difference, however, is driven largely by a cluster of viking skeletons from one particular locality, showing relatively poor molecular tooth preservation (<10% endogenous DNA). In the remaining skeletons there is no systematic difference between the two substrates. A crude preservation (good/bad) applied to each sample prior to DNA-extraction predicted the above/below 10% endogenous DNA threshold in 80% of the cases. Interestingly, we observe signficantly higher levels of cytosine to thymine deamination damage and lower proportions of mitochondrial/nuclear DNA in petrous bone compared to tooth cementum. Lastly, we show that petrous bones from ancient cremated individuals contain no measurable levels of authentic human DNA. Based on these findings we discuss the pros and cons of sampling the different elements.

    PloS one 2017;12;1;e0170940

  • A practical guide to single-cell RNA-sequencing for biomedical research and clinical applications.

    Haque A, Engel J, Teichmann SA and Lönnberg T

    QIMR Berghofer Medical Research Institute, Herston, Brisbane, Queensland, 4006, Australia.

    RNA sequencing (RNA-seq) is a genomic approach for the detection and quantitative analysis of messenger RNA molecules in a biological sample and is useful for studying cellular responses. RNA-seq has fueled much discovery and innovation in medicine over recent years. For practical reasons, the technique is usually conducted on samples comprising thousands to millions of cells. However, this has hindered direct assessment of the fundamental unit of biology-the cell. Since the first single-cell RNA-sequencing (scRNA-seq) study was published in 2009, many more have been conducted, mostly by specialist laboratories with unique skills in wet-lab single-cell genomics, bioinformatics, and computation. However, with the increasing commercial availability of scRNA-seq platforms, and the rapid ongoing maturation of bioinformatics approaches, a point has been reached where any biomedical researcher or clinician can use scRNA-seq to make exciting discoveries. In this review, we present a practical guide to help researchers design their first scRNA-seq studies, including introductory information on experimental hardware, protocol choice, quality control, data analysis and biological interpretation.

    Genome medicine 2017;9;1;75

  • The micro-evolution and epidemiology of Staphylococcus aureus colonization during atopic eczema disease flare.

    Harkins CP, Pettigrew KA, Oravcová K, Gardner J, Hearn RMR, Rice D, Mather AE, Parkhill J, Brown SJ, Proby CM and Holden MTG

    School of Medicine, University of St Andrews, St Andrews, KY11 9TF, UK; Department of Dermatology, Ninewells Hospital, Dundee, DD1 9SY, UK; School of Medicine, University of Dundee, Dundee, DD1 9SY, UK. Electronic address:

    Staphylococcus aureus is an opportunistic pathogen and variable component of the human microbiota. In atopic eczema (AE) a characteristic of the disease is colonization by S. aureus, with exacerbations associated with an increased bacterial burden of the organism. Despite this, the origins and genetic diversity of S. aureus colonizing individual patients during AE disease flares is poorly understood. To examine the micro-evolution of S. aureus colonization we have deep-sequenced S. aureus populations from nine children with moderate to severe AE, and 18 non-atopic children asymptomatically carrying S. aureus nasally. Colonization by clonal S. aureus populations was observed in both AE cases and controls, with all but one of the individuals containing colonies belonging to a single sequence type. Phylogenetic analysis revealed that disease flares were associated with the clonal expansion of the S. aureus population, occurring over a period of weeks to months. There was a significant difference in the genetic backgrounds of S. aureus colonizing AE patients versus controls (Fisher's Exact test, p=0.03). Examination of intra-host genetic heterogeneity of the colonizing S. aureus populations identified evidence of within-host selection in the AE patients, with AE variants being potentially selectively advantageous for intracellular persistence and treatment resistance.

    The Journal of investigative dermatology 2017

  • Methicillin-resistant Staphylococcus aureus emerged long before the introduction of methicillin into clinical practice.

    Harkins CP, Pichon B, Doumith M, Parkhill J, Westh H, Tomasz A, de Lencastre H, Bentley SD, Kearns AM and Holden MTG

    School of Medicine, University of St Andrews, St Andrews, KY16 9TF, UK.

    Background: The spread of drug-resistant bacterial pathogens poses a major threat to global health. It is widely recognised that the widespread use of antibiotics has generated selective pressures that have driven the emergence of resistant strains. Methicillin-resistant Staphylococcus aureus (MRSA) was first observed in 1960, less than one year after the introduction of this second generation beta-lactam antibiotic into clinical practice. Epidemiological evidence has always suggested that resistance arose around this period, when the mecA gene encoding methicillin resistance carried on an SCCmec element, was horizontally transferred to an intrinsically sensitive strain of S. aureus.

    Results: Whole genome sequencing a collection of the first MRSA isolates allows us to reconstruct the evolutionary history of the archetypal MRSA. We apply Bayesian phylogenetic reconstruction to infer the time point at which this early MRSA lineage arose and when SCCmec was acquired. MRSA emerged in the mid-1940s, following the acquisition of an ancestral type I SCCmec element, some 14 years before the first therapeutic use of methicillin.

    Conclusions: Methicillin use was not the original driving factor in the evolution of MRSA as previously thought. Rather it was the widespread use of first generation beta-lactams such as penicillin in the years prior to the introduction of methicillin, which selected for S. aureus strains carrying the mecA determinant. Crucially this highlights how new drugs, introduced to circumvent known resistance mechanisms, can be rendered ineffective by unrecognised adaptations in the bacterial population due to the historic selective landscape created by the widespread use of other antibiotics.

    Funded by: Wellcome Trust

    Genome biology 2017;18;1;130

  • Genomic surveillance reveals low prevalence of livestock-associated methicillin-resistant Staphylococcus aureus in the East of England.

    Harrison EM, Coll F, Toleman MS, Blane B, Brown NM, Török ME, Parkhill J and Peacock SJ

    Department of Medicine, University of Cambridge, Box 157 Addenbrooke's Hospital, Hills Road, Cambridge, CB2 0QQ, United Kingdom.

    Livestock-associated methicillin-resistant Staphylococcus aureus (LA-MRSA) is an emerging problem in many parts of the world. LA-MRSA has been isolated previously from animals and humans in the United Kingdom (UK), but the prevalence is unknown. The aim of this study was to determine the prevalence and to describe the molecular epidemiology of LA-MRSA isolated in the East of England (broadly Cambridge and the surrounding area). We accessed whole genome sequence data for 2,283 MRSA isolates from 1,465 people identified during a 12-month prospective study between 2012 and 2013 conducted in the East of England, United Kingdom. This laboratory serves four hospitals and 75 general practices. We screened the collection for multilocus sequence types (STs) and for host specific resistance and virulence factors previously associated with LA-MRSA. We identified 13 putative LA-MRSA isolates from 12 individuals, giving an estimated prevalence of 0.82% (95% CI 0.47% to 1.43%). Twelve isolates were mecC-MRSA (ten CC130, one ST425 and one ST1943) and single isolate was ST398. Our data demonstrate a low burden of LA-MRSA in the East of England, but the detection of mecC-MRSA and ST398 indicates the need for vigilance. Genomic surveillance provides a mechanism to detect and track the emergence and spread of MRSA clones of human importance.

    Scientific reports 2017;7;1;7406

  • Antimicrobial resistance determinants and susceptibility profiles of pneumococcal isolates recovered in Trinidad and Tobago.

    Hawkins PA, Akpaka PE, Nurse-Lucas M, Gladstone R, Bentley SD, Breiman RF, McGee L and Swanston WH

    Emory University, Atlanta, Georgia, USA; Centers for Disease Control and Prevention (CDC), Atlanta, Georgia, USA. Electronic address:

    Introduction: In Latin America and the Caribbean, pneumococcal infections were estimated to account for 12,000-18,000 deaths, 327,000 cases of pneumonia, 4,000 cases of meningitis and 1,229 cases of sepsis each year in children under five years old. Resistance of pneumococci to antimicrobial agents has evolved into a worldwide health problem in the last few decades.

    Objective: The aim of this study was to determine the antimicrobial susceptibility profiles of 98 pneumococcal isolates collected in Trinidad and Tobago and associated genetic determinants.

    Methods: Whole genome sequences were obtained from 98 pneumococcal isolates recovered at several regional hospitals, including 83 invasive and 15 non-invasive strains, recovered before (n=25) and after (n=73) the introduction of two pneumococcal conjugate vaccines. A bioinformatics pipeline was used to identify core genomic and accessory elements that conferred antimicrobial resistance phenotypes, including β-lactam non-susceptibility.

    Results: and discussion: Forty-one (41.8%) isolates were predicted as resistant to at least one antimicrobial class, including 13 (13.3%) isolates resistant to at least three classes. The most common serotypes associated with antimicrobial resistance were 23F (n=10), 19F (n=8), 6B (n=6), and 14 (n=5). The most common serotypes associated with penicillin non-susceptibility were 19F (n=7) and 14 (n=5). Thirty-nine (39.8%) isolates were positive for PI-1 or PI-2 type pili: 30 (76.9%) were PI-1+, 4 (10.3%) were PI-2+, and 5 (12.8%) were positive for both PI-1 and PI-2. Of the 13 isolates with multidrug resistance, 10 belonged to globally distributed clones PMEN3 and PMEN14 and were isolated in the post-PCV period, suggesting a clonal expansion.

    Journal of global antimicrobial resistance 2017

  • Circulating and Tissue-Resident CD4+ T Cells With Reactivity to Intestinal Microbiota Are Abundant in Healthy Individuals and Function Is Altered During Inflammation.

    Hegazy AN, West NR, Stubbington MJT, Wendt E, Suijker KIM, Datsi A, This S, Danne C, Campion S, Duncan SH, Owens BMJ, Uhlig HH, McMichael A, Oxford IBD Cohort Investigators, Bergthaler A, Teichmann SA, Keshav S and Powrie F

    Translational Gastroenterology Unit, Nuffield Department of Clinical Medicine, Experimental Medicine Division, John Radcliffe Hospital, University of Oxford, United Kingdom; Kennedy Institute of Rheumatology, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, United Kingdom.

    Background &amp; aims: Interactions between commensal microbes and the immune system are tightly regulated and maintain intestinal homeostasis, but little is known about these interactions in humans. We investigated responses of human CD4+ T cells to the intestinal microbiota. We measured the abundance of T cells in circulation and intestinal tissues that respond to intestinal microbes and determined their clonal diversity. We also assessed their functional phenotypes and effects on intestinal resident cell populations, and studied alterations in microbe-reactive T cells in patients with chronic intestinal inflammation.

    Methods: We collected samples of peripheral blood mononuclear cells and intestinal tissues from healthy individuals (controls, n = 13-30) and patients with inflammatory bowel diseases (n = 119; 59 with ulcerative colitis and 60 with Crohn's disease). We used 2 independent assays (CD154 detection and carboxy-fluorescein succinimidyl ester dilution assays) and 9 intestinal bacterial species (Escherichia coli, Lactobacillus acidophilus, Bifidobacterium animalis subsp lactis, Faecalibacterium prausnitzii, Bacteroides vulgatus, Roseburia intestinalis, Ruminococcus obeum, Salmonella typhimurium, and Clostridium difficile) to quantify, expand, and characterize microbe-reactive CD4+ T cells. We sequenced T-cell receptor Vβ genes in expanded microbe-reactive T-cell lines to determine their clonal diversity. We examined the effects of microbe-reactive CD4+ T cells on intestinal stromal and epithelial cell lines. Cytokines, chemokines, and gene expression patterns were measured by flow cytometry and quantitative polymerase chain reaction.

    Results: Circulating and gut-resident CD4+ T cells from controls responded to bacteria at frequencies of 40-4000 per million for each bacterial species tested. Microbiota-reactive CD4+ T cells were mainly of a memory phenotype, present in peripheral blood mononuclear cells and intestinal tissue, and had a diverse T-cell receptor Vβ repertoire. These cells were functionally heterogeneous, produced barrier-protective cytokines, and stimulated intestinal stromal and epithelial cells via interleukin 17A, interferon gamma, and tumor necrosis factor. In patients with inflammatory bowel diseases, microbiota-reactive CD4+ T cells were reduced in the blood compared with intestine; T-cell responses that we detected had an increased frequency of interleukin 17A production compared with responses of T cells from blood or intestinal tissues of controls.

    Conclusions: In an analysis of peripheral blood mononuclear cells and intestinal tissues from patients with inflammatory bowel diseases vs controls, we found that reactivity to intestinal bacteria is a normal property of the human CD4+ T-cell repertoire, and does not necessarily indicate disrupted interactions between immune cells and the commensal microbiota. T-cell responses to commensals might support intestinal homeostasis, by producing barrier-protective cytokines and providing a large pool of T cells that react to pathogens.

    Funded by: European Research Council: 260507; NIAID NIH HHS: UM1 AI100645; Wellcome Trust

    Gastroenterology 2017;153;5;1320-1337.e16

  • The great escape.

    Heinz E

    Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.

    Nature reviews. Microbiology 2017;16;1;4

  • Summing up the parts of the hypothalamus.

    Hemberg M

    Wellcome Trust Sanger Institute, Hinxton, UK.

    Nature neuroscience 2017;20;3;378-379

  • Rare Variant Analysis of Human and Rodent Obesity Genes in Individuals with Severe Childhood Obesity.

    Hendricks AE, Bochukova EG, Marenne G, Keogh JM, Atanassova N, Bounds R, Wheeler E, Mistry V, Henning E, Körner A, Muddyman D, McCarthy S, Hinney A, Hebebrand J, Scott RA, Langenberg C, Wareham NJ, Surendran P, Howson JM, Butterworth AS, Danesh J, Nordestgaard BG, Nielsen SF, Afzal S, Papadia S, Ashford S, Garg S, Millhauser GL, Palomino RI, Kwasniewska A, Tachmazidou I, O'Rahilly S, Zeggini E, Barroso I, Farooqi IS, Understanding Society Scientific Group, EPIC-CVD Consortium and UK10K Consortium

    Wellcome Trust Sanger Institute, Cambridge, UK.

    Obesity is a genetically heterogeneous disorder. Using targeted and whole-exome sequencing, we studied 32 human and 87 rodent obesity genes in 2,548 severely obese children and 1,117 controls. We identified 52 variants contributing to obesity in 2% of cases including multiple novel variants in GNAS, which were sometimes found with accelerated growth rather than short stature as described previously. Nominally significant associations were found for rare functional variants in BBS1, BBS9, GNAS, MKKS, CLOCK and ANGPTL6. The p.S284X variant in ANGPTL6 drives the association signal (rs201622589, MAF~0.1%, odds ratio = 10.13, p-value = 0.042) and results in complete loss of secretion in cells. Further analysis including additional case-control studies and population controls (N = 260,642) did not support association of this variant with obesity (odds ratio = 2.34, p-value = 2.59 × 10<sup>-3</sup>), highlighting the challenges of testing rare variant associations and the need for very large sample sizes. Further validation in cohorts with severe obesity and engineering the variants in model organisms will be needed to explore whether human variants in ANGPTL6 and other genes that lead to obesity when deleted in mice, do contribute to obesity. Such studies may yield druggable targets for weight loss therapies.

    Funded by: British Heart Foundation: RG/08/014/24067; European Research Council: 268834; Medical Research Council: G0900554, MC_UU_12012/1, MC_UU_12012/5, MC_UU_12015/1, MR/L003120/1, MR/L010305/1; NIDDK NIH HHS: R01 DK064265, R01 DK110403; NIGMS NIH HHS: R25 GM058903; Wellcome Trust

    Scientific reports 2017;7;1;4394

  • Genetic Screen for Postembryonic Development in the Zebrafish (<i>Danio rerio</i>): Dominant Mutations Affecting Adult Form.

    Henke K, Daane JM, Hawkins MB, Dooley CM, Busch-Nentwich EM, Stemple DL and Harris MP

    Department of Orthopedic Research, Boston Children's Hospital, Massachusetts 02115

    Large-scale forward genetic screens have been instrumental for identifying genes that regulate development, homeostasis, and regeneration, as well as the mechanisms of disease. The zebrafish, <i>Danio rerio</i>, is an established genetic and developmental model used in genetic screens to uncover genes necessary for early development. However, the regulation of postembryonic development has received less attention as these screens are more labor intensive and require extensive resources. The lack of systematic interrogation of late development leaves large aspects of the genetic regulation of adult form and physiology unresolved. To understand the genetic control of postembryonic development, we performed a dominant screen for phenotypes affecting the adult zebrafish. In our screen, we identified 72 adult viable mutants showing changes in the shape of the skeleton as well as defects in pigmentation. For efficient mapping of these mutants and mutation identification, we devised a new mapping strategy based on identification of mutant-specific haplotypes. Using this method in combination with a candidate gene approach, we were able to identify linked mutations for 22 out of 25 mutants analyzed. Broadly, our mutational analysis suggests that there are key genes and pathways associated with late development. Many of these pathways are shared with humans and are affected in various disease conditions, suggesting constraint in the genetic pathways that can lead to change in adult form. Taken together, these results show that dominant screens are a feasible and productive means to identify mutations that can further our understanding of gene function during postembryonic development and in disease.

    Funded by: NIDCR NIH HHS: U01 DE024434

    Genetics 2017;207;2;609-623

  • HLA haplotypes in primary sclerosing cholangitis patients of admixed and non-European ancestry.

    Henriksen EKK, Viken MK, Wittig M, Holm K, Folseraas T, Mucha S, Melum E, Hov JR, Lazaridis KN, Juran BD, Chazouillères O, Färkkilä M, Gotthardt DN, Invernizzi P, Carbone M, Hirschfield GM, Rushbrook SM, Goode E, UK-PSC Consortium, Ponsioen CY, Weersma RK, Eksteen B, Yimam KK, Gordon SC, Goldberg D, Yu L, Bowlus CL, Franke A, Lie BA and Karlsen TH

    Norwegian PSC Research Center, Department of Transplantation Medicine, Division of Surgery, Inflammatory Medicine and Transplantation, Oslo University Hospital Rikshospitalet, Oslo, Norway.

    Primary sclerosing cholangitis (PSC) is strongly associated with several human leukocyte antigen (HLA) haplotypes. Due to extensive linkage disequilibrium and multiple polymorphic candidate genes in the HLA complex, identifying the alleles responsible for these associations has proven difficult. We aimed to evaluate whether studying populations of admixed or non-European descent could help in defining the causative HLA alleles. When assessing haplotypes carrying HLA-DRB1*13:01 (hypothesized to specifically increase the susceptibility to chronic cholangitis), we observed that every haplotype in the Scandinavian PSC population carried HLA-DQB1*06:03. In contrast, only 65% of HLA-DRB1*13:01 haplotypes in an admixed/non-European PSC population carried this allele, suggesting that further assessments of the PSC-associated haplotype HLA-DRB1*13:01-DQA1*01:03-DQB1*06:03 in admixed or multi-ethnic populations could aid in identifying the causative allele.

    Funded by: NIDDK NIH HHS: R01 DK084960

    HLA 2017;90;4;228-233

  • Molecular epidemiology of Klebsiella pneumoniae invasive infections over a decade at Kilifi County Hospital in Kenya.

    Henson SP, Boinett CJ, Ellington MJ, Kagia N, Mwarumba S, Nyongesa S, Mturi N, Kariuki S, Scott JAG, Thomson NR and Morpeth SC

    KEMRI-Wellcome Trust Research Programme, Kilifi, Kenya; Centre for Tropical Medicine and Global Health, Nuffield Department of Clinical Medicine, Oxford University, Oxford, United Kingdom.

    Multidrug resistant (MDR) Klebsiella pneumoniae is a common cause of nosocomial infections worldwide. Recent years have seen an explosion of resistance to extended-spectrum β-lactamases (ESBLs) and emergence of carbapenem resistance. Here, we examine 198 invasive K. pneumoniae isolates collected from over a decade in Kilifi County Hospital (KCH) in Kenya. We observe a significant increase in MDR K. pneumoniae isolates, particularly to third generation cephalosporins conferred by ESBLs. Using whole-genome sequences, we describe the population structure and the distribution of antimicrobial resistance genes within it. More than half of the isolates examined in this study were ESBL-positive, encoding CTX-M-15, SHV-2, SHV-12 and SHV-27, and 79% were MDR conferring resistance to at least three antimicrobial classes. Although no isolates in our dataset were found to be resistant to carbapenems we did find a plasmid with the genetic architecture of a known New Delhi metallo-β-lactamase-1 (NDM)-carrying plasmid in 25 isolates. In the absence of carbapenem use in KCH and because of the instability of the NDM-1 gene in the plasmid, the NDM-1 gene has been lost in these isolates. Our data suggests that isolates that encode NDM-1 could be present in the population; should carbapenems be introduced as treatment in public hospitals in Kenya, resistance is likely to ensue rapidly.

    International journal of medical microbiology : IJMM 2017;307;7;422-429

  • PGBD5 promotes site-specific oncogenic mutations in human tumors.

    Henssen AG, Koche R, Zhuang J, Jiang E, Reed C, Eisenberg A, Still E, MacArthur IC, Rodríguez-Fos E, Gonzalez S, Puiggròs M, Blackford AN, Mason CE, de Stanchina E, Gönen M, Emde AK, Shah M, Arora K, Reeves C, Socci ND, Perlman E, Antonescu CR, Roberts CWM, Steen H, Mullen E, Jackson SP, Torrents D, Weng Z, Armstrong SA and Kentsis A

    Molecular Pharmacology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, New York, USA.

    Genomic rearrangements are a hallmark of human cancers. Here, we identify the piggyBac transposable element derived 5 (PGBD5) gene as encoding an active DNA transposase expressed in the majority of childhood solid tumors, including lethal rhabdoid tumors. Using assembly-based whole-genome DNA sequencing, we found previously undefined genomic rearrangements in human rhabdoid tumors. These rearrangements involved PGBD5-specific signal (PSS) sequences at their breakpoints and recurrently inactivated tumor-suppressor genes. PGBD5 was physically associated with genomic PSS sequences that were also sufficient to mediate PGBD5-induced DNA rearrangements in rhabdoid tumor cells. Ectopic expression of PGBD5 in primary immortalized human cells was sufficient to promote cell transformation in vivo. This activity required specific catalytic residues in the PGBD5 transposase domain as well as end-joining DNA repair and induced structural rearrangements with PSS breakpoints. These results define PGBD5 as an oncogenic mutator and provide a plausible mechanism for site-specific DNA rearrangements in childhood and adult solid tumors.

    Nature genetics 2017

  • Ecto-5'-nucleotidase (CD73) regulates peripheral chemoreceptor activity and cardiorespiratory responses to hypoxia.

    Holmes AP, Ray CJ, Pearson SA, Coney AM and Kumar P

    Institute of Cardiovascular Sciences, University of Birmingham, Edgbaston, Birmingham, B15 2TT, UK.

    Augmented sensory neuronal activity from the carotid body (CB) has emerged as a principal cause of hypertension in a number of cardiovascular related pathologies including obstructive sleep apnoea, heart failure and diabetes. Development of new targets and pharmacological treatment strategies aiming to reduce CB sensory activity may thus improve outcomes in these key patient cohorts. The current study tested whether ecto-5'-nucleotidase (CD73), an enzyme that generates adenosine, is functionally important in modifying CB sensory activity and cardiovascular respiratory responses to hypoxia. Inhibition of ecto-5'-nucleotidase by α,β-methylene ADP (AOPCP) in the whole CB preparation in vitro reduced basal discharge frequency by 76 ± 5% and reduced sensory activity throughout graded hypoxia. AOPCP also significantly attenuated elevations in sensory activity evoked by mitochondrial inhibition. These effects were mimicked by antagonism of adenosine receptors with 8-(p-sulfophenyl) theophylline. Infusion of AOPCP in vivo significantly decreased the hypoxic ventilatory response (ΔV˙E control 74 ± 6%, ΔV˙E AOPCP 64 ± 5%, P < 0.05). AOPCP also modified cardiovascular responses to hypoxia, as evidenced by reduced elevations in heart rate and exaggerated changes in femoral vascular conductance and mean arterial blood pressure. Thus we identify ecto-5'-nucleotidase as a novel regulator of CB sensory activity. Future investigations are warranted to evaluate whether inhibition of ecto-5'-nucleotidase can effectively reduce CB activity in CB-mediated cardiovascular pathology. This article is protected by copyright. All rights reserved.

    The Journal of physiology 2017

  • The Human Cell Atlas: Technical approaches and challenges.

    Hon CC, Shin JW, Carninci P and Stubbington MJT

    The Human Cell Atlas is a large, international consortium that aims to identify and describe every cell type in the human body. The comprehensive cellular maps that arise from this ambitious effort have the potential to transform many aspects of fundamental biology and clinical practice. Here, we discuss the technical approaches that could be used today to generate such a resource and also the technical challenges that will be encountered.

    Briefings in functional genomics 2017

  • Fine-mapping inflammatory bowel disease loci to single-variant resolution.

    Huang H, Fang M, Jostins L, Umićević Mirkov M, Boucher G, Anderson CA, Andersen V, Cleynen I, Cortes A, Crins F, D'Amato M, Deffontaine V, Dmitrieva J, Docampo E, Elansary M, Farh KK, Franke A, Gori AS, Goyette P, Halfvarson J, Haritunians T, Knight J, Lawrance IC, Lees CW, Louis E, Mariman R, Meuwissen T, Mni M, Momozawa Y, Parkes M, Spain SL, Théâtre E, Trynka G, Satsangi J, van Sommeren S, Vermeire S, Xavier RJ, International Inflammatory Bowel Disease Genetics Consortium, Weersma RK, Duerr RH, Mathew CG, Rioux JD, McGovern DPB, Cho JH, Georges M, Daly MJ and Barrett JC

    Analytic and Translational Genetics Unit, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts 02114, USA.

    Inflammatory bowel diseases are chronic gastrointestinal inflammatory disorders that affect millions of people worldwide. Genome-wide association studies have identified 200 inflammatory bowel disease-associated loci, but few have been conclusively resolved to specific functional variants. Here we report fine-mapping of 94 inflammatory bowel disease loci using high-density genotyping in 67,852 individuals. We pinpoint 18 associations to a single causal variant with greater than 95% certainty, and an additional 27 associations to a single variant with greater than 50% certainty. These 45 variants are significantly enriched for protein-coding changes (n = 13), direct disruption of transcription-factor binding sites (n = 3), and tissue-specific epigenetic marks (n = 10), with the last category showing enrichment in specific immune cells among associations stronger in Crohn's disease and in gut mucosa among associations stronger in ulcerative colitis. The results of this study suggest that high-resolution fine-mapping in large samples can convert many discoveries from genome-wide association studies into statistically convincing causal variants, providing a powerful substrate for experimental elucidation of disease mechanisms.

    Funded by: Chief Scientist Office: ETM/137; Medical Research Council: G0600329, G0800759; NCI NIH HHS: R01 CA141743; NIAID NIH HHS: U01 AI067068; NIDCR NIH HHS: U54 DE023789; NIDDK NIH HHS: P01 DK046763, P30 DK043351, R01 DK064869, R01 DK092235, R01 DK106593, U01 DK062413, U01 DK062420, U01 DK062422, U01 DK062429, U01 DK062432, U24 DK062429; Wellcome Trust: 098051, 098759

    Nature 2017;547;7662;173-178

  • Novel viral vectors in infectious diseases.

    Humphreys IR and Sebastian S

    Institute of Infection and Immunity/Systems Immunity, University Research Institute, Cardiff University, Cardiff, UK.

    Since the development of Vaccinia virus as a vaccine vector in 1984, the utility of numerous viruses in vaccination strategies has been explored. In recent years, key improvements to existing vectors such as those based on adenovirus have led to significant improvements in immunogenicity and efficacy. Furthermore, exciting new vectors that exploit viruses such as cytomegalovirus (CMV) and vesicular stomatitis virus (VSV) have emerged. Herein, we summarize these recent developments in viral vector technologies, focusing on novel vectors based on CMV, VSV, measles and modified adenovirus. We discuss the potential utility of these exciting approaches in eliciting protection against infectious diseases. This article is protected by copyright. All rights reserved.

    Immunology 2017

  • Pooling strategy and chromosome painting characterize a living zebroid for the first time.

    Iannuzzi A, Pereira J, Iannuzzi C, Fu B and Ferguson-Smith M

    Laboratory of Animal Cytogenetics and Genomics, National Research Council of Italy, Institute of Animal Production Systems in Mediterranean Environments (ISPAAM), Naples, Italy.

    We have investigated the complex karyotype of a living zebra-donkey hybrid for the first time using chromosome-specific painting probes produced from flow-sorted chromosomes from a zebra (Equus burchelli) and horse (Equus caballus). As the chromosomes proved difficult to distinguish from one another, a successful new strategy was devised to resolve the difficulty and characterize each chromosome. This was based on selecting five panels of whole chromosome painting probes that could differentiate zebra and donkey chromosomes by labelling the probes with either FITC or Cy3 fluorochromes. Each panel was hybridized sequentially to the same G-Q-banded metaphases and the results combined so that every zebra and donkey chromosome in each suitable metaphase could be identified. A diploid number of 2n = 53, XY was found, containing haploid sets of 22 chromosomes from the zebra and 31 chromosomes from the donkey, without evidence of chromosome rearrangement. This new strategy, developed for the first time, may have several applications in the resolution of other complex hybrid karyotypes and chromosomal aberrations.

    PloS one 2017;12;7;e0180158

  • Gene Expression in <i>Leishmania</i> Is Regulated Predominantly by Gene Dosage.

    Iantorno SA, Durrant C, Khan A, Sanders MJ, Beverley SM, Warren WC, Berriman M, Sacks DL, Cotton JA and Grigg ME

    Laboratory of Parasitic Diseases, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, Maryland, USA.

    <i>Leishmania tropica</i>, a unicellular eukaryotic parasite present in North and East Africa, the Middle East, and the Indian subcontinent, has been linked to large outbreaks of cutaneous leishmaniasis in displaced populations in Iraq, Jordan, and Syria. Here, we report the genome sequence of this pathogen and 7,863 identified protein-coding genes, and we show that the majority of clinical isolates possess high levels of allelic diversity, genetic admixture, heterozygosity, and extensive aneuploidy. By utilizing paired genome-wide high-throughput DNA sequencing (DNA-seq) with RNA-seq, we found that gene dosage, at the level of individual genes or chromosomal "somy" (a general term covering disomy, trisomy, tetrasomy, etc.), accounted for greater than 85% of total gene expression variation in genes with a 2-fold or greater change in expression. High gene copy number variation (CNV) among membrane-bound transporters, a class of proteins previously implicated in drug resistance, was found for the most highly differentially expressed genes. Our results suggest that gene dosage is an adaptive trait that confers phenotypic plasticity among natural <i>Leishmania</i> populations by rapid down- or upregulation of transporter proteins to limit the effects of environmental stresses, such as drug selection.<b>IMPORTANCE</b><i>Leishmania</i> is a genus of unicellular eukaryotic parasites that is responsible for a spectrum of human diseases that range from cutaneous leishmaniasis (CL) and mucocutaneous leishmaniasis (MCL) to life-threatening visceral leishmaniasis (VL). Developmental and strain-specific gene expression is largely thought to be due to mRNA message stability or posttranscriptional regulatory networks for this species, whose genome is organized into polycistronic gene clusters in the absence of promoter-mediated regulation of transcription initiation of nuclear genes. Genetic hybridization has been demonstrated to yield dramatic structural genomic variation, but whether such changes in gene dosage impact gene expression has not been formally investigated. Here we show that the predominant mechanism determining transcript abundance differences (>85%) in <i>Leishmania tropica</i> is that of gene dosage at the level of individual genes or chromosomal somy.

    Funded by: NIAID NIH HHS: R01 AI029646; Wellcome Trust: 206194

    mBio 2017;8;5

  • Variation in olfactory neuron repertoires is genetically controlled and environmentally modulated.

    Ibarra-Soria X, Nakahara TS, Lilue J, Jiang Y, Trimmer C, Souza MA, Netto PH, Ikegami K, Murphy NR, Kusma M, Kirton A, Saraiva LR, Keane TM, Matsunami H, Mainland J, Papes F and Logan DW

    Wellcome Trust Sanger Institute, Cambridge, United Kingdom.

    The mouse olfactory sensory neuron (OSN) repertoire is composed of 10 million cells and each expresses one olfactory receptor (OR) gene from a pool of over 1000. Thus, the nose is sub-stratified into more than a thousand OSN subtypes. Here, we employ and validate an RNA-sequencing-based method to quantify the abundance of all OSN subtypes in parallel, and investigate the genetic and environmental factors that contribute to neuronal diversity. We find that the OSN subtype distribution is stereotyped in genetically identical mice, but varies extensively between different strains. Further, we identify <i>cis</i>-acting genetic variation as the greatest component influencing OSN composition and demonstrate independence from OR function. However, we show that olfactory stimulation with particular odorants results in modulation of dozens of OSN subtypes in a subtle but reproducible, specific and time-dependent manner. Together, these mechanisms generate a highly individualized olfactory sensory system by promoting neuronal diversity.

    Funded by: Medical Research Council: MR/L007428/1; NIDCD NIH HHS: F32 DC014202, P30 DC011735, R01 DC013339, R01 DC014423

    eLife 2017;6

  • An untypeable enterotoxigenic Escherichia coli represents one of the dominant types causing human disease.

    Iguchi A, von Mentzer A, Kikuchi T and Thomson NR

    University of Miyazaki, Miyazaki, Japan.

    Enterotoxigenic Escherichia coli (ETEC) is a major cause of diarrhoea in children below 5 years of age in endemic areas, and is a primary cause of diarrhoea in travellers visiting developing countries. Epidemiological analysis of E. coli pathovars is traditionally carried out based on the results of serotyping. However, genomic analysis of a global ETEC collection of 362 isolates taken from patients revealed nine novel O-antigen biosynthesis gene clusters that were previously unrecognized, and have collectively been called unclassified. When put in the context of all isolates sequenced, one of the novel O-genotypes, OgN5, was found to be the second most common ETEC O-genotype causing disease, after O6, in a globally representative ETEC collection. It's also clear that ETEC OgN5 isolates have spread globally. These novel O-genotypes have now been included in our comprehensive O-genotyping scheme, and can be detected using a PCR-based and an in silico typing method. This will assist in epidemiological studies, as well as in ETEC vaccine development.

    Microbial genomics 2017;3;9;e000121

  • The role of sex and body weight on the metabolic effects of high-fat diet in C57BL/6N mice.

    Ingvorsen C, Karp NA and Lelliott CJ

    Mouse Pipelines, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK.

    Background: Metabolic disorders are commonly investigated using knockout and transgenic mouse models on the C57BL/6N genetic background due to its genetic susceptibility to the deleterious metabolic effects of high-fat diet (HFD). There is growing awareness of the need to consider sex in disease progression, but limited attention has been paid to sexual dimorphism in mouse models and its impact in metabolic phenotypes. We assessed the effect of HFD and the impact of sex on metabolic variables in this strain.

    Methods: We generated a reference data set encompassing glucose tolerance, body composition and plasma chemistry data from 586 C57BL/6N mice fed a standard chow and 733 fed a HFD collected as part of a high-throughput phenotyping pipeline. Linear mixed model regression analysis was used in a dual analysis to assess the effect of HFD as an absolute change in phenotype, but also as a relative change accounting for the potential confounding effect of body weight.

    Results: HFD had a significant impact on all variables tested with an average absolute effect size of 29%. For the majority of variables (78%), the treatment effect was modified by sex and this was dominated by male-specific or a male stronger effect. On average, there was a 13.2% difference in the effect size between the male and female mice for sexually dimorphic variables. HFD led to a significant body weight phenotype (24% increase), which acts as a confounding effect on the other analysed variables. For 79% of the variables, body weight was found to be a significant source of variation, but even after accounting for this confounding effect, similar HFD-induced phenotypic changes were found to when not accounting for weight.

    Conclusion: HFD and sex are powerful modifiers of metabolic parameters in C57BL/6N mice. We also demonstrate the value of considering body size as a covariate to obtain a richer understanding of metabolic phenotypes.

    Nutrition & diabetes 2017;7;4;e261

  • Sub-minute Phosphoregulation of Cell Cycle Systems during Plasmodium Gamete Formation.

    Invergo BM, Brochet M, Yu L, Choudhary J, Beltrao P and Billker O

    European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridgeshire CB10 1SD, UK; Malaria Programme, Wellcome Trust Sanger Institute, Hinxton, Cambridgeshire CB10 1SA, UK.

    The transmission of malaria parasites to mosquitoes relies on the rapid induction of sexual reproduction upon their ingestion into a blood meal. Haploid female and male gametocytes become activated and emerge from their host cells, and the males enter the cell cycle to produce eight microgametes. The synchronized nature of gametogenesis allowed us to investigate phosphorylation signaling during its first minute in Plasmodium berghei via a high-resolution time course of the phosphoproteome. This revealed an unexpectedly broad response, with proteins related to distinct cell cycle events undergoing simultaneous phosphoregulation. We implicate several protein kinases in the process, and we validate our analyses on the plant-like calcium-dependent protein kinase 4 (CDPK4) and a homolog of serine/arginine-rich protein kinases (SRPK1). Mutants in these kinases displayed distinct phosphoproteomic disruptions, consistent with differences in their phenotypes. The results reveal the central role of protein phosphorylation in the atypical cell cycle regulation of a divergent eukaryote.

    Cell reports 2017;21;7;2017-2029

  • Insecticide-induced leg loss does not eliminate biting and reproduction in Anopheles gambiae mosquitoes.

    Isaacs AT, Lynd A and Donnelly MJ

    Department of Vector Biology, Liverpool School of Tropical Medicine, Liverpool, UK.

    Recent successes in malaria control have been largely attributable to the deployment of insecticide-based vector control tools such as bed nets and indoor residual spraying. Pyrethroid-treated bed nets are acutely neurotoxic to mosquitoes, inducing symptoms such as loss of coordination, paralysis, and violent spasms. One result of pyrethroid exposure often seen in laboratory tests is mosquito leg loss, a condition that has thus far been assumed to equate to mortality, as females are not expected to blood feed. However, whilst limb loss is unlikely to be adaptive, females with missing limbs may play a role in the propagation of both their species and pathogens. To test the hypothesis that leg loss inhibits mosquitoes from biting and reproducing, mosquitoes with one, two, or six legs were evaluated for their success in feeding upon a human. These experiments demonstrated that insecticide-induced leg loss had no significant effect upon blood feeding or egg laying success. We conclude that studies of pyrethroid efficacy should not discount mosquitoes that survive insecticide exposure with fewer than six legs, as they may still be capable of biting humans, reproducing, and contributing to malaria transmission.

    Scientific reports 2017;7;46674

  • DNA methylation homeostasis in human and mouse development.

    Iurlaro M, von Meyenn F and Reik W

    Epigenetics Programme, Babraham Institute, Cambridge CB22 3AT, UK.

    The molecular pathways that regulate gain and loss of DNA methylation during mammalian development need to be tightly balanced to maintain a physiological equilibrium. Here we explore the relative contributions of the different pathways and enzymatic activities involved in methylation homeostasis in the context of genome-wide and locus-specific epigenetic reprogramming in mammals. An adaptable epigenetic machinery allows global epigenetic reprogramming to concur with local maintenance of critical epigenetic memory in the genome, and appears to regulate the tempo of global reprogramming in different cell lineages and species.

    Current opinion in genetics & development 2017;43;101-109

  • Disentangling Immediate Adaptive Introgression from Selection on Standing Introgressed Variation in Humans.

    Jagoda E, Lawson DJ, Wall JD, Lambert D, Muller C, Westaway M, Leavesley M, Capellini TD, Mirazón Lahr M, Gerbault P, Thomas MG, Migliano AB, Willerslev E, Metspalu M and Pagani L

    Human Evolutionary Biology, Harvard University 11 Divinity Avenue Cambridge MA 02138 USA.

    Recent studies have reported evidence suggesting that portions of contemporary human genomes introgressed from archaic hominin populations went to high frequencies due to positive selection. However, no study to date has specifically addressed the post-introgression population dynamics of these putative cases of adaptive introgression. Here, for the first time, we specifically define cases of immediate adaptive introgression (iAI), in which archaic haplotypes rose to high frequencies in humans as a result of a selective sweep that occurred shortly after the introgression event. We define these cases as distinct from instances of selection on standing introgressed variation (SI), in which an introgressed haplotype initially segregated neutrally and subsequently underwent positive selection. Using a geographically diverse dataset, we report novel cases of selection on introgressed variation in living humans and shortlisted among these cases those whose selective sweeps are more consistent with having been the product of iAI rather than SI. Many of these novel inferred iAI haplotypes have potential biological relevance, including three introgressed haplotypes that contain immune-related genes in West Siberians, South Asians, and West Eurasians. Overall, our results suggest that iAI may not represent the full picture of positive selection on archaically introgressed haplotypes in humans and that more work needs to be done to analyze the role of SI in the archaic introgression landscape of living humans.

    Molecular biology and evolution 2017

  • Clonal Hematopoiesis and Risk of Atherosclerotic Cardiovascular Disease.

    Jaiswal S, Natarajan P, Silver AJ, Gibson CJ, Bick AG, Shvartz E, McConkey M, Gupta N, Gabriel S, Ardissino D, Baber U, Mehran R, Fuster V, Danesh J, Frossard P, Saleheen D, Melander O, Sukhova GK, Neuberg D, Libby P, Kathiresan S and Ebert BL

    From the Department of Medicine, Division of Hematology, Brigham and Women's Hospital (S.J., A.J.S., M.M.) and Harvard Medical School (B.L.E.), the Department of Medicine, Division of Cardiovascular Medicine, Brigham and Women's Hospital (E.S.) and Harvard Medical School (G.K.S., P.L.), the Department of Pathology (S.J.) and the Center for Genomic Medicine (P.N., S.K.), Massachusetts General Hospital, the Department of Medicine, Division of Cardiology, and Cardiovascular Research Center (P.N., S.K.), and the Department of Medicine (A.G.B.), Massachusetts General Hospital and Harvard Medical School, and the Departments of Medical Oncology (C.J.G.) and Biostatistics and Computational Biology (D.N.), Dana-Farber Cancer Institute, Boston, and the Program in Medical and Population Genetics, Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge (P.N., A.G.B., N.G., S.G., S.K.) - all in Massachusetts; the Department of Cardiology, University Hospital, Parma, Italy (D.A.); the Department of Medicine, Division of Cardiology, Mt. Sinai School of Medicine, New York (U.B., R.M., V.F.); Centro Nacional de Investigaciones Cardiovasculares Carlos III, Madrid (V.F.); Medical Research Council-British Heart Foundation Cardiovascular Epidemiology Unit and National Institute for Health Research Blood and Transplant Research Unit in Donor Health and Genomics, Department of Public Health and Primary Care, and the British Heart Foundation, Cambridge Centre of Excellence, Department of Medicine, University of Cambridge, Cambridge (J.D.), and the Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton (J.D.) - both in the United Kingdom; the Center for Non-Communicable Diseases, Karachi, Pakistan (P.F., D.S.); the Department of Biostatistics and Epidemiology, University of Pennsylvania, Philadelphia (D.S.); and the Department of Clinical Sciences Malmö, Lund University, Lund, Sweden (O.M.).

    Background: Clonal hematopoiesis of indeterminate potential (CHIP), which is defined as the presence of an expanded somatic blood-cell clone in persons without other hematologic abnormalities, is common among older persons and is associated with an increased risk of hematologic cancer. We previously found preliminary evidence for an association between CHIP and atherosclerotic cardiovascular disease, but the nature of this association was unclear.

    Methods: We used whole-exome sequencing to detect the presence of CHIP in peripheral-blood cells and associated such presence with coronary heart disease using samples from four case-control studies that together enrolled 4726 participants with coronary heart disease and 3529 controls. To assess causality, we perturbed the function of Tet2, the second most commonly mutated gene linked to clonal hematopoiesis, in the hematopoietic cells of atherosclerosis-prone mice.

    Results: In nested case-control analyses from two prospective cohorts, carriers of CHIP had a risk of coronary heart disease that was 1.9 times as great as in noncarriers (95% confidence interval [CI], 1.4 to 2.7). In two retrospective case-control cohorts for the evaluation of early-onset myocardial infarction, participants with CHIP had a risk of myocardial infarction that was 4.0 times as great as in noncarriers (95% CI, 2.4 to 6.7). Mutations in DNMT3A, TET2, ASXL1, and JAK2 were each individually associated with coronary heart disease. CHIP carriers with these mutations also had increased coronary-artery calcification, a marker of coronary atherosclerosis burden. Hypercholesterolemia-prone mice that were engrafted with bone marrow obtained from homozygous or heterozygous Tet2 knockout mice had larger atherosclerotic lesions in the aortic root and aorta than did mice that had received control bone marrow. Analyses of macrophages from Tet2 knockout mice showed elevated expression of several chemokine and cytokine genes that contribute to atherosclerosis.

    Conclusions: The presence of CHIP in peripheral-blood cells was associated with nearly a doubling in the risk of coronary heart disease in humans and with accelerated atherosclerosis in mice. (Funded by the National Institutes of Health and others.).

    Funded by: FIC NIH HHS: RC1 TW008485; NHGRI NIH HHS: U54 HG003067; NHLBI NIH HHS: R01 HL080472, R01 HL082945, RC2 HL101834, T32 HL116324

    The New England journal of medicine 2017;377;2;111-121

  • Evolution of mobile genetic element composition in an epidemic methicillin-resistant Staphylococcus aureus: temporal changes correlated with frequent loss and gain events.

    Jamrozy D, Coll F, Mather AE, Harris SR, Harrison EM, MacGowan A, Karas A, Elston T, Estée Török M, Parkhill J and Peacock SJ

    The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.

    Background: Horizontal transfer of mobile genetic elements (MGEs) that carry virulence and antimicrobial resistance genes mediates the evolution of methicillin-resistant Staphylococcus aureus, and the emergence of new MRSA clones. Most MRSA lineages show an association with specific MGEs and the evolution of MGE composition following clonal expansion has not been widely studied.

    Results: We investigated the genomes of 1193 S. aureus bloodstream isolates, 1169 of which were MRSA, collected in the UK and the Republic of Ireland between 2001 and 2010. The majority of isolates belonged to clonal complex (CC)22 (n = 923), which contained diverse MGEs including elements that were found in other MRSA lineages. Several MGEs showed variable distribution across the CC22 phylogeny, including two antimicrobial resistance plasmids (pWBG751-like and SAP078A-like, carrying erythromycin and heavy metal resistance genes, respectively), a pathogenicity island carrying the enterotoxin C gene and two phage types Sa1int and Sa6int. Multiple gains and losses of these five MGEs were identified in the CC22 phylogeny using ancestral state reconstruction. Analysis of the temporal distribution of the five MGEs between 2001 and 2010 revealed an unexpected reduction in prevalence of the two plasmids and the pathogenicity island, and an increase in the two phage types. This occurred across the lineage and was not correlated with changes in the relative prevalence of CC22, or of any sub-lineages within in.

    Conclusions: Ancestral state reconstruction coupled with temporal trend analysis demonstrated that epidemic MRSA CC22 has an evolving MGE composition, and indicates that this important MRSA lineage has continued to adapt to changing selective pressure since its emergence.

    Funded by: Medical Research Council: G1000803; Wellcome Trust

    BMC genomics 2017;18;1;684

  • The secondary resistome of multidrug-resistant Klebsiella pneumoniae.

    Jana B, Cain AK, Doerrler WT, Boinett CJ, Fookes MC, Parkhill J and Guardabassi L

    Department of Veterinary Disease Biology, Faculty of Health and Medical Sciences, University of Copenhagen, Frederiksberg, Denmark.

    Klebsiella pneumoniae causes severe lung and bloodstream infections that are difficult to treat due to multidrug resistance. We hypothesized that antimicrobial resistance can be reversed by targeting chromosomal non-essential genes that are not responsible for acquired resistance but essential for resistant bacteria under therapeutic concentrations of antimicrobials. Conditional essentiality of individual genes to antimicrobial resistance was evaluated in an epidemic multidrug-resistant clone of K. pneumoniae (ST258). We constructed a high-density transposon mutant library of >430,000 unique Tn5 insertions and measured mutant depletion upon exposure to three clinically relevant antimicrobials (colistin, imipenem or ciprofloxacin) by Transposon Directed Insertion-site Sequencing (TraDIS). Using this high-throughput approach, we defined three sets of chromosomal non-essential genes essential for growth during exposure to colistin (n = 35), imipenem (n = 1) or ciprofloxacin (n = 1) in addition to known resistance determinants, collectively termed the "secondary resistome". As proof of principle, we demonstrated that inactivation of a non-essential gene not previously found linked to colistin resistance (dedA) restored colistin susceptibility by reducing the minimum inhibitory concentration from 8 to 0.5 μg/ml, 4-fold below the susceptibility breakpoint (S ≤ 2 μg/ml). This finding suggests that the secondary resistome is a potential target for developing antimicrobial "helper" drugs that restore the efficacy of existing antimicrobials.

    Scientific reports 2017;7;42483

  • Genetic variation and gene expression across multiple tissues and developmental stages in a nonhuman primate.

    Jasinska AJ, Zelaya I, Service SK, Peterson CB, Cantor RM, Choi OW, DeYoung J, Eskin E, Fairbanks LA, Fears S, Furterer AE, Huang YS, Ramensky V, Schmitt CA, Svardal H, Jorgensen MJ, Kaplan JR, Villar D, Aken BL, Flicek P, Nag R, Wong ES, Blangero J, Dyer TD, Bogomolov M, Benjamini Y, Weinstock GM, Dewar K, Sabatti C, Wilson RK, Jentsch JD, Warren W, Coppola G, Woods RP and Freimer NB

    Center for Neurobehavioral Genetics, Semel Institute for Neuroscience and Human Behavior, Department of Psychiatry and Biobehavioral Sciences, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, California, USA.

    By analyzing multitissue gene expression and genome-wide genetic variation data in samples from a vervet monkey pedigree, we generated a transcriptome resource and produced the first catalog of expression quantitative trait loci (eQTLs) in a nonhuman primate model. This catalog contains more genome-wide significant eQTLs per sample than comparable human resources and identifies sex- and age-related expression patterns. Findings include a master regulatory locus that likely has a role in immune function and a locus regulating hippocampal long noncoding RNAs (lncRNAs), whose expression correlates with hippocampal volume. This resource will facilitate genetic investigation of quantitative traits, including brain and behavioral phenotypes relevant to neuropsychiatric disorders.

    Nature genetics 2017

  • Laser Capture and Deep Sequencing Reveals the Transcriptomic Programmes Regulating the Onset of Pancreas and Liver Differentiation in Human Embryos.

    Jennings RE, Berry AA, Gerrard DT, Wearne SJ, Strutt J, Withey S, Chhatriwala M, Piper Hanley K, Vallier L, Bobola N and Hanley NA

    Division of Diabetes, Endocrinology & Gastroenterology, Faculty of Biology, Medicine & Health, AV Hill Building, University of Manchester, Oxford Road, Manchester M13 9PT, UK; Endocrinology Department, Manchester University NHS Foundation Trust, Grafton Street, Manchester M13 9WU, UK.

    To interrogate the alternative fates of pancreas and liver in the earliest stages of human organogenesis, we developed laser capture, RNA amplification, and computational analysis of deep sequencing. Pancreas-enriched gene expression was less conserved between human and mouse than for liver. The dorsal pancreatic bud was enriched for components of Notch, Wnt, BMP, and FGF signaling, almost all genes known to cause pancreatic agenesis or hypoplasia, and over 30 unexplored transcription factors. SOX9 and RORA were imputed as key regulators in pancreas compared with EP300, HNF4A, and FOXA family members in liver. Analyses implied that current in vitro human stem cell differentiation follows a dorsal rather than a ventral pancreatic program and pointed to additional factors for hepatic differentiation. In summary, we provide the transcriptional codes regulating the start of human liver and pancreas development to facilitate stem cell research and clinical interpretation without inter-species extrapolation.

    Funded by: Arthritis Research UK; British Heart Foundation; Medical Research Council: G1100420, MC_PC_12009, MR/J003352/1, MR/L009986/1; Wellcome Trust: 088566, 105610/Z/14/Z

    Stem cell reports 2017;9;5;1387-1394

  • Crosstalk between PKA and PKG controls pH-dependent host cell egress of Toxoplasma gondii.

    Jia Y, Marq JB, Bisio H, Jacot D, Mueller C, Yu L, Choudhary J, Brochet M and Soldati-Favre D

    Department of Microbiology and Molecular Medicine, CMU, University of Geneva, Geneva 4, Switzerland.

    Toxoplasma gondii encodes three protein kinase A catalytic (PKAc1-3) and one regulatory (PKAr) subunits to integrate cAMP-dependent signals. Here, we show that inactive PKAc1 is maintained at the parasite pellicle by interacting with acylated PKAr. Either a conditional knockdown of PKAr or the overexpression of PKAc1 blocks parasite division. Conversely, down-regulation of PKAc1 or stabilisation of a dominant-negative PKAr isoform that does not bind cAMP triggers premature parasite egress from infected cells followed by serial invasion attempts leading to host cell lysis. This untimely egress depends on host cell acidification. A phosphoproteome analysis suggested the interplay between cAMP and cGMP signalling as PKAc1 inactivation changes the phosphorylation profile of a putative cGMP-phosphodiesterase. Concordantly, inhibition of the cGMP-dependent protein kinase G (PKG) blocks egress induced by PKAc1 inactivation or environmental acidification, while a cGMP-phosphodiesterase inhibitor circumvents egress repression by PKAc1 or pH neutralisation. This indicates that pH and PKAc1 act as balancing regulators of cGMP metabolism to control egress. These results reveal a crosstalk between PKA and PKG pathways to govern egress in T. gondii.

    The EMBO journal 2017

  • Human Y-chromosome variation in the genome-sequencing era.

    Jobling MA and Tyler-Smith C

    Department of Genetics &Genome Biology, University of Leicester, University Road, Leicester LE1 7RH, UK.

    The properties of the human Y chromosome - namely, male specificity, haploidy and escape from crossing over - make it an unusual component of the genome, and have led to its genetic variation becoming a key part of studies of human evolution, population history, genealogy, forensics and male medical genetics. Next-generation sequencing (NGS) technologies have driven recent progress in these areas. In particular, NGS has yielded direct estimates of mutation rates, and an unbiased and calibrated molecular phylogeny that has unprecedented detail. Moreover, the availability of direct-to-consumer NGS services is fuelling a rise of 'citizen scientists', whose interest in resequencing their own Y chromosomes is generating a wealth of new data.

    Nature reviews. Genetics 2017

  • The Type III Secretion System Effector SptP of Salmonella enterica Serovar Typhi.

    Johnson R, Byrne A, Berger CN, Klemm E, Crepin VF, Dougan G and Frankel G

    MRC Centre for Molecular Bacteriology and Infection, Department of Life Sciences, Imperial College London, London, United Kingdom.

    Strains of the various Salmonella enterica serovars cause gastroenteritis or typhoid fever in humans, with virulence depending on the action of two type III secretion systems (Salmonella pathogenicity island 1 [SPI-1] and SPI-2). SptP is a Salmonella SPI-1 effector, involved in mediating recovery of the host cytoskeleton postinfection. SptP requires a chaperone, SicP, for stability and secretion. SptP has 94% identity between S. enterica serovar Typhimurium and S Typhi; direct comparison of the protein sequences revealed that S Typhi SptP has numerous amino acid changes within its chaperone-binding domain. Subsequent comparison of ΔsptP S Typhi and S. Typhimurium strains demonstrated that, unlike SptP in S. Typhimurium, SptP in S Typhi was not involved in invasion or cytoskeletal recovery postinfection. Investigation of whether the observed amino acid changes within SptP of S Typhi affected its function revealed that S Typhi SptP was unable to complement S. Typhimurium ΔsptP due to an absence of secretion. We further demonstrated that while S. Typhimurium SptP is stable intracellularly within S Typhi, S Typhi SptP is unstable, although stability could be recovered following replacement of the chaperone-binding domain with that of S. Typhimurium. Direct assessment of the strength of the interaction between SptP and SicP of both serovars via bacterial two-hybrid analysis demonstrated that S Typhi SptP has a significantly weaker interaction with SicP than the equivalent proteins in S. Typhimurium. Taken together, our results suggest that changes within the chaperone-binding domain of SptP in S Typhi hinder binding to its chaperone, resulting in instability, preventing translocation, and therefore restricting the intracellular activity of this effector.

    Importance: Studies investigating Salmonella pathogenesis typically rely on Salmonella Typhimurium, even though Salmonella Typhi causes the more severe disease in humans. As such, an understanding of S. Typhi pathogenesis is lacking. Differences within the type III secretion system effector SptP between typhoidal and nontyphoidal serovars led us to characterize this effector within S Typhi. Our results suggest that SptP is not translocated from typhoidal serovars, even though the loss of sptP results in virulence defects in S. Typhimurium. Although SptP is just one effector, our results exemplify that the behavior of these serovars is significantly different and genes identified to be important for S. Typhimurium virulence may not translate to S Typhi.

    Funded by: Medical Research Council: MR/J006874/1, MR/K019007/1

    Journal of bacteriology 2017;199;4

  • Comparison of Salmonella enterica serovars Typhi and Typhimurium reveals typhoidal-specific responses to bile.

    Johnson R, Ravenhall M, Pickard D, Dougan G, Byrne A and Frankel G

    MRC Centre for Molecular Bacteriology and Infection, Department of Life Sciences, Imperial College London, London, United Kingdom.

    Salmonella enterica serovars Typhi and Typhimurium cause typhoid fever and gastroenteritis respectively. A unique feature of typhoid infection is asymptomatic carriage within the gallbladder, which is linked with S Typhi transmission. Despite this, S Typhi responses to bile have been poorly studied. RNA-Seq of S Typhi Ty2 and a clinical S Typhi isolate belonging to the globally dominant H58 lineage (129-0238), as well as S Typhimurium 14028, revealed that 249, 389 and 453 genes respectively were differentially expressed in the presence of 3% bile compared to control cultures lacking bile. fad genes, the actP-acs operon, and putative sialic acid uptake and metabolism genes (t1787-t1790) were upregulated in all strains following bile exposure, which may represent adaptation to the small intestine environment. Genes within the Salmonella pathogenicity island 1 (SPI-1), encoding a type IIII secretion system (T3SS), and motility genes were significantly upregulated in both S Typhi strains in bile, but downregulated in S Typhimurium. Western blots of the SPI-1 proteins SipC, SipD, SopB and SopE validated the gene expression data. Consistent with this, bile significantly increased S Typhi HeLa cell invasion whilst S Typhimurium invasion was significantly repressed. Protein stability assays demonstrated that in S Typhi the half-life of HilD, the dominant regulator of SPI-1, is three times longer in the presence of bile; this increase in stability was independent of the acetyltransferase Pat. Overall, we found that S Typhi exhibits a specific response to bile, especially with regards to virulence gene expression, which could impact pathogenesis and transmission.

    Infection and immunity 2017

  • Somatic mutations reveal asymmetric cellular dynamics in the early human embryo.

    Ju YS, Martincorena I, Gerstung M, Petljak M, Alexandrov LB, Rahbari R, Wedge DC, Davies HR, Ramakrishna M, Fullam A, Martin S, Alder C, Patel N, Gamble S, O'Meara S, Giri DD, Sauer T, Pinder SE, Purdie CA, Borg Å, Stunnenberg H, van de Vijver M, Tan BK, Caldas C, Tutt A, Ueno NT, van 't Veer LJ, Martens JW, Sotiriou C, Knappskog S, Span PN, Lakhani SR, Eyfjörd JE, Børresen-Dale AL, Richardson A, Thompson AM, Viari A, Hurles ME, Nik-Zainal S, Campbell PJ and Stratton MR

    Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton CB10 1SA, UK.

    Somatic cells acquire mutations throughout the course of an individual's life. Mutations occurring early in embryogenesis are often present in a substantial proportion of, but not all, cells in postnatal humans and thus have particular characteristics and effects. Depending on their location in the genome and the proportion of cells they are present in, these mosaic mutations can cause a wide range of genetic disease syndromes and predispose carriers to cancer. They have a high chance of being transmitted to offspring as de novo germline mutations and, in principle, can provide insights into early human embryonic cell lineages and their contributions to adult tissues. Although it is known that gross chromosomal abnormalities are remarkably common in early human embryos, our understanding of early embryonic somatic mutations is very limited. Here we use whole-genome sequences of normal blood from 241 adults to identify 163 early embryonic mutations. We estimate that approximately three base substitution mutations occur per cell per cell-doubling event in early human embryogenesis and these are mainly attributable to two known mutational signatures. We used the mutations to reconstruct developmental lineages of adult cells and demonstrate that the two daughter cells of many early embryonic cell-doubling events contribute asymmetrically to adult blood at an approximately 2:1 ratio. This study therefore provides insights into the mutation rates, mutational processes and developmental outcomes of cell dynamics that operate during early human embryogenesis.

    Funded by: Wellcome Trust: 077012/Z/05/Z

    Nature 2017;543;7647;714-718

  • Advances in the generation of bioengineered bile ducts.

    Justin AW, Saeb-Parsy K, Markaki AE, Vallier L and Sampaziotis F

    Department of Engineering, University of Cambridge, Cambridge, UK. Electronic address:

    The generation of bioengineered biliary tissue could contribute to the management of some of the most impactful cholangiopathies associated with liver transplantation, such as biliary atresia or ischemic cholangiopathy. Recent advances in tissue engineering and in vitro cholangiocyte culture have made the achievement of this goal possible. Here we provide an overview of these developments and review the progress towards the generation and transplantation of bioengineered bile ducts.

    Biochimica et biophysica acta 2017

  • Genetic Characterization of Vibrio cholerae O1 isolates from outbreaks between 2011 and 2015 in Tanzania.

    Kachwamba Y, Mohammed AA, Lukupulo H, Urio L, Majigo M, Mosha F, Matonya M, Kishimba R, Mghamba J, Lusekelo J, Nyanga S, Almeida M, Li S, Domman D, Massele SY and Stine OC

    Muhimbili University of Health and Allied Sciences, Dar es Salaam, United Republic of Tanzania.

    Background: Cholera outbreaks have occurred in Tanzania since 1974. To date, the genetic epidemiology of these outbreaks has not been assessed.

    Methods: 96 Vibrio cholerae O1 isolates from five regions were characterized, and their genetic relatedness assessed using multi-locus variable-number tandem-repeat analysis (MLVA) and whole genome sequencing (WGS).

    Results: Of the 48 MLVA genotypes observed, 3 were genetically unrelated to any others, while the remaining 45 genotypes separated into three MLVA clonal complexes (CCs) - each comprised of genotypes differing by a single allelic change. In Kigoma, two separate outbreaks, 4 months apart (January and May, 2015), were each caused by genetically distinct strains by MLVA and WGS. Remarkably, one MLVA CC contained isolates from both the May outbreak and ones from the 2011/2012 outbreak in Dar-es-Salaam. However, WGS revealed the isolates from the two outbreaks to be distinct clades. The outbreak that started in August 2015 in Dar-es-Salaam and spread to Morogoro, Singida and Mara was comprised of a single MLVA CC and WGS clade. Isolates from within an outbreak were closely related differing at fewer than 5 nucleotides. All isolates were part of the 3(rd) wave of the 7(th) pandemic and were found in four clades related to isolates from Kenya and Asia.

    Conclusions: We conclude that genetically related V. cholerae cluster in outbreaks, and distinct strains circulate simultaneously.

    BMC infectious diseases 2017;17;1;157

  • Impact of insecticide resistance in Anopheles arabiensis on malaria incidence and prevalence in Sudan and the costs of mitigation.

    Kafy HT, Ismail BA, Mnzava AP, Lines J, Abdin MSE, Eltaher JS, Banaga AO, West P, Bradley J, Cook J, Thomas B, Subramaniam K, Hemingway J, Knox TB, Malik EM, Yukich JO, Donnelly MJ and Kleinschmidt I

    Vector Unit, Ministry of Health, Khartoum, Sudan.

    Insecticide-based interventions have contributed to ∼78% of the reduction in the malaria burden in sub-Saharan Africa since 2000. Insecticide resistance in malaria vectors could presage a catastrophic rebound in disease incidence and mortality. A major impediment to the implementation of insecticide resistance management strategies is that evidence of the impact of resistance on malaria disease burden is limited. A cluster randomized trial was conducted in Sudan with pyrethroid-resistant and carbamate-susceptible malaria vectors. Clusters were randomly allocated to receive either long-lasting insecticidal nets (LLINs) alone or LLINs in combination with indoor residual spraying (IRS) with a pyrethroid (deltamethrin) insecticide in the first year and a carbamate (bendiocarb) insecticide in the two subsequent years. Malaria incidence was monitored for 3 y through active case detection in cohorts of children aged 1 to <10 y. When deltamethrin was used for IRS, incidence rates in the LLIN + IRS arm and the LLIN-only arm were similar, with the IRS providing no additional protection [incidence rate ratio (IRR) = 1.0 (95% confidence interval [CI]: 0.36-3.0; P = 0.96)]. When bendiocarb was used for IRS, there was some evidence of additional protection [interaction IRR = 0.55 (95% CI: 0.40-0.76; P < 0.001)]. In conclusion, pyrethroid resistance may have had an impact on pyrethroid-based IRS. The study was not designed to assess whether resistance had an impact on LLINs. These data alone should not be used as the basis for any policy change in vector control interventions.

    Proceedings of the National Academy of Sciences of the United States of America 2017

  • Systematic longitudinal survey of invasive<i>Escherichia coli</i>in England demonstrates a stable population structure only transiently disturbed by the emergence of ST131.

    Kallonen T, Brodrick HJ, Harris SR, Corander J, Brown NM, Martin V, Peacock SJ and Parkhill J

    Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, United Kingdom.

    <i>Escherichia coli</i> associated with urinary tract infections and bacteremia has been intensively investigated, including recent work focusing on the virulent, globally disseminated, multidrug-resistant lineage ST131. To contextualize ST131 within the broader<i>E. coli</i>population associated with disease, we used genomics to analyze a systematic 11-yr hospital-based survey of<i>E. coli</i>associated with bacteremia using isolates collected from across England by the British Society for Antimicrobial Chemotherapy and from the Cambridge University Hospitals NHS Foundation Trust. Population dynamics analysis of the most successful lineages identified the emergence of ST131 and ST69 and their establishment as two of the five most common lineages along with ST73, ST95, and ST12. The most frequently identified lineage was ST73. Compared to ST131, ST73 was susceptible to most antibiotics, indicating that multidrug resistance was not the dominant reason for prevalence of<i>E. coli</i>lineages in this population. Temporal phylogenetic analysis of the emergence of ST69 and ST131 identified differences in the dynamics of emergence and showed that expansion of ST131 in this population was not driven by sequential emergence of increasingly resistant subclades. We showed that over time, the<i>E. coli</i>population was only transiently disturbed by the introduction of new lineages before a new equilibrium was rapidly achieved. Together, these findings suggest that the frequency of<i>E. coli</i>lineages in invasive disease is driven by negative frequency-dependent selection occurring outside of the hospital, most probably in the commensal niche, and that drug resistance is not a primary determinant of success in this niche.

    Funded by: Wellcome Trust

    Genome research 2017

  • WD40-repeat 47, a microtubule-associated protein, is essential for brain development and autophagy.

    Kannan M, Bayam E, Wagner C, Rinaldi B, Kretz PF, Tilly P, Roos M, McGillewie L, Bär S, Minocha S, Chevalier C, Po C, Sanger Mouse Genetics Project, Chelly J, Mandel JL, Borgatti R, Piton A, Kinnear C, Loos B, Adams DJ, Hérault Y, Collins SC, Friant S, Godin JD and Yalcin B

    Department of Translational Medicine and Neurogenetics, Institut de Génétique et de Biologie Moléculaire et Cellulaire, 67404 Illkirch, France.

    The family of WD40-repeat (WDR) proteins is one of the largest in eukaryotes, but little is known about their function in brain development. Among 26 WDR genes assessed, we found 7 displaying a major impact in neuronal morphology when inactivated in mice. Remarkably, all seven genes showed corpus callosum defects, including thicker (<i>Atg16l1</i>, <i>Coro1c</i>, <i>Dmxl2</i>, and <i>Herc1</i>), thinner (<i>Kif21b</i> and <i>Wdr89</i>), or absent corpus callosum (<i>Wdr47</i>), revealing a common role for WDR genes in brain connectivity. We focused on the poorly studied WDR47 protein sharing structural homology with LIS1, which causes lissencephaly. In a dosage-dependent manner, mice lacking <i>Wdr47</i> showed lethality, extensive fiber defects, microcephaly, thinner cortices, and sensory motor gating abnormalities. We showed that WDR47 shares functional characteristics with LIS1 and participates in key microtubule-mediated processes, including neural stem cell proliferation, radial migration, and growth cone dynamics. In absence of WDR47, the exhaustion of late cortical progenitors and the consequent decrease of neurogenesis together with the impaired survival of late-born neurons are likely yielding to the worsening of the microcephaly phenotype postnatally. Interestingly, the WDR47-specific C-terminal to LisH (CTLH) domain was associated with functions in autophagy described in mammals. Silencing WDR47 in hypothalamic GT1-7 neuronal cells and yeast models independently recapitulated these findings, showing conserved mechanisms. Finally, our data identified superior cervical ganglion-10 (SCG10) as an interacting partner of WDR47. Taken together, these results provide a starting point for studying the implications of WDR proteins in neuronal regulation of microtubules and autophagy.

    Proceedings of the National Academy of Sciences of the United States of America 2017;114;44;E9308-E9317

  • Flipping between Polycomb repressed and active transcriptional states introduces noise in gene expression.

    Kar G, Kim JK, Kolodziejczyk AA, Natarajan KN, Torlai Triglia E, Mifsud B, Elderkin S, Marioni JC, Pombo A and Teichmann SA

    European Molecular Biology Laboratory-European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.

    Polycomb repressive complexes (PRCs) are important histone modifiers, which silence gene expression; yet, there exists a subset of PRC-bound genes actively transcribed by RNA polymerase II (RNAPII). It is likely that the role of Polycomb repressive complex is to dampen expression of these PRC-active genes. However, it is unclear how this flipping between chromatin states alters the kinetics of transcription. Here, we integrate histone modifications and RNAPII states derived from bulk ChIP-seq data with single-cell RNA-sequencing data. We find that Polycomb repressive complex-active genes have greater cell-to-cell variation in expression than active genes, and these results are validated by knockout experiments. We also show that PRC-active genes are clustered on chromosomes in both two and three dimensions, and interactions with active enhancers promote a stabilization of gene expression noise. These findings provide new insights into how chromatin regulation modulates stochastic gene expression and transcriptional bursting, with implications for regulation of pluripotency and development.Polycomb repressive complexes modify histones but it is unclear how changes in chromatin states alter kinetics of transcription. Here, the authors use single-cell RNAseq and ChIPseq to find that actively transcribed genes with Polycomb marks have greater cell-to-cell variation in expression.

    Nature communications 2017;8;1;36

  • Prevalence of sexual dimorphism in mammalian phenotypic traits.

    Karp NA, Mason J, Beaudet AL, Benjamini Y, Bower L, Braun RE, Brown SDM, Chesler EJ, Dickinson ME, Flenniken AM, Fuchs H, Angelis MH, Gao X, Guo S, Greenaway S, Heller R, Herault Y, Justice MJ, Kurbatova N, Lelliott CJ, Lloyd KCK, Mallon AM, Mank JE, Masuya H, McKerlie C, Meehan TF, Mott RF, Murray SA, Parkinson H, Ramirez-Solis R, Santos L, Seavitt JR, Smedley D, Sorg T, Speak AO, Steel KP, Svenson KL, International Mouse Phenotyping Consortium, Wakana S, West D, Wells S, Westerberg H, Yaacoby S and White JK

    Mouse Informatics Group, The Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, UK.

    The role of sex in biomedical studies has often been overlooked, despite evidence of sexually dimorphic effects in some biological studies. Here, we used high-throughput phenotype data from 14,250 wildtype and 40,192 mutant mice (representing 2,186 knockout lines), analysed for up to 234 traits, and found a large proportion of mammalian traits both in wildtype and mutants are influenced by sex. This result has implications for interpreting disease phenotypes in animal models and humans.

    Funded by: NHGRI NIH HHS: UM1 HG006348; NIH HHS: U42 OD011185, UM1 OD023222

    Nature communications 2017;8;15475

  • Insertional mutagenesis identifies drivers of a novel oncogenic pathway in invasive lobular breast carcinoma.

    Kas SM, de Ruiter JR, Schipper K, Annunziato S, Schut E, Klarenbeek S, Drenth AP, van der Burg E, Klijn C, Ten Hoeve JJ, Adams DJ, Koudijs MJ, Wesseling J, Nethe M, Wessels LFA and Jonkers J

    Division of Molecular Pathology, The Netherlands Cancer Institute, Amsterdam, the Netherlands.

    Invasive lobular carcinoma (ILC) is the second most common breast cancer subtype and accounts for 8-14% of all cases. Although the majority of human ILCs are characterized by the functional loss of E-cadherin (encoded by CDH1), inactivation of Cdh1 does not predispose mice to develop mammary tumors, implying that mutations in additional genes are required for ILC formation in mice. To identify these genes, we performed an insertional mutagenesis screen using the Sleeping Beauty transposon system in mice with mammary-specific inactivation of Cdh1. These mice developed multiple independent mammary tumors of which the majority resembled human ILC in terms of morphology and gene expression. Recurrent and mutually exclusive transposon insertions were identified in Myh9, Ppp1r12a, Ppp1r12b and Trp53bp2, whose products have been implicated in the regulation of the actin cytoskeleton. Notably, MYH9, PPP1R12B and TP53BP2 were also frequently aberrated in human ILC, highlighting these genes as drivers of a novel oncogenic pathway underlying ILC development.

    Funded by: Cancer Research UK: 13031

    Nature genetics 2017;49;8;1219-1230

  • Single-cell epigenomics: Recording the past and predicting the future.

    Kelsey G, Stegle O and Reik W

    Epigenetics Programme, Babraham Institute, Cambridge CB22 3AT, UK.

    Single-cell multi-omics has recently emerged as a powerful technology by which different layers of genomic output-and hence cell identity and function-can be recorded simultaneously. Integrating various components of the epigenome into multi-omics measurements allows for studying cellular heterogeneity at different time scales and for discovering new layers of molecular connectivity between the genome and its functional output. Measurements that are increasingly available range from those that identify transcription factor occupancy and initiation of transcription to long-lasting and heritable epigenetic marks such as DNA methylation. Together with techniques in which cell lineage is recorded, this multilayered information will provide insights into a cell's past history and its future potential. This will allow new levels of understanding of cell fate decisions, identity, and function in normal development, physiology, and disease.

    Science (New York, N.Y.) 2017;358;6359;69-75

  • Identification of 153 new loci associated with heel bone mineral density and functional involvement of GPC6 in osteoporosis.

    Kemp JP, Morris JA, Medina-Gomez C, Forgetta V, Warrington NM, Youlten SE, Zheng J, Gregson CL, Grundberg E, Trajanoska K, Logan JG, Pollard AS, Sparkes PC, Ghirardello EJ, Allen R, Leitch VD, Butterfield NC, Komla-Ebri D, Adoum AT, Curry KF, White JK, Kussy F, Greenlaw KM, Xu C, Harvey NC, Cooper C, Adams DJ, Greenwood CMT, Maurano MT, Kaptoge S, Rivadeneira F, Tobias JH, Croucher PI, Ackert-Bicknell CL, Bassett JHD, Williams GR, Richards JB and Evans DM

    University of Queensland Diamantina Institute, Translational Research Institute, Brisbane, Queensland, Australia.

    Osteoporosis is a common disease diagnosed primarily by measurement of bone mineral density (BMD). We undertook a genome-wide association study (GWAS) in 142,487 individuals from the UK Biobank to identify loci associated with BMD as estimated by quantitative ultrasound of the heel. We identified 307 conditionally independent single-nucleotide polymorphisms (SNPs) that attained genome-wide significance at 203 loci, explaining approximately 12% of the phenotypic variance. These included 153 previously unreported loci, and several rare variants with large effect sizes. To investigate the underlying mechanisms, we undertook (1) bioinformatic, functional genomic annotation and human osteoblast expression studies; (2) gene-function prediction; (3) skeletal phenotyping of 120 knockout mice with deletions of genes adjacent to lead independent SNPs; and (4) analysis of gene expression in mouse osteoblasts, osteocytes and osteoclasts. The results implicate GPC6 as a novel determinant of BMD, and also identify abnormal skeletal phenotypes in knockout mice associated with a further 100 prioritized genes.

    Funded by: Arthritis Research UK: 17702, 21231; British Heart Foundation: RG/08/014/24067, RG/13/13/30194; Department of Health: HTA/10/33/04; Medical Research Council: G0400491, MC_QA137853, MC_U147585819, MC_U147585824, MC_U147585827, MC_UP_A620_1014, MC_UU_12011/1, MC_UU_12013/4, MR/L003120/1; Wellcome Trust: 094134, 101123WILLIAMS

    Nature genetics 2017;49;10;1468-1475

  • Fine-Scale Genetic Structure in Finland.

    Kerminen S, Havulinna AS, Hellenthal G, Martin AR, Sarin AP, Perola M, Palotie A, Salomaa V, Daly MJ, Ripatti S and Pirinen M

    Institute for Molecular Medicine Finland, University of Helsinki, 00014, Finland.

    Coupling dense genotype data with new computational methods offers unprecedented opportunities for individual-level ancestry estimation once geographically precisely defined reference data sets become available. We study such a reference data set for Finland containing 2376 such individuals from the FINRISK Study survey of 1997 both of whose parents were born close to each other. This sampling strategy focuses on the population structure present in Finland before the 1950s. By using the recent haplotype-based methods ChromoPainter (CP) and FineSTRUCTURE (FS) we reveal a highly geographically clustered genetic structure in Finland and report its connections to the settlement history as well as to the current dialectal regions of the Finnish language. The main genetic division within Finland shows striking concordance with the 1323 borderline of the treaty of Nöteborg. In general, we detect genetic substructure throughout the country, which reflects stronger regional genetic differences in Finland compared to, for example, the UK, which in a similar analysis was dominated by a single unstructured population. We expect that similar population genetic reference data sets will become available for many more populations in the near future with important applications, for example, in forensic genetics and in genetic association studies. With this in mind, we report those extensions of the CP + FS approach that we found most useful in our analyses of the Finnish data.

    G3 (Bethesda, Md.) 2017;7;10;3459-3468

  • Clinical features associated with CTNNB1 de novo loss of function mutations in ten individuals.

    Kharbanda M, Pilz DT, Tomkins S, Chandler K, Saggar A, Fryer A, McKay V, Louro P, Smith JC, Burn J, Kini U, De Burca A, FitzPatrick DR, Kinning E and DDD Study

    West of Scotland Clinical Genetics Service, Level 2A Laboratory Medicine Building, Queen Elizabeth University Hospital, Glasgow, UK. Electronic address:

    Loss of function mutations in CTNNB1 have been reported in individuals with intellectual disability [MIM #615075] associated with peripheral spasticity, microcephaly and central hypotonia, suggesting a recognisable phenotype associated with haploinsufficiency for this gene. Trio based whole exome sequencing via the Deciphering Developmental Disorders (DDD) study has identified eleven further individuals with de novo loss of function mutations in CTNNB1. Here we report detailed phenotypic information on ten of these. We confirm the features that have been previously described and further delineate the skin and hair findings, including fair skin and fair and sparse hair with unusual patterning.

    Funded by: Medical Research Council: MC_PC_U127561093; Wellcome Trust

    European journal of medical genetics 2017;60;2;130-135

  • Adults with suspected central nervous system infection: A prospective study of diagnostic accuracy.

    Khatib U, van de Beek D, Lees JA and Brouwer MC

    Department of Neurology, Center of Infection and Immunity Amsterdam (CINIMA), Academic Medical Center, University of Amsterdam, Amsterdam, The Netherlands.

    Objectives: To study the diagnostic accuracy of clinical and laboratory features in the diagnosis of central nervous system (CNS) infection and bacterial meningitis.

    Methods: We included consecutive adult episodes with suspected CNS infection who underwent cerebrospinal fluid (CSF) examination. The reference standard was the diagnosis classified into five categories: 1) CNS infection; 2) CNS inflammation without infection; 3) other neurological disorder; 4) non-neurological infection; and 5) other systemic disorder.

    Results: Between 2012 and 2015, 363 episodes of suspected CNS infection were included. CSF examination showed leucocyte count >5/mm<sup>3</sup> in 47% of episodes. Overall, 89 of 363 episodes were categorized as CNS infection (25%; most commonly viral meningitis [7%], bacterial meningitis [7%], and viral encephalitis [4%]), 36 (10%) episodes as CNS inflammatory disorder, 111 (31%) as systemic infection, in 119 (33%) as other neurological disorder, and 8 (2%) as other systemic disorders. Diagnostic accuracy of individual clinical characteristics and blood tests for the diagnosis of CNS infection or bacterial meningitis was low. CSF leucocytosis differentiated best between bacterial meningitis and other diagnoses (area under the curve [AUC] 0.95) or any neurological infection versus other diagnoses (AUC 0.93).

    Conclusions: Clinical characteristics fail to differentiate between neurological infections and other diagnoses, and CSF analysis is the main contributor to the final diagnosis.

    Funded by: Medical Research Council: 1365620 ; Wellcome Trust: 098051

    The Journal of infection 2017;74;1;1-9

  • Common genetic variation drives molecular heterogeneity in human iPSCs.

    Kilpinen H, Goncalves A, Leha A, Afzal V, Alasoo K, Ashford S, Bala S, Bensaddek D, Casale FP, Culley OJ, Danecek P, Faulconbridge A, Harrison PW, Kathuria A, McCarthy D, McCarthy SA, Meleckyte R, Memari Y, Moens N, Soares F, Mann A, Streeter I, Agu CA, Alderton A, Nelson R, Harper S, Patel M, White A, Patel SR, Clarke L, Halai R, Kirton CM, Kolb-Kokocinski A, Beales P, Birney E, Danovi D, Lamond AI, Ouwehand WH, Vallier L, Watt FM, Durbin R, Stegle O and Gaffney DJ

    European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

    Technology utilizing human induced pluripotent stem cells (iPS cells) has enormous potential to provide improved cellular models of human disease. However, variable genetic and phenotypic characterization of many existing iPS cell lines limits their potential use for research and therapy. Here we describe the systematic generation, genotyping and phenotyping of 711 iPS cell lines derived from 301 healthy individuals by the Human Induced Pluripotent Stem Cells Initiative. Our study outlines the major sources of genetic and phenotypic variation in iPS cells and establishes their suitability as models of complex human traits and cancer. Through genome-wide profiling we find that 5-46% of the variation in different iPS cell phenotypes, including differentiation capacity and cellular morphology, arises from differences between individuals. Additionally, we assess the phenotypic consequences of genomic copy-number alterations that are repeatedly observed in iPS cells. In addition, we present a comprehensive map of common regulatory variants affecting the transcriptome of human pluripotent cells.

    Funded by: Medical Research Council: G0801843, MC_PC_12009, MC_PC_12026; Wellcome Trust: WT090851

    Nature 2017;546;7658;370-375

  • Detection of structural mosaicism from targeted and whole-genome sequencing data.

    King DA, Sifrim A, Fitzgerald TW, Rahbari R, Hobson E, Homfray T, Mansour S, Mehta SG, Shehla M, Tomkins SE, Vasudevan PC, Hurles ME and Deciphering Developmental Disorders Study

    Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, United Kingdom.

    Structural mosaic abnormalities are large post-zygotic mutations present in a subset of cells and have been implicated in developmental disorders and cancer. Such mutations have been conventionally assessed in clinical diagnostics using cytogenetic or microarray testing. Modern disease studies rely heavily on exome sequencing, yet an adequate method for the detection of structural mosaicism using targeted sequencing data is lacking. Here, we present a method, called MrMosaic, to detect structural mosaic abnormalities using deviations in allele fraction and read coverage from next-generation sequencing data. Whole-exome sequencing (WES) and whole-genome sequencing (WGS) simulations were used to calculate detection performance across a range of mosaic event sizes, types, clonalities, and sequencing depths. The tool was applied to 4911 patients with undiagnosed developmental disorders, and 11 events among nine patients were detected. For eight of these 11 events, mosaicism was observed in saliva but not blood, suggesting that assaying blood alone would miss a large fraction, possibly >50%, of mosaic diagnostic chromosomal rearrangements.

    Funded by: Wellcome Trust: WT098051

    Genome research 2017;27;10;1704-1714

  • Proliferation Drives Aging-Related Functional Decline in a Subpopulation of the Hematopoietic Stem Cell Compartment.

    Kirschner K, Chandra T, Kiselev V, Flores-Santa Cruz D, Macaulay IC, Park HJ, Li J, Kent DG, Kumar R, Pask DC, Hamilton TL, Hemberg M, Reik W and Green AR

    Cambridge Institute for Medical Research, University of Cambridge, Cambridge, Cambridgeshire CB2 0XY, UK; Department of Haematology, University of Cambridge, Cambridge, Cambridgeshire CB2 0XY, UK; Stem Cell Institute, University of Cambridge, Cambridge, Cambridgeshire CB2 0XY, UK; Institute for Cancer Sciences, University of Glasgow, Glasgow, Lanarkshire G61 1BD, UK. Electronic address:

    Aging of the hematopoietic stem cell (HSC) compartment is characterized by lineage bias and reduced stem cell function, the molecular basis of which is largely unknown. Using single-cell transcriptomics, we identified a distinct subpopulation of old HSCs carrying a p53 signature indicative of stem cell decline alongside pro-proliferative JAK/STAT signaling. To investigate the relationship between JAK/STAT and p53 signaling, we challenged HSCs with a constitutively active form of JAK2 (V617F) and observed an expansion of the p53-positive subpopulation in old mice. Our results reveal cellular heterogeneity in the onset of HSC aging and implicate a role for JAK2V617F-driven proliferation in the p53-mediated functional decline of old HSCs.

    Funded by: Medical Research Council: MC_PC_12009

    Cell reports 2017;19;8;1503-1511

  • SC3: consensus clustering of single-cell RNA-seq data.

    Kiselev VY, Kirschner K, Schaub MT, Andrews T, Yiu A, Chandra T, Natarajan KN, Reik W, Barahona M, Green AR and Hemberg M

    Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK.

    Single-cell RNA-seq enables the quantitative characterization of cell types based on global transcriptome profiles. We present single-cell consensus clustering (SC3), a user-friendly tool for unsupervised clustering, which achieves high accuracy and robustness by combining multiple clustering solutions through a consensus approach ( We demonstrate that SC3 is capable of identifying subclones from the transcriptomes of neoplastic cells collected from patients.

    Funded by: Medical Research Council: MC_PC_12009; Wellcome Trust: 098051

    Nature methods 2017;14;5;483-486

  • Preterm Infant-Associated Clostridium tertium, Clostridium cadaveris, and Clostridium paraputrificum Strains: Genomic and Evolutionary Insights.

    Kiu R, Caim S, Alcon-Giner C, Belteki G, Clarke P, Pickard D, Dougan G and Hall LJ

    The Gut Health and Food Safety Programme, Quadram Institute Bioscience, Norwich Research Park, Norwich, United Kingdom.

    Clostridium species (particularly Clostridium difficile, Clostridium botulinum, Clostridium tetani and Clostridium perfringens) are associated with a range of human and animal diseases. Several other species including Clostridium tertium, Clostridium cadaveris, and Clostridium paraputrificum have also been linked with sporadic human infections, however there is very limited, or in some cases, no genomic information publicly available. Thus, we isolated one C. tertium strain, one C. cadaveris strain and three C. paraputrificum strains from preterm infants residing within neonatal intensive care units and performed Whole Genome Sequencing (WGS) using Illumina HiSeq. In this report, we announce the open availability of the draft genomes: C. tertium LH009, C. cadaveris LH052, C. paraputrificum LH025, C. paraputrificum LH058, and C. paraputrificum LH141. These genomes were checked for contamination in silico to ensure purity, and we confirmed species identity and phylogeny using both 16S rRNA gene sequences (from PCR and in silico) and WGS-based approaches. Average Nucleotide Identity (ANI) was used to differentiate genomes from their closest relatives to further confirm speciation boundaries. We also analysed the genomes for virulence-related factors and antimicrobial resistance genes, and detected presence of tetracycline and methicillin resistance, and potentially harmful enzymes, including multiple phospholipases and toxins. The availability of genomic data in open databases, in tandem with our initial insights into the genomic content and virulence traits of these pathogenic Clostridium species, should enable the scientific community to further investigate the disease-causing mechanisms of these bacteria with a view to enhancing clinical diagnosis and treatment.

    Genome biology and evolution 2017;9;10;2707-2714

  • The Influence of HIV on The Evolution of Mycobacterium tuberculosis.

    Koch A, Brites D, Stucki D, Evans JC, Seldon R, Heekes A, Mulder N, Nicol M, Oni T, Warner DF, Mizrahi V, Parkhill J, Gagneux S, Martin DP and Wilkinson RJ

    Wellcome Centre for Infectious Disease Research in Africa, Institute of Infectious Disease and Molecular Medicine, and Department of Medicine, University of Cape Town, South Africa.

    HIV significantly affects the immunological environment during tuberculosis co-infection, and therefore may influence the selective landscape upon which M. tuberculosis evolves. To test this hypothesis whole genome sequences were determined for 169 South African M. tuberculosis strains from HIV-1 co-infected and uninfected individuals and analysed using two Bayesian codon-model based selection analysis approaches: FUBAR which was used to detect persistent positive and negative selection (selection respectively favouring and disfavouring nonsynonymous substitutions); and MEDS which was used to detect episodic directional selection specifically favouring nonsynonymous substitutions within HIV-1 infected individuals. Among the 25,251 polymorphic codon sites analysed, FUBAR revealed that 189-fold more were detectably evolving under persistent negative selection than were evolving under persistent positive selection. Three specific codon sites within the genes celA2b, katG and cyp138 were identified by MEDS as displaying significant evidence of evolving under directional selection influenced by HIV-1 co-infection. All three genes encode proteins that may indirectly interact with human proteins that, in turn, interact functionally with HIV proteins. Unexpectedly, epitope encoding regions were enriched for sites displaying weak evidence of directional selection influenced by HIV-1. Although the low degree of genetic diversity observed in our M. tuberculosis dataset means that these results should be interpreted carefully, the effects of HIV-1 on epitope evolution in M. tuberculosis may have implications for the design of M. tuberculosis vaccines that are intended for use in populations with high HIV-1 infection rates.

    Molecular biology and evolution 2017

  • Open Targets: a platform for therapeutic target identification and validation.

    Koscielny G, An P, Carvalho-Silva D, Cham JA, Fumis L, Gasparyan R, Hasan S, Karamanis N, Maguire M, Papa E, Pierleoni A, Pignatelli M, Platt T, Rowland F, Wankar P, Bento AP, Burdett T, Fabregat A, Forbes S, Gaulton A, Gonzalez CY, Hermjakob H, Hersey A, Jupe S, Kafkas Ş, Keays M, Leroy C, Lopez FJ, Magarinos MP, Malone J, McEntyre J, Munoz-Pomer Fuentes A, O'Donovan C, Papatheodorou I, Parkinson H, Palka B, Paschall J, Petryszak R, Pratanwanich N, Sarntivijal S, Saunders G, Sidiropoulos K, Smith T, Sondka Z, Stegle O, Tang YA, Turner E, Vaughan B, Vrousgou O, Watkins X, Martin MJ, Sanseau P, Vamathevan J, Birney E, Barrett J and Dunham I

    Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK

    We have designed and developed a data integration and visualization platform that provides evidence about the association of known and potential drug targets with diseases. The platform is designed to support identification and prioritization of biological targets for follow-up. Each drug target is linked to a disease using integrated genome-wide data from a broad range of data sources. The platform provides either a target-centric workflow to identify diseases that may be associated with a specific target, or a disease-centric workflow to identify targets that may be associated with a specific disease. Users can easily transition between these target- and disease-centric workflows. The Open Targets Validation Platform is accessible at

    Nucleic acids research 2017;45;D1;D985-D994

  • Dynamics of Indel Profiles Induced by Various CRISPR/Cas9 Delivery Methods.

    Kosicki M, Rajan SS, Lorenzetti FC, Wandall HH, Narimatsu Y, Metzakopian E and Bennett EP

    Wellcome Trust Sanger Institute, Hinxton, Cambridge, United Kingdom.

    The introduction of CRISPR/Cas9 gene editing in mammalian cells is a scientific breakthrough, which has greatly affected basic research and gene therapy. The simplicity and general access to CRISPR/Cas9 reagents has in an unprecedented manner "democratized" gene targeting in biomedical research, enabling genetic engineering of any gene in any cell, tissue, organ, and organism. The ability for fast, precise, and efficient profiling of the double-stranded break induced insertions and deletions (indels), mediated by any of the available programmable nucleases, is paramount to any given gene targeting approach. In this study we review the most commonly used indel detection methods and using a robust, sensitive, and cost efficient Indel Detection by Amplicon Analysis method, we have investigated the impact of the most commonly used CRISPR/Cas9 delivery formats, including lentivirus transduction, plasmid lipofection, and ribo nuclear protein electroporation, on the dynamics of indel profile formation. We observe rapid indel formation using RNP electroporation, especially with synthetic stabilized gRNA, as well as long-term decline in overall indel frequency with lipofectamine-based, plasmid transfection methods. Most methods reach peak editing on day 2-3 postdelivery. Furthermore, we find relative increase in frequency of larger size indels (>6bp) under condition of persistent editing using stably integrated lentiviral gRNA and Cas9 vectors.

    Progress in molecular biology and translational science 2017;152;49-67

  • Benzalkonium tolerance genes and outcome in Listeria monocytogenes meningitis.

    Kremer PH, Lees JA, Koopmans MM, Ferwerda B, Arends AW, Feller MM, Schipper K, Valls Seron M, van der Ende A, Brouwer MC, van de Beek D and Bentley SD

    Department of Neurology, Centre for Infection and Immunity Amsterdam (CINIMA), Academic Medical Centre, University of Amsterdam, Amsterdam, The Netherlands.

    Objectives: Listeria monocytogenes is a food-borne pathogen that can cause meningitis. The listerial genotype ST6 has been linked to increasing rates of unfavourable outcome over time. We investigated listerial genetic variation and the relation with clinical outcome in meningitis.

    Methods: We sequenced 96 isolates from adults with listerial meningitis included in two prospective nationwide cohort studies by whole genome sequencing, and evaluated associations between bacterial genetic variation and clinical outcome. We validated these results by screening listerial genotypes of 445 cerebrospinal fluid and blood isolates from patients over a 30-year period from the Dutch national surveillance cohort.

    Results: We identified a bacteriophage, phiLMST6 co-occurring with a novel plasmid, pLMST6, in ST6 isolates to be associated with unfavourable outcome in patients (p 2.83e-05). The plasmid carries a benzalkonium chloride tolerance gene, emrC, conferring decreased susceptibility to disinfectants used in the food-processing industry. Isolates harbouring emrC were growth inhibited at higher levels of benzalkonium chloride (median 60 mg/L versus 15 mg/L; p <0.001), and had higher MICs for amoxicillin and gentamicin compared with isolates without emrC (both p <0.001). Transformation of pLMST6 into naive strains led to benzalkonium chloride tolerance and higher MICs for gentamicin.

    Conclusions: These results show that a novel plasmid, carrying the efflux transporter emrC, is associated with increased incidence of ST6 listerial meningitis in the Netherlands. Suggesting increased disease severity, our findings warrant consideration of disinfectants used in the food-processing industry that select for resistance mechanisms and may, inadvertently, lead to increased risk of poor disease outcome.

    Funded by: European Research Council: 281156

    Clinical microbiology and infection : the official publication of the European Society of Clinical Microbiology and Infectious Diseases 2017;23;4;265.e1-265.e7

  • Epigenetic and Genetic Contributions to Adaptation in Chlamydomonas.

    Kronholm I, Bassett A, Baulcombe D and Collins S

    Department of Biological and Environmental Science, Centre of Excellence in Biological Interactions, University of Jyväskylä, Jyväskylä, Finland.

    Epigenetic modifications, such as DNA methylation or histone modifications, can be transmitted between cellular or organismal generations. However, there are no experiments measuring their role in adaptation, so here we use experimental evolution to investigate how epigenetic variation can contribute to adaptation. We manipulated DNA methylation and histone acetylation in the unicellular green alga Chlamydomonas reinhardtii both genetically and chemically to change the amount of epigenetic variation generated or transmitted in adapting populations in three different environments (salt stress, phosphate starvation, and high CO2) for two hundred asexual generations. We find that reducing the amount of epigenetic variation available to populations can reduce adaptation in environments where it otherwise happens. From genomic and epigenomic sequences from a subset of the populations, we see changes in methylation patterns between the evolved populations over-represented in some functional categories of genes, which is consistent with some of these differences being adaptive. Based on whole genome sequencing of evolved clones, the majority of DNA methylation changes do not appear to be linked to cis-acting genetic mutations. Our results show that transgenerational epigenetic effects play a role in adaptive evolution, and suggest that the relationship between changes in methylation patterns and differences in evolutionary outcomes, at least for quantitative traits such as cell division rates, is complex.

    Molecular biology and evolution 2017;34;9;2285-2306

  • Risks of Breast, Ovarian, and Contralateral Breast Cancer for BRCA1 and BRCA2 Mutation Carriers.

    Kuchenbaecker KB, Hopper JL, Barnes DR, Phillips KA, Mooij TM, Roos-Blom MJ, Jervis S, van Leeuwen FE, Milne RL, Andrieu N, Goldgar DE, Terry MB, Rookus MA, Easton DF, Antoniou AC, BRCA1 and BRCA2 Cohort Consortium, McGuffog L, Evans DG, Barrowdale D, Frost D, Adlard J, Ong KR, Izatt L, Tischkowitz M, Eeles R, Davidson R, Hodgson S, Ellis S, Nogues C, Lasset C, Stoppa-Lyonnet D, Fricker JP, Faivre L, Berthet P, Hooning MJ, van der Kolk LE, Kets CM, Adank MA, John EM, Chung WK, Andrulis IL, Southey M, Daly MB, Buys SS, Osorio A, Engel C, Kast K, Schmutzler RK, Caldes T, Jakubowska A, Simard J, Friedlander ML, McLachlan SA, Machackova E, Foretova L, Tan YY, Singer CF, Olah E, Gerdes AM, Arver B and Olsson H

    Centre for Cancer Genetic Epidemiology, Department of Public Health and Primary Care, University of Cambridge, Cambridge, England2Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, England.

    Importance: The clinical management of BRCA1 and BRCA2 mutation carriers requires accurate, prospective cancer risk estimates.

    Objectives: To estimate age-specific risks of breast, ovarian, and contralateral breast cancer for mutation carriers and to evaluate risk modification by family cancer history and mutation location.

    Design, setting, and participants: Prospective cohort study of 6036 BRCA1 and 3820 BRCA2 female carriers (5046 unaffected and 4810 with breast or ovarian cancer or both at baseline) recruited in 1997-2011 through the International BRCA1/2 Carrier Cohort Study, the Breast Cancer Family Registry and the Kathleen Cuningham Foundation Consortium for Research into Familial Breast Cancer, with ascertainment through family clinics (94%) and population-based studies (6%). The majority were from large national studies in the United Kingdom (EMBRACE), the Netherlands (HEBON), and France (GENEPSO). Follow-up ended December 2013; median follow-up was 5 years.

    Exposures: BRCA1/2 mutations, family cancer history, and mutation location.

    Main outcomes and measures: Annual incidences, standardized incidence ratios, and cumulative risks of breast, ovarian, and contralateral breast cancer.

    Results: Among 3886 women (median age, 38 years; interquartile range [IQR], 30-46 years) eligible for the breast cancer analysis, 5066 women (median age, 38 years; IQR, 31-47 years) eligible for the ovarian cancer analysis, and 2213 women (median age, 47 years; IQR, 40-55 years) eligible for the contralateral breast cancer analysis, 426 were diagnosed with breast cancer, 109 with ovarian cancer, and 245 with contralateral breast cancer during follow-up. The cumulative breast cancer risk to age 80 years was 72% (95% CI, 65%-79%) for BRCA1 and 69% (95% CI, 61%-77%) for BRCA2 carriers. Breast cancer incidences increased rapidly in early adulthood until ages 30 to 40 years for BRCA1 and until ages 40 to 50 years for BRCA2 carriers, then remained at a similar, constant incidence (20-30 per 1000 person-years) until age 80 years. The cumulative ovarian cancer risk to age 80 years was 44% (95% CI, 36%-53%) for BRCA1 and 17% (95% CI, 11%-25%) for BRCA2 carriers. For contralateral breast cancer, the cumulative risk 20 years after breast cancer diagnosis was 40% (95% CI, 35%-45%) for BRCA1 and 26% (95% CI, 20%-33%) for BRCA2 carriers (hazard ratio [HR] for comparing BRCA2 vs BRCA1, 0.62; 95% CI, 0.47-0.82; P=.001 for difference). Breast cancer risk increased with increasing number of first- and second-degree relatives diagnosed as having breast cancer for both BRCA1 (HR for ≥2 vs 0 affected relatives, 1.99; 95% CI, 1.41-2.82; P<.001 for trend) and BRCA2 carriers (HR, 1.91; 95% CI, 1.08-3.37; P=.02 for trend). Breast cancer risk was higher if mutations were located outside vs within the regions bounded by positions c.2282-c.4071 in BRCA1 (HR, 1.46; 95% CI, 1.11-1.93; P=.007) and c.2831-c.6401 in BRCA2 (HR, 1.93; 95% CI, 1.36-2.74; P<.001).

    Conclusions and relevance: These findings provide estimates of cancer risk based on BRCA1 and BRCA2 mutation carrier status using prospective data collection and demonstrate the potential importance of family history and mutation location in risk assessment.

    Funded by: Cancer Research UK: 11174, 12677; NCI NIH HHS: R01 CA159868, UM1 CA164920

    JAMA 2017;317;23;2402-2416

  • Combined immunodeficiency with severe inflammation and allergy caused by ARPC1B deficiency.

    Kuijpers TW, Tool ATJ, van der Bijl I, de Boer M, van Houdt M, de Cuyper IM, Roos D, van Alphen F, van Leeuwen K, Cambridge EL, Arends MJ, Dougan G, Clare S, Ramirez-Solis R, Pals ST, Adams DJ, Meijer AB and van den Berg TK

    Department of Pediatric Hematology, Immunology and Infectious Diseases, Emma Children's Hospital, Academic Medical Center, University of Amsterdam, Amsterdam, The Netherlands; Department of Blood Cell Research, Sanquin Research, University of Amsterdam, Amsterdam, The Netherlands. Electronic address:

    Funded by: Cancer Research UK: 13031

    The Journal of allergy and clinical immunology 2017;140;1;273-277.e10

  • Profiling invasive Plasmodium falciparum merozoites using an integrated omics approach.

    Kumar K, Srinivasan P, Nold MJ, Moch JK, Reiter K, Sturdevant D, Otto TD, Squires RB, Herrera R, Nagarajan V, Rayner JC, Porcella SF, Geromanos SJ, Haynes JD and Narum DL

    Laboratory of Malaria Immunology and Vaccinology, NIAID, NIH, Rockville, MD, USA.

    The symptoms of malaria are brought about by blood-stage parasites, which are established when merozoites invade human erythrocytes. Our understanding of the molecular events that underpin erythrocyte invasion remains hampered by the short-period of time that merozoites are invasive. To address this challenge, a Plasmodium falciparum gamma-irradiated long-lived merozoite (LLM) line was developed and investigated. Purified LLMs invaded erythrocytes by an increase of 10-300 fold compared to wild-type (WT) merozoites. Using an integrated omics approach, we investigated the basis for the phenotypic difference. Only a few single nucleotide polymorphisms within the P. falciparum genome were identified and only marginal differences were observed in the merozoite transcriptomes. By contrast, using label-free quantitative mass-spectrometry, a significant change in protein abundance was noted, of which 200 were proteins of unknown function. We determined the relative molar abundance of over 1100 proteins in LLMs and further characterized the major merozoite surface protein complex. A unique processed MSP1 intermediate was identified in LLM but not observed in WT suggesting that delayed processing may be important for the observed phenotype. This integrated approach has demonstrated the significant role of the merozoite proteome during erythrocyte invasion, while identifying numerous unknown proteins likely to be involved in invasion.

    Scientific reports 2017;7;1;17146

  • Genome watch: Microbiota shuns the modern world.

    Kumar N and Forster SC

    Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.

    Nature reviews. Microbiology 2017;15;12;710

  • The dental phenotype of hairless dogs with FOXI3 haploinsufficiency.

    Kupczik K, Cagan A, Brauer S and Fischer MS

    Max Planck Weizmann Center for Integrative Archaeology and Anthropology, Max Planck Institute for Evolutionary Anthropology, Deutscher Platz 6, 04103, Leipzig, Germany.

    Hairless dog breeds show a form of ectodermal dysplasia characterised by a lack of hair and abnormal tooth morphology. This has been attributed to a semi-dominant 7-base-pair duplication in the first exon of the forkhead box I3 gene (FOXI3) shared by all three breeds. Here, we identified this FOXI3 variant in a historical museum sample of pedigreed hairless dog skulls by using ancient DNA extraction and present the associated dental phenotype. Unlike in the coated wild type dogs, the hairless dogs were characterised in both the mandibular and maxillary dentition by a loss of the permanent canines, premolars and to some extent incisors. In addition, the deciduous fourth premolars and permanent first and second molars consistently lacked the distal and lingual cusps; this resulted in only a single enlarged cusp in the basin-like heel (talonid in lower molars, talon in upper molars). This molar phenotype is also found among several living and fossil carnivorans and the extinct order Creodonta in which it is associated with hypercarnivory. We therefore suggest that FOXI3 may generally be involved in dental (cusp) development within and across mammalian lineages including the hominids which are known to exhibit marked variability in the presence of lingual cusps.

    Funded by: Wellcome Trust

    Scientific reports 2017;7;1;5459

  • Predicting evolution.

    Lässig M, Mustonen V and Walczak AM

    Institute of Theoretical Physics, University of Cologne, 50937 Cologne, Germany.

    The face of evolutionary biology is changing: from reconstructing and analysing the past to predicting future evolutionary processes. Recent developments include prediction of reproducible patterns in parallel evolution experiments, forecasting the future of individual populations using data from their past, and controlled manipulation of evolutionary dynamics. Here we undertake a synthesis of central concepts for evolutionary predictions, based on examples of microbial and viral systems, cancer cell populations, and immune receptor repertoires. These systems have strikingly similar evolutionary dynamics driven by the competition of clades within a population. These dynamics are the basis for models that predict the evolution of clade frequencies, as well as broad genetic and phenotypic changes. Moreover, there are strong links between prediction and control, which are important for interventions such as vaccine or therapy design. All of these are key elements of what may become a predictive theory of evolution.

    Nature ecology & evolution 2017;1;3;77

  • Single-cell RNA-seq and computational analysis using temporal mixture modelling resolves Th1/Tfh fate bifurcation in malaria.

    Lönnberg T, Svensson V, James KR, Fernandez-Ruiz D, Sebina I, Montandon R, Soon MS, Fogg LG, Nair AS, Liligeto U, Stubbington MJ, Ly LH, Bagger FO, Zwiessele M, Lawrence ND, Souza-Fonseca-Guimaraes F, Bunn PT, Engwerda CR, Heath WR, Billker O, Stegle O, Haque A and Teichmann SA

    European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, UK; Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK.

    Differentiation of naïve CD4<sup>+</sup> T cells into functionally distinct T helper subsets is crucial for the orchestration of immune responses. Due to extensive heterogeneity and multiple overlapping transcriptional programs in differentiating T cell populations, this process has remained a challenge for systematic dissection <i>in vivo</i>. By using single-cell transcriptomics and computational analysis using a temporal mixtures of Gaussian processes model, termed GPfates, we reconstructed the developmental trajectories of Th1 and Tfh cells during blood-stage <i>Plasmodium</i> infection in mice. By tracking clonality using endogenous TCR sequences, we first demonstrated that Th1/Tfh bifurcation had occurred at both population and single-clone levels. Next, we identified genes whose expression was associated with Th1 or Tfh fates, and demonstrated a T-cell intrinsic role for Galectin-1 in supporting a Th1 differentiation. We also revealed the close molecular relationship between Th1 and IL-10-producing Tr1 cells in this infection. Th1 and Tfh fates emerged from a highly proliferative precursor that upregulated aerobic glycolysis and accelerated cell cycling as cytokine expression began. Dynamic gene expression of chemokine receptors around bifurcation predicted roles for cell-cell in driving Th1/Tfh fates. In particular, we found that precursor Th cells were coached towards a Th1 but not a Tfh fate by inflammatory monocytes. Thus, by integrating genomic and computational approaches, our study has provided two unique resources, a database, which facilitates discovery of novel factors controlling Th1/Tfh fate commitment, and more generally, GPfates, a modelling framework for characterizing cell differentiation towards multiple fates.

    Funded by: European Research Council: 260507; Wellcome Trust: 098051

    Science immunology 2017;2;9

  • High-throughput annotation of full-length long noncoding RNAs with capture long-read sequencing.

    Lagarde J, Uszczynska-Ratajczak B, Carbonell S, Pérez-Lluch S, Abad A, Davis C, Gingeras TR, Frankish A, Harrow J, Guigo R and Johnson R

    Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain.

    Accurate annotation of genes and their transcripts is a foundation of genomics, but currently no annotation technique combines throughput and accuracy. As a result, reference gene collections remain incomplete-many gene models are fragmentary, and thousands more remain uncataloged, particularly for long noncoding RNAs (lncRNAs). To accelerate lncRNA annotation, the GENCODE consortium has developed RNA Capture Long Seq (CLS), which combines targeted RNA capture with third-generation long-read sequencing. Here we present an experimental reannotation of the GENCODE intergenic lncRNA populations in matched human and mouse tissues that resulted in novel transcript models for 3,574 and 561 gene loci, respectively. CLS approximately doubled the annotated complexity of targeted loci, outperforming existing short-read techniques. Full-length transcript models produced by CLS enabled us to definitively characterize the genomic features of lncRNAs, including promoter and gene structure, and protein-coding potential. Thus, CLS removes a long-standing bottleneck in transcriptome annotation and generates manual-quality full-length transcript models at high-throughput scales.

    Funded by: NHGRI NIH HHS: U41 HG007000, U41 HG007234, U54 HG007004; NIMH NIH HHS: R01 MH101814

    Nature genetics 2017;49;12;1731-1740

  • Heterogeneity of hypothalamic pro-opiomelanocortin-expressing neurons revealed by single-cell RNA sequencing.

    Lam BYH, Cimino I, Polex-Wolf J, Nicole Kohnke S, Rimmington D, Iyemere V, Heeley N, Cossetti C, Schulte R, Saraiva LR, Logan DW, Blouet C, O'Rahilly S, Coll AP and Yeo GSH

    MRC Metabolic Diseases Unit, University of Cambridge Metabolic Research Laboratories, Wellcome Trust-MRC Institute of Metabolic Science, Addenbrooke's Hospital, Cambridge CB2 0QQ, UK.

    Objective: Arcuate proopiomelanocortin (POMC) neurons are critical nodes in the control of body weight. Often characterized simply as direct targets for leptin, recent data suggest a more complex architecture.

    Methods: Using single cell RNA sequencing, we have generated an atlas of gene expression in murine POMC neurons.

    Results: Of 163 neurons, 118 expressed high levels of Pomc with little/no Agrp expression and were considered "canonical" POMC neurons (P(+)). The other 45/163 expressed low levels of Pomc and high levels of Agrp (A(+)P+). Unbiased clustering analysis of P(+) neurons revealed four different classes, each with distinct cell surface receptor gene expression profiles. Further, only 12% (14/118) of P(+) neurons expressed the leptin receptor (Lepr) compared with 58% (26/45) of A(+)P+ neurons. In contrast, the insulin receptor (Insr) was expressed at similar frequency on P(+) and A(+)P+ neurons (64% and 55%, respectively).

    Conclusion: These data reveal arcuate POMC neurons to be a highly heterogeneous population. Accession Numbers: GSE92707.

    Molecular metabolism 2017;6;5;383-392

  • Large-Scale Cognitive GWAS Meta-Analysis Reveals Tissue-Specific Neural Expression and Potential Nootropic Drug Targets.

    Lam M, Trampush JW, Yu J, Knowles E, Davies G, Liewald DC, Starr JM, Djurovic S, Melle I, Sundet K, Christoforou A, Reinvang I, DeRosse P, Lundervold AJ, Steen VM, Espeseth T, Räikkönen K, Widen E, Palotie A, Eriksson JG, Giegling I, Konte B, Roussos P, Giakoumaki S, Burdick KE, Payton A, Ollier W, Chiba-Falek O, Attix DK, Need AC, Cirulli ET, Voineskos AN, Stefanis NC, Avramopoulos D, Hatzimanolis A, Arking DE, Smyrnis N, Bilder RM, Freimer NA, Cannon TD, London E, Poldrack RA, Sabb FW, Congdon E, Conley ED, Scult MA, Dickinson D, Straub RE, Donohoe G, Morris D, Corvin A, Gill M, Hariri AR, Weinberger DR, Pendleton N, Bitsios P, Rujescu D, Lahti J, Le Hellard S, Keller MC, Andreassen OA, Deary IJ, Glahn DC, Malhotra AK and Lencz T

    Institute of Mental Health, Singapore, Singapore.

    Here, we present a large (n = 107,207) genome-wide association study (GWAS) of general cognitive ability ("g"), further enhanced by combining results with a large-scale GWAS of educational attainment. We identified 70 independent genomic loci associated with general cognitive ability. Results showed significant enrichment for genes causing Mendelian disorders with an intellectual disability phenotype. Competitive pathway analysis implicated the biological processes of neurogenesis and synaptic regulation, as well as the gene targets of two pharmacologic agents: cinnarizine, a T-type calcium channel blocker, and LY97241, a potassium channel inhibitor. Transcriptome-wide and epigenome-wide analysis revealed that the implicated loci were enriched for genes expressed across all brain regions (most strongly in the cerebellum). Enrichment was exclusive to genes expressed in neurons but not oligodendrocytes or astrocytes. Finally, we report genetic correlations between cognitive ability and disparate phenotypes including psychiatric disorders, several autoimmune disorders, longevity, and maternal age at first birth.

    Cell reports 2017;21;9;2597-2613

  • The Ageing Brain: Effects on DNA Repair and DNA Methylation in Mice.

    Langie SA, Cameron KM, Ficz G, Oxley D, Tomaszewski B, Gorniak JP, Maas LM, Godschalk RW, van Schooten FJ, Reik W, von Zglinicki T and Mathers JC

    Centre for Ageing and Vitality, Human Nutrition Research Centre, Institute of Cellular Medicine, Newcastle University, Campus for Ageing and Vitality, Newcastle upon Tyne NE4 5PL, UK.

    Base excision repair (BER) may become less effective with ageing resulting in accumulation of DNA lesions, genome instability and altered gene expression that contribute to age-related degenerative diseases. The brain is particularly vulnerable to the accumulation of DNA lesions; hence, proper functioning of DNA repair mechanisms is important for neuronal survival. Although the mechanism of age-related decline in DNA repair capacity is unknown, growing evidence suggests that epigenetic events (e.g., DNA methylation) contribute to the ageing process and may be functionally important through the regulation of the expression of DNA repair genes. We hypothesize that epigenetic mechanisms are involved in mediating the age-related decline in BER in the brain. Brains from male mice were isolated at 3-32 months of age. Pyrosequencing analyses revealed significantly increased Ogg1 methylation with ageing, which correlated inversely with Ogg1 expression. The reduced Ogg1 expression correlated with enhanced expression of methyl-CpG binding protein 2 and ten-eleven translocation enzyme 2. A significant inverse correlation between Neil1 methylation at CpG-site2 and expression was also observed. BER activity was significantly reduced and associated with increased 8-oxo-7,8-dihydro-2'-deoxyguanosine levels. These data indicate that Ogg1 and Neil1 expression can be epigenetically regulated, which may mediate the effects of ageing on DNA repair in the brain.

    Genes 2017;8;2

  • Obesity-associated gene <i>TMEM18</i> has a role in the central control of appetite and body weight regulation.

    Larder R, Sim MFM, Gulati P, Antrobus R, Tung YCL, Rimmington D, Ayuso E, Polex-Wolf J, Lam BYH, Dias C, Logan DW, Virtue S, Bosch F, Yeo GSH, Saudek V, O'Rahilly S and Coll AP

    University of Cambridge Metabolic Research Laboratories, Level 4, Wellcome Trust-Medical Research Council Institute of Metabolic Science, Addenbrooke's Hospital, Cambridge CB2 0QQ, United Kingdom.

    An intergenic region of human chromosome 2 (2p25.3) harbors genetic variants which are among those most strongly and reproducibly associated with obesity. The gene closest to these variants is <i>TMEM18</i>, although the molecular mechanisms mediating these effects remain entirely unknown. <i>Tmem18</i> expression in the murine hypothalamic paraventricular nucleus (PVN) was altered by changes in nutritional state. Germline loss of <i>Tmem18</i> in mice resulted in increased body weight, which was exacerbated by high fat diet and driven by increased food intake. Selective overexpression of <i>Tmem18</i> in the PVN of wild-type mice reduced food intake and also increased energy expenditure. We provide evidence that TMEM18 has four, not three, transmembrane domains and that it physically interacts with key components of the nuclear pore complex. Our data support the hypothesis that <i>TMEM18</i> itself, acting within the central nervous system, is a plausible mediator of the impact of adjacent genetic variation on human adiposity.

    Funded by: Medical Research Council: G0900554, MC_UU_12012/1, MC_UU_12012/5; Wellcome Trust: 100574/Z/12/Z, 100679/Z/12/Z, WT098051

    Proceedings of the National Academy of Sciences of the United States of America 2017;114;35;9421-9426

  • The Genetic Legacy of the Indian Ocean Slave Trade: Recent Admixture and Post-admixture Selection in the Makranis of Pakistan.

    Laso-Jadart R, Harmant C, Quach H, Zidane N, Tyler-Smith C, Mehdi Q, Ayub Q, Quintana-Murci L and Patin E

    Unit of Human Evolutionary Genetics, Department of Genomes & Genetics, Institut Pasteur, Paris 75015, France; Centre National de la Recherche Scientifique URA3012, Paris 75015, France; Center of Bioinformatics, Biostatistics and Integrative Biology, Institut Pasteur, Paris 75015, France.

    From the eighth century onward, the Indian Ocean was the scene of extensive trade of sub-Saharan African slaves via sea routes controlled by Muslim Arab and Swahili traders. Several populations in present-day Pakistan and India are thought to be the descendants of such slaves, yet their history of admixture and natural selection remains largely undefined. Here, we studied the genome-wide diversity of the African-descent Makranis, who reside on the Arabian Sea coast of Pakistan, as well that of four neighboring Pakistani populations, to investigate the genetic legacy, population dynamics, and tempo of the Indian Ocean slave trade. We show that the Makranis are the result of an admixture event between local Baluch tribes and Bantu-speaking populations from eastern or southeastern Africa; we dated this event to ∼300 years ago during the Omani Empire domination. Levels of parental relatedness, measured through runs of homozygosity, were found to be similar across Pakistani populations, suggesting that the Makranis rapidly adopted the traditional practice of endogamous marriages. Finally, we searched for signatures of post-admixture selection at traits evolving under positive selection, including skin color, lactase persistence, and resistance to malaria. We demonstrate that the African-specific Duffy-null blood group-believed to confer resistance against Plasmodium vivax infection-was recently introduced to Pakistan through the slave trade and evolved adaptively in this P. vivax malaria-endemic region. Our study reconstructs the genetic and adaptive history of a neglected episode of the African Diaspora and illustrates the impact of recent admixture on the diffusion of adaptive traits across human populations.

    American journal of human genetics 2017

  • Sources of error in measurement of minimal residual disease in childhood acute lymphoblastic leukemia.

    Latham S, Hughes E, Budgen B, Mechinaud F, Crock C, Ekert H, Campbell P and Morley A

    Department of Haematology and Genetic Pathology, Flinders University and Medical Centre, Bedford Park, SA, Australia.

    Introduction: The level of minimal residual disease (MRD) in marrow predicts outcome and guides treatment in childhood acute lymphoblastic leukemia (ALL) but accurate prediction depends on accurate measurement.

    Methods: Forty-one children with ALL were studied at the end of induction. Two samples were obtained from each iliac spine and each sample was assayed twice. Assay, sample and side-to-side variation were quantified by analysis of variance and presumptively incorrect decisions related to high-risk disease were determined using the result from each MRD assay, the mean MRD in the patient as the measure of the true value, and each of 3 different MRD cut-off levels which have been used for making decisions on treatment.

    Results: Variation between assays, samples and sides each differed significantly from zero and the overall standard deviation for a single MRD estimation was 0.60 logs. Multifocal residual disease seemed to be at least partly responsible for the variation between samples. Decision errors occurred at a frequency of 13-14% when the mean patient MRD was between 10-2 and 10-5. Decision errors were observed only for an MRD result within 1 log of the cut-off value used for assessing high risk. Depending on the cut-off used, 31-40% of MRD results were within 1 log of the cut-off value and 21-16% of such results would have resulted in a decision error.

    Conclusion: When the result obtained for the level of MRD is within 1 log of the cut-off value used for making decisions, variation in the assay and/or sampling may result in a misleading assessment of the true level of marrow MRD. This may lead to an incorrect decision on treatment.

    PloS one 2017;12;10;e0185556

  • Genome-wide association study identifies distinct genetic contributions to prognosis and susceptibility in Crohn's disease.

    Lee JC, Biasci D, Roberts R, Gearry RB, Mansfield JC, Ahmad T, Prescott NJ, Satsangi J, Wilson DC, Jostins L, Anderson CA, UK IBD Genetics Consortium, Traherne JA, Lyons PA, Parkes M and Smith KG

    Department of Medicine, University of Cambridge School of Clinical Medicine, Addenbrooke's Hospital, Cambridge, UK.

    For most immune-mediated diseases, the main determinant of patient well-being is not the diagnosis itself but instead the course that the disease takes over time (prognosis). Prognosis may vary substantially between patients for reasons that are poorly understood. Familial studies support a genetic contribution to prognosis, but little evidence has been found for a proposed association between prognosis and the burden of susceptibility variants. To better characterize how genetic variation influences disease prognosis, we performed a within-cases genome-wide association study in two cohorts of patients with Crohn's disease. We identified four genome-wide significant loci, none of which showed any association with disease susceptibility. Conversely, the aggregated effect of all 170 disease susceptibility loci was not associated with disease prognosis. Together, these data suggest that the genetic contribution to prognosis in Crohn's disease is largely independent of the contribution to disease susceptibility and point to a biology of prognosis that could provide new therapeutic opportunities.

    Funded by: Chief Scientist Office: ETM/137; Department of Health: NIHR-RP-R3-12-026; Medical Research Council: G0600329, G0800759, MC_UU_12010/7, MR/L019027/1; Wellcome Trust

    Nature genetics 2017;49;2;262-268

  • Complex chromosomal rearrangements by single catastrophic pathogenesis in NUT midline carcinoma.

    Lee JK, Louzada S, An Y, Kim SY, Kim S, Youk J, Park S, Koo SH, Keam B, Jeon YK, Ku JL, Yang F, Kim TM and Ju YS

    Korea Advanced Institute of Science and Technology, Graduate School of Medical Science and Engineering, Daejeon, South Korea.

    Background: Nuclear protein in testis (NUT) midline carcinoma (NMC) is a rare aggressive malignancy often occurring in the tissues of midline anatomical structures. Except for the pathognomonic BRD3/4-NUT rearrangement, the comprehensive landscape of genomic alterations in NMCs has been unexplored.

    Patients and methods: We investigated three NMC cases, including two newly diagnosed NMC patients in Seoul National University Hospital, and a previously reported cell line (Ty-82). Whole-genome and transcriptome sequencing were carried out for these cases, and findings were validated by multiplex fluorescence in situ hybridization and using individual fluorescence probes.

    Results: Here, we present the first integrative analysis of whole-genome sequencing, transcriptome sequencing and cytogenetic characterization of NUT midline carcinomas. By whole-genome sequencing, we identified a remarkably similar pattern of highly complex genomic rearrangements (previously denominated as chromoplexy) involving the BRD3/4-NUT oncogenic rearrangements in two newly diagnosed NMC cases. Transcriptome sequencing revealed that these complex rearrangements were transcribed as very simple BRD3/4-NUT fusion transcripts. In Ty-82 cells, we also identified a complex genomic rearrangement involving the BRD4-NUT rearrangement underlying the simple t(15;19) karyotype. Careful inspections of rearrangement breakpoints indicated that these rearrangements were likely attributable to single catastrophic events. Although the NMC genomes had >3000 somatic point mutations, canonical oncogenes or tumor suppressor genes were rarely affected, indicating that they were largely passenger events. Mutational signature analysis showed predominant molecular clock-like signatures in all three cases (accounting for 54%-75% of all base substitutions), suggesting that NMCs may arise from actively proliferating normal cells.

    Conclusion: Taken together, our findings suggest that a single catastrophic event in proliferating normal cells could be sufficient for neoplastic transformation into NMCs.

    Annals of oncology : official journal of the European Society for Medical Oncology 2017;28;4;890-897

  • WormBase 2017: molting into a new stage.

    Lee RYN, Howe KL, Harris TW, Arnaboldi V, Cain S, Chan J, Chen WJ, Davis P, Gao S, Grove C, Kishore R, Muller HM, Nakamura C, Nuin P, Paulini M, Raciti D, Rodgers F, Russell M, Schindelman G, Tuli MA, Van Auken K, Wang Q, Williams G, Wright A, Yook K, Berriman M, Kersey P, Schedl T, Stein L and Sternberg PW

    Division of Biology and Biological Engineering 156-29, California Institute of Technology, Pasadena, CA 91125, USA.

    WormBase ( ) is an important knowledge resource for biomedical researchers worldwide. To accommodate the ever increasing amount and complexity of research data, WormBase continues to advance its practices on data acquisition, curation and retrieval to most effectively deliver comprehensive knowledge about Caenorhabditis elegans , and genomic information about other nematodes and parasitic flatworms. Recent notable enhancements include user-directed submission of data, such as micropublication; genomic data curation and presentation, including additional genomes and JBrowse, respectively; new query tools, such as SimpleMine, Gene Enrichment Analysis; new data displays, such as the Person Lineage browser and the Summary of Ontology-based Annotations. Anticipating more rapid data growth ahead, WormBase continues the process of migrating to a cutting-edge database technology to achieve better stability, scalability, reproducibility and a faster response time. To better serve the broader research community, WormBase, with five other Model Organism Databases and The Gene Ontology project, have begun to collaborate formally as the Alliance of Genome Resources.

    Nucleic acids research 2017

  • Within-Host Sampling of a Natural Population Shows Signs of Selection on Pde1 during Bacterial Meningitis.

    Lees JA, Brouwer M, van der Ende A, Parkhill J, van de Beek D and Bentley SD

    Wellcome Trust Sanger Institute, Hinxton, United Kingdom.

    Funded by: Medical Research Council: 1365620; Wellcome Trust: 098051

    Infection and immunity 2017;85;3

  • Genome-wide identification of lineage and locus specific variation associated with pneumococcal carriage duration.

    Lees JA, Croucher NJ, Goldblatt D, Nosten F, Parkhill J, Turner C, Turner P and Bentley SD

    Infection Genomics, Wellcome Trust Sanger Institute, Hinxton, United Kingdom.

    <i>Streptococcus pneumoniae</i> is a leading cause of invasive disease in infants, especially in low-income settings. Asymptomatic carriage in the nasopharynx is a prerequisite for disease, but variability in its duration is currently only understood at the serotype level. Here we developed a model to calculate the duration of carriage episodes from longitudinal swab data, and combined these results with whole genome sequence data. We estimated that pneumococcal genomic variation accounted for 63% of the phenotype variation, whereas the host traits considered here (age and previous carriage) accounted for less than 5%. We further partitioned this heritability into both lineage and locus effects, and quantified the amount attributable to the largest sources of variation in carriage duration: serotype (17%), drug-resistance (9%) and other significant locus effects (7%). A pan-genome-wide association study identified prophage sequences as being associated with decreased carriage duration independent of serotype, potentially by disruption of the competence mechanism. These findings support theoretical models of pneumococcal competition and antibiotic resistance.

    Funded by: Medical Research Council: 1365620; Wellcome Trust: 083735/Z/07/Z, 098051, 104169/Z/14/Z

    eLife 2017;6

  • Large scale genomic analysis shows no evidence for pathogen adaptation between the blood and cerebrospinal fluid niches during bacterial meningitis.

    Lees JA, Kremer PHC, Manso AS, Croucher NJ, Ferwerda B, Serón MV, Oggioni MR, Parkhill J, Brouwer MC, van der Ende A, van de Beek D and Bentley SD

    1​Pathogen Genomics, Wellcome Trust Sanger Institute, Hinxton, UK.

    Recent studies have provided evidence for rapid pathogen genome diversification, some of which could potentially affect the course of disease. We have previously described such variation seen between isolates infecting the blood and cerebrospinal fluid (CSF) of a single patient during a case of bacterial meningitis. Here, we performed whole-genome sequencing of paired isolates from the blood and CSF of 869 meningitis patients to determine whether such variation frequently occurs between these two niches in cases of bacterial meningitis. Using a combination of reference-free variant calling approaches, we show that no genetic adaptation occurs in either invaded niche during bacterial meningitis for two major pathogen species, <i>Streptococcus pneumoniae</i> and <i>Neisseria meningitidis</i>. This study therefore shows that the bacteria capable of causing meningitis are already able to do this upon entering the blood, and no further sequence change is necessary to cross the blood-brain barrier. Our findings place the focus back on bacterial evolution between nasopharyngeal carriage and invasion, or diversity of the host, as likely mechanisms for determining invasiveness.

    Funded by: European Research Council: 281156; Medical Research Council: 1365620, MR/M003078/1; Wellcome Trust: 098051, 104169/Z/14/Z

    Microbial genomics 2017;3;1;e000103

  • Stronger together.

    Lees JA, Tonkin-Hill G and Bentley SD

    Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.

    Nature reviews. Microbiology 2017;15;9;516

  • Evolutionary insights from wild vervet genomes.

    Leffler EM

    Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, UK, and the Wellcome Trust Sanger Institute, Hinxton, UK.

    A new study reports genome-wide variation in 163 vervet monkeys from across their taxonomic and geographic ranges. The analysis suggests a complex history of admixture and identifies signals of repeated evolutionary selection, some of which may be linked to response to simian immunodeficiency virus.

    Nature genetics 2017;49;12;1671-1672

  • Resistance to malaria through structural variation of red blood cell invasion receptors.

    Leffler EM, Band G, Busby GBJ, Kivinen K, Le QS, Clarke GM, Bojang KA, Conway DJ, Jallow M, Sisay-Joof F, Bougouma EC, Mangano VD, Modiano D, Sirima SB, Achidi E, Apinjoh TO, Marsh K, Ndila CM, Peshu N, Williams TN, Drakeley C, Manjurano A, Reyburn H, Riley E, Kachala D, Molyneux M, Nyirongo V, Taylor T, Thornton N, Tilley L, Grimsley S, Drury E, Stalker J, Cornelius V, Hubbart C, Jeffreys AE, Rowlands K, Rockett KA, Spencer CCA, Kwiatkowski DP and Malaria Genomic Epidemiology Network

    Wellcome Trust Centre for Human Genetics, University of Oxford, Roosevelt Drive, Oxford OX3 7BN, UK.

    The malaria parasite Plasmodium falciparum invades human red blood cells via interactions between host and parasite surface proteins. By analyzing genome sequence data from human populations, including 1269 individuals from sub-Saharan Africa, we identify a diverse array of large copy number variants affecting the host invasion receptor genes GYPA and GYPB We find that a nearby association with severe malaria is explained by a complex structural rearrangement involving the loss of GYPB and gain of two GYPB-A hybrid genes, which encode a serologically distinct blood group antigen known as Dantu. This variant reduces the risk of severe malaria by 40% and has recently risen in frequency in parts of Kenya, yet it appears to be absent from west Africa. These findings link structural variation of red blood cell invasion receptors with natural resistance to severe malaria.

    Science (New York, N.Y.) 2017

  • Role of HIV-specific CD8+ T cells in pediatric HIV cure strategies after widespread early viral escape.

    Leitman EM, Thobakgale CF, Adland E, Ansari MA, Raghwani J, Prendergast AJ, Tudor-Williams G, Kiepiela P, Hemelaar J, Brener J, Tsai MH, Mori M, Riddell L, Luzzi G, Jooste P, Ndung'u T, Walker BD, Pybus OG, Kellam P, Naranbhai V, Matthews PC, Gall A and Goulder PJR

    Department of Paediatrics, University of Oxford, Oxford, England, UK.

    Recent studies have suggested greater HIV cure potential among infected children than adults. A major obstacle to HIV eradication in adults is that the viral reservoir is largely comprised of HIV-specific cytotoxic T lymphocyte (CTL) escape variants. We here evaluate the potential for CTL in HIV-infected slow-progressor children to play an effective role in "shock-and-kill" cure strategies. Two distinct subgroups of children were identified on the basis of viral load. Unexpectedly, in both groups, as in adults, HIV-specific CTL drove the selection of escape variants across a range of epitopes within the first weeks of infection. However, in HIV-infected children, but not adults, de novo autologous variant-specific CTL responses were generated, enabling the pediatric immune system to "corner" the virus. Thus, even when escape variants are selected in early infection, the capacity in children to generate variant-specific anti-HIV CTL responses maintains the potential for CTL to contribute to effective shock-and-kill cure strategies in pediatric HIV infection.

    Funded by: Wellcome Trust

    The Journal of experimental medicine 2017;214;11;3239-3261

  • Chromosomal breaks at FRA18C: association with reduced <i>DOK6</i> expression, altered oncogenic signaling and increased gastric cancer survival.

    Leong SH, Lwin KM, Lee SS, Ng WH, Ng KM, Tan SY, Ng BL, Carter NP, Tang C and Lian Kon O

    1Division of Medical Sciences, Humphrey Oei Institute of Cancer Research, National Cancer Centre Singapore, 11 Hospital Drive, Singapore, 169610 Singapore.

    Chromosomal rearrangements are common in cancer. More than 50% occur in common fragile sites and disrupt tumor suppressors. However, such rearrangements are not known in gastric cancer. Here we report recurrent 18q2 breakpoints in 6 of 17 gastric cancer cell lines. The rearranged chromosome 18, t(9;18), in MKN7 cells was flow sorted and identified by reverse chromosome painting. High-resolution tiling array hybridization mapped breakpoints to <i>DOK6</i> (docking protein 6) intron 4 in FRA18C (18q22.2) and an intergenic region in 9q22.2. The same rearrangement was detected by FISH in 22% of 99 primary gastric cancers. Intron 4 truncation was associated with reduced <i>DOK6</i> transcription. Analysis of The Cancer Genome Atlas stomach adenocarcinoma cohort showed significant correlation of <i>DOK6</i> expression with histological and molecular phenotypes. Multiple oncogenic signaling pathways (gastrin-CREB, NGF-neurotrophin, PDGF, EGFR, ERK, ERBB4, FGFR1, RAS, VEGFR2 and RAF/MAP kinase) known to be active in aggressive gastric cancers were strikingly diminished in gastric cancers with low <i>DOK6</i> expression. Median survival of patients with low <i>DOK6</i>-expressing tumors was 2100 days compared with 533 days in patients with high <i>DOK6</i>-expressing tumors (log-rank <i>P</i> = 0.0027). The level of <i>DOK6</i> expression in tumors predicted patient survival independent of TNM stage. These findings point to new functions of human <i>DOK6</i> as an adaptor that interacts with diverse molecular components of signaling pathways. Our data suggest that <i>DOK6</i> expression is an integrated biomarker of multiple oncogenic signals in gastric cancer and identify FRA18C as a new cancer-associated fragile site.

    Funded by: Wellcome Trust

    NPJ precision oncology 2017;1;1;9

  • Skeletal Site-specific Changes in Bone Mass in a Genetic Mouse Model for Human 15q11-13 Duplication Seen in Autism.

    Lewis KE, Sharan K, Takumi T and Yadav VK

    Department of Mouse and Zebrafish Genetics, Wellcome Trust Sanger Institute, Cambridge, CB10 1SA, United Kingdom.

    Children suffering from autism have been reported to have low bone mineral density and increased risk for fracture, yet the cellular origin of the bone phenotype remains unknown. Here we have utilized a mouse model of autism that duplicates 6.3 Mb region of chromosome 7 (Dp/+) corresponding to a region of chromosome 15q11-13, duplication of which is recurrent in humans to characterize the bone phenotype. Paternally inherited Dp/+ (patDp/+) mice showed expected increases in the gene expression in bone, normal postnatal growth and body weight acquisition compared to the littermate controls. Four weeks-old patDp/+ mice develop a low bone mass phenotype in the appendicular but not the axial skeleton compared to the littermate controls. This low bone mass in the mutant mice was secondary to a decrease in the number of osteoblasts and bone formation rate while the osteoclasts remained relatively unaffected. Further in vitro cell culture experiments and gene expression analysis revealed a major defect in the proliferation, differentiation and mineralization abilities of patDp/+ osteoblasts while osteoclast differentiation remained unchanged compared to controls. This study therefore characterizes the structural and cellular bone phenotype in a mouse model of autism that can be further utilized to investigate therapeutic avenues to treat bone fractures in children with autism.

    Scientific reports 2017;7;1;9902

  • A lncRNA fine tunes the dynamics of a cell state transition involving Lin28, let-7 and de novo DNA methylation.

    Li MA, Amaral PP, Cheung P, Bergmann JH, Kinoshita M, Kalkan T, Ralser M, Robson S, Meyenn FV, Paramor M, Yang F, Chen C, Nichols J, Spector DL, Kouzarides T, He L and Smith A

    Wellcome Trust - Medical Research Council Stem Cell Institute, University of Cambridge, Cambridge, United Kingdom.

    Execution of pluripotency requires progression from the naïve status represented by mouse embryonic stem cells (ESCs) to a state capacitated for lineage specification. This transition is coordinated at multiple levels. Non-coding RNAs may contribute to this regulatory orchestra. We identified a rodent-specific long non-coding RNA (lncRNA) linc1281, hereafter Ephemeron (Eprn), that modulates the dynamics of exit from naïve pluripotency. Eprn deletion delays the extinction of ESC identity, an effect associated with perduring Nanog expression. In the absence of Eprn, Lin28a expression is reduced which results in persistence of let-7 microRNAs, and the up-regulation of de novo methyltransferases Dnmt3a/b is delayed. Dnmt3a/b deletion retards ES cell transition, correlating with delayed Nanog promoter methylation and phenocopying loss of Eprn or Lin28a. The connection from lncRNA to miRNA and DNA methylation facilitates the acute extinction of naïve pluripotency, a pre-requisite for rapid progression from preimplantation epiblast to gastrulation in rodents. Eprn illustrates how lncRNAs may introduce species-specific network modulations.

    Funded by: NCI NIH HHS: P01 CA013106, R01 CA139067, R21 CA175560; Wellcome Trust

    eLife 2017;6

  • CD215+ Myeloid Cells Respond to Interleukin 15 Stimulation and Promote Tumor Progression.

    Lin S, Huang G, Xiao Y, Sun W, Jiang Y, Deng Q, Peng M, Wei X, Ye W, Li B, Lin S, Wang S, Wu Q, Liang Q, Li Y, Zhang X, Wu Y, Liu P, Pei D, Yu F, Wen Z, Yao Y, Wu D and Li P

    Key Laboratory of Regenerative Biology, South China Institute for Stem Cell Biology and Regenerative Medicine, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou, China.

    Interleukin 15 (IL-15) regulates the development, survival, and functions of multiple innate and adaptive immune cells and plays a dual role in promoting both tumor cell growth and antitumor immunity. Here, we demonstrated that the in vivo injection of recombinant human IL-15 (200 µg/kg) or murine IL-15 (3 µg/kg) to tumor-bearing NOD-SCID-IL2Rg-/- (NSI) mice resulted in increased tumor progression and CD45+ CD11b+ Gr-1+ CD215+ cell expansion in the tumors and spleen. In B16F10-bearing C57BL/6 mice model, we found that murine IL-15 has antitumoral effect since the activation and expansion of CD8+ T cells with murine IL-15 treatment. But no enhanced or reduced tumor growth was observed in mice when human IL-15 was used. However, both murine and human IL-15 promote CD45+ CD11b+ Gr-1+ CD215+ cells expansion. In xenograft tumor models, CD215+ myeloid cells, but not CD215- cells, responded to human IL-15 stimulation and promoted tumor growth. Furthermore, we found that human IL-15 mediated insulin-like growth factor-1 production in CD215+ myeloid cells and blocking IGF-1 reduced the tumor-promoting effect of IL-15. Finally, we observed that higher IGF-1 expression is an indicator of poor prognosis among lung adenocarcinoma patients. These findings provide evidence that IL-15 may promote tumor cell progression via CD215+ myeloid cells, and IGF-1 may be an important candidate that IL-15 facilitates tumor growth.

    Frontiers in immunology 2017;8;1713

  • Ancient individuals from the North American Northwest Coast reveal 10,000 years of regional genetic continuity.

    Lindo J, Achilli A, Perego UA, Archer D, Valdiosera C, Petzelt B, Mitchell J, Worl R, Dixon EJ, Fifield TE, Rasmussen M, Willerslev E, Cybulski JS, Kemp BM, DeGiorgio M and Malhi RS

    Department of Human Genetics, University of Chicago, Chicago, IL 60637.

    Recent genomic studies of both ancient and modern indigenous people of the Americas have shed light on the demographic processes involved during the first peopling. The Pacific Northwest Coast proves an intriguing focus for these studies because of its association with coastal migration models and genetic ancestral patterns that are difficult to reconcile with modern DNA alone. Here, we report the low-coverage genome sequence of an ancient individual known as "Shuká <u>K</u>áa" ("Man Ahead of Us") recovered from the On Your Knees Cave (OYKC) in southeastern Alaska (archaeological site 49-PET-408). The human remains date to ∼10,300 calendar (cal) y B.P. We also analyze low-coverage genomes of three more recent individuals from the nearby coast of British Columbia dating from ∼6,075 to 1,750 cal y B.P. From the resulting time series of genetic data, we show that the Pacific Northwest Coast exhibits genetic continuity for at least the past 10,300 cal y B.P. We also infer that population structure existed in the late Pleistocene of North America with Shuká <u>K</u>áa on a different ancestral line compared with other North American individuals from the late Pleistocene or early Holocene (i.e., Anzick-1 and Kennewick Man). Despite regional shifts in mtDNA haplogroups, we conclude from individuals sampled through time that people of the northern Northwest Coast belong to an early genetic lineage that may stem from a late Pleistocene coastal migration into the Americas.

    Proceedings of the National Academy of Sciences of the United States of America 2017;114;16;4093-4098

  • Reply to Gatesy and Springer: Claims of homology errors and zombie lineages do not compromise the dating of placental diversification.

    Liu L, Zhang J, Rheindt FE, Lei F, Qu Y, Wang Y, Zhang Y, Sullivan C, Nie W, Wang J, Yang F, Chen J, Edwards SV, Meng J and Wu S

    Jiangsu Key Laboratory of Phylogenomics and Comparative Genomics, School of Life Sciences, Jiangsu Normal University, Xuzhou 221116, Jiangsu, China.

    Proceedings of the National Academy of Sciences of the United States of America 2017;114;45;E9433-E9434

  • An Organismal CNV Mutator Phenotype Restricted to Early Human Development.

    Liu P, Yuan B, Carvalho CM, Wuster A, Walter K, Zhang L, Gambin T, Chong Z, Campbell IM, Coban Akdemir Z, Gelowani V, Writzl K, Bacino CA, Lindsay SJ, Withers M, Gonzaga-Jauregui C, Wiszniewska J, Scull J, Stankiewicz P, Jhangiani SN, Muzny DM, Zhang F, Chen K, Gibbs RA, Rautenstrauss B, Cheung SW, Smith J, Breman A, Shaw CA, Patel A, Hurles ME and Lupski JR

    Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA; Baylor Genetics, Houston, TX 77021, USA. Electronic address:

    De novo copy number variants (dnCNVs) arising at multiple loci in a personal genome have usually been considered to reflect cancer somatic genomic instabilities. We describe a multiple dnCNV (MdnCNV) phenomenon in which individuals with genomic disorders carry five to ten constitutional dnCNVs. These CNVs originate from independent formation incidences, are predominantly tandem duplications or complex gains, exhibit breakpoint junction features reminiscent of replicative repair, and show increased de novo point mutations flanking the rearrangement junctions. The active CNV mutation shower appears to be restricted to a transient perizygotic period. We propose that a defect in the CNV formation process is responsible for the "CNV-mutator state," and this state is dampened after early embryogenesis. The constitutional MdnCNV phenomenon resembles chromosomal instability in various cancers. Investigations of this phenomenon may provide unique access to understanding genomic disorders, structural variant mutagenesis, human evolution, and cancer biology.

    Cell 2017;168;5;830-842.e7

  • Human evolution: a tale from ancient genomes.

    Llamas B, Willerslev E and Orlando L

    Australian Centre for ADNA, School of Biological Sciences, University of Adelaide, Adelaide, South Australia 5005, Australia.

    The field of human ancient DNA (aDNA) has moved from mitochondrial sequencing that suffered from contamination and provided limited biological insights, to become a fully genomic discipline that is changing our conception of human history. Recent successes include the sequencing of extinct hominins, and true population genomic studies of Bronze Age populations. Among the emerging areas of aDNA research, the analysis of past epigenomes is set to provide more new insights into human adaptation and disease susceptibility through time. Starting as a mere curiosity, ancient human genetics has become a major player in the understanding of our evolutionary history.This article is part of the themed issue 'Evo-devo in the genomics era, and the origins of morphological diversity'.

    Philosophical transactions of the Royal Society of London. Series B, Biological sciences 2017;372;1713

  • Asymptomatic Plasmodium vivax infections induce robust IgG responses to multiple blood-stage proteins in a low-transmission region of western Thailand.

    Longley RJ, França CT, White MT, Kumpitak C, Sa-Angchai P, Gruszczyk J, Hostetler JB, Yadava A, King CL, Fairhurst RM, Rayner JC, Tham WH, Nguitragool W, Sattabongkot J and Mueller I

    Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia.

    Background: Thailand is aiming to eliminate malaria by the year 2024. Plasmodium vivax has now become the dominant species causing malaria within the country, and a high proportion of infections are asymptomatic. A better understanding of antibody dynamics to P. vivax antigens in a low-transmission setting, where acquired immune responses are poorly characterized, will be pivotal for developing new strategies for elimination, such as improved surveillance methods and vaccines. The objective of this study was to characterize total IgG antibody levels to 11 key P. vivax proteins in a village of western Thailand.

    Methods: Plasma samples from 546 volunteers enrolled in a cross-sectional survey conducted in 2012 in Kanchanaburi Province were utilized. Total IgG levels to 11 different proteins known or predicted to be involved in reticulocyte binding or invasion (ARP, GAMA, P41, P12, PVX_081550, and five members of the PvRBP family), as well as the leading pre-erythrocytic vaccine candidate (CSP) were measured using a multiplexed bead-based assay. Associations between IgG levels and infection status, age, and spatial location were explored.

    Results: Individuals from a low-transmission region of western Thailand reacted to all 11 P. vivax recombinant proteins. Significantly greater IgG levels were observed in the presence of a current P. vivax infection, despite all infected individuals being asymptomatic. IgG levels were also higher in adults (18 years and older) than in children. For most of the proteins, higher IgG levels were observed in individuals living closer to the Myanmar border and further away from local health services.

    Conclusions: Robust IgG responses were observed to most proteins and IgG levels correlated with surrogates of exposure, suggesting these antigens may serve as potential biomarkers of exposure, immunity, or both.

    Funded by: Medical Research Council: MR/J002283/1; NIAID NIH HHS: R01 AI104822; Wellcome Trust: 098051, 101073/Z/13Z

    Malaria journal 2017;16;1;178

  • Integrative genomic analysis implicates limited peripheral adipose storage capacity in the pathogenesis of human insulin resistance.

    Lotta LA, Gulati P, Day FR, Payne F, Ongen H, van de Bunt M, Gaulton KJ, Eicher JD, Sharp SJ, Luan J, De Lucia Rolfe E, Stewart ID, Wheeler E, Willems SM, Adams C, Yaghootkar H, EPIC-InterAct Consortium, Cambridge FPLD1 Consortium, Forouhi NG, Khaw KT, Johnson AD, Semple RK, Frayling T, Perry JR, Dermitzakis E, McCarthy MI, Barroso I, Wareham NJ, Savage DB, Langenberg C, O'Rahilly S and Scott RA

    MRC Epidemiology Unit, University of Cambridge, Cambridge, UK.

    Insulin resistance is a key mediator of obesity-related cardiometabolic disease, yet the mechanisms underlying this link remain obscure. Using an integrative genomic approach, we identify 53 genomic regions associated with insulin resistance phenotypes (higher fasting insulin levels adjusted for BMI, lower HDL cholesterol levels and higher triglyceride levels) and provide evidence that their link with higher cardiometabolic risk is underpinned by an association with lower adipose mass in peripheral compartments. Using these 53 loci, we show a polygenic contribution to familial partial lipodystrophy type 1, a severe form of insulin resistance, and highlight shared molecular mechanisms in common/mild and rare/severe insulin resistance. Population-level genetic analyses combined with experiments in cellular models implicate CCDC92, DNAH10 and L3MBTL3 as previously unrecognized molecules influencing adipocyte differentiation. Our findings support the notion that limited storage capacity of peripheral adipose tissue is an important etiological component in insulin-resistant cardiometabolic disease and highlight genes and mechanisms underpinning this link.

    Funded by: Cancer Research UK: 14136; Medical Research Council: G0401527, G1000143, MC_PC_13046, MC_PC_13048, MC_QA137853, MC_UU_12012/5, MC_UU_12015/1, MR/L00002/1, MR/N01104X/1; Wellcome Trust: 100574/Z/12/Z, 090532 , 095515/Z/11/Z, 098381, 107064, WT098051

    Nature genetics 2017;49;1;17-26

  • Reclassification of the specialized metabolite producer Pseudomonas mesoacidophila ATCC 31433 as a member of the Burkholderia cepacia complex.

    Loveridge EJ, Jones C, Bull MJ, Moody SC, Kahl MW, Khan Z, Neilson L, Tomeva M, Adams SE, Wood AC, Rodriguez-Martin D, Pinel I, Parkhill J, Mahenthiralingam E and Crosby J

    Department of Chemistry, Swansea University, Singleton Park, Swansea SA2 8PP, UK

    Pseudomonas mesoacidophila ATCC 31433 is a Gram-negative bacterium, first isolated from Japanese soil samples, which produces the monobactam isosulfazecin and the β-lactam potentiating bulgecins. To characterize the biosynthetic potential of P. mesoacidophila ATCC 31433 its complete genome was determined using single molecule real time DNA sequence analysis. The 7.8 Mb genome comprised four replicons: three chromosomal, each encoding ribosomal RNA, and one plasmid. Phylogenetic analysis demonstrated that P. mesoacidopila ATCC 31433 was mis-classified at the time of its deposition and is a member of the Burkholderia cepacia complex, most closely related to B. ubonensis The sequenced genome shows considerable additional biosynthetic potential, with known gene clusters for malleilactone, ornibactin, isosulfazecin, alkylhydroxyquinoline and pyrrolnitrin biosynthesis, and several uncharacterized biosynthetic gene clusters for PKS, NRPS and other metabolites, identified. Furthermore P. mesoacidophila ATCC 31433 harbours many genes associated with environmental resilience and antibiotic resistance, and was resistant to a range of antibiotics and metal ions. In summary, this bioactive strain should be designated B. cepacia complex strain ATCC 31433 pending further detailed taxonomic characterization.IMPORTANCE This work reports the complete genome sequence of Pseudomonas mesoacidophila ATCC 31433, a known producer of bioactive compounds. A large number of both known and novel biosynthetic gene clusters were identified, indicating that P. mesoacidophila ATCC 31433 is an untapped resource for discovery of novel bioactive compounds. Phylogenetic analysis demonstrated that P. mesoacidopila ATCC 31433 is in fact a member of the Burkholderia cepacia complex, most closely related to the species B. ubonensis Further investigation of the classification and biosynthetic potential of P. mesoacidopila ATCC 31433 is warranted.

    Journal of bacteriology 2017

  • Applications of CRISPR Genome Editing Technology in Drug Target Identification and Validation.

    Lu Q, Livi GP, Modha S, Yusa K, Macarrón R and Dow DJ

    a Target Sciences, GlaxoSmithKline R&D , 1250 South Collegeville Road, Collegeville , PA 19426 , USA.

    Structured Abstract Introduction: The analysis of pharmaceutical industry data indicates that the major reason for drug candidates failing in late stage clinical development is lack of efficacy, with a high proportion of these due to erroneous hypotheses about target to disease linkage. More than ever, there is a requirement to better understand potential new drug targets and their role in disease biology in order to reduce attrition in drug development. Genome editing technology enables precise modification of individual protein coding genes, as well as noncoding regulatory sequences, enabling the elucidation of functional effects in human disease relevant cellular systems. Areas Covered: This article outlines applications of CRISPR genome editing technology in target identification and target validation studies. Expert opinion: Applications of CRISPR technology in target validation studies are in evidence and gaining momentum. Whilst technical challenges remain, we are on the cusp of CRISPR being applied in complex cell systems such as iPS derived differentiated cells and stem cell derived organoids. In the meantime, our experience to date suggests that precise genome editing of putative targets in primary cell systems is possible, offering more human disease relevant systems than conventional cell lines.

    Expert opinion on drug discovery 2017

  • A gene expression atlas of adult Schistosoma mansoni and their gonads.

    Lu Z, Sessler F, Holroyd N, Hahnel S, Quack T, Berriman M and Grevelding CG

    BFS, Institute of Parasitology, Justus Liebig University, 35392 Giessen, Germany.

    RNA-Seq has proven excellence in providing information about the regulation and transcript levels of genes. We used this method for profiling genes in the flatworm Schistosoma mansoni. This parasite causes schistosomiasis, an infectious disease of global importance for human and animals. The pathology of schistosomiasis is associated with the eggs, which are synthesized as a final consequence of male and female adults pairing. The male induces processes in the female that lead to the full development of its gonads as a prerequisite for egg production. Unpaired females remain sexually immature. Based on an organ-isolation method we obtained gonad tissue for RNA extraction from paired and unpaired schistosomes, with whole adults included as controls. From a total of 23 samples, we used high-throughput cDNA sequencing (RNA-Seq) on the Illumina platform to profile gene expression between genders and tissues, with and without pairing influence. The data obtained provide a wealth of information on the reproduction biology of schistosomes and a rich resource for exploitation through basic and applied research activities.

    Funded by: Wellcome Trust

    Scientific data 2017;4;170118

  • Sharing of carbapenemase-encoding plasmids between Enterobacteriaceae in UK sewage uncovered by MinION sequencing.

    Ludden C, Reuter S, Judge K, Gouliouris T, Blane B, Coll F, Naydenova P, Hunt M, Tracey A, Hopkins KL, Brown NM, Woodford N, Parkhill J and Peacock SJ

    1​London School of Hygiene and Tropical Medicine, London, UK.

    Dissemination of carbapenem resistance among pathogenic Gram-negative bacteria is a looming medical emergency. Efficient spread of resistance within and between bacterial species is facilitated by mobile genetic elements. We hypothesized that wastewater contributes to the dissemination of carbapenemase-producing Enterobacteriaceae (CPE), and studied this through a cross-sectional observational study of wastewater in the East of England. We isolated clinically relevant species of CPE in untreated and treated wastewater, confirming that waste treatment does not prevent release of CPE into the environment. We observed that CPE-positive plants were restricted to those in direct receipt of hospital waste, suggesting that hospital effluent may play a role in disseminating carbapenem resistance. We postulated that plasmids carrying carbapenemase genes were exchanged between bacterial hosts in sewage, and used short-read (Illumina) and long-read (MinION) technologies to characterize plasmids encoding resistance to antimicrobials and heavy metals. We demonstrated that different CPE species (<i>Enterobacter kobei</i> and <i>Raoultella ornithinolytica</i>) isolated from wastewater from the same treatment plant shared two plasmids of 63 and 280 kb. The former plasmid conferred resistance to carbapenems (<i>bla</i><sub>OXA-48</sub>), and the latter to numerous drug classes and heavy metals. We also report the complete genome sequence for <i>Enterobacter kobei</i>. Small, portable sequencing instruments such as the MinION have the potential to improve the quality of information gathered on antimicrobial resistance in the environment.

    Funded by: Department of Health; Wellcome Trust: 103387/Z/13/Z, 110243/Z/15/Z, WT098600

    Microbial genomics 2017;3;7;e000114

  • Overcoming confounding plate effects in differential expression analyses of single-cell RNA-seq data.

    Lun ATL and Marioni JC

    Cancer Research UK Cambridge Institute, University of Cambridge, Li Ka Shing Centre, RobinsonWay, Cambridge CB2 0RE, UK.

    An increasing number of studies are using single-cell RNA-sequencing (scRNA-seq) to characterize the gene expression profiles of individual cells. One common analysis applied to scRNA-seq data involves detecting differentially expressed (DE) genes between cells in different biological groups. However, many experiments are designed such that the cells to be compared are processed in separate plates or chips, meaning that the groupings are confounded with systematic plate effects. This confounding aspect is frequently ignored in DE analyses of scRNA-seq data. In this article, we demonstrate that failing to consider plate effects in the statistical model results in loss of type I error control. A solution is proposed whereby counts are summed from all cells in each plate and the count sums for all plates are used in the DE analysis. This restores type I error control in the presence of plate effects without compromising detection power in simulated data. Summation is also robust to varying numbers and library sizes of cells on each plate. Similar results are observed in DE analyses of real data where the use of count sums instead of single-cell counts improves specificity and the ranking of relevant genes. This suggests that summation can assist in maintaining statistical rigour in DE analyses of scRNA-seq data with plate effects.

    Biostatistics (Oxford, England) 2017;18;3;451-464

  • Assessing the reliability of spike-in normalization for analyses of single-cell RNA sequencing data.

    Lun ATL, Calero-Nieto FJ, Haim-Vilmovsky L, Göttgens B and Marioni JC

    Cancer Research UK Cambridge Institute, University of Cambridge, Li Ka Shing Centre, Cambridge CB2 0RE, United Kingdom.

    By profiling the transcriptomes of individual cells, single-cell RNA sequencing provides unparalleled resolution to study cellular heterogeneity. However, this comes at the cost of high technical noise, including cell-specific biases in capture efficiency and library generation. One strategy for removing these biases is to add a constant amount of spike-in RNA to each cell and to scale the observed expression values so that the coverage of spike-in transcripts is constant across cells. This approach has previously been criticized as its accuracy depends on the precise addition of spike-in RNA to each sample. Here, we perform mixture experiments using two different sets of spike-in RNA to quantify the variance in the amount of spike-in RNA added to each well in a plate-based protocol. We also obtain an upper bound on the variance due to differences in behavior between the two spike-in sets. We demonstrate that both factors are small contributors to the total technical variance and have only minor effects on downstream analyses, such as detection of highly variable genes and clustering. Our results suggest that scaling normalization using spike-in transcripts is reliable enough for routine use in single-cell RNA sequencing data analyses.

    Genome research 2017

  • Testing for differential abundance in mass cytometry data.

    Lun ATL, Richard AC and Marioni JC

    Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, UK.

    When comparing biological conditions using mass cytometry data, a key challenge is to identify cellular populations that change in abundance. Here, we present a computational strategy for detecting 'differentially abundant' populations by assigning cells to hyperspheres, testing for significant differences between conditions and controlling the spatial false discovery rate. Our method ( outperforms other approaches in simulations and finds novel patterns of differential abundance in real data.

    Nature methods 2017

  • Exploring the genetic architecture of inflammatory bowel disease by whole-genome sequencing identifies association at ADCY7.

    Luo Y, de Lange KM, Jostins L, Moutsianas L, Randall J, Kennedy NA, Lamb CA, McCarthy S, Ahmad T, Edwards C, Serra EG, Hart A, Hawkey C, Mansfield JC, Mowat C, Newman WG, Nichols S, Pollard M, Satsangi J, Simmons A, Tremelling M, Uhlig H, Wilson DC, Lee JC, Prescott NJ, Lees CW, Mathew CG, Parkes M, Barrett JC and Anderson CA

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, UK.

    To further resolve the genetic architecture of the inflammatory bowel diseases ulcerative colitis and Crohn's disease, we sequenced the whole genomes of 4,280 patients at low coverage and compared them to 3,652 previously sequenced population controls across 73.5 million variants. We then imputed from these sequences into new and existing genome-wide association study cohorts and tested for association at ∼12 million variants in a total of 16,432 cases and 18,843 controls. We discovered a 0.6% frequency missense variant in ADCY7 that doubles the risk of ulcerative colitis. Despite good statistical power, we did not identify any other new low-frequency risk variants and found that such variants explained little heritability. We detected a burden of very rare, damaging missense variants in known Crohn's disease risk genes, suggesting that more comprehensive sequencing studies will continue to improve understanding of the biology of complex diseases.

    Funded by: Chief Scientist Office: CZB/4/540, ETM/137, ETM/75; Department of Health: NIHR-RP-R3-12-026; Medical Research Council: G0600329, G0800675, G0800759, MC_UU_12010/7, MR/J00314X/1; Wellcome Trust

    Nature genetics 2017;49;2;186-192

  • Effects of long-term ethanol consumption and Aldh1b1 depletion on intestinal tumourigenesis in mice.

    Müller MF, Zhou Y, Adams DJ and Arends MJ

    University of Edinburgh, Division of Pathology, Centre for Comparative Pathology, Cancer Research UK Edinburgh Centre, Institute of Genetics & Molecular Medicine, Western General Hospital, Crewe Road South, Edinburgh, EH4 2XR, UK.

    Ethanol and its metabolite acetaldehyde have been classified as carcinogens for the upper aerodigestive tract, liver, breast, and colorectum. Whereas mechanisms related to oxidative stress and Cyp2e1 induction seem to prevail in the liver, and acetaldehyde has been proposed to play a crucial role in the upper aerodigestive tract, pathological mechanisms in the colorectum have not yet been clarified. Moreover, all evidence for a pro-carcinogenic role of ethanol in colorectal cancer is derived from correlations observed in epidemiological studies or from rodent studies with additional carcinogen application or tumour suppressor gene inactivation. In the current study, wild-type mice and mice with depletion of aldehyde dehydrogenase 1b1 (Aldh1b1), an enzyme which has been proposed to play an important role in acetaldehyde detoxification in the intestines, received ethanol in drinking water for 1 year. Long-term ethanol consumption led to intestinal tumour development in wild-type and Aldh1b1-depleted mice, but no intestinal tumours were observed in water-treated controls. Moreover, a significant increase in DNA damage was detected in the large intestinal epithelium of ethanol-treated mice of both genotypes compared with the respective water-treated groups, along with increased proliferation of the small and large intestinal epithelium. Aldh1b1 depletion led to increased plasma acetaldehyde levels in ethanol-treated mice, to a significant aggravation of ethanol-induced intestinal hyperproliferation, and to more advanced features of intestinal tumours, but it did not affect intestinal tumour incidence. These data indicate that ethanol consumption can initiate intestinal tumourigenesis without any additional carcinogen treatment or tumour suppressor gene inactivation, and we provide evidence for a role of Aldh1b1 in protection of the intestines from ethanol-induced damage, as well as for both carcinogenic and tumour-promoting functions of acetaldehyde, including increased progression of ethanol-induced tumours. Copyright © 2016 Pathological Society of Great Britain and Ireland. Published by John Wiley & Sons, Ltd.

    Funded by: Cancer Research UK: 13031

    The Journal of pathology 2017;241;5;649-660

  • Single-Cell Multiomics: Multiple Measurements from Single Cells.

    Macaulay IC, Ponting CP and Voet T

    Earlham Institute, Norwich Research Park, Norwich NR4 7UH, UK. Electronic address:

    Single-cell sequencing provides information that is not confounded by genotypic or phenotypic heterogeneity of bulk samples. Sequencing of one molecular type (RNA, methylated DNA or open chromatin) in a single cell, furthermore, provides insights into the cell's phenotype and links to its genotype. Nevertheless, only by taking measurements of these phenotypes and genotypes from the same single cells can such inferences be made unambiguously. In this review, we survey the first experimental approaches that assay, in parallel, multiple molecular types from the same single cell, before considering the challenges and opportunities afforded by these and future technologies.

    Funded by: Biotechnology and Biological Sciences Research Council; Medical Research Council; Wellcome Trust

    Trends in genetics : TIG 2017;33;2;155-168

  • fCCAC: functional canonical correlation analysis to evaluate covariance between nucleic acid sequencing datasets.

    Madrigal P

    Wellcome Trust-Medical Research Council Cambridge Stem Cell Institute, University of Cambridge, Cambridge CB2 0SZ, UK.

    Summary: Computational evaluation of variability across DNA or RNA sequencing datasets is a crucial step in genomic science, as it allows both to evaluate reproducibility of biological or technical replicates, and to compare different datasets to identify their potential correlations. Here we present fCCAC, an application of functional canonical correlation analysis to assess covariance of nucleic acid sequencing datasets such as chromatin immunoprecipitation followed by deep sequencing (ChIP-seq). We show how this method differs from other measures of correlation, and exemplify how it can reveal shared covariance between histone modifications and DNA binding proteins, such as the relationship between the H3K4me3 chromatin mark and its epigenetic writers and readers.

    Availability and implementation: An R/Bioconductor package is available at .


    Supplementary information: Supplementary data are available at Bioinformatics online.

    Funded by: Medical Research Council: MC_PC_12009; Wellcome Trust

    Bioinformatics (Oxford, England) 2017;33;5;746-748

  • The AURORA pilot study for molecular screening of patients with advanced breast cancer-a study of the breast international group.

    Maetens M, Brown D, Irrthum A, Aftimos P, Viale G, Loibl S, Laes JF, Campbell PJ, Thompson A, Cortes J, Seiler S, Vinnicombe S, Oliveira M, Rothé F, Bareche Y, Fumagalli D, Zardavas D, Desmedt C, Piccart M, Loi S and Sotiriou C

    J.-C. Heuson Breast Cancer Translational Research Laboratory, Institut Jules Bordet, Université Libre de Bruxelles, Brussels, Belgium.

    Several studies have demonstrated the feasibility of molecular screening of tumour samples for matching patients with cancer to targeted therapies. However, most of them have been carried out at institutional or national level. Herein, we report on the pilot phase of AURORA (NCT02102165), a European multinational collaborative molecular screening initiative for advanced breast cancer patients. Forty-one patients were prospectively enroled at four participating centres across Europe. Metastatic tumours were biopsied and profiled using an Ion Torrent sequencing platform at a central facility. Sequencing results were obtained for 63% of the patients in real-time with variable turnaround time stemming from delays between patient consent and biopsy. At least one clinically actionable mutation was identified in 73% of patients. We used the Illumina sequencing technology for orthogonal validation and achieved an average of 66% concordance of substitution calls per patient. Additionally, copy number aberrations inferred from the Ion Torrent sequencing were compared to single nucleotide polymorphism arrays and found to be 59% concordant on average. Although this study demonstrates that powerful next generation genomic techniques are logistically ready for international molecular screening programs in routine clinical settings, technical challenges remain to be addressed in order to ensure the accuracy and clinical utility of the genomic data.

    NPJ breast cancer 2017;3;23

  • Nutrient sensing modulates malaria parasite virulence.

    Mancio-Silva L, Slavic K, Grilo Ruivo MT, Grosso AR, Modrzynska KK, Vera IM, Sales-Dias J, Gomes AR, MacPherson CR, Crozet P, Adamo M, Baena-Gonzalez E, Tewari R, Llinás M, Billker O and Mota MM

    Instituto de Medicina Molecular, Faculdade de Medicina da Universidade de Lisboa, 1649-028 Lisboa, Portugal.

    The lifestyle of intracellular pathogens, such as malaria parasites, is intimately connected to that of their host, primarily for nutrient supply. Nutrients act not only as primary sources of energy but also as regulators of gene expression, metabolism and growth, through various signalling networks that enable cells to sense and adapt to varying environmental conditions. Canonical nutrient-sensing pathways are presumed to be absent from the causative agent of malaria, Plasmodium, thus raising the question of whether these parasites can sense and cope with fluctuations in host nutrient levels. Here we show that Plasmodium blood-stage parasites actively respond to host dietary calorie alterations through rearrangement of their transcriptome accompanied by substantial adjustment of their multiplication rate. A kinome analysis combined with chemical and genetic approaches identified KIN as a critical regulator that mediates sensing of nutrients and controls a transcriptional response to the host nutritional status. KIN shares homology with SNF1/AMPKα, and yeast complementation studies suggest that it is part of a functionally conserved cellular energy-sensing pathway. Overall, these findings reveal a key parasite nutrient-sensing mechanism that is critical for modulating parasite replication and virulence.

    Funded by: European Research Council: 311502; Medical Research Council: G0900109, MR/K011782/1; NIAID NIH HHS: F32 AI104252; NIH HHS: DP2 OD001315; Wellcome Trust

    Nature 2017;547;7662;213-216

  • Visualization and Quantification of Browning Using a Ucp1-2A-Luciferase Knock-in Mouse Model.

    Mao L, Nie B, Nie T, Hui X, Gao X, Lin X, Liu X, Xu Y, Tang X, Yuan R, Li K, Li P, Ding K, Wang Y, Xu A, Fei J, Han W, Liu P, Madsen L, Kristiansen K, Zhou Z, Ding S and Wu D

    CAS Key Laboratory of Regenerative Biology, Joint School of Life Sciences, Guangzhou Medical University, and Guangzhou Institute of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou, China.

    Both mammals and adult humans possess classic brown adipocytes and beige adipocytes, and the amount and activity of these adipocytes are considered key factors in combating obesity and its associated metabolic diseases. Uncoupling protein 1 (Ucp1) is the functional marker of both brown and beige adipocytes. To facilitate a reliable, easy, and sensitive measurement of Ucp1 expression both in vivo and in vitro, we generated a Ucp1-2A-luciferase knock-in mouse by deleting the stop codon for the mouse Ucp1 gene and replacing it with a 2A peptide. This peptide was followed by the luciferase coding sequence to recapitulate the expression of the Ucp1 gene at the transcriptional and translational levels. With this mouse, we discovered a cold-sensitive brown/beige adipose depot underneath the skin of the ears, which we named uBAT. Because of the sensitivity and high dynamic range of luciferase activity, the Ucp1-2A-luciferase mouse is useful for both in vitro quantitative determination and in vivo visualization of nonshivering thermogenesis. With the use of this model, we identified and characterized axitinib, an oral small-molecule tyrosine kinase inhibitor, as an effective browning agent.

    Diabetes 2017;66;2;407-417

  • How Single-Cell Genomics Is Changing Evolutionary and Developmental Biology.

    Marioni JC and Arendt D

    Wellcome Genome Campus, EMBL-European Bioinformatics Institute, Cambridge CB10 1SD, United Kingdom; email:

    The recent flood of single-cell data not only boosts our knowledge of cells and cell types, but also provides new insight into development and evolution from a cellular perspective. For example, assaying the genomes of multiple cells during development reveals developmental lineage trees-the kinship lineage-whereas cellular transcriptomes inform us about the regulatory state of cells and their gradual restriction in potency-the Waddington lineage. Beyond that, the comparison of single-cell data across species allows evolutionary changes to be tracked at all stages of development from the zygote, via different kinds of stem cells, to the differentiating cells. We discuss recent insights into the evolution of stem cells and initial attempts to reconstruct the evolutionary cell type tree of the mammalian forebrain, for example, by the comparative analysis of neuron types in the mesencephalic floor. These studies illustrate the immense potential of single-cell genomics to open up a new era in developmental and evolutionary research. Expected final online publication date for the Annual Review of Cell and Developmental Biology Volume 33 is October 6, 2017. Please see for revised estimates.

    Annual review of cell and developmental biology 2017

  • Diagnostics for yaws eradication: insights from direct next generation sequencing of cutaneous strains of Treponema pallidum.

    Marks M, Fookes M, Wagner J, Butcher R, Ghinai R, Sokana O, Sarkodie YA, Lukehart SA, Solomon AW, Mabey DCW and Thomson N

    Clinical Research Department, Faculty of Infectious and Tropical Diseases, London School of Hygiene & Tropical Medicine, London, United Kingdom.

    Background: Yaws-like chronic ulcers can be caused by Treponema pallidum subsp. pertenue, Haemophilus ducreyi, or other still-undefined bacteria. To permit accurate evaluation of yaws elimination efforts programmatic use of molecular diagnostics is required. The accuracy and sensitivity of current tools remains unclear because our understanding of T. pallidum diversity is limited by the low number of sequenced genomes.

    Methods: We tested samples from patients with suspected yaws collected in previous studies in the Solomon Islands and Ghana. All samples were from patients whose lesions had previously tested negative using the current CDC diagnostic assay in widespread use. However, some of these patients had positive serological assays for yaws on blood. We used direct whole genome sequencing to identify T.p subsp. pertenue strains missed by the current assay.

    Results: From 45 Solomon Islands and 27 Ghanaian samples, 11 were positive for T. pallidum DNA using the species-wide qPCR, from which we obtained 6 previously undetected T. p. subsp. pertenue whole genome sequences. These sequences show that Solomon Islands sequences represent distinct T. p. subsp. pertenue clades. These isolates were invisible to the CDC diagnostic PCR assay in widespread current use, due to sequence variation in the primer binding site.

    Conclusion: Our data double the number of published T. p. subsp. pertenue genomes. We show that Solomon Islands strains are undetectable by the PCR used in many studies and by health ministries. This assay is therefore not adequate for the eradication programme. Next-generation genome sequence data are essential for these efforts.

    Clinical infectious diseases : an official publication of the Infectious Diseases Society of America 2017

  • An Unexpectedly Complex Architecture for Skin Pigmentation in Africans.

    Martin AR, Lin M, Granka JM, Myrick JW, Liu X, Sockell A, Atkinson EG, Werely CJ, Möller M, Sandhu MS, Kingsley DM, Hoal EG, Liu X, Daly MJ, Feldman MW, Gignoux CR, Bustamante CD and Henn BM

    Department of Genetics, Stanford University, Stanford, CA 94305, USA; Analytic and Translational Genetics Unit, Department of Medicine, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA; Program in Medical and Population Genetics, Broad Institute, Cambridge, MA 02141, USA; Stanley Center for Psychiatric Research, Broad Institute, Cambridge, MA 02141, USA. Electronic address:

    Approximately 15 genes have been directly associated with skin pigmentation variation in humans, leading to its characterization as a relatively simple trait. However, by assembling a global survey of quantitative skin pigmentation phenotypes, we demonstrate that pigmentation is more complex than previously assumed, with genetic architecture varying by latitude. We investigate polygenicity in the KhoeSan populations indigenous to southern Africa who have considerably lighter skin than equatorial Africans. We demonstrate that skin pigmentation is highly heritable, but known pigmentation loci explain only a small fraction of the variance. Rather, baseline skin pigmentation is a complex, polygenic trait in the KhoeSan. Despite this, we identify canonical and non-canonical skin pigmentation loci, including near SLC24A5, TYRP1, SMARCA2/VLDLR, and SNX13, using a genome-wide association approach complemented by targeted resequencing. By considering diverse, under-studied African populations, we show how the architecture of skin pigmentation can vary across humans subject to different local evolutionary pressures.

    Cell 2017;171;6;1340-1353.e14

  • cuRRBS: simple and robust evaluation of enzyme combinations for reduced representation approaches.

    Martin-Herranz DE, Ribeiro AJM, Krueger F, Thornton JM, Reik W and Stubbs TM

    European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, UK.

    DNA methylation is an important epigenetic modification in many species that is critical for development, and implicated in ageing and many complex diseases, such as cancer. Many cost-effective genome-wide analyses of DNA modifications rely on restriction enzymes capable of digesting genomic DNA at defined sequence motifs. There are hundreds of restriction enzyme families but few are used to date, because no tool is available for the systematic evaluation of restriction enzyme combinations that can enrich for certain sites of interest in a genome. Herein, we present customised Reduced Representation Bisulfite Sequencing (cuRRBS), a novel and easy-to-use computational method that solves this problem. By computing the optimal enzymatic digestions and size selection steps required, cuRRBS generalises the traditional MspI-based Reduced Representation Bisulfite Sequencing (RRBS) protocol to all restriction enzyme combinations. In addition, cuRRBS estimates the fold-reduction in sequencing costs and provides a robustness value for the personalised RRBS protocol, allowing users to tailor the protocol to their experimental needs. Moreover, we show in silico that cuRRBS-defined restriction enzymes consistently out-perform MspI digestion in many biological systems, considering both CpG and CHG contexts. Finally, we have validated the accuracy of cuRRBS predictions for single and double enzyme digestions using two independent experimental datasets.

    Nucleic acids research 2017

  • Universal Patterns of Selection in Cancer and Somatic Tissues.

    Martincorena I, Raine KM, Gerstung M, Dawson KJ, Haase K, Van Loo P, Davies H, Stratton MR and Campbell PJ

    Wellcome Trust Sanger Institute, Hinxton CB10 1SA, Cambridgeshire, UK. Electronic address:

    Cancer develops as a result of somatic mutation and clonal selection, but quantitative measures of selection in cancer evolution are lacking. We adapted methods from molecular evolution and applied them to 7,664 tumors across 29 cancer types. Unlike species evolution, positive selection outweighs negative selection during cancer development. On average, <1 coding base substitution/tumor is lost through negative selection, with purifying selection almost absent outside homozygous loss of essential genes. This allows exome-wide enumeration of all driver coding mutations, including outside known cancer genes. On average, tumors carry ∼4 coding substitutions under positive selection, ranging from <1/tumor in thyroid and testicular cancers to >10/tumor in endometrial and colorectal cancers. Half of driver substitutions occur in yet-to-be-discovered cancer genes. With increasing mutation burden, numbers of driver mutations increase, but not linearly. We systematically catalog cancer genes and show that genes vary extensively in what proportion of mutations are drivers versus passengers.

    Funded by: Wellcome Trust

    Cell 2017;171;5;1029-1041.e21

  • Aging increases cell-to-cell transcriptional variability upon immune stimulation.

    Martinez-Jimenez CP, Eling N, Chen HC, Vallejos CA, Kolodziejczyk AA, Connor F, Stojic L, Rayner TF, Stubbington MJT, Teichmann SA, de la Roche M, Marioni JC and Odom DT

    University of Cambridge, Cancer Research UK Cambridge Institute, Robinson Way, Cambridge, CB2 0RE, UK.

    Aging is characterized by progressive loss of physiological and cellular functions, but the molecular basis of this decline remains unclear. We explored how aging affects transcriptional dynamics using single-cell RNA sequencing of unstimulated and stimulated naïve and effector memory CD4(+) T cells from young and old mice from two divergent species. In young animals, immunological activation drives a conserved transcriptomic switch, resulting in tightly controlled gene expression characterized by a strong up-regulation of a core activation program, coupled with a decrease in cell-to-cell variability. Aging perturbed the activation of this core program and increased expression heterogeneity across populations of cells in both species. These discoveries suggest that increased cell-to-cell transcriptional variability will be a hallmark feature of aging across most, if not all, mammalian tissues.

    Funded by: Biotechnology and Biological Sciences Research Council: BB/I015914/3; Cancer Research UK: A15603, A22257; European Research Council: 615584, 646794; Medical Research Council: MC_UP_0801/1; Wellcome Trust: 098051, 107609, 202878ODOM

    Science (New York, N.Y.) 2017;355;6332;1433-1436

  • Detection and quantitation of copy number variation in the voltage-gated sodium channel gene of the mosquito Culex quinquefasciatus.

    Martins WFS, Subramaniam K, Steen K, Mawejje H, Liloglou T, Donnelly MJ and Wilding CS

    Department of Vector Biology, Liverpool School of Tropical Medicine, Liverpool, UK.

    Insecticide resistance is typically associated with alterations to the insecticidal target-site or with gene expression variation at loci involved in insecticide detoxification. In some species copy number variation (CNV) of target site loci (e.g. the Ace-1 target site of carbamate insecticides) or detoxification genes has been implicated in the resistance phenotype. We show that field-collected Ugandan Culex quinquefasciatus display CNV for the voltage-gated sodium channel gene (Vgsc), target-site of pyrethroid and organochlorine insecticides. In order to develop field-applicable diagnostics for Vgsc CN, and as a prelude to investigating the possible association of CN with insecticide resistance, three assays were compared for their accuracy in CN estimation in this species. The gold standard method is droplet digital PCR (ddPCR), however, the hardware is prohibitively expensive for widespread utility. Here, ddPCR was compared to quantitative PCR (qPCR) and pyrosequencing. Across all platforms, CNV was detected in ≈10% of mosquitoes, corresponding to three or four copies (per diploid genome). ddPCR and qPCR-Std-curve yielded similar predictions for Vgsc CN, indicating that the qPCR protocol developed here can be applied as a diagnostic assay, facilitating monitoring of Vgsc CN in wild populations and the elucidation of association between the Vgsc CN and insecticide resistance.

    Funded by: NIAID NIH HHS: R01 AI116811, U19 AI089674

    Scientific reports 2017;7;1;5821

  • Human Y chromosome copy number variation in the next generation sequencing era and beyond.

    Massaia A and Xue Y

    National Heart and Lung Institute, Imperial College London, London, SW7 2AZ, UK.

    The human Y chromosome provides a fertile ground for structural rearrangements owing to its haploidy and high content of repeated sequences. The methodologies used for copy number variation (CNV) studies have developed over the years. Low-throughput techniques based on direct observation of rearrangements were developed early on, and are still used, often to complement array-based or sequencing approaches which have limited power in regions with high repeat content and specifically in the presence of long, identical repeats, such as those found in human sex chromosomes. Some specific rearrangements have been investigated for decades; because of their effects on fertility, or their outstanding evolutionary features, the interest in these has not diminished. However, following the flourishing of large-scale genomics, several studies have investigated CNVs across the whole chromosome. These studies sometimes employ data generated within large genomic projects such as the DDD study or the 1000 Genomes Project, and often survey large samples of healthy individuals without any prior selection. Novel technologies based on sequencing long molecules and combinations of technologies, promise to stimulate the study of Y-CNVs in the immediate future.

    Human genetics 2017

  • Temporal Tracking of Microglia Activation in Neurodegeneration at Single-Cell Resolution.

    Mathys H, Adaikkan C, Gao F, Young JZ, Manet E, Hemberg M, De Jager PL, Ransohoff RM, Regev A and Tsai LH

    Picower Institute for Learning and Memory, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.

    Microglia, the tissue-resident macrophages in the brain, are damage sensors that react to nearly any perturbation, including neurodegenerative diseases such as Alzheimer's disease (AD). Here, using single-cell RNA sequencing, we determined the transcriptome of more than 1,600 individual microglia cells isolated from the hippocampus of a mouse model of severe neurodegeneration with AD-like phenotypes and of control mice at multiple time points during progression of neurodegeneration. In this neurodegeneration model, we discovered two molecularly distinct reactive microglia phenotypes that are typified by modules of co-regulated type I and type II interferon response genes, respectively. Furthermore, our work identified previously unobserved heterogeneity in the response of microglia to neurodegeneration, discovered disease stage-specific microglia cell states, revealed the trajectory of cellular reprogramming of microglia in response to neurodegeneration, and uncovered the underlying transcriptional programs.

    Cell reports 2017;21;2;366-380

  • Pro-inflammatory fatty acid profile and colorectal cancer risk: A Mendelian randomisation analysis.

    May-Wilson S, Sud A, Law PJ, Palin K, Tuupanen S, Gylfe A, Hänninen UA, Cajuso T, Tanskanen T, Kondelin J, Kaasinen E, Sarin AP, Eriksson JG, Rissanen H, Knekt P, Pukkala E, Jousilahti P, Salomaa V, Ripatti S, Palotie A, Renkonen-Sinisalo L, Lepistö A, Böhm J, Mecklin JP, Al-Tassan NA, Palles C, Farrington SM, Timofeeva MN, Meyer BF, Wakil SM, Campbell H, Smith CG, Idziaszczyk S, Maughan TS, Fisher D, Kerr R, Kerr D, Passarelli MN, Figueiredo JC, Buchanan DD, Win AK, Hopper JL, Jenkins MA, Lindor NM, Newcomb PA, Gallinger S, Conti D, Schumacher F, Casey G, Aaltonen LA, Cheadle JP, Tomlinson IP, Dunlop MG and Houlston RS

    Division of Genetics and Epidemiology, The Institute of Cancer Research, London, SW7 3RP, UK.

    Background: While dietary fat has been established as a risk factor for colorectal cancer (CRC), associations between fatty acids (FAs) and CRC have been inconsistent. Using Mendelian randomisation (MR), we sought to evaluate associations between polyunsaturated (PUFA), monounsaturated (MUFA) and saturated FAs (SFAs) and CRC risk.

    Methods: We analysed genotype data on 9254 CRC cases and 18,386 controls of European ancestry. Externally weighted polygenic risk scores were generated and used to evaluate associations with CRC per one standard deviation increase in genetically defined plasma FA levels.

    Results: Risk reduction was observed for oleic and palmitoleic MUFAs (OROA = 0.77, 95% CI: 0.65-0.92, P = 3.9 × 10(-3); ORPOA = 0.36, 95% CI: 0.15-0.84, P = 0.018). PUFAs linoleic and arachidonic acid had negative and positive associations with CRC respectively (ORLA = 0.95, 95% CI: 0.93-0.98, P = 3.7 × 10(-4); ORAA = 1.05, 95% CI: 1.02-1.07, P = 1.7 × 10(-4)). The SFA stearic acid was associated with increased CRC risk (ORSA = 1.17, 95% CI: 1.01-1.35, P = 0.041).

    Conclusion: Results from our analysis are broadly consistent with a pro-inflammatory FA profile having a detrimental effect in terms of CRC risk.

    European journal of cancer (Oxford, England : 1990) 2017;84;228-238

  • The BEACH-domain containing protein, Nbeal2, interacts with Dock7, Sec16a and Vac14.

    Mayer L, Jasztal M, Pardo M, Aguera de Haro S, Collins J, Bariana TK, Smethurst PA, Grassi L, Petersen R, Nurden P, Favier R, Yu L, Meacham S, Astle WJ, Choudhary J, Yue WW, Ouwehand WH and Guerrero JA

    Department of Haematology, University of Cambridge, Cambridge Biomedical Campus, Cambridge, United Kingdom.

    Mutations in NBEAL2, the gene encoding the scaffolding protein Nbeal2 are causal of Gray Platelet Syndrome (GPS), a rare recessive bleeding disorder characterized by platelets lacking α-granules and progressive marrow fibrosis. We present here the interactome of Nbeal2 with additional validation by reverse immunoprecipitation of Dock7, Sec16a and Vac14 as interactors of Nbeal2. We show that GPS-causing mutations in its BEACH domain have profound and possible effects on the interaction with Dock7 and Vac14, respectively. Proximity ligation assays show that these two proteins are physically proximal to Nbeal2 in human megakaryocytes. In addition, we demonstrate that Nbeal2 is primarily localized in the cytoplasm and Dock7 on the membrane of or in α-granules. Interestingly, platelets from GPS cases and Nbeal2-/- mice are almost devoid of Dock7 resulting in a profound dysregulation of its signaling pathway, leading to defective actin polymerization, platelet activation and shape change. This study shows for the first time proteins interacting with Nbeal2 and points to the dysregulation of the canonical signaling pathway of Dock7 as a possible cause of the aberrant formation of platelets in GPS cases and Nbeal2 deficient mice.

    Blood 2017

  • Draft Genome Sequence of an IMP-7-Producing Pseudomonas aeruginosa Bloodstream Infection Isolate from Australia.

    McCarthy KL, Jennison A, Wailan AM and Paterson DL

    University of Queensland, Centre for Clinical Research, Brisbane, Queensland, Australia

    IMP-7 is one of the two IMP-type carbapenemases that in Pseudomonas aeruginosa are not limited to a geographic area, but it has not been previously reported in the Australian setting. We report here the draft genome sequence of an Australian P. aeruginosa bloodstream infection isolate that contains IMP-7.

    Genome announcements 2017;5;27

  • Draft Genome Sequences of Two Pseudomonas aeruginosa Bloodstream Infection Isolates Associated with Rapid Patient Death.

    McCarthy KL, Jennison AV, Wailan AM and Paterson DL

    The University of Queensland, UQ Centre for Clinical Research, Brisbane, Queensland, Australia

    The morbidity and mortality associated with Pseudomonas aeruginosa bloodstream infections are significant. New strategies are required to treat such infections. We report here the draft genome sequences of two antibiotic-sensitive P. aeruginosa bloodstream infection isolates that were associated with rapid death in nonneutropenic patients.

    Genome announcements 2017;5;33

  • Sibling recurrence of total anomalous pulmonary venous drainage.

    McDermott JH, Study DD and Clayton-Smith J

    Manchester Centre for Genomic Medicine, St. Mary's Hospital, Central Manchester Foundation Trust, Oxford Road, Manchester, M13 9WL, United Kingdom. Electronic address:

    Many childhood syndromic disorders are associated with congenital heart defects, but few present specifically with total anomalous pulmonary venous drainage (TAPVD). Here, we report two siblings presenting with TAPVD, tracheo-oesophageal fistula and dysmorphic features in the neonatal period. Careful examination of the mother revealed subtle facial asymmetry and a pre-auricular tag, suggesting a potential variable expression of a dominant disorder. Whole exome sequencing identified a pathogenic heterozygous mutation in EFTUD2, a gene, normally associated with mandibulofacial dystosis Guion-Almedia type (MFDGA), in both siblings and the mother. This is the first report of TAPVD occurring as part of the MFDGA phenotype. It serves to highlight the importance of modern sequencing panels in identifying causative mutations for heterogeneous syndromes such as MFDGA and familial congenital heart defects whilst emphasising the relevance of variable expression when counselling parents.

    European journal of medical genetics 2017

  • Variants in the fetal genome near FLT1 are associated with risk of preeclampsia.

    McGinnis R, Steinthorsdottir V, Williams NO, Thorleifsson G, Shooter S, Hjartardottir S, Bumpstead S, Stefansdottir L, Hildyard L, Sigurdsson JK, Kemp JP, Silva GB, Thomsen LCV, Jääskeläinen T, Kajantie E, Chappell S, Kalsheker N, Moffett A, Hiby S, Lee WK, Padmanabhan S, Simpson NAB, Dolby VA, Staines-Urias E, Engel SM, Haugan A, Trogstad L, Svyatova G, Zakhidova N, Najmutdinova D, FINNPEC Consortium, GOPEC Consortium, Dominiczak AF, Gjessing HK, Casas JP, Dudbridge F, Walker JJ, Pipkin FB, Thorsteinsdottir U, Geirsson RT, Lawlor DA, Iversen AC, Magnus P, Laivuori H, Stefansson K and Morgan L

    Wellcome Trust Sanger Institute, Cambridge, UK.

    Preeclampsia, which affects approximately 5% of pregnancies, is a leading cause of maternal and perinatal death. The causes of preeclampsia remain unclear, but there is evidence for inherited susceptibility. Genome-wide association studies (GWAS) have not identified maternal sequence variants of genome-wide significance that replicate in independent data sets. We report the first GWAS of offspring from preeclamptic pregnancies and discovery of the first genome-wide significant susceptibility locus (rs4769613; P = 5.4 × 10<sup>-11</sup>) in 4,380 cases and 310,238 controls. This locus is near the FLT1 gene encoding Fms-like tyrosine kinase 1, providing biological support, as a placental isoform of this protein (sFlt-1) is implicated in the pathology of preeclampsia. The association was strongest in offspring from pregnancies in which preeclampsia developed during late gestation and offspring birth weights exceeded the tenth centile. An additional nearby variant, rs12050029, associated with preeclampsia independently of rs4769613. The newly discovered locus may enhance understanding of the pathophysiology of preeclampsia and its subtypes.

    Funded by: Medical Research Council: MC_PC_15018, MC_UU_12013/5; NICHD NIH HHS: R01 HD058008

    Nature genetics 2017;49;8;1255-1260

  • <i>JAK2</i> V617F hematopoietic clones are present several years prior to MPN diagnosis and follow different expansion kinetics.

    McKerrell T, Park N, Chi J, Collord G, Moreno T, Ponstingl H, Dias J, Gerasimou P, Melanthiou K, Prokopiou C, Antoniades M, Varela I, Costeas PA and Vassiliou GS

    The Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom.

    Funded by: Medical Research Council: MC_PC_12009

    Blood advances 2017;1;14;968-971

  • Pro-death NMDA receptor signaling is promoted by the GluN2B C-terminus independently of DAPK1.

    McQueen J, Ryan TJ, McKay S, Marwick KF, Baxter PS, Carpanini SM, Wishart TM, Gillingwater TH, Manson JC, Wyllie DJ, Grant SG, McColl B, Komiyama N and Hardingham GE

    UK Dementia Research Institute, Edinburgh Medical School, University of Edinburgh, Edinburgh, United Kingdom.

    Aberrant NMDA receptor (NMDAR) activity contributes to several neurological disorders, but direct antagonism is poorly tolerated therapeutically. The GluN2B cytoplasmic C-terminal domain (CTD) represents an alternative therapeutic target since it potentiates excitotoxic signaling. The key GluN2B CTD-centred event in excitotoxicity is proposed to involve its phosphorylation at Ser-1303 by DAPK1, that is blocked by a neuroprotective cell-permeable peptide mimetic of the region. Contrary to this model, we find that excitotoxicity can proceed without increased Ser-1303 phosphorylation, and is unaffected by DAPK1 deficiency in vitro or following ischemia in vivo. Pharmacological analysis of the aforementioned neuroprotective peptide revealed that it acts in a sequence-independent manner as an open-channel NMDAR antagonist at or near the Mg(2+) site, due to its high net positive charge. Thus, GluN2B-driven excitotoxic signaling can proceed independently of DAPK1 or altered Ser-1303 phosphorylation.

    eLife 2017;6

  • Disease model discovery from 3,328 gene knockouts by The International Mouse Phenotyping Consortium.

    Meehan TF, Conte N, West DB, Jacobsen JO, Mason J, Warren J, Chen CK, Tudose I, Relac M, Matthews P, Karp N, Santos L, Fiegel T, Ring N, Westerberg H, Greenaway S, Sneddon D, Morgan H, Codner GF, Stewart ME, Brown J, Horner N, International Mouse Phenotyping Consortium, Haendel M, Washington N, Mungall CJ, Reynolds CL, Gallegos J, Gailus-Durner V, Sorg T, Pavlovic G, Bower LR, Moore M, Morse I, Gao X, Tocchini-Valentini GP, Obata Y, Cho SY, Seong JK, Seavitt J, Beaudet AL, Dickinson ME, Herault Y, Wurst W, de Angelis MH, Lloyd KCK, Flenniken AM, Nutter LMJ, Newbigging S, McKerlie C, Justice MJ, Murray SA, Svenson KL, Braun RE, White JK, Bradley A, Flicek P, Wells S, Skarnes WC, Adams DJ, Parkinson H, Mallon AM, Brown SDM and Smedley D

    European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK.

    Although next-generation sequencing has revolutionized the ability to associate variants with human diseases, diagnostic rates and development of new therapies are still limited by a lack of knowledge of the functions and pathobiological mechanisms of most genes. To address this challenge, the International Mouse Phenotyping Consortium is creating a genome- and phenome-wide catalog of gene function by characterizing new knockout-mouse strains across diverse biological systems through a broad set of standardized phenotyping tests. All mice will be readily available to the biomedical community. Analyzing the first 3,328 genes identified models for 360 diseases, including the first models, to our knowledge, for type C Bernard-Soulier, Bardet-Biedl-5 and Gordon Holmes syndromes. 90% of our phenotype annotations were novel, providing functional evidence for 1,092 genes and candidates in genetically uncharacterized diseases including arrhythmogenic right ventricular dysplasia 3. Finally, we describe our role in variant functional validation with The 100,000 Genomes Project and others.

    Funded by: Medical Research Council: MC_U142684171, MC_U142684172; NCI NIH HHS: P30 CA034196; NHGRI NIH HHS: U54 HG006332, U54 HG006348, U54 HG006364, U54 HG006370, UM1 HG006348, UM1 HG006370; NIH HHS: R24 OD011883, U42 OD011174, U42 OD011175, U42 OD011185, U42 OD012210, UM1 OD023221, UM1 OD023222; Wellcome Trust

    Nature genetics 2017;49;8;1231-1238

  • The Typhoid Vaccine Acceleration Consortium (TyVAC): Vaccine effectiveness study designs: Accelerating the introduction of typhoid conjugate vaccines and reducing the global burden of enteric fever. Report from a meeting held on 26-27 October 2016, Oxford, UK.

    Meiring JE, Gibani M and TyVAC Consortium Meeting Group

    Oxford Vaccine Group, Department of Paediatrics, University of Oxford, and the NIHR Oxford Biomedical Research Centre, Oxford, United Kingdom. Electronic address:

    Typhoid fever is estimated to cause between 11.9-26.9 million infections globally each year with 129,000-216,510 deaths. Access to improved water sources have reduced disease incidence in parts of the world but the use of efficacious vaccines is seen as an important public health tool for countries with a high disease burden. A new generation of Vi typhoid conjugate vaccines (TCVs), licensed for use in young children and expected to provide longer lasting protection than previous vaccines, are now available. The WHO Strategic Advisory Group of Experts on Immunization (SAGE) has convened a working group to review the evidence on TCVs and produce an updated WHO position paper for all typhoid vaccines in 2018 that will inform Gavi, the Vaccine Alliance's future vaccine investment strategies for TCVs. The Typhoid Vaccine Acceleration Consortium (TyVAC) has been formed through a $36.9 million funding program from the Bill & Melinda Gates Foundation to accelerate the introduction of TCVs into Gavi-eligible countries. In October 2016, a meeting was held to initiate planning of TCV effectiveness studies that will provide the data required by policy makers and stakeholders to support decisions on TCV use in countries with a high typhoid burden. Discussion topics included (1) the latest evidence and data gaps in typhoid epidemiology; (2) WHO and Gavi methods and data requirements; (3) data on TCV efficacy; (4) cost effectiveness analysis for TCVs from mathematical models; (5) TCV delivery and effectiveness study design. Specifically, participants were asked to comment on study design in 3 sites for which population-based typhoid surveillance is underway. The conclusion of the meeting was that country-level decision making would best be informed by the respective selected sites in Africa and Asia vaccinating children aged from 9-months to 15-years-old, employing either an individual or cluster randomized design with design influenced by population characteristics, transmission dynamics, and statistical considerations.

    Vaccine 2017

  • Dietary (Poly)phenols, Brown Adipose Tissue Activation, and Energy Expenditure: A Narrative Review.

    Mele L, Bidault G, Mena P, Crozier A, Brighenti F, Vidal-Puig A and Del Rio D

    Laboratory of Phytochemicals in Physiology, Department of Food and Drugs, University of Parma, Parma, Italy.

    The incidence of overweight and obesity has reached epidemic proportions, making the control of body weight and its complications a primary health problem. Diet has long played a first-line role in preventing and managing obesity. However, beyond the obvious strategy of restricting caloric intake, growing evidence supports the specific antiobesity effects of some food-derived components, particularly (poly)phenolic compounds. The relatively new rediscovery of active brown adipose tissue in adult humans has generated interest in this tissue as a novel and viable target for stimulating energy expenditure and controlling body weight by promoting energy dissipation. This review critically discusses the evidence supporting the concept that the antiobesity effects ascribed to (poly)phenols might be dependent on their capacity to promote energy dissipation by activating brown adipose tissue. Although discrepancies exist in the literature, most in vivo studies with rodents strongly support the role of some (poly)phenol classes, particularly flavan-3-ols and resveratrol, in promoting energy expenditure. Some human data currently are available and most are consistent with studies in rodents. Further investigation of effects in humans is warranted.

    Advances in nutrition (Bethesda, Md.) 2017;8;5;694-704

  • Rapid detection and evolutionary analysis of Legionella pneumophila serogroup 1 sequence type 47.

    Mentasti M, Cassier P, David S, Ginevra C, Gomez-Valero L, Underwood A, Afshar B, Etienne J, Parkhill J, Chalker V, Buchrieser C, Harrison TG, Jarraud S and ESCMID Study Group for Legionella Infections (ESGLI)

    Public Health England, London, UK.

    Objectives: Legionella pneumophila serogroup 1 (Lp1) sequence type 47 is the leading cause of legionellosis in north-western Europe, but, surprisingly, it is rarely isolated from environmental samples. Comparative genomics was applied to develop a PCR assay and to better understand the evolution of this strain.

    Methods: Comparative analysis of 36 genomes representative of the Lp species was used to identify specific PCR targets, which were then evaluated in silico on 545 sequenced genomes and in vitro on 436 Legionella strains, 106 respiratory samples, and three environmental samples from proven ST47 sources. Phylogenetic analyses were performed to understand the evolution of ST47.

    Results: The gene LPO_1073 was characterized as being 100% conserved in all 129 ST47 genomes analysed. A real-time PCR designed to detect LPO_1073 was positive for all 110 ST47 strains tested and agreed with culture and typing results previously obtained for 106 respiratory samples. The three environmental samples were also positive. Surprisingly, 26 of the 44 ST109 strains tested among 342 non-ST47 strains scored positive for LPO_1073. SNP-based phylogenetic analysis was undertaken to understand this result: the PCR-positive ST109 genomes were almost identical to ST47 genomes, with the exception of a recombined region probably acquired by ST47 from a ST62(-like) strain.

    Conclusion: The genomic analysis allowed the design of a highly specific PCR assay for rapid detection of ST47 strains. Furthermore, it allowed us to uncover the evolution of ST47 strains from ST109 by homologous recombination with ST62. We hypothesize that this recombination generated the leading cause of legionellosis in north-western Europe.

    Clinical microbiology and infection : the official publication of the European Society of Clinical Microbiology and Infectious Diseases 2017;23;4;264.e1-264.e9

  • Identification and characterization of the novel colonization factor CS30 based on whole genome sequencing in enterotoxigenic Escherichia coli (ETEC).

    Mentzer AV, Tobias J, Wiklund G, Nordqvist S, Aslett M, Dougan G, Sjöling Å and Svennerholm AM

    Department of Microbiology and Immunology, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden.

    The ability to colonize the small intestine is essential for enterotoxigenic Escherichia coli (ETEC) to cause diarrhea. Although 22 antigenically different colonization factors (CFs) have been identified and characterized in ETEC at least 30% of clinical ETEC isolates lack known CFs. Ninety-four whole genome sequenced "CF negative" isolates were searched for novel CFs using a reverse genetics approach followed by phenotypic analyses. We identified a novel CF, CS30, encoded by a set of seven genes, csmA-G, related to the human CF operon CS18 and the porcine CF operon 987P (F6). CS30 was shown to be thermo-regulated, expressed at 37 °C, but not at 20 °C, by SDS-page and mass spectrometry analyses as well as electron microscopy imaging. Bacteria expressing CS30 were also shown to bind to differentiated human intestinal Caco-2 cells. The genes encoding CS30 were located on a plasmid (E873p3) together with the genes encoding LT and STp. PCR screening of ETEC isolates revealed that 8.6% (n = 13) of "CF negative" (n = 152) and 19.4% (n = 13) of "CF negative" LT + STp (n = 67) expressing isolates analyzed harbored CS30. Hence, we conclude that CS30 is common among "CF negative" LT + STp isolates and is associated with ETEC that cause diarrhea.

    Scientific reports 2017;7;1;12514

  • Pacritinib versus best available therapy for the treatment of myelofibrosis irrespective of baseline cytopenias (PERSIST-1): an international, randomised, phase 3 trial.

    Mesa RA, Vannucchi AM, Mead A, Egyed M, Szoke A, Suvorov A, Jakucs J, Perkins A, Prasad R, Mayer J, Demeter J, Ganly P, Singer JW, Zhou H, Dean JP, Te Boekhorst PA, Nangalia J, Kiladjian JJ and Harrison CN

    Division of Hematology and Medical Oncology, Mayo Clinic, Scottsdale, AZ, USA. Electronic address:

    Background: Available therapies for myelofibrosis can exacerbate cytopenias and are not indicated for patients with severe thrombocytopenia. Pacritinib, which inhibits both JAK2 and FLT3, induced spleen responses with limited myelosuppression in phase 1/2 trials. We aimed to assess the efficacy and safety of pacritinib versus best available therapy in patients with myelofibrosis irrespective of baseline cytopenias.

    Methods: This international, multicentre, randomised, phase 3 trial (PERSIST-1) was done at 67 sites in 12 countries. Patients with higher-risk myelofibrosis (with no exclusions for baseline anaemia or thrombocytopenia) were randomly assigned (2:1) to receive oral pacritinib 400 mg once daily or best available therapy (BAT) excluding JAK2 inhibitors until disease progression or unacceptable toxicity. Randomisation was stratified by risk category, platelet count, and region. Treatment assignments were known to investigators, site personnel, patients, clinical monitors, and pharmacovigilance personnel. The primary endpoint was spleen volume reduction (SVR) of 35% or more from baseline to week 24 in the intention-to-treat population as assessed by blinded, centrally reviewed MRI or CT. We did safety analyses in all randomised patients who received either treatment. Here we present the final data. This trial is registered with, number NCT01773187.

    Findings: Between Jan 8, 2013, and Aug 1, 2014, 327 patients were randomly assigned to pacritinib (n=220) or BAT (n=107). Median follow-up was 23·2 months (IQR 14·8-28·7). At week 24, the primary endpoint of SVR of 35% or more was achieved by 42 (19%) patients in the pacritinib group versus five (5%) patients in the BAT group (p=0·0003). 90 patients in the BAT group crossed over to receive pacritinib at a median of 6·3 months (IQR 5·8-6·7). The most common grade 3-4 adverse events through week 24 were anaemia (n=37 [17%]), thrombocytopenia (n=26 [12%]), and diarrhoea (n=11 [5%]) in the pacritinib group, and anaemia (n=16 [15%]), thrombocytopenia (n=12 [11%]), dyspnoea (n=3 [3%]), and hypotension (n=3 [3%]) in the BAT group. The most common serious adverse events that occurred through week 24 were anaemia (10 [5%]), cardiac failure (5 [2%]), pyrexia (4 [2%]), and pneumonia (4 [2%]) with pacritinib, and anaemia (5 [5%]), sepsis (2 [2%]), and dyspnoea (2 [2%]) with BAT. Deaths due to adverse events were observed in 27 (12%) patients in the pacritinib group and 14 (13%) patients in the BAT group throughout the duration of the study.

    Interpretation: Pacritinib therapy was well tolerated and induced significant and sustained SVR and symptom reduction, even in patients with severe baseline cytopenias. Pacritinib could be a treatment option for patients with myelofibrosis, including those with baseline cytopenias for whom options are particularly limited.

    Funding: CTI BioPharma Corp.

    Funded by: NCI NIH HHS: P30 CA054174

    The Lancet. Haematology 2017;4;5;e225-e236

  • Nonenzymatic gluconeogenesis-like formation of fructose 1,6-bisphosphate in ice.

    Messner CB, Driscoll PC, Piedrafita G, De Volder MFL and Ralser M

    The Molecular Biology of Metabolism Laboratory, The Francis Crick Institute, London NW1 1AT, United Kingdom.

    The evolutionary origins of metabolism, in particular the emergence of the sugar phosphates that constitute glycolysis, the pentose phosphate pathway, and the RNA and DNA backbone, are largely unknown. In cells, a major source of glucose and the large sugar phosphates is gluconeogenesis. This ancient anabolic pathway (re-)builds carbon bonds as cleaved in glycolysis in an aldol condensation of the unstable catabolites glyceraldehyde 3-phosphate and dihydroxyacetone phosphate, forming the much more stable fructose 1,6-bisphosphate. We here report the discovery of a nonenzymatic counterpart to this reaction. The in-ice nonenzymatic aldol addition leads to the continuous accumulation of fructose 1,6-bisphosphate in a permanently frozen solution as followed over months. Moreover, the in-ice reaction is accelerated by simple amino acids, in particular glycine and lysine. Revealing that gluconeogenesis may be of nonenzymatic origin, our results shed light on how glucose anabolism could have emerged in early life forms. Furthermore, the amino acid acceleration of a key cellular anabolic reaction may indicate a link between prebiotic chemistry and the nature of the first metabolic enzymes.

    Proceedings of the National Academy of Sciences of the United States of America 2017;114;28;7403-7407

  • Enhancing the genome editing toolbox: genome wide CRISPR arrayed libraries.

    Metzakopian E, Strong A, Iyer V, Hodgkins A, Tzelepis K, Antunes L, Friedrich MJ, Kang Q, Davidson T, Lamberth J, Hoffmann C, Davis GD, Vassiliou GS, Skarnes WC and Bradley A

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.

    CRISPR-Cas9 technology has accelerated biological research becoming routine for many laboratories. It is rapidly replacing conventional gene editing techniques and has high utility for both genome-wide and gene-focussed applications. Here we present the first individually cloned CRISPR-Cas9 genome wide arrayed sgRNA libraries covering 17,166 human and 20,430 mouse genes at a complexity of 34,332 sgRNAs for human and 40,860 sgRNAs for the mouse genome. For flexibility in generating stable cell lines the sgRNAs have been cloned in a lentivirus backbone containing PiggyBac transposase recognition elements together with fluorescent and drug selection markers. Over 95% of tested sgRNA induced specific DNA cleavage as measured by CEL-1 assays. Furthermore, sgRNA targeting GPI anchor protein pathway genes induced loss of function mutations in human and mouse cell lines measured by FLAER labelling. These arrayed libraries offer the prospect for performing screens on individual genes, combinations as well as larger gene sets. They also facilitate rapid deconvolution of signals from genome-wide screens. This set of vectors provide an organized comprehensive gene editing toolbox of considerable scientific value.

    Funded by: Medical Research Council: MC_PC_12009; Wellcome Trust

    Scientific reports 2017;7;1;2244

  • Your DNA, Your Say.

    Middleton A

    a Society and Ethics Research, Connecting Science , Wellcome Genome Campus, Cambridge.

    Genomic and medical data sharing is pivotal if the promise of genomic medicine is to be fully realised. Social scientists working in the genomics arena ask the public 'how is the technology working for you?' Empirical studies on attitudes, values and beliefs are incredibly valuable; they offer a voice from those who are, or will be, directly affected. This is paramount if personalised medicine is to be truly personal. An International attitude study, Your DNA, Your Say, uses film to provide background information and an online survey to gather public views on donating one's own personal DNA and medical data for use by others. In this paper the rationale to the project is introduced together with an overview of the survey and film design. The project has been translated into multiple languages and the results will be used in policy for the Global Alliance for Genomics and Health.

    The New bioethics : a multidisciplinary journal of biotechnology and the body 2017;23;1;74-80

  • Gender Differences in Global but Not Targeted Demethylation in iPSC Reprogramming.

    Milagre I, Stubbs TM, King MR, Spindel J, Santos F, Krueger F, Bachman M, Segonds-Pichon A, Balasubramanian S, Andrews SR, Dean W and Reik W

    Epigenetics Programme, The Babraham Institute, Cambridge CB22 3AT, UK. Electronic address:

    Global DNA demethylation is an integral part of reprogramming processes in vivo and in vitro, but whether it occurs in the derivation of induced pluripotent stem cells (iPSCs) is not known. Here, we show that iPSC reprogramming involves both global and targeted demethylation, which are separable mechanistically and by their biological outcomes. Cells at intermediate-late stages of reprogramming undergo transient genome-wide demethylation, which is more pronounced in female cells. Global demethylation requires activation-induced cytidine deaminase (AID)-mediated downregulation of UHRF1 protein, and abolishing demethylation leaves thousands of hypermethylated regions in the iPSC genome. Independently of AID and global demethylation, regulatory regions, particularly ESC enhancers and super-enhancers, are specifically targeted for hypomethylation in association with transcription of the pluripotency network. Our results show that global and targeted DNA demethylation are conserved and distinct reprogramming processes, presumably because of their respective roles in epigenetic memory erasure and in the establishment of cell identity.

    Cell reports 2017;18;5;1079-1089

  • Deciphering the distance to antibiotic resistance for the pneumococcus using genome sequencing data.

    Mobegi FM, Cremers AJ, de Jonge MI, Bentley SD, van Hijum SA and Zomer A

    Laboratory of Pediatric Infectious Diseases, Radboud Institute for Molecular Life Sciences, Radboud University Medical Centre, Nijmegen 6525 GA, The Netherlands.

    Advances in genome sequencing technologies and genome-wide association studies (GWAS) have provided unprecedented insights into the molecular basis of microbial phenotypes and enabled the identification of the underlying genetic variants in real populations. However, utilization of genome sequencing in clinical phenotyping of bacteria is challenging due to the lack of reliable and accurate approaches. Here, we report a method for predicting microbial resistance patterns using genome sequencing data. We analyzed whole genome sequences of 1,680 Streptococcus pneumoniae isolates from four independent populations using GWAS and identified probable hotspots of genetic variation which correlate with phenotypes of resistance to essential classes of antibiotics. With the premise that accumulation of putative resistance-conferring SNPs, potentially in combination with specific resistance genes, precedes full resistance, we retrogressively surveyed the hotspot loci and quantified the number of SNPs and/or genes, which if accumulated would confer full resistance to an otherwise susceptible strain. We name this approach the 'distance to resistance'. It can be used to identify the creep towards complete antibiotics resistance in bacteria using genome sequencing. This approach serves as a basis for the development of future sequencing-based methods for predicting resistance profiles of bacterial strains in hospital microbiology and public health settings.

    Scientific reports 2017;7;42808

  • Parallel genome-wide screens identify synthetic viable interactions between the BLM helicase complex and Fanconi anemia.

    Moder M, Velimezi G, Owusu M, Mazouzi A, Wiedner M, Ferreira da Silva J, Robinson-Garcia L, Schischlik F, Slavkovsky R, Kralovics R, Schuster M, Bock C, Ideker T, Jackson SP, Menche J and Loizou JI

    CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Lazarettgasse 14, AKH BT 25.3, 1090, Vienna, Austria.

    Maintenance of genome integrity via repair of DNA damage is a key biological process required to suppress diseases, including Fanconi anemia (FA). We generated loss-of-function human haploid cells for FA complementation group C (FANCC), a gene encoding a component of the FA core complex, and used genome-wide CRISPR libraries as well as insertional mutagenesis to identify synthetic viable (genetic suppressor) interactions for FA. Here we show that loss of the BLM helicase complex suppresses FANCC phenotypes and we confirm this interaction in cells deficient for FA complementation group I and D2 (FANCI and FANCD2) that function as part of the FA I-D2 complex, indicating that this interaction is not limited to the FA core complex, hence demonstrating that systematic genome-wide screening approaches can be used to reveal genetic viable interactions for DNA repair defects.

    Nature communications 2017;8;1;1238

  • Single-Cell Landscape of Transcriptional Heterogeneity and Cell Fate Decisions during Mouse Early Gastrulation.

    Mohammed H, Hernando-Herraez I, Savino A, Scialdone A, Macaulay I, Mulas C, Chandra T, Voet T, Dean W, Nichols J, Marioni JC and Reik W

    Epigenetics Programme, Babraham Institute, Cambridge CB22 3AT, UK.

    The mouse inner cell mass (ICM) segregates into the epiblast and primitive endoderm (PrE) lineages coincident with implantation of the embryo. The epiblast subsequently undergoes considerable expansion of cell numbers prior to gastrulation. To investigate underlying regulatory principles, we performed systematic single-cell RNA sequencing (seq) of conceptuses from E3.5 to E6.5. The epiblast shows reactivation and subsequent inactivation of the X chromosome, with Zfp57 expression associated with reactivation and inactivation together with other candidate regulators. At E6.5, the transition from epiblast to primitive streak is linked with decreased expression of polycomb subunits, suggesting a key regulatory role. Notably, our analyses suggest elevated transcriptional noise at E3.5 and within the non-committed epiblast at E6.5, coinciding with exit from pluripotency. By contrast, E6.5 primitive streak cells became highly synchronized and exhibit a shortened G1 cell-cycle phase, consistent with accelerated proliferation. Our study systematically charts transcriptional noise and uncovers molecular processes associated with early lineage decisions.

    Funded by: Biotechnology and Biological Sciences Research Council: BB/K010867/1; Medical Research Council: MC_PC_12009; Wellcome Trust: 095645/Z/11/Z, 105031/D/14/Z

    Cell reports 2017;20;5;1215-1228

  • Y-chromosomal sequences of diverse Indian populations and the ancestry of the Andamanese.

    Mondal M, Bergström A, Xue Y, Calafell F, Laayouni H, Casals F, Majumder PP, Tyler-Smith C and Bertranpetit J

    Institute of Evolutionary Biology (CSIC-UPF), Universitat Pompeu Fabra, Doctor Aiguader 88 (PRBB), 08003, Barcelona, Catalonia, Spain.

    We present 42 new Y-chromosomal sequences from diverse Indian tribal and non-tribal populations, including the Jarawa and Onge from the Andaman Islands, which are analysed within a calibrated Y-chromosomal phylogeny incorporating South Asian (in total 305 individuals) and worldwide (in total 1286 individuals) data from the 1000 Genomes Project. In contrast to the more ancient ancestry in the South than in the North that has been claimed, we detected very similar coalescence times within Northern and Southern non-tribal Indian populations. A closest neighbour analysis in the phylogeny showed that Indian populations have an affinity towards Southern European populations and that the time of divergence from these populations substantially predated the Indo-European migration into India, probably reflecting ancient shared ancestry rather than the Indo-European migration, which had little effect on Indian male lineages. Among the tribal populations, the Birhor (Austro-Asiatic-speaking) and Irula (Dravidian-speaking) are the nearest neighbours of South Asian non-tribal populations, with a common origin in the last few millennia. In contrast, the Riang (Tibeto-Burman-speaking) and Andamanese have their nearest neighbour lineages in East Asia. The Jarawa and Onge shared haplogroup D lineages with each other within the last ~7000 years, but had diverged from Japanese haplogroup D Y-chromosomes ~53000 years ago, most likely by a split from a shared ancestral population. This analysis suggests that Indian populations have complex ancestry which cannot be explained by a single expansion model.

    Funded by: Wellcome Trust: 098051

    Human genetics 2017;136;5;499-510

  • Evolution of the <i>Staphylococcus argenteus</i> ST2250 Clone in Northeastern Thailand Is Linked with the Acquisition of Livestock-Associated Staphylococcal Genes.

    Moradigaravand D, Jamrozy D, Mostowy R, Anderson A, Nickerson EK, Thaipadungpanit J, Wuthiekanun V, Limmathurotsakul D, Tandhavanant S, Wikraiphat C, Wongsuvan G, Teerawattanasook N, Jutrakul Y, Srisurat N, Chaimanee P, Eoin West T, Blane B, Parkhill J, Chantratita N and Peacock SJ

    Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, United Kingdom

    <i>Staphylococcus argenteus</i> is a newly named species previously described as a divergent lineage of <i>Staphylococcus aureus</i> that has recently been shown to have a global distribution. Despite growing evidence of the clinical importance of this species, knowledge about its population epidemiology and genomic architecture is limited. We used whole-genome sequencing to evaluate and compare <i>S. aureus</i> (<i>n</i> = 251) and <i>S. argenteus</i> (<i>n</i> = 68) isolates from adults with staphylococcal sepsis at several hospitals in northeastern Thailand between 2006 and 2013. The majority (82%) of the <i>S. argenteus</i> isolates were of multilocus sequence type 2250 (ST2250). <i>S. aureus</i> was more diverse, although 43% of the isolates belonged to ST121. Bayesian analysis suggested an <i>S. argenteus</i> ST2250 substitution rate of 4.66 (95% confidence interval [CI], 3.12 to 6.38) mutations per genome per year, which was comparable to the <i>S. aureus</i> ST121 substitution rate of 4.07 (95% CI, 2.61 to 5.55). <i>S. argenteus</i> ST2250 emerged in Thailand an estimated 15 years ago, which contrasts with the <i>S. aureus</i> ST1, ST88, and ST121 clades that emerged around 100 to 150 years ago. Comparison of <i>S. argenteus</i> ST2250 genomes from Thailand and a global collection indicated a single introduction into Thailand, followed by transmission to local and more distant countries in Southeast Asia and further afield. <i>S. argenteus</i> and <i>S. aureus</i> shared around half of their core gene repertoire, indicating a high level of divergence and providing strong support for their classification as separate species. Several gene clusters were present in ST2250 isolates but absent from the other <i>S. argenteus</i> and <i>S. aureus</i> study isolates. These included multiple exotoxins and antibiotic resistance genes that have been linked previously with livestock-associated <i>S. aureus</i>, consistent with a livestock reservoir for <i>S. argenteus</i> These genes appeared to be associated with plasmids and mobile genetic elements and may have contributed to the biological success of ST2250.<b>IMPORTANCE</b> In this study, we used whole-genome sequencing to understand the genome evolution and population structure of a systematic collection of ST2250 <i>S. argenteus</i> isolates. A newly identified ancestral species of <i>S. aureus</i>, <i>S. argenteus</i> has become increasingly known as a clinically important species that has been reported recently across various countries. Our results indicate that <i>S. argenteus</i> has spread at a relatively rapid pace over the past 2 decades across northeastern Thailand and acquired multiple exotoxin and antibiotic resistance genes that have been linked previously with livestock-associated <i>S. aureus</i> Our findings highlight the clinical importance and potential pathogenicity of <i>S. argenteus</i> as a recently emerging pathogen.

    Funded by: Department of Health: HICF-T5-342; Wellcome Trust: 087769/Z/08/Z, WT098600

    mBio 2017;8;4

  • Evolution and Epidemiology of Multidrug-Resistant<i>Klebsiella pneumoniae</i>in the United Kingdom and Ireland.

    Moradigaravand D, Martin V, Peacock SJ and Parkhill J

    Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, United Kingdom.

    <i>Klebsiella pneumoniae</i> is a human commensal and opportunistic pathogen that has become a leading causative agent of hospital-based infections over the past few decades. The emergence and global expansion of hypervirulent and multidrug-resistant (MDR) clones of<i>K. pneumoniae</i>have been increasingly reported in community-acquired and nosocomial infections. Despite this, the population genomics and epidemiology of MDR<i>K. pneumoniae</i>at the national level are still poorly understood. To obtain insights into these, we analyzed a systematic large-scale collection of invasive MDR<i>K. pneumoniae</i>isolates from hospitals across the United Kingdom and Ireland. Using whole-genome phylogenetic analysis, we placed these in the context of previously sequenced<i>K. pneumoniae</i>populations from geographically diverse countries and identified their virulence and drug resistance determinants. Our results demonstrate that United Kingdom and Ireland MDR isolates are a highly diverse population drawn from across the global phylogenetic tree of<i>K. pneumoniae</i>and represent multiple recent international introductions that are mainly from Europe but in some cases from more distant countries. In addition, we identified novel genetic determinants underlying resistance to beta-lactams, gentamicin, ciprofloxacin, and tetracyclines, indicating that both increased virulence and resistance have emerged independently multiple times throughout the population. Our data show that MDR<i>K. pneumoniae</i>isolates in the United Kingdom and Ireland have multiple distinct origins and appear to be part of a globally circulating<i>K. pneumoniae</i>population.<b>IMPORTANCE</b><i>Klebsiella pneumoniae</i>is a major human pathogen that has been implicated in infections in healthcare settings over the past few decades. Antimicrobial treatment of<i>K. pneumoniae</i>infections has become increasingly difficult as a consequence of the emergence and spread of strains that are resistant to multiple antimicrobials. To better understand the spread of resistant<i>K. pneumoniae</i>, we studied the genomes of a large-scale population of extensively antimicrobial-resistant<i>K. pneumoniae</i>in the United Kingdom and Ireland by utilizing the fine resolution that whole-genome sequencing of pathogen genomes provides. Our results indicate that the<i>K. pneumoniae</i>population is highly diverse and that, in some cases, resistant strains appear to have spread across the country over a few years. In addition, we found evidence that some strains have acquired antimicrobial resistance genes independently, presumably in response to antimicrobial treatment.

    Funded by: Department of Health: HICF-T5-342; Wellcome Trust: WT098600

    mBio 2017;8;1

  • Pneumococcal capsule synthesis locus cps as evolutionary hotspot with potential to generate novel serotypes by recombination.

    Mostowy RJ, Croucher NJ, De Maio N, Chewapreecha C, Salter SJ, Turner P, Aanensen DM, Bentley SD, Didelot X and Fraser C

    Department of Infectious Disease Epidemiology, School of Public Health, Imperial College London, W2 1PG London, UK.

    Diversity of the polysaccharide capsule in Streptococcus pneumoniae -- main surface antigen and the target of the currently used pneumococcal vaccines -- constitutes a major obstacle in eliminating pneumococcal disease. Such diversity is genetically encoded by almost 100 variants of the capsule biosynthesis locus, cps. However, the evolutionary dynamics of the capsule remains not fully understood. Here, using genetic data from 4,519 bacterial isolates, we found cps to be an evolutionary hotspot with elevated substitution and recombination rates. These rates were a consequence of relaxed purifying selection and positive, diversifying selection acting at this locus, supporting the hypothesis that the capsule has an increased potential to generate novel diversity compared to the rest of the genome. Diversifying selection was particularly evident in the region of wzd/wze genes, which are known to regulate capsule expression and hence the bacterium's ability to cause disease. Using a novel, capsule-centred approach, we analysed the evolutionary history of twelve major serogroups. Such analysis revealed their complex diversification scenarios, which were principally driven by recombination with other serogroups and other streptococci. Patterns of recombinational exchanges between serogroups could not be explained by serotype frequency alone, thus pointing to non-random associations between co-colonising serotypes. Finally, we discovered a previously unobserved mosaic serotype 39X, which was confirmed to carry a viable and structurally novel capsule. Adding to previous discoveries of other mosaic capsules in densely sampled collections, these results emphasise the strong adaptive potential of the bacterium by its ability to generate novel antigenic diversity by recombination.

    Molecular biology and evolution 2017

  • Optimisation of <i>ex vivo</i> memory B cell expansion/differentiation for interrogation of rare peripheral memory B cell subset responses.

    Muir L, McKay PF, Petrova VN, Klymenko OV, Kratochvil S, Pinder CL, Kellam P and Shattock RJ

    Department of Mucosal Infection and Immunity, Imperial College London, London, W2 1PG, UK.

    <i><b>Background</b>:</i> Human memory B cells play a vital role in the long-term protection of the host from pathogenic re-challenge. In recent years the importance of a number of different memory B cell subsets that can be formed in response to vaccination or infection has started to become clear. To study memory B cell responses, cells can be cultured <i>ex vivo,</i> allowing for an increase in cell number and activation of these quiescent cells, providing sufficient quantities of each memory subset to enable full investigation of functionality. However, despite numerous papers being published demonstrating bulk memory B cell culture, we could find no literature on optimised conditions for the study of memory B cell subsets, such as IgM <sup>+</sup> memory B cells. <i><b>Methods</b>:</i> Following a literature review, we carried out a large screen of memory B cell expansion conditions to identify the combination that induced the highest levels of memory B cell expansion. We subsequently used a novel Design of Experiments approach to finely tune the optimal memory B cell expansion and differentiation conditions for human memory B cell subsets. Finally, we characterised the resultant memory B cell subpopulations by IgH sequencing and flow cytometry. <i><b>Results</b>:</i> The application of specific optimised conditions induce multiple rounds of memory B cell proliferation equally across Ig isotypes, differentiation of memory B cells to antibody secreting cells, and importantly do not alter the Ig genotype of the stimulated cells.  <i><b>Conclusions</b>:</i> Overall, our data identify a memory B cell culture system that offers a robust platform for investigating the functionality of rare memory B cell subsets to infection and/or vaccination.

    Wellcome open research 2017;2;97

  • Artemisinin resistance without pfkelch13 mutations in Plasmodium falciparum isolates from Cambodia.

    Mukherjee A, Bopp S, Magistrado P, Wong W, Daniels R, Demas A, Schaffner S, Amaratunga C, Lim P, Dhorda M, Miotto O, Woodrow C, Ashley EA, Dondorp AM, White NJ, Wirth D, Fairhurst R and Volkman SK

    Department of Immunology and Infectious Disease, Harvard T.H. Chan School of Public Health, 665 Huntington Avenue, I-704, Boston, MA, 02115, USA.

    Background: Artemisinin resistance is associated with delayed parasite clearance half-life in vivo and correlates with ring-stage survival under dihydroartemisinin in vitro. Both phenotypes are associated with mutations in the PF3D7_1343700 pfkelch13 gene. Recent spread of artemisinin resistance and emerging piperaquine resistance in Southeast Asia show that artemisinin combination therapy, such as dihydroartemisinin-piperaquine, are losing clinical effectiveness, prompting investigation of drug resistance mechanisms and development of strategies to surmount emerging anti-malarial resistance.

    Methods: Sixty-eight parasites isolates with in vivo clearance data were obtained from two Tracking Resistance to Artemisinin Collaboration study sites in Cambodia, culture-adapted, and genotyped for pfkelch13 and other mutations including pfmdr1 copy number; and the RSA0-3h survival rates and response to antimalarial drugs in vitro were measured for 36 of these isolates.

    Results: Among these 36 parasites one isolate demonstrated increased ring-stage survival for a PfKelch13 mutation (D584V, RSA0-3h = 8%), previously associated with slow clearance but not yet tested in vitro. Several parasites exhibited increased ring-stage survival, yet lack pfkelch13 mutations, and one isolate showed evidence for piperaquine resistance.

    Conclusions: This study of 68 culture-adapted Plasmodium falciparum clinical isolates from Cambodia with known clearance values, associated the D584V PfKelch13 mutation with increased ring-stage survival and identified parasites that lack pfkelch13 mutations yet exhibit increased ring-stage survival. These data suggest mutations other than those found in pfkelch13 may be involved in conferring artemisinin resistance in P. falciparum. Piperaquine resistance was also detected among the same Cambodian samples, consistent with reports of emerging piperaquine resistance in the field. These culture-adapted parasites permit further investigation of mechanisms of both artemisinin and piperaquine resistance and development of strategies to prevent or overcome anti-malarial resistance.

    Malaria journal 2017;16;1;195

  • Hemopoietic-specific Sf3b1-K700E knock-in mice display the splicing defect seen in human MDS but develop anemia without ring sideroblasts.

    Mupo A, Seiler M, Sathiaseelan V, Pance A, Yang Y, Agrawal AA, Iorio F, Bautista R, Pacharne S, Tzelepis K, Manes N, Wright P, Papaemmanuil E, Kent DG, Campbell PC, Buonamici S, Bolli N and Vassiliou GS

    Haematological Cancer Genetics, Wellcome Sanger Institute, Hinxton, Cambridge, UK.

    Heterozygous somatic mutations affecting the spliceosome gene SF3B1 drive age-related clonal hematopoiesis, myelodysplastic syndromes (MDS) and other neoplasms. To study their role in such disorders, we generated knock-in mice with hematopoietic-specific expression of Sf3b1-K700E, the commonest type of SF3B1 mutation in MDS. Sf3b1<sup>K700E/+</sup> animals had impaired erythropoiesis and progressive anemia without ringed sideroblasts, as well as reduced hematopoietic stem cell numbers and host-repopulating fitness. To understand the molecular basis of these observations, we analyzed global RNA splicing in Sf3b1<sup>K700E/+</sup> hematopoietic cells. Aberrant splicing was associated with the usage of cryptic 3' splice and branchpoint sites, as described for human SF3B1 mutants. However, we found a little overlap between aberrantly spliced mRNAs in mouse versus human, suggesting that anemia may be a consequence of globally disrupted splicing. Furthermore, the murine orthologues of genes associated with ring sideroblasts in human MDS, including Abcb7 and Tmem14c, were not aberrantly spliced in Sf3b1<sup>K700E/+</sup> mice. Our findings demonstrate that, despite significant differences in affected transcripts, there is overlap in the phenotypes associated with SF3B1-K700E between human and mouse. Future studies should focus on understanding the basis of these similarities and differences as a means of deciphering the consequences of spliceosome gene mutations in MDS.

    Funded by: Medical Research Council: MC_PC_12009; Wellcome Trust: 095663, 098051

    Leukemia 2017;31;3;720-727

  • Genomic landscape of extended-spectrum β-lactamase resistance in Escherichia coli from an urban African setting.

    Musicha P, Feasey NA, Cain AK, Kallonen T, Chaguza C, Peno C, Khonga M, Thompson S, Gray KJ, Mather AE, Heyderman RS, Everett DB, Thomson NR and Msefula CL

    Malawi-Liverpool-Wellcome Trust Clinical Research Programme, Queen Elizabeth Central Hospital, Blantyre, Malawi.

    Objectives: Efforts to treat Escherichia coli infections are increasingly being compromised by the rapid, global spread of antimicrobial resistance (AMR). Whilst AMR in E. coli has been extensively investigated in resource-rich settings, in sub-Saharan Africa molecular patterns of AMR are not well described. In this study, we have begun to explore the population structure and molecular determinants of AMR amongst E. coli isolates from Malawi.

    Methods: Ninety-four E. coli isolates from patients admitted to Queen's Hospital, Malawi, were whole-genome sequenced. The isolates were selected on the basis of diversity of phenotypic resistance profiles and clinical source of isolation (blood, CSF and rectal swab). Sequence data were analysed using comparative genomics and phylogenetics.

    Results: Our results revealed the presence of five clades, which were strongly associated with E. coli phylogroups A, B1, B2, D and F. We identified 43 multilocus STs, of which ST131 (14.9%) and ST12 (9.6%) were the most common. We identified 25 AMR genes. The most common ESBL gene was bla CTX-M-15 and it was present in all five phylogroups and 11 STs, and most commonly detected in ST391 (4/4 isolates), ST648 (3/3 isolates) and ST131 [3/14 (21.4%) isolates].

    Conclusions: This study has revealed a high diversity of lineages associated with AMR, including ESBL and fluoroquinolone resistance, in Malawi. The data highlight the value of longitudinal bacteraemia surveillance coupled with detailed molecular epidemiology in all settings, including low-income settings, in describing the global epidemiology of ESBL resistance.

    Funded by: Wellcome Trust

    The Journal of antimicrobial chemotherapy 2017;72;6;1602-1609

  • Aboriginal Australian mitochondrial genome variation - an increased understanding of population antiquity and diversity.

    Nagle N, van Oven M, Wilcox S, van Holst Pellekaan S, Tyler-Smith C, Xue Y, Ballantyne KN, Wilcox L, Papac L, Cooke K, van Oorschot RA, McAllister P, Williams L, Kayser M, Mitchell RJ and Genographic Consortium

    Department of Biochemistry and Genetics, La Trobe Institute for Molecular Sciences, La Trobe University, Melbourne, Victoria, Australia.

    Aboriginal Australians represent one of the oldest continuous cultures outside Africa, with evidence indicating that their ancestors arrived in the ancient landmass of Sahul (present-day New Guinea and Australia) ~55 thousand years ago. Genetic studies, though limited, have demonstrated both the uniqueness and antiquity of Aboriginal Australian genomes. We have further resolved known Aboriginal Australian mitochondrial haplogroups and discovered novel indigenous lineages by sequencing the mitogenomes of 127 contemporary Aboriginal Australians. In particular, the more common haplogroups observed in our dataset included M42a, M42c, S, P5 and P12, followed by rarer haplogroups M15, M16, N13, O, P3, P6 and P8. We propose some major phylogenetic rearrangements, such as in haplogroup P where we delinked P4a and P4b and redefined them as P4 (New Guinean) and P11 (Australian), respectively. Haplogroup P2b was identified as a novel clade potentially restricted to Torres Strait Islanders. Nearly all Aboriginal Australian mitochondrial haplogroups detected appear to be ancient, with no evidence of later introgression during the Holocene. Our findings greatly increase knowledge about the geographic distribution and phylogenetic structure of mitochondrial lineages that have survived in contemporary descendants of Australia's first settlers.

    Funded by: Wellcome Trust: 098051

    Scientific reports 2017;7;43041

  • Role of alanine racemase mutations in Mycobacterium tuberculosis D-cycloserine resistance.

    Nakatani Y, Opel-Reading HK, Merker M, Machado D, Andres S, Kumar SS, Moradigaravand D, Coll F, Perdigão J, Portugal I, Schön T, Nair D, Devi KRU, Kohl TA, Beckert P, Clark TG, Maphalala G, Khumalo D, Diel R, Klaos K, Aung HL, Cook GM, Parkhill J, Peacock SJ, Swaminathan S, Viveiros M, Niemann S, Krause KL and Köser CU

    University of Otago, Department of Microbiology and Immunology, Otago School of Medical Sciences, Dunedin, New Zealand.

    Screening of more than 1,500 drug-resistant strains of Mycobacterium tuberculosis revealed evolutionary patterns characteristic of positive selection for three alanine racemase (Alr) mutations. We investigated these mutations using molecular modeling, in vitro MIC testing, as well as direct measurements of enzymatic activity, which demonstrated that these mutations likely confer resistance to D-cycloserine.

    Antimicrobial agents and chemotherapy 2017

  • Myeloproliferative neoplasms: from origins to outcomes.

    Nangalia J and Green AR

    Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, United Kingdom.

    Substantial progress has been made in our understanding of the pathogenetic basis of myeloproliferative neoplasms. The discovery of mutations in <i>JAK2</i> over a decade ago heralded a new age for patient care as a consequence of improved diagnosis and the development of therapeutic JAK inhibitors. The more recent identification of mutations in calreticulin brought with it a sense of completeness, with most patients with myeloproliferative neoplasm now having a biological basis for their excessive myeloproliferation. We are also beginning to understand the processes that lead to acquisition of somatic mutations and the factors that influence subsequent clonal expansion and emergence of disease. Extended genomic profiling has established a multitude of additional acquired mutations, particularly prevalent in myelofibrosis, where their presence carries prognostic implications. A major goal is to integrate genetic, clinical, and laboratory features to identify patients who share disease biology and clinical outcome, such that therapies, both existing and novel, can be better targeted.

    Funded by: Cancer Research UK; Medical Research Council: MC_PC_12009; Wellcome Trust

    Blood 2017;130;23;2475-2483

  • Estimating the human mutation rate from autozygous segments reveals population differences in human mutational processes.

    Narasimhan VM, Rahbari R, Scally A, Wuster A, Mason D, Xue Y, Wright J, Trembath RC, Maher ER, van Heel DA, Auton A, Hurles ME, Tyler-Smith C and Durbin R

    Wellcome Trust Sanger Institute, Hinxton, Cambridge, CB10 1SA, UK.

    Heterozygous mutations within homozygous sequences descended from a recent common ancestor offer a way to ascertain de novo mutations across multiple generations. Using exome sequences from 3222 British-Pakistani individuals with high parental relatedness, we estimate a mutation rate of 1.45 ± 0.05 × 10<sup>-8</sup> per base pair per generation in autosomal coding sequence, with a corresponding non-crossover gene conversion rate of 8.75 ± 0.05 × 10<sup>-6</sup> per base pair per generation. This is at the lower end of exome mutation rates previously estimated in parent-offspring trios, suggesting that post-zygotic mutations contribute little to the human germ-line mutation rate. We find frequent recurrence of mutations at polymorphic CpG sites, and an increase in C to T mutations in a 5' CCG 3' to 5' CTG 3' context in the Pakistani population compared to Europeans, suggesting that mutational processes have evolved rapidly between human populations.Estimates of human mutation rates differ substantially based on the approach. Here, the authors present a multi-generational estimate from the autozygous segment in a non-European population that gives insight into the contribution of post-zygotic mutations and population-specific mutational processes.

    Funded by: Medical Research Council: MR/M009017/1

    Nature communications 2017;8;1;303

  • Single cell transcriptomics of pluripotent stem cells: reprogramming and differentiation.

    Natarajan KN, Teichmann SA and Kolodziejczyk AA

    Wellcome Trust Sanger Institute, Wellcome Genome Campus, Cambridge, UK; European Bioinformatics Institute, Wellcome Genome Campus, Cambridge, UK. Electronic address:

    Single-cell transcriptomics serves as a powerful tool to identify cell states within populations of cells, and to dissect underlying heterogeneity at high resolution. Single-cell transcriptomics on pluripotent stem cells has provided new insights into cellular variation, subpopulation structures and the interplay of cell cycle with pluripotency. The single-cell perspective has helped to better understand gene regulation and regulatory networks during exit from pluripotency, cell-fate determination as well as molecular mechanisms driving cellular reprogramming of somatic cells to induced pluripotent stage. Here we review the recent progress and significant findings from application of single-cell technologies on pluripotent stem cells along with a brief outlook on new combinatorial single-cell approaches that further unravel pluripotent stem cell states.

    Current opinion in genetics & development 2017;46;66-76

  • Invasive disease caused simultaneously by dual serotypes of Streptococcus pneumoniae.

    Ndlangisa K, du Plessis M, Allam M, Wolter N, de Gouveia L, Klugman KP, Cohen C, Gladstone RA and von Gottberg A

    National Institute for Communicable Diseases (NICD), a division of the National Health Laboratory Service, Johannesburg, South Africa

    There are at least 98 known pneumococcal serotypes. Invasive pneumococcal (IPD) disease is usually caused by a single serotype, and dual serotype IPD is rare. To assess factors associated with dual serotype IPD, patient information obtained through laboratory-based surveillance for IPD from 2005 through 2014 in South Africa was reviewed. Genomes of isolate pairs from co-infected individuals were sequenced to determine their molecular characteristics. For 30 (91%) of 33 patients with dual serotypes, one or both isolates were a pneumococcal conjugate vaccine (PCV13) serotype. Dual serotype IPD was associated with children <5 years of age [adjusted odds ratio (aOR), 4.7, 95% confidence interval (95% CI), 1.8-11.7], underlying illness (other than HIV) (aOR, 2.8; 95% CI, 1.1-6.6) and death (aOR, 2.5; 95% CI, 1.08-6.09). For each co-infecting pair, isolates were genotypically unrelated and their genotypes were common among isolates of the same serotype in South Africa. Of 701 accessory genes identified among dual serotype IPD isolates, four were common between isolate pairs. Co-infecting isolate pairs had different genotypic backgrounds. The association of dual serotypes with death warrants increased awareness of IPD co-infection caused by two or more serotypes.

    Journal of clinical microbiology 2017

  • In Vivo Regulation of the Zebrafish Endoderm Progenitor Niche by T-Box Transcription Factors.

    Nelson AC, Cutty SJ, Gasiunas SN, Deplae I, Stemple DL and Wardle FC

    Randall Division of Cell and Molecular Biophysics, King's College London, London SE1 1UL, UK; Sir William Dunn School of Pathology, University of Oxford, Oxford OX1 3RE, UK; School of Life Sciences, University of Warwick, Coventry CV4 7AL, UK. Electronic address:

    T-box transcription factors T/Brachyury homolog A (Ta) and Tbx16 are essential for correct mesoderm development in zebrafish. The downstream transcriptional networks guiding their functional activities are poorly understood. Additionally, important contributions elsewhere are likely masked due to redundancy. Here, we exploit functional genomic strategies to identify Ta and Tbx16 targets in early embryogenesis. Surprisingly, we discovered they not only activate mesodermal gene expression but also redundantly regulate key endodermal determinants, leading to substantial loss of endoderm in double mutants. To further explore the gene regulatory networks (GRNs) governing endoderm formation, we identified targets of Ta/Tbx16-regulated homeodomain transcription factor Mixl1, which is absolutely required in zebrafish for endoderm formation. Interestingly, we find many endodermal determinants coordinately regulated through common genomic occupancy by Mixl1, Eomesa, Smad2, Nanog, Mxtx2, and Pou5f3. Collectively, these findings augment the endoderm GRN and reveal a panel of target genes underlying the Ta, Tbx16, and Mixl1 mutant phenotypes.

    Cell reports 2017;19;13;2782-2795

  • Association analyses based on false discovery rate implicate new loci for coronary artery disease.

    Nelson CP, Goel A, Butterworth AS, Kanoni S, Webb TR, Marouli E, Zeng L, Ntalla I, Lai FY, Hopewell JC, Giannakopoulou O, Jiang T, Hamby SE, Di Angelantonio E, Assimes TL, Bottinger EP, Chambers JC, Clarke R, Palmer CNA, Cubbon RM, Ellinor P, Ermel R, Evangelou E, Franks PW, Grace C, Gu D, Hingorani AD, Howson JMM, Ingelsson E, Kastrati A, Kessler T, Kyriakou T, Lehtimäki T, Lu X, Lu Y, März W, McPherson R, Metspalu A, Pujades-Rodriguez M, Ruusalepp A, Schadt EE, Schmidt AF, Sweeting MJ, Zalloua PA, AlGhalayini K, Keavney BD, Kooner JS, Loos RJF, Patel RS, Rutter MK, Tomaszewski M, Tzoulaki I, Zeggini E, Erdmann J, Dedoussis G, Björkegren JLM, EPIC-CVD Consortium, CARDIoGRAMplusC4D, UK Biobank CardioMetabolic Consortium CHD working group, Schunkert H, Farrall M, Danesh J, Samani NJ, Watkins H and Deloukas P

    Department of Cardiovascular Sciences, University of Leicester, Leicester, UK.

    Genome-wide association studies (GWAS) in coronary artery disease (CAD) had identified 66 loci at 'genome-wide significance' (P < 5 × 10<sup>-8</sup>) at the time of this analysis, but a much larger number of putative loci at a false discovery rate (FDR) of 5% (refs. 1,2,3,4). Here we leverage an interim release of UK Biobank (UKBB) data to evaluate the validity of the FDR approach. We tested a CAD phenotype inclusive of angina (SOFT; n<sub>cases</sub> = 10,801) as well as a stricter definition without angina (HARD; n<sub>cases</sub> = 6,482) and selected cases with the former phenotype to conduct a meta-analysis using the two most recent CAD GWAS. This approach identified 13 new loci at genome-wide significance, 12 of which were on our previous list of loci meeting the 5% FDR threshold, thus providing strong support that the remaining loci identified by FDR represent genuine signals. The 304 independent variants associated at 5% FDR in this study explain 21.2% of CAD heritability and identify 243 loci that implicate pathways in blood vessel morphogenesis as well as lipid metabolism, nitric oxide signaling and inflammation.

    Funded by: British Heart Foundation: FS/12/80/29821, FS/14/76/30933, PG/12/32/29544, RG/08/014/24067, RG/10/12/28456, RG/14/5/30893, RG/15/12/31616; Medical Research Council: G0601966, G0700931, G0800270, MC_QA137853, MR/L003120/1; NIH HHS: S10 OD018522

    Nature genetics 2017;49;9;1385-1391

  • Commensal Koch's postulates: establishing causation in human microbiota research.

    Neville BA, Forster SC and Lawley TD

    Host-Microbiota Interactions Laboratory, Wellcome Trust Sanger Institute, Hinxton CB10 1SA, UK.

    Advances in high-throughput sequencing technologies and the development of sophisticated bioinformatics analysis methods, algorithms, and pipelines to handle the large amounts of data generated have driven the field of human microbiome research forward. This specialist knowledge has been crucial to thoroughly mine the human gut microbiota, particularly in the absence of methods for the routine cultivation of most enteric microorganisms. In recent years, however, significant efforts have been made to address the 'great plate count anomaly' and to overcome the barriers to cultivation of the fastidious and mostly strictly anaerobic bacteria that reside in the human gut. As a result, many new species have been discovered, characterised, genome sequenced, and deposited in culture collections. These continually expanding resources enable experimental investigation of the human gut microbiota, validation of hypotheses made with sequence-based analyses, and phenotypic characterisation of its constituent microbes. Herein we propose a variant of Koch's postulates, aimed at providing a framework to establish causation in microbiome studies, with a particular focus on demonstrating the health-promoting role of the commensal gut microbiota.

    Current opinion in microbiology 2017;42;47-52

  • A Genome-wide Association Study of Dupuytren Disease Reveals 17 Additional Variants Implicated in Fibrosis.

    Ng M, Thakkar D, Southam L, Werker P, Ophoff R, Becker K, Nothnagel M, Franke A, Nürnberg P, Espirito-Santo AI, Izadi D, Hennies HC, Nanchahal J, Zeggini E and Furniss D

    Nuffield Department of Orthopaedics, Rheumatology, and Musculoskeletal Science, University of Oxford, Botnar Research Centre, Windmill Road, Oxford OX3 7HE, UK.

    Individuals with Dupuytren disease (DD) are commonly seen by physicians and surgeons across multiple specialties. It is an increasingly common and disabling fibroproliferative disorder of the palmar fascia, which leads to flexion contractures of the digits, and is associated with other tissue-specific fibroses. DD affects between 5% and 25% of people of European descent and is the most common inherited disease of connective tissue. We undertook the largest GWAS to date in individuals with a surgically validated diagnosis of DD from the UK, with replication in British, Dutch, and German individuals. We validated association at all nine previously described signals and discovered 17 additional variants with p ≤ 5 × 10(-8). As a proof of principle, we demonstrated correlation of the high-risk genotype at the statistically most strongly associated variant with decreased secretion of the soluble WNT-antagonist SFRP4, in surgical specimen-derived DD myofibroblasts. These results highlight important pathways involved in the pathogenesis of fibrosis, including WNT signaling, extracellular matrix modulation, and inflammation. In addition, many associated loci contain genes that were hitherto unrecognized as playing a role in fibrosis, opening up new avenues of research that may lead to novel treatments for DD and fibrosis more generally. DD represents an ideal human model disease for fibrosis research.

    American journal of human genetics 2017;101;3;417-427

  • Tracing the peopling of the world through genomics.

    Nielsen R, Akey JM, Jakobsson M, Pritchard JK, Tishkoff S and Willerslev E

    Department of Integrative Biology, University of California, Berkeley, Berkeley, California 94720, USA.

    Advances in the sequencing and the analysis of the genomes of both modern and ancient peoples have facilitated a number of breakthroughs in our understanding of human evolutionary history. These include the discovery of interbreeding between anatomically modern humans and extinct hominins; the development of an increasingly detailed description of the complex dispersal of modern humans out of Africa and their population expansion worldwide; and the characterization of many of the genetic adaptions of humans to local environmental conditions. Our interpretation of the evolutionary history and adaptation of humans is being transformed by analyses of these new genomic data.

    Funded by: NIDDK NIH HHS: R01 DK104339; NIGMS NIH HHS: R01 GM110068, R01 GM113657, R01 GM116044

    Nature 2017;541;7637;302-310

  • Mutational Signatures in Breast Cancer: The Problem at the DNA Level.

    Nik-Zainal S and Morganella S

    Wellcome Trust Sanger Institute, Hinxton Genome Campus, Cambridge, United Kingdom.

    A breast cancer genome is a record of the historic mutagenic activity that has occurred throughout the development of the tumor. Indeed, every mutation may be informative. Although driver mutations were the main focus of cancer research for a long time, passenger mutational signatures, the imprints of DNA damage and DNA repair processes that have been operative during tumorigenesis, are also biologically illuminating. This review is a chronicle of how the concept of mutational signatures arose and brings the reader up-to-date on this field, particularly in breast cancer. Mutational signatures have now been advanced to include mutational processes that involve rearrangements, and novel cancer biological insights have been gained through studying these in great detail. Furthermore, there are efforts to take this field into the clinical sphere. If validated, mutational signatures could thus form an additional weapon in the arsenal of cancer precision diagnostics and therapeutic stratification in the modern war against cancer. Clin Cancer Res; 23(11); 2617-29. ©2017 AACRSee all articles in this CCR Focus section, "Breast Cancer Research: From Base Pairs to Populations."

    Funded by: Wellcome Trust

    Clinical cancer research : an official journal of the American Association for Cancer Research 2017;23;11;2617-2629

  • Protein-Truncating Variants at the Cholesteryl Ester Transfer Protein Gene and Risk for Coronary Heart Disease.

    Nomura A, Won HH, Khera AV, Takeuchi F, Ito K, McCarthy S, Emdin CA, Klarin D, Natarajan P, Zekavat SM, Gupta N, Peloso GM, Borecki IB, Teslovich TM, Asselta R, Duga S, Merlini PA, Correa A, Kessler T, Wilson JG, Bown MJ, Hall AS, Braund PS, Carey DJ, Murray MF, Kirchner HL, Leader JB, Lavage DR, Manus JN, Hartze DN, Samani NJ, Schunkert H, Marrugat J, Elosua R, McPherson R, Farrall M, Watkins H, Juang JJ, Hsiung CA, Lin SY, Wang JS, Tada H, Kawashiri MA, Inazu A, Yamagishi M, Katsuya T, Nakashima E, Nakatochi M, Yamamoto K, Yokota M, Momozawa Y, Rotter JI, Lander ES, Rader DJ, Danesh J, Ardissino D, Gabriel S, Willer CJ, Abecasis GR, Saleheen D, Kubo M, Kato N, Ida Chen YD, Dewey FE and Kathiresan S

    Rationale: Therapies that inhibit CETP (cholesteryl ester transfer protein) have failed to demonstrate a reduction in risk for coronary heart disease (CHD). Human DNA sequence variants that truncate the CETP gene may provide insight into the efficacy of CETP inhibition.

    Objective: To test whether protein-truncating variants (PTVs) at the CETP gene were associated with plasma lipid levels and CHD.

    Methods and results: We sequenced the exons of the CETP gene in 58 469 participants from 12 case-control studies (18 817 CHD cases, 39 652 CHD-free controls). We defined PTV as those that lead to a premature stop, disrupt canonical splice sites, or lead to insertions/deletions that shift frame. We also genotyped 1 Japanese-specific PTV in 27561 participants from 3 case-control studies (14 286 CHD cases, 13 275 CHD-free controls). We tested association of CETP PTV carrier status with both plasma lipids and CHD. Among 58 469 participants with CETP gene-sequencing data available, average age was 51.5 years and 43% were women; 1 in 975 participants carried a PTV at the CETP gene. Compared with noncarriers, carriers of PTV at CETP had higher high-density lipoprotein cholesterol (effect size, 22.6 mg/dL; 95% confidence interval, 18-27; P<1.0×10(-4)), lower low-density lipoprotein cholesterol (-12.2 mg/dL; 95% confidence interval, -23 to -0.98; P=0.033), and lower triglycerides (-6.3%; 95% confidence interval, -12 to -0.22; P=0.043). CETP PTV carrier status was associated with reduced risk for CHD (summary odds ratio, 0.70; 95% confidence interval, 0.54-0.90; P=5.1×10(-3)).

    Conclusions: Compared with noncarriers, carriers of PTV at CETP displayed higher high-density lipoprotein cholesterol, lower low-density lipoprotein cholesterol, lower triglycerides, and lower risk for CHD.

    Circulation research 2017;121;1;81-88

  • A graph extension of the positional Burrows-Wheeler transform and its applications.

    Novak AM, Garrison E and Paten B

    Genomics Institute, University of California Santa Cruz, CBSE, 501C Engineering 2, MS: CBSE, 1156 High St., Santa Cruz, CA 95064 USA.

    We present a generalization of the positional Burrows-Wheeler transform, or PBWT, to genome graphs, which we call the gPBWT. A genome graph is a collapsed representation of a set of genomes described as a graph. In a genome graph, a haplotype corresponds to a restricted form of walk. The gPBWT is a compressible representation of a set of these graph-encoded haplotypes that allows for efficient subhaplotype match queries. We give efficient algorithms for gPBWT construction and query operations. As a demonstration, we use the gPBWT to quickly count the number of haplotypes consistent with random walks in a genome graph, and with the paths taken by mapped reads; results suggest that haplotype consistency information can be practically incorporated into graph-based read mappers. We estimate that with the gPBWT of the order of 100,000 diploid genomes, including all forms structural variation, could be stored and made searchable for haplotype queries using a single large compute node.

    Algorithms for molecular biology : AMB 2017;12;18

  • A population-based analysis of germline BAP1 mutations in melanoma.

    O'Shea SJ, Robles-Espinoza CD, McLellan L, Harrigan J, Jacq X, Hewinson J, Iyer V, Merchant W, Elliott F, Harland M, Bishop DT, Newton-Bishop JA and Adams DJ

    Section of Epidemiology and Biostatistics, Leeds Institute of Cancer and Pathology, University of Leeds, Leeds, UK.

    Germline mutation of the BRCA1 associated protein-1 (BAP1) gene has been linked to uveal melanoma, mesothelioma, meningioma, renal cell carcinoma and basal cell carcinoma. Germline variants have also been found in familial cutaneous melanoma pedigrees, but their contribution to sporadic melanoma has not been fully assessed. We sequenced BAP1 in 1,977 melanoma cases and 754 controls and used deubiquitinase assays, a pedigree analysis, and a histopathological review to assess the consequences of the mutations found. Sequencing revealed 30 BAP1 variants in total, of which 27 were rare (ExAc allele frequency <0.002). Of the 27 rare variants, 22 were present in cases (18 missense, one splice acceptor, one frameshift and two near splice regions) and five in controls (all missense). A missense change (S98R) in a case that completely abolished BAP1 deubiquitinase activity was identified. Analysis of cancers in the pedigree of the proband carrying the S98R variant and in two other pedigrees carrying clear loss-of-function alleles showed the presence of BAP1-associated cancers such as renal cell carcinoma, mesothelioma and meningioma, but not uveal melanoma. Two of these three probands carrying BAP1 loss-of-function variants also had melanomas with histopathological features suggestive of a germline BAP1 mutation. The remaining cases with germline mutations, which were predominantly missense mutations, were associated with less typical pedigrees and tumours lacking a characteristic BAP1-associated histopathological appearances, but may still represent less penetrant variants. Germline BAP1 alleles defined as loss-of-function or predicted to be deleterious/damaging are rare in cutaneous melanoma.

    Funded by: Cancer Research UK: 13031, C37059/A17894, C588/A19167; Wellcome Trust

    Human molecular genetics 2017;26;4;717-728

  • Limited impact of adolescent meningococcal ACWY vaccination on group W carriage in university students.

    Oldfield NJ, Green LR, Parkhill J, Bayliss CD and Turner DPJ

    School of Life Sciences, University of Nottingham, Nottingham NG7 2RD, UK.

    Background: In the UK rising disease levels due to Neisseria meningitidis serogroup W clonal complex ST-11 (MenW:cc11) strains led to introduction of conjugate MenACWY vaccination for teenagers. We investigated the impact of immunization on carriage of targeted meningococci by whole genome sequencing of isolates recovered from a cohort of vaccinated university students.

    Methods: Strain designation data were extracted from whole genome sequence data. Genomes from carried and invasive MenW:cc11 were compared using a gene-by-gene approach. Serogrouping identified isolates expressing capsule antigens targeted by the vaccine.

    Results: Isolates with a W: P1.5,2: F1-1: ST-11 (cc11) designation, and belonging to the emerging '2013-strain' of the South American-UK MenW:cc11 sub-lineage, were responsible for an increase in carried group W. A multifocal expansion was evident with close transmission networks extending beyond individual dormitories. Carried group Y isolates were predominantly from clonal complex 23, but showed significant heterogeneity and individual strain designations were only sporadically recovered. No shifts towards acapsulate phenotypes were detected in targeted meningococcal populations.

    Conclusions: In a setting with high levels of conjugate MenACWY vaccination, expansion of capsule-expressing isolates from the 2013-strain of MenW:cc11, but not MenY:cc23 isolates, is indicative of differential susceptibilities to vaccine-induced immunity.

    The Journal of infectious diseases 2017

  • Micro-epidemiological structuring of Plasmodium falciparum parasite populations in regions with varying transmission intensities in Africa.

    Omedo I, Mogeni P, Bousema T, Rockett K, Amambua-Ngwa A, Oyier I, C Stevenson J, Y Baidjoe A, de Villiers EP, Fegan G, Ross A, Hubbart C, Jeffreys A, N Williams T, Kwiatkowski D and Bejon P

    KEMRI-Wellcome Trust Research Programme, Centre for Geographic Medicine Research-Coast, Kilifi, Kenya.

    Background: The first models of malaria transmission assumed a completely mixed and homogeneous population of parasites.  Recent models include spatial heterogeneity and variably mixed populations. However, there are few empiric estimates of parasite mixing with which to parametize such models. Methods: Here we genotype 276 single nucleotide polymorphisms (SNPs) in 5199 P. falciparum isolates from two Kenyan sites and one Gambian site to determine the spatio-temporal extent of parasite mixing, and use Principal Component Analysis (PCA) and linear regression to examine the relationship between genetic relatedness and relatedness in space and time for parasite pairs. Results: We show that there are no discrete geographically restricted parasite sub-populations, but instead we see a diffuse spatio-temporal structure to parasite genotypes.  Genetic relatedness of sample pairs is predicted by relatedness in space and time. Conclusions: Our findings suggest that targeted malaria control will benefit the surrounding community, but unfortunately also that emerging drug resistance will spread rapidly through the population.

    Wellcome open research 2017;2;10

  • Geographic-genetic analysis of Plasmodium falciparum parasite populations from surveys of primary school children in Western Kenya.

    Omedo I, Mogeni P, Rockett K, Kamau A, Hubbart C, Jeffreys A, Ochola-Oyier LI, de Villiers EP, Gitonga CW, Noor AM, Snow RW, Kwiatkowski D and Bejon P

    KEMRI-Wellcome Trust Research Programme, Centre for Geographic Medicine Research-Coast, Kilifi, Kenya.

    Background. Malaria control, and finally malaria elimination, requires the identification and targeting of residual foci or hotspots of transmission. However, the level of parasite mixing within and between geographical locations is likely to impact the effectiveness and durability of control interventions and thus should be taken into consideration when developing control programs. Methods. In order to determine the geographic-genetic patterns of Plasmodium falciparum parasite populations at a sub-national level in Kenya, we used the Sequenom platform to genotype 111 genome-wide distributed single nucleotide polymorphic (SNP) positions in 2486 isolates collected from children in 95 primary schools in western Kenya. We analysed these parasite genotypes for genetic structure using principal component analysis and assessed local and global clustering using statistical measures of spatial autocorrelation. We further examined the region for spatial barriers to parasite movement as well as directionality in the patterns of parasite movement. Results. We found no evidence of population structure and little evidence of spatial autocorrelation of parasite genotypes (correlation coefficients <0.03 among parasite pairs in distance classes of 1km, 2km and 5km; p value<0.01). An analysis of the geographical distribution of allele frequencies showed weak evidence of variation in distribution of alleles, with clusters representing a higher than expected number of samples with the major allele being identified for 5 SNPs. Furthermore, we found no evidence of the existence of spatial barriers to parasite movement within the region, but observed directional movement of parasites among schools in two separate sections of the region studied. Conclusions. Our findings illustrate a pattern of high parasite mixing within the study region. If this mixing is due to rapid gene flow, then "one-off" targeted interventions may not be currently effective at the sub-national scale in Western Kenya, due to the high parasite movement that is likely to lead to re-introduction of infection from surrounding regions. However repeated targeted interventions may reduce transmission in the surrounding regions.

    Wellcome open research 2017;2;29

  • Role of <i>sapA</i> and <i>yfgA</i> in Susceptibility to Antibody-Mediated Complement-Dependent Killing and Virulence of Salmonella enterica Serovar Typhimurium.

    Ondari EM, Heath JN, Klemm EJ, Langridge G, Barquist L, Goulding DA, Clare S, Dougan G, Kingsley RA and MacLennan CA

    Swiss Tropical Public Health Institute, Basel, Switzerland.

    The ST313 pathovar of <i>Salmonella enterica</i> serovar Typhimurium contributes to a high burden of invasive disease among African infants and HIV-infected adults. It is characterized by genome degradation (loss of coding capacity) and has increased resistance to antibody-dependent complement-mediated killing compared with enterocolitis-causing strains of <i>S</i> Typhimurium. Vaccination is an attractive disease-prevention strategy, and leading candidates focus on the induction of bactericidal antibodies. Antibody-resistant strains arising through further gene deletion could compromise such a strategy. Exposing a saturating transposon insertion mutant library of <i>S</i> Typhimurium to immune serum identified a repertoire of <i>S</i> Typhimurium genes that, when interrupted, result in increased resistance to serum killing. These genes included several involved in bacterial envelope biogenesis, protein translocation, and metabolism. We generated defined mutant derivatives using <i>S</i> Typhimurium SL1344 as the host. Based on their initial levels of enhanced resistance to killing, <i>yfgA</i> and <i>sapA</i> mutants were selected for further characterization. The <i>S</i> Typhimurium <i>yfgA</i> mutant lost the characteristic <i>Salmonella</i> rod-shaped appearance, exhibited increased sensitivity to osmotic and detergent stress, lacked very long lipopolysaccharide, was unable to invade enterocytes, and demonstrated decreased ability to infect mice. In contrast, the <i>S</i> Typhimurium <i>sapA</i> mutants had similar sensitivity to osmotic and detergent stress and lipopolysaccharide profile and an increased ability to infect enterocytes compared with the wild type, but it had no increased ability to cause <i>in vivo</i> infection. These findings indicate that increased resistance to antibody-dependent complement-mediated killing secondary to genetic deletion is not necessarily accompanied by increased virulence and suggest the presence of different mechanisms of antibody resistance.

    Funded by: Wellcome Trust

    Infection and immunity 2017;85;9

  • Optimised metrics for CRISPR-KO screens with second-generation gRNA libraries.

    Ong SH, Li Y, Koike-Yusa H and Yusa K

    Wellcome Trust Sanger Institute, Hinxton, Cambridge, CB10 1SA, UK.

    Genome-wide CRISPR-based knockout (CRISPR-KO) screening is an emerging technique which enables systematic genetic analysis of a cellular or molecular phenotype in question. Continuous improvements, such as modifications to the guide RNA (gRNA) scaffold and the development of gRNA on-target prediction algorithms, have since been made to increase their screening performance. We compared the performance of three available second-generation human genome-wide CRISPR-KO libraries that included at least one of the improvements, and examined the effect of gRNA scaffold, number of gRNAs per gene and number of replicates on screen performance. We identified duplicated screens using a library with 6 gRNAs per gene as providing the best trade-off. Despite the improvements, we found that each improved library still has library-specific false negatives and, for the first time, estimated the false negative rates of CRISPR-KO screens, which are between 10% and 20%. Our newly-defined optimal screening parameters would be helpful in designing screens and constructing bespoke gRNA libraries.

    Funded by: Wellcome Trust

    Scientific reports 2017;7;1;7384

  • Human Germline Genome Editing.

    Ormond KE, Mortlock DP, Scholes DT, Bombard Y, Brody LC, Faucett WA, Garrison NA, Hercher L, Isasi R, Middleton A, Musunuru K, Shriner D, Virani A and Young CE

    Department of Genetics and Stanford Center for Biomedical Ethics, School of Medicine, Stanford University, Stanford, CA 94305, USA. Electronic address:

    With CRISPR/Cas9 and other genome-editing technologies, successful somatic and germline genome editing are becoming feasible. To respond, an American Society of Human Genetics (ASHG) workgroup developed this position statement, which was approved by the ASHG Board in March 2017. The workgroup included representatives from the UK Association of Genetic Nurses and Counsellors, Canadian Association of Genetic Counsellors, International Genetic Epidemiology Society, and US National Society of Genetic Counselors. These groups, as well as the American Society for Reproductive Medicine, Asia Pacific Society of Human Genetics, British Society for Genetic Medicine, Human Genetics Society of Australasia, Professional Society of Genetic Counselors in Asia, and Southern African Society for Human Genetics, endorsed the final statement. The statement includes the following positions. (1) At this time, given the nature and number of unanswered scientific, ethical, and policy questions, it is inappropriate to perform germline gene editing that culminates in human pregnancy. (2) Currently, there is no reason to prohibit in vitro germline genome editing on human embryos and gametes, with appropriate oversight and consent from donors, to facilitate research on the possible future clinical applications of gene editing. There should be no prohibition on making public funds available to support this research. (3) Future clinical application of human germline genome editing should not proceed unless, at a minimum, there is (a) a compelling medical rationale, (b) an evidence base that supports its clinical use, (c) an ethical justification, and (d) a transparent public process to solicit and incorporate stakeholder input.

    Funded by: NHGRI NIH HHS: ZIA HG200362

    American journal of human genetics 2017;101;2;167-176

  • Variability of human pluripotent stem cell lines.

    Ortmann D and Vallier L

    Wellcome Trust - Medical Research Council Cambridge Stem Cell Institute @ Anne McLaren Laboratory for Regenerative Medicine, Department of Surgery, University of Cambridge, UK.

    Human pluripotent stem cells derived from embryos (human Embryonic Stem Cells or hESCs) or generated by direct reprogramming of somatic cells (human Induced Pluripotent Stem Cells or hiPSCs) can proliferate almost indefinitely in vitro while maintaining the capacity to differentiate into a broad diversity of cell types. These two properties (self-renewal and pluripotency) confers human pluripotent stem cells a unique interest for clinical applications since they could allow the production of infinite quantities of cells for disease modelling, drug screening and cell based therapy. However, recent studies have clearly established that human pluripotent stem cell lines can display variable capacity to differentiate into specific lineages. Consequently, the development of universal protocols of differentiation which could work efficiently with any human pluripotent cell line is complicated substantially. As a consequence, each protocol needs to be adapted to every cell line thereby limiting large scale applications and precluding personalised therapies. Here, we summarise our knowledge concerning the origin of this variability and describe potential solutions currently available to bypass this major challenge.

    Current opinion in genetics & development 2017;46;179-185

  • Transcriptome and proteome analysis of Salmonella enterica serovar Typhimurium systemic infection of wild type and immune-deficient mice.

    Oshota O, Conway M, Fookes M, Schreiber F, Chaudhuri RR, Yu L, Morgan FJE, Clare S, Choudhary J, Thomson NR, Lio P, Maskell DJ, Mastroeni P and Grant AJ

    Department of Veterinary Medicine, University of Cambridge, Cambridge, United Kingdom.

    Salmonella enterica are a threat to public health. Current vaccines are not fully effective. The ability to grow in infected tissues within phagocytes is required for S. enterica virulence in systemic disease. As the infection progresses the bacteria are exposed to a complex host immune response. Consequently, in order to continue growing in the tissues, S. enterica requires the coordinated regulation of fitness genes. Bacterial gene regulation has so far been investigated largely using exposure to artificial environmental conditions or to in vitro cultured cells, and little information is available on how S. enterica adapts in vivo to sustain cell division and survival. We have studied the transcriptome, proteome and metabolic flux of Salmonella, and the transcriptome of the host during infection of wild type C57BL/6 and immune-deficient gp91-/-phox mice. Our analyses advance the understanding of how S. enterica and the host behaves during infection to a more sophisticated level than has previously been reported.

    PloS one 2017;12;8;e0181365

  • Melanoma: a global perspective.

    Ossio R, Roldán-Marín R, Martínez-Said H, Adams DJ and Robles-Espinoza CD

    Laboratorio Internacional de Investigación sobre el Genoma Humano, Universidad Nacional Autónoma de México, Campus Juriquilla, Blvd Juriquilla 3001, Santiago de Querétaro 76230, México.

    Most of our current knowledge of melanoma is derived from the study of patients from populations of European descent, for whom public health, sun protection initiatives and screening measures have appreciably decreased disease mortality. Notably, some melanoma subtypes that most commonly develop in other populations are not associated with exposure to ultraviolet (UV) light, suggesting a different disease aetiology. Further study of these subtypes is necessary to understand their risk factors and genomic architecture, and to tailor therapies and public health campaigns to benefit patients of all ethnic groups.

    Funded by: Cancer Research UK: 13031

    Nature reviews. Cancer 2017;17;7;393-394

  • Emergence and clonal spread of colistin resistance due to multiple mutational mechanisms in carbapenemase-producing Klebsiella pneumoniae in London.

    Otter JA, Doumith M, Davies F, Mookerjee S, Dyakova E, Gilchrist M, Brannigan ET, Bamford K, Galletly T, Donaldson H, Aanensen DM, Ellington MJ, Hill R, Turton JF, Hopkins KL, Woodford N and Holmes A

    National Institute for Health Research Health Protection Research Unit (NIHR HPRU) in Healthcare Associated Infection and Antimicrobial Resistance at Imperial College London & Public Health England, Hammersmith Hospital, Du Cane Road, W12 0HS, London, UK.

    Carbapenemase-producing Enterobacteriaceae (CPE) are emerging worldwide, limiting therapeutic options. Mutational and plasmid-mediated mechanisms of colistin resistance have both been reported. The emergence and clonal spread of colistin resistance was analysed in 40 epidemiologically-related NDM-1 carbapenemase producing Klebsiella pneumoniae isolates identified during an outbreak in a group of London hospitals. Isolates from July 2014 to October 2015 were tested for colistin susceptibility using agar dilution, and characterised by whole genome sequencing (WGS). Colistin resistance was detected in 25/38 (65.8%) cases for which colistin susceptibility was tested. WGS found that three potential mechanisms of colistin resistance had emerged separately, two due to different mutations in mgrB, and one due to a mutation in phoQ, with onward transmission of two distinct colistin-resistant variants, resulting in two sub-clones associated with transmission at separate hospitals. A high rate of colistin resistance (66%) emerged over a 10 month period. WGS demonstrated that mutational colistin resistance emerged three times during the outbreak, with transmission of two colistin-resistant variants.

    Scientific reports 2017;7;1;12711

  • Characterization of Posa and Posa-like virus genomes in fecal samples from humans, pigs, rats, and bats collected from a single location in Vietnam.

    Oude Munnink BB, Phan MVT, VIZIONS Consortium, Simmonds P, Koopmans MPG, Kellam P, van der Hoek L and Cotten M

    Department of Virus Genomics, Wellcome Trust Sanger Institute, Hinxton CB10 1SA, UK.

    Porcine stool-associated RNA virus (posavirus), and Human stool-associated RNA virus (husavirus) are viruses in the order Picornavirales recently described in porcine and human fecal samples. The tentative group (Posa and Posa-like viruses: PPLVs) also includes fish stool-associated RNA virus (fisavirus) as well as members detected in insects (Drosophila subobscura and Anopheles sinensis) and parasites (Ascaris suum). As part of an agnostic deep sequencing survey of animal and human viruses in Vietnam, we detected three husaviruses in human fecal samples, two of which share 97-98% amino acid identity to Dutch husavirus strains and one highly divergent husavirus with only 25% amino acid identity to known husaviruses. In addition, the current study found forty-seven complete posavirus genomes from pigs, ten novel rat stool-associated RNA virus genomes (tentatively named rasavirus), and sixteen novel bat stool-associated RNA virus genomes (tentatively named basavirus). The five expected Picornavirales protein domains (helicase, 3C-protease, RNA-dependent RNA polymerase, and two Picornavirus capsid domain) were found to be encoded by all PPLV genomes. In addition, a nucleotide composition analysis revealed that the PPLVs shared compositional properties with arthropod viruses and predicted non-mammalian hosts for all PPLV lineages. The study adds seventy-six genomes to the twenty-nine PPLV genomes currently available and greatly extends our sequence knowledge of this group of viruses within the Picornavirales order.

    Virus evolution 2017;3;2;vex022

  • Accurate Classification of Protein Subcellular Localization from High-Throughput Microscopy Images Using Deep Learning.

    Pärnamaa T and Parts L

    Institute of Computer Science, University of Tartu, 50409, Estonia.

    High-throughput microscopy of many single cells generates high-dimensional data that are far from straightforward to analyze. One important problem is automatically detecting the cellular compartment where a fluorescently-tagged protein resides, a task relatively simple for an experienced human, but difficult to automate on a computer. Here, we train an 11-layer neural network on data from mapping thousands of yeast proteins, achieving per cell localization classification accuracy of 91%, and per protein accuracy of 99% on held-out images. We confirm that low-level network features correspond to basic image characteristics, while deeper layers separate localization classes. Using this network as a feature calculator, we train standard classifiers that assign proteins to previously unseen compartments after observing only a small number of training examples. Our results are the most accurate subcellular localization classifications to date, and demonstrate the usefulness of deep learning for high-throughput microscopy.

    Funded by: Wellcome Trust

    G3 (Bethesda, Md.) 2017;7;5;1385-1392

  • An Ethnolinguistic and Genetic Perspective on the Origins of the Dravidian-Speaking Brahui in Pakistan.

    Pagani L, Colonna V, Tyler-Smith C and Ayub Q

    Department of Archaeology and Anthropology, University of Cambridge, United Kingdom, Department of Biological, Geological and Environmental Sciences, University of Bologna, Italy.

    Pakistan is a part of South Asia that modern humans encountered soon after they left Africa ~50 - 70,000 years ago. Approximately 9,000 years ago they began establishing cities that eventually expanded to represent the Harappan culture, rivalling the early city states of Mesopotamia. The modern state constitutes the north western land mass of the Indian sub-continent and is now the abode of almost 200 million humans representing many ethnicities and linguistic groups. Studies utilising autosomal, Y chromosomal and mitochondrial DNA markers in selected Pakistani populations revealed a mixture of Western Eurasian-, South- and East Asian-specific lineages, some of which were unequivocally associated with past migrations. Overall in Pakistan, genetic relationships are generally predicted more accurately by geographic proximity than linguistic origin. The Dravidian-speaking Brahui population are a prime example of this. They currently reside in south-western Pakistan, surrounded by Indo-Europeans speakers with whom they share a common genetic origin. In contrast, the Hazara share the highest affinity with East Asians, despite their Indo-European linguistic affiliation. In this report we reexamine the genetic origins of the Brahuis, and compare them with diverse populations from India, including several Dravidian-speaking groups, and present a genetic perspective on ethnolinguistic groups in present-day Pakistan. Given the high affinity of Brahui to the other Indo-European Pakistani populations and the absence of population admixture with any of the examined Indian Dravidian groups, we conclude that Brahui are an example of cultural (linguistic) retention following a major population replacement.

    Man in India 2017;97;1;267-278

  • Comparison of classical multi-locus sequence typing software for next-generation sequencing data.

    Page AJ, Alikhan NF, Carleton HA, Seemann T, Keane JA and Katz LS

    Pathogen Genomics, Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK.

    Multi-locus sequence typing (MLST) is a widely used method for categorizing bacteria. Increasingly, MLST is being performed using next-generation sequencing (NGS) data by reference laboratories and for clinical diagnostics. Many software applications have been developed to calculate sequence types from NGS data; however, there has been no comprehensive review to date on these methods. We have compared eight of these applications against real and simulated data, and present results on: (1) the accuracy of each method against traditional typing methods, (2) the performance on real outbreak datasets, (3) the impact of contamination and varying depth of coverage, and (4) the computational resource requirements.

    Microbial genomics 2017;3;8;e000124

  • A Recurrent De Novo Nonsense Variant in ZSWIM6 Results in Severe Intellectual Disability without Frontonasal or Limb Malformations.

    Palmer EE, Kumar R, Gordon CT, Shaw M, Hubert L, Carroll R, Rio M, Murray L, Leffler M, Dudding-Byth T, Oufadem M, Lalani SR, Lewis AM, Xia F, Tam A, Webster R, Brammah S, Filippini F, Pollard J, Spies J, Minoche AE, Cowley MJ, Risen S, Powell-Hamilton NN, Tusi JE, Immken L, Nagakura H, Bole-Feysot C, Nitschké P, Garrigue A, de Saint Basile G, Kivuva E, DDD Study, Scott RH, Rendon A, Munnich A, Newman W, Kerr B, Besmond C, Rosenfeld JA, Amiel J, Field M and Gecz J

    Genetics of Learning Disability Service, Hunter Genetics, Waratah, NSW 2298, Australia; School of Women and Children's Health, University of New South Wales, Randwick, NSW 2031, Australia; The Kinghorn Cancer Centre, Garvan Institute of Medical Research, Darlinghurst NSW 2010, Australia.

    A recurrent de novo missense variant within the C-terminal Sin3-like domain of ZSWIM6 was previously reported to cause acromelic frontonasal dysostosis (AFND), an autosomal-dominant severe frontonasal and limb malformation syndrome, associated with neurocognitive and motor delay, via a proposed gain-of-function effect. We present detailed phenotypic information on seven unrelated individuals with a recurrent de novo nonsense variant (c.2737C>T [p.Arg913Ter]) in the penultimate exon of ZSWIM6 who have severe-profound intellectual disability and additional central and peripheral nervous system symptoms but an absence of frontonasal or limb malformations. We show that the c.2737C>T variant does not trigger nonsense-mediated decay of the ZSWIM6 mRNA in affected individual-derived cells. This finding supports the existence of a truncated ZSWIM6 protein lacking the Sin3-like domain, which could have a dominant-negative effect. This study builds support for a key role for ZSWIM6 in neuronal development and function, in addition to its putative roles in limb and craniofacial development, and provides a striking example of different variants in the same gene leading to distinct phenotypes.

    Funded by: NIGMS NIH HHS: T32 GM007526; Wellcome Trust

    American journal of human genetics 2017;101;6;995-1005

  • Evolve and survive.

    Pance A

    Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.

    Nature reviews. Microbiology 2017;15;5;258

  • Radiographic endophenotyping in hip osteoarthritis improves the precision of genetic association analysis.

    Panoutsopoulou K, Thiagarajah S, Zengini E, Day-Williams AG, Ramos YF, Meessen JM, Huetink K, Nelissen RG, Southam L, Rayner NW, arcOGEN Consortium, Doherty M, Meulenbelt I, Zeggini E and Wilkinson JM

    Department of Human Genetics, Wellcome Trust Sanger Institute, Hinxton, UK.

    Objective: Osteoarthritis (OA) has a strong genetic component but the success of previous genome-wide association studies (GWAS) has been restricted due to insufficient sample sizes and phenotype heterogeneity. Our aim was to examine the effect of clinically relevant endophenotyping according to site of maximal joint space narrowing (maxJSN) and bone remodelling response on GWAS signal detection in hip OA.

    Methods: A stratified GWAS meta-analysis was conducted in 2118 radiographically defined hip OA cases and 6500 population-based controls. Signals were followed up by analysing differential expression of proximal genes for bone remodelling endophenotypes in 33 pairs of macroscopically intact and OA-affected cartilage.

    Results: We report suggestive evidence (p<5×10<sup>-6</sup>) of association at 6 variants with OA endophenotypes that would have been missed by using presence of hip OA as the disease end point. For example, in the analysis of hip OA cases with superior maxJSN versus cases with non-superior maxJSN we detected association with a variant in the <i>LRCH1</i> gene (rs754106, p=1.49×10<sup>-7</sup>, OR (95% CIs) 0.70 (0.61 to 0.80)). In the comparison of hypertrophic with non-hypertrophic OA the most significant variant was located between <i>STT3B</i> and <i>GADL1</i> (rs6766414, p=3.13×10<sup>-6</sup>, OR (95% CIs) 1.45 (1.24 to 1.69)). Both of these associations were fully attenuated in non-stratified analyses of all hip OA cases versus population controls (p>0.05). <i>STT3B</i> was significantly upregulated in OA-affected versus intact cartilage, particularly in the analysis of hypertrophic and normotrophic compared with atrophic bone remodelling pattern (p=4.2×10<sup>-4</sup>).

    Conclusions: Our findings demonstrate that stratification of OA cases into more homogeneous endophenotypes can identify genes of potential functional importance otherwise obscured by disease heterogeneity.

    Funded by: Medical Research Council: G0100594, G0600237, G0900753, G0901461, MR/K002279/1, MR/K006312/1; Wellcome Trust

    Annals of the rheumatic diseases 2017;76;7;1199-1206

  • Antimalarial efficacy of MMV390048, an inhibitor of <i>Plasmodium</i> phosphatidylinositol 4-kinase.

    Paquet T, Le Manach C, Cabrera DG, Younis Y, Henrich PP, Abraham TS, Lee MCS, Basak R, Ghidelli-Disse S, Lafuente-Monasterio MJ, Bantscheff M, Ruecker A, Blagborough AM, Zakutansky SE, Zeeman AM, White KL, Shackleford DM, Mannila J, Morizzi J, Scheurer C, Angulo-Barturen I, Martínez MS, Ferrer S, Sanz LM, Gamo FJ, Reader J, Botha M, Dechering KJ, Sauerwein RW, Tungtaeng A, Vanachayangkul P, Lim CS, Burrows J, Witty MJ, Marsh KC, Bodenreider C, Rochford R, Solapure SM, Jiménez-Díaz MB, Wittlin S, Charman SA, Donini C, Campo B, Birkholtz LM, Hanson KK, Drewes G, Kocken CHM, Delves MJ, Leroy D, Fidock DA, Waterson D, Street LJ and Chibale K

    Department of Chemistry, University of Cape Town, Rondebosch 7701, South Africa.

    As part of the global effort toward malaria eradication, phenotypic whole-cell screening revealed the 2-aminopyridine class of small molecules as a good starting point to develop new antimalarial drugs. Stemming from this series, we found that the derivative, MMV390048, lacked cross-resistance with current drugs used to treat malaria. This compound was efficacious against all <i>Plasmodium</i> life cycle stages, apart from late hypnozoites in the liver. Efficacy was shown in the humanized <i>Plasmodium falciparum</i> mouse model, and modest reductions in mouse-to-mouse transmission were achieved in the <i>Plasmodium berghei</i> mouse model. Experiments in monkeys revealed the ability of MMV390048 to be used for full chemoprotection. Although MMV390048 was not able to eliminate liver hypnozoites, it delayed relapse in a <i>Plasmodium cynomolgi</i> monkey model. Both genomic and chemoproteomic studies identified a kinase of the <i>Plasmodium</i> parasite, phosphatidylinositol 4-kinase, as the molecular target of MMV390048. The ability of MMV390048 to block all life cycle stages of the malaria parasite suggests that this compound should be further developed and may contribute to malaria control and eradication as part of a single-dose combination treatment.

    Funded by: NIAID NIH HHS: R01 AI103058, R01 AI109023; NIGMS NIH HHS: T32 GM008283; Wellcome Trust: WT078285

    Science translational medicine 2017;9;387

  • Resolving Affinity Purified Protein Complexes by Blue Native PAGE and Protein Correlation Profiling.

    Pardo M, Bode D, Yu L and Choudhary JS

    Proteomic Mass Spectrometry, Wellcome Trust Sanger Institute;

    Most proteins act in association with others; hence, it is crucial to characterize these functional units in order to fully understand biological processes. Affinity purification coupled to mass spectrometry (AP-MS) has become the method of choice for identifying protein-protein interactions. However, conventional AP-MS studies provide information on protein interactions, but the organizational information is lost. To address this issue, we developed a strategy to unravel the distinct functional assemblies a protein might be involved in, by resolving affinity-purified protein complexes prior to their characterization by mass spectrometry. Protein complexes isolated through affinity purification of a bait protein using an epitope tag and competitive elution are separated through blue native electrophoresis. Comparison of protein migration profiles through correlation profiling using quantitative mass spectrometry allows assignment of interacting proteins to distinct molecular entities. This method is able to resolve protein complexes of close molecular weights that might not be resolved by traditional chromatographic techniques such as gel filtration. With little more work than conventional AP-geLC-MS/MS, we demonstrate this strategy may in many cases be adequate for obtaining protein complex topological information concomitantly to identifying protein interactions.

    Journal of visualized experiments : JoVE 2017;122

  • Myst2/Kat7 histone acetyltransferase interaction proteomics reveals tumour-suppressor Niam as a novel binding partner in embryonic stem cells.

    Pardo M, Yu L, Shen S, Tate P, Bode D, Letney BL, Quelle DE, Skarnes W and Choudhary JS

    Proteomic Mass Spectrometry, Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SA, United Kingdom.

    MYST histone acetyltransferases have crucial functions in transcription, replication and DNA repair and are hence implicated in development and cancer. Here we characterise Myst2/Kat7/Hbo1 protein interactions in mouse embryonic stem cells by affinity purification coupled to mass spectrometry. This study confirms that in embryonic stem cells Myst2 is part of H3 and H4 histone acetylation complexes similar to those described in somatic cells. We identify a novel Myst2-associated protein, the tumour suppressor protein Niam (Nuclear Interactor of ARF and Mdm2). Human NIAM is involved in chromosome segregation, p53 regulation and cell proliferation in somatic cells, but its role in embryonic stem cells is unknown. We describe the first Niam embryonic stem cell interactome, which includes proteins with roles in DNA replication and repair, transcription, splicing and ribosome biogenesis. Many of Myst2 and Niam binding partners are required for correct embryonic development, implicating Myst2 and Niam in the cooperative regulation of this process and suggesting a novel role for Niam in embryonic biology. The data provides a useful resource for exploring Myst2 and Niam essential cellular functions and should contribute to deeper understanding of organism early development and survival as well as cancer. Data are available via ProteomeXchange with identifier PXD005987.

    Scientific reports 2017;7;1;8157

  • Design and Application of Multiplex PCR Seq for the Detection of Somatic Mutations Associated with Myeloid Malignancies.

    Park N and Vassiliou G

    DNA Pipelines Research and Development, Wellcome Trust Sanger Institute, Cambridge, CB10 1SA, UK.

    Targeted sequencing, in which only a selected set of genomic loci are sequenced, enables a much higher coverage of each target than what is obtained using whole genome or exome sequencing. Multiplex PCR offers a simple and affordable technique for specific capture of target regions and can be easily adapted to generate next-generation sequencing (NGS)-ready amplicons. Here we describe a multiplex PCR (MxPCR) approach for capturing 13 leukemia-associated mutation hotspots followed by MiSeq sequencing that enables robust detection of mutations with a variant allele fraction (VAF) as low as 0.8% (0.008) in blood DNA.

    Methods in molecular biology (Clifton, N.J.) 2017;1633;87-99