Sanger Institute - Publications 2020

Number of papers published in 2020: 635

  • Longitudinal Cytokine Profiling Identifies GRO-α and EGF as Potential Biomarkers of Disease Progression in Essential Thrombocythemia.

    Øbro NF, Grinfeld J, Belmonte M, Irvine M, Shepherd MS, Rao TN, Karow A, Riedel LM, Harris OB, Baxter EJ, Nangalia J, Godfrey A, Harrison CN, Li J, Skoda RC, Campbell PJ, Green AR and Kent DG

    Wellcome MRC Cambridge Stem Cell Institute, University of Cambridge, Hills Road, Cambridge, CB2 0XY, United Kingdom.

    Myeloproliferative neoplasms (MPNs) are characterized by deregulation of mature blood cell production and increased risk of myelofibrosis (MF) and leukemic transformation. Numerous driver mutations have been identified but substantial disease heterogeneity remains unexplained, implying the involvement of additional as yet unidentified factors. The inflammatory microenvironment has recently attracted attention as a crucial factor in MPN biology, in particular whether inflammatory cytokines and chemokines contribute to disease establishment or progression. Here we present a large-scale study of serum cytokine profiles in more than 400 MPN patients and identify an essential thrombocythemia (ET)-specific inflammatory cytokine signature consisting of Eotaxin, GRO-α, and EGF. Levels of 2 of these markers (GRO-α and EGF) in ET patients were associated with disease transformation in initial sample collection (GRO-α) or longitudinal sampling (EGF). In ET patients with extensive genomic profiling data (n = 183) cytokine levels added significant prognostic value for predicting transformation from ET to MF. Furthermore, CD56<sup>+</sup>CD14<sup>+</sup> pro-inflammatory monocytes were identified as a novel source of increased GRO-α levels. These data implicate the immune cell microenvironment as a significant player in ET disease evolution and illustrate the utility of cytokines as potential biomarkers for reaching beyond genomic classification for disease stratification and monitoring.

    HemaSphere 2020;4;3;e371

  • N6-methyladenosine regulates the stability of RNA:DNA hybrids in human cells.

    Abakir A, Giles TC, Cristini A, Foster JM, Dai N, Starczak M, Rubio-Roldan A, Li M, Eleftheriou M, Crutchley J, Flatt L, Young L, Gaffney DJ, Denning C, Dalhus B, Emes RD, Gackowski D, Corrêa IR, Garcia-Perez JL, Klungland A, Gromak N and Ruzov A

    Department of Stem Cell Biology, University of Nottingham, Nottingham, UK.

    R-loops are nucleic acid structures formed by an RNA:DNA hybrid and unpaired single-stranded DNA that represent a source of genomic instability in mammalian cells<sup>1-4</sup>. Here we show that N<sup>6</sup>-methyladenosine (m<sup>6</sup>A) modification, contributing to different aspects of messenger RNA metabolism<sup>5,6</sup>, is detectable on the majority of RNA:DNA hybrids in human pluripotent stem cells. We demonstrate that m<sup>6</sup>A-containing R-loops accumulate during G<sub>2</sub>/M and are depleted at G<sub>0</sub>/G<sub>1</sub> phases of the cell cycle, and that the m<sup>6</sup>A reader promoting mRNA degradation, YTHDF2 (ref. <sup>7</sup>), interacts with R-loop-enriched loci in dividing cells. Consequently, YTHDF2 knockout leads to increased R-loop levels, cell growth retardation and accumulation of γH2AX, a marker for DNA double-strand breaks, in mammalian cells. Our results suggest that m<sup>6</sup>A regulates accumulation of R-loops, implying a role for this modification in safeguarding genomic stability.

    Funded by: Biotechnology and Biological Sciences Research Council: BB/N005759/1; British Heart Foundation: PG/14/59/31000, SP/15/9/31605; British Heart Foundation (BHF): SP/15/9/31605, PG/14/59/31000; EC | EU Framework Programme for Research and Innovation H2020 | H2020 Priority Excellent Science | H2020 European Research Council (H2020 Excellent Science - European Research Council): ERC-STG-2012-233764; Heart Research UK: TRP01/12; Medical Research Council: 1792340; National Centre for the Replacement Refinement and Reduction of Animals in Research (NC3Rs): CRACK-IT:35911-259146; National Centre for the Replacement, Refinement and Reduction of Animals in Research: NC/C013105/1, NC/C013202/1, NC/K000225/1; Oxford University | John Fell Fund, University of Oxford (John Fell OUP Research Fund: BVD07340; RCUK | Biotechnology and Biological Sciences Research Council (BBSRC): BB/N005759/1; RCUK | MRC | Medical Research Foundation: MR/M017354/1, MR/N013913/1; RCUK | Medical Research Council (MRC): MR/M017354/1; Royal Society: University Research Fellowship; Wellcome Trust; Wellcome Trust (Wellcome): WT206194

    Nature genetics 2020;52;1;48-55

  • High-throughput phenotyping reveals expansive genetic and structural underpinnings of immune variation.

    Abeler-Dörner L, Laing AG, Lorenc A, Ushakov DS, Clare S, Speak AO, Duque-Correa MA, White JK, Ramirez-Solis R, Saran N, Bull KR, Morón B, Iwasaki J, Barton PR, Caetano S, Hng KI, Cambridge E, Forman S, Crockford TL, Griffiths M, Kane L, Harcourt K, Brandt C, Notley G, Babalola KO, Warren J, Mason JC, Meeniga A, Karp NA, Melvin D, Cawthorne E, Weinrick B, Rahim A, Drissler S, Meskas J, Yue A, Lux M, Song-Zhao GX, Chan A, Ballesteros Reviriego C, Abeler J, Wilson H, Przemska-Kosicka A, Edmans M, Strevens N, Pasztorek M, Meehan TF, Powrie F, Brinkman R, Dougan G, Jacobs W, Lloyd CM, Cornall RJ, Maloy KJ, Grencis RK, Griffiths GM, Adams DJ and Hayday AC

    Department of Immunobiology, King's College London, London, UK.

    By developing a high-density murine immunophenotyping platform compatible with high-throughput genetic screening, we have established profound contributions of genetics and structure to immune variation ( Specifically, high-throughput phenotyping of 530 unique mouse gene knockouts identified 140 monogenic 'hits', of which most had no previous immunologic association. Furthermore, hits were collectively enriched in genes for which humans show poor tolerance to loss of function. The immunophenotyping platform also exposed dense correlation networks linking immune parameters with each other and with specific physiologic traits. Such linkages limit freedom of movement for individual immune parameters, thereby imposing genetically regulated 'immunologic structures', the integrity of which was associated with immunocompetence. Hence, we provide an expanded genetic resource and structural perspective for understanding and monitoring immune variation in health and disease.

    Funded by: Medical Research Council: MC_UU_00008/6; Wellcome Trust (Wellcome): 100156/Z/12/Z

    Nature immunology 2020;21;1;86-100

  • Integration of intra-sample contextual error modeling for improved detection of somatic mutations from deep sequencing.

    Abelson S, Zeng AGX, Nofech-Mozes I, Wang TT, Ng SWK, Minden MD, Pugh TJ, Awadalla P, Shlush LI, Murphy T, Chan SM, Dick JE and Bratman SV

    Ontario Institute for Cancer Research, Toronto, ON, Canada.

    Sensitive mutation detection by next-generation sequencing is critical for early cancer detection, monitoring minimal/measurable residual disease (MRD), and guiding precision oncology. Nevertheless, because of artifacts introduced during library preparation and sequencing, the detection of low-frequency variants at high specificity is problematic. Here, we present Espresso, an error suppression method that considers local sequence features to accurately detect single-nucleotide variants (SNVs). Compared to other advanced error suppression techniques, Espresso consistently demonstrated lower numbers of false-positive mutation calls and greater sensitivity. We demonstrated Espresso's superior performance in detecting MRD in the peripheral blood of patients with acute myeloid leukemia (AML) throughout their treatment course. Furthermore, we showed that accurate mutation calling in a small number of informative genomic loci might provide a cost-efficient strategy for pragmatic risk prediction of AML development in healthy individuals. More broadly, we aim for Espresso to aid with accurate mutation detection in many other research and clinical settings.

    Science advances 2020;6;50

  • Interferon-gamma polymorphisms and risk of iron deficiency and anaemia in Gambian children.

    Abuga KM, Rockett KA, Muriuki JM, Koch O, Nairz M, Sirugo G, Bejon P, Kwiatkowski DP, Prentice AM and Atkinson SH

    Kenya Medical Research Institute (KEMRI) Centre for Geographic Medicine Coast, KEMRI-Wellcome Trust Research Programme, Kilifi, Kenya.

    <b>Background</b>: Anaemia is a major public health concern especially in African children living in malaria-endemic regions. Interferon-gamma (IFN-γ) is elevated during malaria infection and is thought to influence erythropoiesis and iron status. Genetic variants in the IFN-γ gene <i>(IFNG</i>) are associated with increased IFN-γ production. We investigated putative functional single nucleotide polymorphisms (SNPs) and haplotypes of <i>IFNG</i> in relation to nutritional iron status and anaemia in Gambian children over a malaria season. <b>Methods:</b> We used previously available data from Gambian family trios to determine informative SNPs and then used the Agena Bioscience MassArray platform to type five SNPs from the <i>IFNG</i> gene in a cohort of 780 Gambian children. We also measured haemoglobin and biomarkers of iron status and inflammation at the start and end of a malaria season. <b>Results:</b> We identified five <i>IFNG</i> haplotype-tagging SNPs ( <i>IFNG</i>-1616 [rs2069705], <i>IFNG</i>+874 [rs2430561], <i>IFNG</i>+2200 [rs1861493], <i>IFNG</i>+3234 [rs2069718] and <i>IFNG</i>+5612 [rs2069728]). The <i>IFNG</i>+2200C [rs1861493] allele was associated with reduced haemoglobin concentrations (adjusted β -0.44 [95% CI -0.75, -0.12]; Bonferroni adjusted P = 0.03) and a trend towards iron deficiency compared to wild-type at the end of the malaria season in multivariable models adjusted for potential confounders. A haplotype uniquely identified by <i>IFNG</i>+2200C was similarly associated with reduced haemoglobin levels and trends towards iron deficiency, anaemia and iron deficiency anaemia at the end of the malaria season in models adjusted for age, sex, village, inflammation and malaria parasitaemia. <b>Conclusion:</b> We found limited statistical evidence linking <i>IFNG</i> polymorphisms with a risk of developing iron deficiency and anaemia in Gambian children. More definitive studies are needed to investigate the effects of genetically influenced IFN-γ levels on the risk of iron deficiency and anaemia in children living in malaria-endemic areas.

    Wellcome open research 2020;5;40

  • Analysis of erythrocyte signalling pathways during Plasmodium falciparum infection identifies targets for host-directed antimalarial intervention.

    Adderley JD, John von Freyend S, Jackson SA, Bird MJ, Burns AL, Anar B, Metcalf T, Semblat JP, Billker O, Wilson DW and Doerig C

    Centre for Chronic Inflammatory and Infectious and Diseases, Biomedical Sciences Cluster, School of Health and Biomedical Sciences, RMIT University, Bundoora, VIC, 3083, Australia.

    Intracellular pathogens mobilize host signaling pathways of their host cell to promote their own survival. Evidence is emerging that signal transduction elements are activated in a-nucleated erythrocytes in response to infection with malaria parasites, but the extent of this phenomenon remains unknown. Here, we fill this knowledge gap through a comprehensive and dynamic assessment of host erythrocyte signaling during infection with Plasmodium falciparum. We used arrays of 878 antibodies directed against human signaling proteins to interrogate the activation status of host erythrocyte phospho-signaling pathways at three blood stages of parasite asexual development. This analysis reveals a dynamic modulation of many host signalling proteins across parasite development. Here we focus on the hepatocyte growth factor receptor (c-MET) and the MAP kinase pathway component B-Raf, providing a proof of concept that human signaling kinases identified as activated by malaria infection represent attractive targets for antimalarial intervention.

    Funded by: Department of Health | National Health and Medical Research Council (NHMRC): APP1082619

    Nature communications 2020;11;1;4015

  • Disruption of chromatin folding domains by somatic genomic rearrangements in human cancer.

    Akdemir KC, Le VT, Chandran S, Li Y, Verhaak RG, Beroukhim R, Campbell PJ, Chin L, Dixon JR, Futreal PA, PCAWG Structural Variation Working Group and PCAWG Consortium

    Department of Genomic Medicine, University of Texas MD Anderson Cancer Center, Houston, TX, USA.

    Chromatin is folded into successive layers to organize linear DNA. Genes within the same topologically associating domains (TADs) demonstrate similar expression and histone-modification profiles, and boundaries separating different domains have important roles in reinforcing the stability of these features. Indeed, domain disruptions in human cancers can lead to misregulation of gene expression. However, the frequency of domain disruptions in human cancers remains unclear. Here, as part of the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA), which aggregated whole-genome sequencing data from 2,658 cancers across 38 tumor types, we analyzed 288,457 somatic structural variations (SVs) to understand the distributions and effects of SVs across TADs. Notably, SVs can lead to the fusion of discrete TADs, and complex rearrangements markedly change chromatin folding maps in the cancer genomes. Notably, only 14% of the boundary deletions resulted in a change in expression in nearby genes of more than twofold.

    Funded by: NCI NIH HHS: R01 CA095175, R01 CA217991; NIGMS NIH HHS: R35 GM127029; NIH HHS: DP5 OD023071

    Nature genetics 2020;52;3;294-305

  • Somatic mutation distributions in cancer genomes vary with three-dimensional chromatin structure.

    Akdemir KC, Le VT, Kim JM, Killcoyne S, King DA, Lin YP, Tian Y, Inoue A, Amin SB, Robinson FS, Nimmakayalu M, Herrera RE, Lynn EJ, Chan K, Seth S, Klimczak LJ, Gerstung M, Gordenin DA, O'Brien J, Li L, Deribe YL, Verhaak RG, Campbell PJ, Fitzgerald R, Morrison AJ, Dixon JR and Andrew Futreal P

    Department of Genomic Medicine, The University of Texas MD Anderson Cancer Center, Houston, TX, USA.

    Somatic mutations in driver genes may ultimately lead to the development of cancer. Understanding how somatic mutations accumulate in cancer genomes and the underlying factors that generate somatic mutations is therefore crucial for developing novel therapeutic strategies. To understand the interplay between spatial genome organization and specific mutational processes, we studied 3,000 tumor-normal-pair whole-genome datasets from 42 different human cancer types. Our analyses reveal that the change in somatic mutational load in cancer genomes is co-localized with topologically-associating-domain boundaries. Domain boundaries constitute a better proxy to track mutational load change than replication timing measurements. We show that different mutational processes lead to distinct somatic mutation distributions where certain processes generate mutations in active domains, and others generate mutations in inactive domains. Overall, the interplay between three-dimensional genome organization and active mutational processes has a substantial influence on the large-scale mutation-rate variations observed in human cancers.

    Funded by: Welch Foundation: G-0040

    Nature genetics 2020

  • A single-nucleotide polymorphism in a Plasmodium berghei ApiAP2 transcription factor alters the development of host immunity.

    Akkaya M, Bansal A, Sheehan PW, Pena M, Molina-Cruz A, Orchard LM, Cimperman CK, Qi CF, Ross P, Yazew T, Sturdevant D, Anzick SL, Thiruvengadam G, Otto TD, Billker O, Llinás M, Miller LH and Pierce SK

    Laboratory of Immunogenetics, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Rockville, MD, USA.

    The acquisition of malaria immunity is both remarkably slow and unpredictable. At present, we know little about the malaria parasite genes that influence the host's ability to mount a protective immune response. Here, we show that a single-nucleotide polymorphism (SNP) resulting in a single amino acid change (S to F) in an ApiAP2 transcription factor in the rodent malaria parasite <i>Plasmodium berghei</i> (<i>Pb</i>) NK65 allowed infected mice to mount a T helper cell 1 (T<sub>H</sub>1)-type immune response that controlled subsequent infections. As compared to <i>Pb</i>NK65<sup>S</sup>, <i>Pb</i>NK65<sup>F</sup> parasites differentially expressed 46 genes, most of which are predicted to play roles in immune evasion. <i>Pb</i>NK65<sup>F</sup> infections resulted in an early interferon-γ response and a later expansion of germinal centers, resulting in high levels of infected red blood cell-specific T<sub>H</sub>1-type immunoglobulin G2b (IgG2b) and IgG2c antibodies. Thus, the <i>Pb</i> ApiAP2 transcription factor functions as a critical parasite virulence factor in malaria infections.

    Funded by: NIAID NIH HHS: R01 AI125565

    Science advances 2020;6;6;eaaw6957

  • A Single-Cell Transcriptomics CRISPR-Activation Screen Identifies Epigenetic Regulators of the Zygotic Genome Activation Program.

    Alda-Catalinas C, Bredikhin D, Hernando-Herraez I, Santos F, Kubinyecz O, Eckersley-Maslin MA, Stegle O and Reik W

    Epigenetics Programme, Babraham Institute, Cambridge CB22 3AT, UK; Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, UK.

    Zygotic genome activation (ZGA) is an essential transcriptional event in embryonic development that coincides with extensive epigenetic reprogramming. Complex manipulation techniques and maternal stores of proteins preclude large-scale functional screens for ZGA regulators within early embryos. Here, we combined pooled CRISPR activation (CRISPRa) with single-cell transcriptomics to identify regulators of ZGA-like transcription in mouse embryonic stem cells, which serve as a tractable, in vitro proxy of early mouse embryos. Using multi-omics factor analysis (MOFA+) applied to ∼200,000 single-cell transcriptomes comprising 230 CRISPRa perturbations, we characterized molecular signatures of ZGA and uncovered 24 factors that promote a ZGA-like response. Follow-up assays validated top screen hits, including the DNA-binding protein Dppa2, the chromatin remodeler Smarca5, and the transcription factor Patz1, and functional experiments revealed that Smarca5's regulation of ZGA-like transcription is dependent on Dppa2. Together, our single-cell transcriptomic profiling of CRISPRa-perturbed cells provides both system-level and molecular insights into the mechanisms that orchestrate ZGA.

    Cell systems 2020

  • Single cell transcriptomics comes of age.

    Aldridge S and Teichmann SA

    Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB1 1SA, UK.

    Nature communications 2020;11;1;4307

  • The repertoire of mutational signatures in human cancer.

    Alexandrov LB, Kim J, Haradhvala NJ, Huang MN, Tian Ng AW, Wu Y, Boot A, Covington KR, Gordenin DA, Bergstrom EN, Islam SMA, Lopez-Bigas N, Klimczak LJ, McPherson JR, Morganella S, Sabarinathan R, Wheeler DA, Mustonen V, PCAWG Mutational Signatures Working Group, Getz G, Rozen SG, Stratton MR and PCAWG Consortium

    Department of Cellular and Molecular Medicine, Department of Bioengineering, Moores Cancer Center, University of California, San Diego, CA, USA.

    Somatic mutations in cancer genomes are caused by multiple mutational processes, each of which generates a characteristic mutational signature<sup>1</sup>. Here, as part of the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium<sup>2</sup> of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA), we characterized mutational signatures using 84,729,690 somatic mutations from 4,645 whole-genome and 19,184 exome sequences that encompass most types of cancer. We identified 49 single-base-substitution, 11 doublet-base-substitution, 4 clustered-base-substitution and 17 small insertion-and-deletion signatures. The substantial size of our dataset, compared with previous analyses<sup>3-15</sup>, enabled the discovery of new signatures, the separation of overlapping signatures and the decomposition of signatures into components that may represent associated-but distinct-DNA damage, repair and/or replication mechanisms. By estimating the contribution of each signature to the mutational catalogues of individual cancer genomes, we revealed associations of signatures to exogenous or endogenous exposures, as well as to defective DNA-maintenance processes. However, many signatures are of unknown cause. This analysis provides a systematic perspective on the repertoire of mutational processes that contribute to the development of human cancer.

    Funded by: European Research Council; NCI NIH HHS: U24 CA143843, U24 CA143845, U24 CA210999; NIGMS NIH HHS: T32 GM008313; Wellcome Trust: 206194

    Nature 2020;578;7793;94-101

  • NtrBC Regulates Invasiveness and Virulence of Pseudomonas aeruginosa During High-Density Infection.

    Alford MA, Baghela A, Yeung ATY, Pletzer D and Hancock REW

    Centre for Microbial Diseases and Immunity Research, University of British Columbia, Vancouver, BC, Canada.

    <i>Pseudomonas aeruginosa</i> is an opportunistic pathogen that is a major cause of nosocomial and chronic infections contributing to morbidity and mortality in cystic fibrosis patients. One of the reasons for its success as a pathogen is its ability to adapt to a broad range of circumstances. Here, we show the involvement of the general nitrogen regulator NtrBC, which is structurally conserved but functionally diverse across species, in pathogenic and adaptive states of <i>P. aeruginosa</i>. The role of NtrB and NtrC was examined in progressive or chronic infections, which revealed that mutants (Δ<i>ntrB</i>, Δ<i>ntrC</i>, and Δ<i>ntrBC</i>) were reduced in their ability to invade and cause damage in a high-density abscess model <i>in vivo.</i> Progressive infections were established with mutants in the highly virulent PA14 genetic background, whereas chronic infections were established with mutants in the less virulent clinical isolate LESB58 genetic background. Characterization of adaptive lifestyles <i>in vitro</i> confirmed that the double Δ<i>ntrBC</i> mutant demonstrated >40% inhibition of biofilm formation, a nearly complete inhibition of swarming motility, and a modest decrease and altered surfing motility colony appearance; with the exception of swarming, single mutants generally had more subtle or no changes. Transcriptional profiles of deletion mutants under swarming conditions were defined using RNA-Seq and unveiled dysregulated expression of hundreds of genes implicated in virulence in PA14 and LESB58 chronic lung infections, as well as carbon and nitrogen metabolism. Thus, transcriptional profiles were validated by testing responsiveness of mutants to several key intermediates of central metabolic pathways. These results indicate that NtrBC is a global regulatory system involved in both pathological and physiological processes relevant to the success of <i>Pseudomonas</i> in high-density infection.

    Frontiers in microbiology 2020;11;773

  • Population Structure, Stratification, and Introgression of Human Structural Variation.

    Almarri MA, Bergström A, Prado-Martinez J, Yang F, Fu B, Dunham AS, Chen Y, Hurles ME, Tyler-Smith C and Xue Y

    Wellcome Sanger Institute, Hinxton CB10 1SA, UK. Electronic address:

    Structural variants contribute substantially to genetic diversity and are important evolutionarily and medically, but they are still understudied. Here we present a comprehensive analysis of structural variation in the Human Genome Diversity panel, a high-coverage dataset of 911 samples from 54 diverse worldwide populations. We identify, in total, 126,018 variants, 78% of which were not identified in previous global sequencing projects. Some reach high frequency and are private to continental groups or even individual populations, including regionally restricted runaway duplications and putatively introgressed variants from archaic hominins. By de novo assembly of 25 genomes using linked-read sequencing, we discover 1,643 breakpoint-resolved unique insertions, in aggregate accounting for 1.9 Mb of sequence absent from the GRCh38 reference. Our results illustrate the limitation of a single human reference and the need for high-quality genomes from diverse populations to fully discover and understand human genetic variation.

    Funded by: Cancer Research UK: FC001595; Medical Research Council: FC001595; Wellcome Trust: FC001595

    Cell 2020;182;1;189-199.e15

  • A unified catalog of 204,938 reference genomes from the human gut microbiome.

    Almeida A, Nayfach S, Boland M, Strozzi F, Beracochea M, Shi ZJ, Pollard KS, Sakharova E, Parks DH, Hugenholtz P, Segata N, Kyrpides NC and Finn RD

    European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, UK.

    Comprehensive, high-quality reference genomes are required for functional characterization and taxonomic assignment of the human gut microbiota. We present the Unified Human Gastrointestinal Genome (UHGG) collection, comprising 204,938 nonredundant genomes from 4,644 gut prokaryotes. These genomes encode >170 million protein sequences, which we collated in the Unified Human Gastrointestinal Protein (UHGP) catalog. The UHGP more than doubles the number of gut proteins in comparison to those present in the Integrated Gene Catalog. More than 70% of the UHGG species lack cultured representatives, and 40% of the UHGP lack functional annotations. Intraspecies genomic variation analyses revealed a large reservoir of accessory genes and single-nucleotide variants, many of which are specific to individual human populations. The UHGG and UHGP collections will enable studies linking genotypes to phenotypes in the human gut microbiome.

    Funded by: EC | EU Framework Programme for Research and Innovation H2020 | H2020 Priority Excellent Science | H2020 European Research Council (H2020 Excellent Science - European Research Council): ERC- STG MetaPG-716575; RCUK | Biotechnology and Biological Sciences Research Council (BBSRC): BB/N018354/1, BB/R015228/1

    Nature biotechnology 2020

  • Molecular characterization of Brucella ovis in Argentina.

    Alvarez LP, Ruiz-Villalobos N, Suárez-Esquivel M, Thomson NR, Marcellino R, Víquez-Ruiz E, Robles CA and Guzmán-Verri C

    Centro de Referencia en Levaduras y Tecnología Cervecera (CRELTEC), Instituto Andino Patagónico de Tecnologías Biológicas y Geoambientales (IPATEC), Consejo Nacional de Investigaciones Científicas y Técnicas, Universidad Nacional del Comahue, San Carlos de Bariloche, Río Negro, Argentina. Electronic address:

    Brucellosis in rams is caused by Brucella ovis or Brucella melitensis and it is considered one of the most important infectious diseases of males in sheep-raising countries. Molecular characterization of Brucella spp. achieved by multi-locus variable number of tandem repeats analysis (MLVA) is a powerful tool to genotype Brucella spp. However, data regarding B. ovis genotyping is scarce. Thus, the aim of this study was to characterize the molecular diversity of B. ovis field-strains in Argentina. A total of 115 isolates of B. ovis from Argentina and Uruguay were genotyped using MLVA-16 and analyzed altogether with 14 publicly available B. ovis genotypes from Brazil. The Discriminatory Power (D) was 0.996 for MLVA-16 and 0.0998 for MLVA-8 and MLVA-11. Analysis of MLVA-16 revealed 100 different genotypes, all of them novel, including 90 unique ones. There was no correlation between geographical distribution and genotype and results showed a higher diversity within provinces than between provinces. Clustering analysis of the strains from Argentina, Uruguay and Brazil revealed that the 129 isolates were grouped into two clades. Whole Genome Sequencing analysis of the 19 B. ovis genomes available in public databases, and including some of the Argentinian strains used in this study, revealed clustering of the Argentinian isolates and closer relationship with B. ovis from New Zealand and Australia. This work adds new data to the poorly understood distribution map of genotypes regionally and worldwide for B. ovis and it constitutes the largest study of B. ovis molecular genotyping until now.

    Veterinary microbiology 2020;245;108703

  • Schistosoma species detection by environmental DNA assays in African freshwaters.

    Alzaylaee H, Collins RA, Rinaldi G, Shechonge A, Ngatunga B, Morgan ER and Genner MJ

    School of Biological Sciences, University of Bristol, Life Sciences Building, Bristol, United Kingdom.

    Background: Schistosomiasis is a neglected tropical parasitic disease associated with severe pathology, mortality and economic loss worldwide. Programs for disease control may benefit from specific and sensitive diagnostic methods to detect Schistosoma trematodes in aquatic environments. Here we report the development of novel environmental DNA (eDNA) qPCR assays for the presence of the human-infecting species Schistosoma mansoni, S. haematobium and S. japonicum.

    Methodology/principal findings: We first tested the specificity of the assays across the three species using genomic DNA preparations which showed successful amplification of target sequences with no cross amplification between the three focal species. In addition, we evaluated the specificity of the assays using synthetic DNA of multiple Schistosoma species, and demonstrated a high overall specificity; however, S. japonicum and S. haematobium assays showed cross-species amplification with very closely-related species. We next tested the effectiveness of the S. mansoni assay using eDNA samples from aquaria containing infected host gastropods, with the target species revealed as present in all infected aquaria. Finally, we evaluated the effectiveness of the S. mansoni and S. haematobium assays using eDNA samples from eight discrete natural freshwater sites in Tanzania, and demonstrated strong correspondence between infection status established using eDNA and conventional assays of parasite prevalence in host snails.

    Conclusions/significance: Collectively, our results suggest that eDNA monitoring is able to detect schistosomes in freshwater bodies, but refinement of the field sampling, storage and assay methods are likely to optimise its performance. We anticipate that environmental DNA-based approaches will help to inform epidemiological studies and contribute to efforts to control and eliminate schistosomiasis in endemic areas.

    PLoS neglected tropical diseases 2020;14;3;e0008129

  • High-throughput genotyping assays for identification of glycophorin B deletion variants in population studies.

    Amuzu DS, Rockett KA, Leffler EM, Ansah F, Amoako N, Morang'a CM, Hubbart C, Rowlands K, Jeffreys AE, Amenga-Etego LN, Kwiatkowski DP and Awandare GA

    West African Centre for Cell Biology of Infectious Pathogens, University of Ghana, Accra, GH 0233, Ghana.

    Glycophorins are the most abundant sialoglycoproteins on the surface of human erythrocyte membranes. Genetic variation in glycophorin region of human chromosome 4 (containing <i>GYPA</i>, <i>GYPB</i>, and <i>GYPE</i> genes) is of interest because the gene products serve as receptors for pathogens of major public health interest, including <i>Plasmodium</i><i>sp.</i>, <i>Babesia</i><i>sp.</i>, Influenza virus, <i>Vibrio cholerae</i> El Tor Hemolysin, and <i>Escherichia coli</i>. A large structural rearrangement and hybrid glycophorin variant, known as <i>Dantu</i>, which was identified in East African populations, has been linked with a 40% reduction in risk for severe malaria. Apart from <i>Dantu</i>, other large structural variants exist, with the most common being deletion of the whole <i>GYPB</i> gene and its surrounding region, resulting in multiple different deletion forms. In West Africa particularly, these deletions are estimated to account for between 5 and 15% of the variation in different populations, mostly attributed to the forms known as DEL1 and DEL2. Due to the lack of specific variant assays, little is known of the distribution of these variants. Here, we report a modification of a previous <i>GYPB</i> DEL1 assay and the development of a novel <i>GYPB</i> DEL2 assay as high-throughput PCR-RFLP assays, as well as the identification of the crossover/breakpoint for <i>GYPB</i> DEL2. Using 393 samples from three study sites in Ghana as well as samples from HapMap and 1000 G projects for validation, we show that our assays are sensitive and reliable for genotyping <i>GYPB</i> DEL1 and DEL2. To the best of our knowledge, this is the first report of such high-throughput genotyping assays by PCR-RFLP for identifying specific <i>GYPB</i> deletion types in populations. These assays will enable better identification of GYPB deletions for large genetic association studies and functional experiments to understand the role of this gene cluster region in susceptibility to malaria and other diseases.

    Experimental biology and medicine (Maywood, N.J.) 2020;1535370220968545

  • Extracellular non-coding RNA signatures of the metacestode stage of Echinococcus multilocularis.

    Ancarola ME, Lichtenstein G, Herbig J, Holroyd N, Mariconti M, Brunetti E, Berriman M, Albrecht K, Marcilla A, Rosenzvit MC, Kamenetzky L, Brehm K and Cucher M

    Departament of Microbiology, School of Medicine, University of Buenos Aires, Buenos Aires, Argentina.

    Extracellular RNAs (ex-RNAs) are secreted by cells through different means that may involve association with proteins, lipoproteins or extracellular vesicles (EV). In the context of parasitism, ex-RNAs represent new and exciting communication intermediaries with promising potential as novel biomarkers. In the last years, it was shown that helminth parasites secrete ex-RNAs, however, most work mainly focused on RNA secretion mediated by EV. Ex-RNA study is of special interest in those helminth infections that still lack biomarkers for early and/or follow-up diagnosis, such as echinococcosis, a neglected zoonotic disease caused by cestodes of the genus Echinococcus. In this work, we have characterised the ex-RNA profile secreted by in vitro grown metacestodes of Echinococcus multilocularis, the casuative agent of alveolar echinococcosis. We have used high throughput RNA-sequencing together with RT-qPCR to characterise the ex-RNA profile secreted towards the extra- and intra-parasite milieus in EV-enriched and EV-depleted fractions. We show that a polarized secretion of small RNAs takes place, with microRNAs mainly secreted to the extra-parasite milieu and rRNA- and tRNA-derived sequences mostly secreted to the intra-parasite milieu. In addition, we show by nanoparticle tracking analyses that viable metacestodes secrete EV mainly into the metacestode inner vesicular fluid (MVF); however, the number of nanoparticles in culture medium and MVF increases > 10-fold when metacestodes show signs of tegument impairment. Interestingly, we confirm the presence of host miRNAs in the intra-parasite milieu, implying their internalization and transport through the tegument towards the MVF. Finally, our assessment of the detection of Echinococcus miRNAs in patient samples by RT-qPCR yielded negative results suggesting the tested miRNAs may not be good biomarkers for this disease. A comprehensive study of the secretion mechanisms throughout the life cycle of these parasites will help to understand parasite interaction with the host and also, improve current diagnostic tools.

    PLoS neglected tropical diseases 2020;14;11;e0008890

  • Tutorial: guidelines for the computational analysis of single-cell RNA sequencing data.

    Andrews TS, Kiselev VY, McCarthy D and Hemberg M

    Wellcome Sanger Institute, Hinxton, UK.

    Single-cell RNA sequencing (scRNA-seq) is a popular and powerful technology that allows you to profile the whole transcriptome of a large number of individual cells. However, the analysis of the large volumes of data generated from these experiments requires specialized statistical and computational methods. Here we present an overview of the computational workflow involved in processing scRNA-seq data. We discuss some of the most common tasks and the tools available for addressing central biological questions. In this article and our companion website ( ), we provide guidelines regarding best practices for performing computational analyses. This tutorial provides a hands-on guide for experimentalists interested in analyzing their data as well as an overview for bioinformaticians seeking to develop new computational methods.

    Funded by: Baker Foundation: NA; Department of Health | National Health and Medical Research Council (NHMRC): GNT1112681, GNT1162829; Wellcome Trust (Wellcome): NA

    Nature protocols 2020

  • Characterization of putative drug resistant biomarkers in Plasmodium falciparum isolated from Ghanaian blood donors.

    Aninagyei E, Duedu KO, Rufai T, Tetteh CD, Chandi MG, Ampomah P and Acheampong DO

    Department of Biomedical Sciences, School of Basic and Biomedical Sciences, University of Health and Allied Sciences, Ho, Volta Region, Ghana.

    Background: Plasmodium falciparum parasites, which could harbour anti-malaria drug resistance genes, are commonly detected in blood donors in malaria-endemic areas. Notwithstanding, anti-malaria drug resistant biomarkers have not been characterized in blood donors with asymptomatic P. falciparum infection.

    Methods: A total of 771 blood donors were selected from five districts in the Greater Accra Region, Ghana. Each donor sample was screened with malaria rapid diagnostic test (RDT) kit and parasitaemia quantified microscopically. Dried blood spots from malaria positive samples were genotyped for P. falciparum chloroquine resistance transporter (Pfcrt), P. falciparum multi-drug resistance (Pfmdr1), P. falciparum dihydropteroate-synthetase (Pfdhps), P. falciparum dihydrofolate-reductase (Pfdhfr) and Kelch 13 propeller domain on chromosome 13 (Kelch 13) genes.

    Results: Of the 771 blood donors, 91 (11.8%) were positive by RDT. Analysis of sequence reads indicated successful genotyping of Pfcrt, Pfmdr1, Pfdhfr, Pfdhps and Kelch 13 genes in 84.6, 81.3, 86.8, 86.9 and 92.3% of the isolates respectively. Overall, 21 different mutant haplotypes were identified in 69 isolates (75.8%). In Pfcrt, CVIET haplotype was observed in 11.6% samples while in Pfmdr1, triple mutation (resulting in YFN haplotype) was detected in 8.1% of isolates. In Pfdhfr gene, triple mutation resulting in IRNI haplotype and in Pfdhps gene, quintuple mutation resulting in AGESS haplotype was identified in 17.7% parasite isolates. Finally, five non-synonymous Kelch 13 alleles were detected; C580Y (3.6%), P615L (4.8%), A578S (4.8%), I543V (2.4%) and A676S (1.2%) were detected.

    Conclusion: Results obtained in this study indicated various frequencies of mutant alleles in Pfcrt, Pfmdr1, Pfdhfr, Pfdhps and Kelch 13 genes from P. falciparum infected blood donors. These alleles could reduce the efficacy of standard malaria treatment in transfusion-transmitted malaria cases. Incorporating malaria screening into donor screening protocol to defer infected donors is therefore recommended.

    BMC infectious diseases 2020;20;1;533

  • Professional duties are now considered legal duties of care within genomic medicine.

    Anna M, Christine P, Jonathan R, Richard M, Alessia C, Lauren R and Jerome A

    Society and Ethics Research, Connecting Science, Wellcome Genome Campus, Cambridge, UK.

    The legal duty to protect patient confidentiality is common knowledge amongst healthcare professionals. However, what may not be widely known, is that this duty is not always absolute. In the United Kingdom, both the General Medical Council governing the practice of all doctors, as well as many other professional codes of practice recognise that, under certain circumstances, it may be appropriate to break confidentiality. This arises when there is a wider duty to protect the health of others, and when the risk of non-disclosure outweighs the potential harm from breaking confidentiality. We discuss this situation specifically in relation to genomic medicine where relatives in a family may have differing views on the sharing of familial genetic information. Overruling a patient's wishes is predicated on balancing the duty of care towards the patient versus protecting their relative from serious harm. We discuss the practice implications of a pivotal legal case that concluded recently in the High Court of Justice in England and Wales, ABC v St Georges Healthcare NHS Trust & Ors. Professional guidance is already clear that genetic healthcare professionals must undertake a balancing exercise to weigh up contradictory duties of care. However, the judge has provided a new legal weighting to these professional duties: 'The scope of the duty extends not only to conducting the necessary balancing exercise but also to acting in accordance with its outcome' [1: 189]. In the context of genomic medicine, this has important consequences for clinical practice.

    Funded by: Wellcome Trust (Wellcome): 206194

    European journal of human genetics : EJHG 2020

  • In situ CRISPR-Cas9 base editing for the development of genetically engineered mouse models of breast cancer.

    Annunziato S, Lutz C, Henneman L, Bhin J, Wong K, Siteur B, van Gerwen B, de Korte-Grimmerink R, Zafra MP, Schatoff EM, Drenth AP, van der Burg E, Eijkman T, Mukherjee S, Boroviak K, Wessels LF, van de Ven M, Huijbers IJ, Adams DJ, Dow LE and Jonkers J

    Division of Molecular Pathology, The Netherlands Cancer Institute, Amsterdam, The Netherlands.

    Genetically engineered mouse models (GEMMs) of cancer have proven to be of great value for basic and translational research. Although CRISPR-based gene disruption offers a fast-track approach for perturbing gene function and circumvents certain limitations of standard GEMM development, it does not provide a flexible platform for recapitulating clinically relevant missense mutations in vivo. To this end, we generated knock-in mice with Cre-conditional expression of a cytidine base editor and tested their utility for precise somatic engineering of missense mutations in key cancer drivers. Upon intraductal delivery of sgRNA-encoding vectors, we could install point mutations with high efficiency in one or multiple endogenous genes in situ and assess the effect of defined allelic variants on mammary tumorigenesis. While the system also produces bystander insertions and deletions that can stochastically be selected for when targeting a tumor suppressor gene, we could effectively recapitulate oncogenic nonsense mutations. We successfully applied this system in a model of triple-negative breast cancer, providing the proof of concept for extending this flexible somatic base editing platform to other tissues and tumor types.

    Funded by: Cancer Genomics Netherlands (CGCNL): 024001028; Cancer Systems Biology Center (CSBC): 85300120; EC | FP7 | FP7 Ideas: European Research Council (FP7 Ideas); ERC Synergy project CombatCancer: 319661; NCI NIH HHS: F31 CA224800; National Roadmap grant for Large-Scale Research Facilities: 184032303; Nederlandse Organisatie voor Wetenschappelijk Onderzoek (NWO): VICI 91814643; Netherlands Genomics Initiative (NGI) Zenith: 93512009; Oncode Institute

    The EMBO journal 2020;39;5;e102169

  • Genome variation and population structure among 1142 mosquitoes of the African malaria vector species Anopheles gambiae and Anopheles coluzzii.

    Anopheles gambiae 1000 Genomes Consortium

    Mosquito control remains a central pillar of efforts to reduce malaria burden in sub-Saharan Africa. However, insecticide resistance is entrenched in malaria vector populations, and countries with a high malaria burden face a daunting challenge to sustain malaria control with a limited set of surveillance and intervention tools. Here we report on the second phase of a project to build an open resource of high-quality data on genome variation among natural populations of the major African malaria vector species <i>Anopheles gambiae</i> and <i>Anopheles coluzzii</i> We analyzed whole genomes of 1142 individual mosquitoes sampled from the wild in 13 African countries, as well as a further 234 individuals comprising parents and progeny of 11 laboratory crosses. The data resource includes high-confidence single-nucleotide polymorphism (SNP) calls at 57 million variable sites, genome-wide copy number variation (CNV) calls, and haplotypes phased at biallelic SNPs. We use these data to analyze genetic population structure and characterize genetic diversity within and between populations. We illustrate the utility of these data by investigating species differences in isolation by distance, genetic variation within proposed gene drive target sequences, and patterns of resistance to pyrethroid insecticides. This data resource provides a foundation for developing new operational systems for molecular surveillance and for accelerating research and development of new vector control tools. It also provides a unique resource for the study of population genomics and evolutionary biology in eukaryotic species with high levels of genetic diversity under strong anthropogenic evolutionary pressures.

    Genome research 2020;30;10;1533-1546

  • Modulation of Triple Artemisinin-Based Combination Therapy Pharmacodynamics by Plasmodium falciparum Genotype.

    Ansbro MR, Itkin Z, Chen L, Zahoranszky-Kohalmi G, Amaratunga C, Miotto O, Peryea T, Hobbs CV, Suon S, Sá JM, Dondorp AM, van der Pluijm RW, Wellems TE, Simeonov A and Eastman RT

    Laboratory of Malaria and Vector Research, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, Maryland 20892, United States.

    The first-line treatments for uncomplicated <i>Plasmodium falciparum</i> malaria are artemisinin-based combination therapies (ACTs), consisting of an artemisinin derivative combined with a longer acting partner drug. However, the spread of <i>P. falciparum</i> with decreased susceptibility to artemisinin and partner drugs presents a significant challenge to malaria control efforts. To stem the spread of drug resistant parasites, novel chemotherapeutic strategies are being evaluated, including the implementation of triple artemisinin-based combination therapies (TACTs). Currently, there is limited knowledge on the pharmacodynamic and pharmacogenetic interactions of proposed TACT drug combinations. To evaluate these interactions, we established an <i>in vitro</i> high-throughput process for measuring the drug concentration-response to three distinct antimalarial drugs present in a TACT. Sixteen different TACT combinations were screened against 15 parasite lines from Cambodia, with a focus on parasites with differential susceptibilities to piperaquine and artemisinins. Analysis revealed drug-drug interactions unique to specific genetic backgrounds, including antagonism between piperaquine and pyronaridine associated with gene amplification of <i>plasmepsin II/III</i>, two aspartic proteases that localize to the parasite digestive vacuole. From this initial study, we identified parasite genotypes with decreased susceptibility to specific TACTs, as well as potential TACTs that display antagonism in a genotype-dependent manner. Our assay and analysis platform can be further leveraged to inform drug implementation decisions and evaluate next-generation TACTs.

    Funded by: Wellcome Trust

    ACS pharmacology & translational science 2020;3;6;1144-1157

  • Development of copy number assays for detection and surveillance of piperaquine resistance associated plasmepsin 2/3 copy number variation in Plasmodium falciparum.

    Ansbro MR, Jacob CG, Amato R, Kekre M, Amaratunga C, Sreng S, Suon S, Miotto O, Fairhurst RM, Wellems TE and Kwiatkowski DP

    Wellcome Sanger Institute, Hinxton, UK.

    Background: Long regarded as an epicenter of drug-resistant malaria, Southeast Asia continues to provide new challenges to the control of Plasmodium falciparum malaria. Recently, resistance to the artemisinin combination therapy partner drug piperaquine has been observed in multiple locations across Southeast Asia. Genetic studies have identified single nucleotide polymorphisms as well as copy number variations in the plasmepsin 2 and plasmepsin 3 genes, which encode haemoglobin-degrading proteases that associate with clinical and in vitro piperaquine resistance.

    Results: To accurately and quickly determine the presence of copy number variations in the plasmepsin 2/3 genes in field isolates, this study developed a quantitative PCR assay using TaqMan probes. Copy number estimates were validated using a separate SYBR green-based quantitative PCR assay as well as a novel PCR-based breakpoint assay to detect the hybrid gene product. Field samples from 2012 to 2015 across three sites in Cambodia were tested using DNA extracted from dried blood spots and whole blood to monitor the extent of plasmepsin 2/3 gene amplifications, as well as amplifications in the multidrug resistance transporter 1 gene (pfmdr1), a marker of mefloquine resistance. This study found high concordance across all methods of copy number detection. For samples derived from dried blood spots, a success rate greater than 80% was found in each assay, with more recent samples performing better. Evidence of extensive plasmepsin 2/3 copy number amplifications was observed in Pursat (94%, 2015) (Western Cambodia) and Preah Vihear (87%, 2014) (Northern Cambodia), and lower levels in Ratanakiri (16%, 2014) (Eastern Cambodia). A shift was observed from two copies of plasmepsin 2 in Pursat in 2013 to three copies in 2014-2015 (25% to 64%). Pfmdr1 amplifications were absent in all samples from Preah Vihear and Ratanakiri in 2014 and absent in Pursat in 2015.

    Conclusions: The multiplex TaqMan assay is a robust tool for monitoring both plasmepsin 2/3 and pfmdr1 copy number variations in field isolates, and the SYBR-green and breakpoint assays are useful for monitoring plasmepsin 2/3 amplifications. This study shows increasing levels of plasmepsin 2 copy numbers across Cambodia from 2012 to 2015 and a complete reversion of multicopy pfmdr1 parasites to single copy parasites in all study locations.

    Funded by: Bill and Melinda Gates Foundation: OPP1118166; Department for International Development: MR/M005212/1; Medical Research Council UK: G0600718; Wellcome Trust: 090770/Z/09/Z, 098051, 206194

    Malaria journal 2020;19;1;181

  • Tet3 ablation in adult brain neurons increases anxiety-like behavior and regulates cognitive function in mice.

    Antunes C, Da Silva JD, Guerra-Gomes S, Alves ND, Ferreira F, Loureiro-Campos E, Branco MR, Sousa N, Reik W, Pinto L and Marques CJ

    Life and Health Sciences Research Institute (ICVS), School of Medicine, University of Minho, 4710-057, Braga, Portugal.

    TET3 is a member of the ten-eleven translocation (TET) family of enzymes which oxidize 5-methylcytosine (5mC) into 5-hydroxymethylcytosine (5hmC). Tet3 is highly expressed in the brain, where 5hmC levels are most abundant. In adult mice, we observed that TET3 is present in mature neurons and oligodendrocytes but is absent in astrocytes. To investigate the function of TET3 in adult postmitotic neurons, we crossed Tet3 floxed mice with a neuronal Cre-expressing mouse line, Camk2a-CreERT2, obtaining a Tet3 conditional KO (cKO) mouse line. Ablation of Tet3 in adult mature neurons resulted in increased anxiety-like behavior with concomitant hypercorticalism, and impaired hippocampal-dependent spatial orientation. Transcriptome and gene-specific expression analysis of the hippocampus showed dysregulation of genes involved in glucocorticoid signaling pathway (HPA axis) in the ventral hippocampus, whereas upregulation of immediate early genes was observed in both dorsal and ventral hippocampal areas. In addition, Tet3 cKO mice exhibit increased dendritic spine maturation in the ventral CA1 hippocampal subregion. Based on these observations, we suggest that TET3 is involved in molecular alterations that govern hippocampal-dependent functions. These results reveal a critical role for epigenetic modifications in modulating brain functions, opening new insights into the molecular basis of neurological disorders.

    Molecular psychiatry 2020

  • Mechanisms of stretch-mediated skin expansion at single-cell resolution.

    Aragona M, Sifrim A, Malfait M, Song Y, Van Herck J, Dekoninck S, Gargouri S, Lapouge G, Swedlund B, Dubois C, Baatsen P, Vints K, Han S, Tissir F, Voet T, Simons BD and Blanpain C

    Laboratory of Stem Cells and Cancer, Université Libre de Bruxelles, Brussels, Belgium.

    The ability of the skin to grow in response to stretching has been exploited in reconstructive surgery<sup>1</sup>. Although the response of epidermal cells to stretching has been studied in vitro<sup>2,3</sup>, it remains unclear how mechanical forces affect their behaviour in vivo. Here we develop a mouse model in which the consequences of stretching on skin epidermis can be studied at single-cell resolution. Using a multidisciplinary approach that combines clonal analysis with quantitative modelling and single-cell RNA sequencing, we show that stretching induces skin expansion by creating a transient bias in the renewal activity of epidermal stem cells, while a second subpopulation of basal progenitors remains committed to differentiation. Transcriptional and chromatin profiling identifies how cell states and gene-regulatory networks are modulated by stretching. Using pharmacological inhibitors and mouse mutants, we define the step-by-step mechanisms that control stretch-mediated tissue expansion at single-cell resolution in vivo.

    Nature 2020

  • Evidence of human occupation in Mexico around the Last Glacial Maximum.

    Ardelean CF, Becerra-Valdivia L, Pedersen MW, Schwenninger JL, Oviatt CG, Macías-Quintero JI, Arroyo-Cabrales J, Sikora M, Ocampo-Díaz YZE, Rubio-Cisneros II, Watling JG, de Medeiros VB, De Oliveira PE, Barba-Pingarón L, Ortiz-Butrón A, Blancas-Vázquez J, Rivera-González I, Solís-Rosales C, Rodríguez-Ceja M, Gandy DA, Navarro-Gutierrez Z, De La Rosa-Díaz JJ, Huerta-Arellano V, Marroquín-Fernández MB, Martínez-Riojas LM, López-Jiménez A, Higham T and Willerslev E

    Unidad Académica de Antropología, Universidad Autónoma de Zacatecas, Zacatecas, Mexico.

    The initial colonization of the Americas remains a highly debated topic<sup>1</sup>, and the exact timing of the first arrivals is unknown. The earliest archaeological record of Mexico-which holds a key geographical position in the Americas-is poorly known and understudied. Historically, the region has remained on the periphery of research focused on the first American populations<sup>2</sup>. However, recent investigations provide reliable evidence of a human presence in the northwest region of Mexico<sup>3,4</sup>, the Chiapas Highlands<sup>5</sup>, Central Mexico<sup>6</sup> and the Caribbean coast<sup>7-9</sup> during the Late Pleistocene and Early Holocene epochs. Here we present results of recent excavations at Chiquihuite Cave-a high-altitude site in central-northern Mexico-that corroborate previous findings in the Americas<sup>10-17</sup>of cultural evidence that dates to the Last Glacial Maximum (26,500-19,000 years ago)<sup>18</sup>, and which push back dates for human dispersal to the region possibly as early as 33,000-31,000 years ago. The site yielded about 1,900 stone artefacts within a 3-m-deep stratified sequence, revealing a previously unknown lithic industry that underwent only minor changes over millennia. More than 50 radiocarbon and luminescence dates provide chronological control, and genetic, palaeoenvironmental and chemical data document the changing environments in which the occupants lived. Our results provide new evidence for the antiquity of humans in the Americas, illustrate the cultural diversity of the earliest dispersal groups (which predate those of the Clovis culture) and open new directions of research.

    Nature 2020

  • Type 1 Interferon Responses Underlie Tumor-Selective Replication of Oncolytic Measles Virus.

    Aref S, Castleton AZ, Bailey K, Burt R, Dey A, Leongamornlert D, Mitchell RJ, Okasha D and Fielding AK

    UCL Cancer Institute, London WC1E 6DD, UK.

    The mechanism of tumor-selective replication of oncolytic measles virus (MV) is poorly understood. Using a stepwise model of cellular transformation, in which oncogenic hits were additively expressed in human bone marrow-derived mesenchymal stromal cells, we show that MV-induced oncolysis increased progressively with transformation. The type 1 interferon (IFN) response to MV infection was significantly reduced and delayed, in accordance with the level of transformation. Consistently, we observed delayed and reduced signal transducer and activator of transcription (STAT1) phosphorylation in the fully transformed cells. Pre-treatment with IFNβ restored resistance to MV-mediated oncolysis. Gene expression profiling to identify the genetic correlates of susceptibility to MV oncolysis revealed a dampened basal level of immune-related genes in the fully transformed cells compared to their normal counterparts. IFN-induced transmembrane protein 1 (IFITM1) was the foremost basally downregulated immune gene. Stable IFITM1 overexpression in MV-susceptible cells resulted in a 50% increase in cell viability and a significant reduction in viral replication at 24 h after MV infection. Overall, our data indicate that the basal reduction in functions of the type 1 IFN pathway is a major contributor to the oncolytic selectivity of MV. In particular, we have identified IFITM1 as a restriction factor for oncolytic MV, acting at early stages of infection.

    Molecular therapy : the journal of the American Society of Gene Therapy 2020;28;4;1043-1055

  • MOFA+: a statistical framework for comprehensive integration of multi-modal single-cell data.

    Argelaguet R, Arnol D, Bredikhin D, Deloro Y, Velten B, Marioni JC and Stegle O

    European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridgeshire, CB10 1SD, UK.

    Technological advances have enabled the profiling of multiple molecular layers at single-cell resolution, assaying cells from multiple samples or conditions. Consequently, there is a growing need for computational strategies to analyze data from complex experimental designs that include multiple data modalities and multiple groups of samples. We present Multi-Omics Factor Analysis v2 (MOFA+), a statistical framework for the comprehensive and scalable integration of single-cell multi-modal data. MOFA+ reconstructs a low-dimensional representation of the data using computationally efficient variational inference and supports flexible sparsity constraints, allowing to jointly model variation across multiple sample groups and data modalities.

    Genome biology 2020;21;1;111

  • Integrating whole-genome sequencing within the National Antimicrobial Resistance Surveillance Program in the Philippines.

    Argimón S, Masim MAL, Gayeta JM, Lagrada ML, Macaranas PKV, Cohen V, Limas MT, Espiritu HO, Palarca JC, Chilam J, Jamoralin MC, Villamin AS, Borlasa JB, Olorosa AM, Hernandez LFT, Boehme KD, Jeffrey B, Abudahab K, Hufano CM, Sia SB, Stelling J, Holden MTG, Aanensen DM and Carlos CC

    Centre for Genomic Pathogen Surveillance, Wellcome Genome Campus, Hinxton, UK.

    National networks of laboratory-based surveillance of antimicrobial resistance (AMR) monitor resistance trends and disseminate these data to AMR stakeholders. Whole-genome sequencing (WGS) can support surveillance by pinpointing resistance mechanisms and uncovering transmission patterns. However, genomic surveillance is rare in low- and middle-income countries. Here, we implement WGS within the established Antimicrobial Resistance Surveillance Program of the Philippines via a binational collaboration. In parallel, we characterize bacterial populations of key bug-drug combinations via a retrospective sequencing survey. By linking the resistance phenotypes to genomic data, we reveal the interplay of genetic lineages (strains), AMR mechanisms, and AMR vehicles underlying the expansion of specific resistance phenotypes that coincide with the growing carbapenem resistance rates observed since 2010. Our results enhance our understanding of the drivers of carbapenem resistance in the Philippines, while also serving as the genetic background to contextualize ongoing local prospective surveillance.

    Funded by: Department of Health: 16_136_111; Medical Research Council: MR/N019296/1; NCI NIH HHS: U01 CA207167; NCRR NIH HHS: R01 RR025040

    Nature communications 2020;11;1;2719

  • gplas: a comprehensive tool for plasmid analysis using short-read graphs.

    Arredondo-Alonso S, Bootsma M, Hein Y, Rogers MRC, Corander J, Willems RJL and Schürch AC

    Department of Medical Microbiology, University Medical Center Utrecht, Utrecht University, 3584 CX Utrecht, The Netherlands.

    Summary: Plasmids can horizontally transmit genetic traits, enabling rapid bacterial adaptation to new environments and hosts. Short-read whole-genome sequencing data are often applied to large-scale bacterial comparative genomics projects but the reconstruction of plasmids from these data is facing severe limitations, such as the inability to distinguish plasmids from each other in a bacterial genome. We developed gplas, a new approach to reliably separate plasmid contigs into discrete components using sequence composition, coverage, assembly graph information and network partitioning based on a pruned network of plasmid unitigs. Gplas facilitates the analysis of large numbers of bacterial isolates and allows a detailed analysis of plasmid epidemiology based solely on short-read sequence data.

    Availability and implementation: Gplas is written in R, Bash and uses a Snakemake pipeline as a workflow management system. Gplas is available under the GNU General Public License v3.0 at

    Supplementary information: Supplementary data are available at Bioinformatics online.

    Bioinformatics (Oxford, England) 2020;36;12;3874-3876

  • Plasmids Shaped the Recent Emergence of the Major Nosocomial Pathogen Enterococcus faecium.

    Arredondo-Alonso S, Top J, McNally A, Puranen S, Pesonen M, Pensar J, Marttinen P, Braat JC, Rogers MRC, van Schaik W, Kaski S, Willems RJL, Corander J and Schürch AC

    Department of Medical Microbiology, University Medical Center Utrecht, Utrecht, The Netherlands.

    <i>Enterococcus faecium</i> is a gut commensal of humans and animals but is also listed on the WHO global priority list of multidrug-resistant pathogens. Many of its antibiotic resistance traits reside on plasmids and have the potential to be disseminated by horizontal gene transfer. Here, we present the first comprehensive population-wide analysis of the pan-plasmidome of a clinically important bacterium, by whole-genome sequence analysis of 1,644 isolates from hospital, commensal, and animal sources of <i>E. faecium</i> Long-read sequencing on a selection of isolates resulted in the completion of 305 plasmids that exhibited high levels of sequence modularity. We further investigated the entirety of all plasmids of each isolate (plasmidome) using a combination of short-read sequencing and machine-learning classifiers. Clustering of the plasmid sequences unraveled different <i>E. faecium</i> populations with a clear association with hospitalized patient isolates, suggesting different optimal configurations of plasmids in the hospital environment. The characterization of these populations allowed us to identify common mechanisms of plasmid stabilization such as toxin-antitoxin systems and genes exclusively present in particular plasmidome populations exemplified by copper resistance, phosphotransferase systems, or bacteriocin genes potentially involved in niche adaptation. Based on the distribution of k-mer distances between isolates, we concluded that plasmidomes rather than chromosomes are most informative for source specificity of <i>E. faecium</i><b>IMPORTANCE</b><i>Enterococcus faecium</i> is one of the most frequent nosocomial pathogens of hospital-acquired infections. <i>E. faecium</i> has gained resistance against most commonly available antibiotics, most notably, against ampicillin, gentamicin, and vancomycin, which renders infections difficult to treat. Many antibiotic resistance traits, in particular, vancomycin resistance, can be encoded in autonomous and extrachromosomal elements called plasmids. These sequences can be disseminated to other isolates by horizontal gene transfer and confer novel mechanisms to source specificity. In our study, we elucidated the total plasmid content, referred to as the plasmidome, of 1,644 <i>E. faecium</i> isolates by using short- and long-read whole-genome technologies with the combination of a machine-learning classifier. This was fundamental to investigate the full collection of plasmid sequences present in our collection (pan-plasmidome) and to observe the potential transfer of plasmid sequences between <i>E. faecium</i> hosts. We observed that <i>E. faecium</i> isolates from hospitalized patients carried a larger number of plasmid sequences compared to that from other sources, and they elucidated different configurations of plasmidome populations in the hospital environment. We assessed the contribution of different genomic components and observed that plasmid sequences have the highest contribution to source specificity. Our study suggests that <i>E. faecium</i> plasmids are regulated by complex ecological constraints rather than physical interaction between hosts.

    mBio 2020;11;1

  • The secreted protease Adamts18 links hormone action to activation of the mammary stem cell niche.

    Ataca D, Aouad P, Constantin C, Laszlo C, Beleut M, Shamseddin M, Rajaram RD, Jeitziner R, Mead TJ, Caikovski M, Bucher P, Ambrosini G, Apte SS and Brisken C

    Ecole Polytechnique Fédérale de Lausanne, Station 19, CH-1015, Lausanne, Switzerland.

    Estrogens and progesterone control breast development and carcinogenesis via their cognate receptors expressed in a subset of luminal cells in the mammary epithelium. How they control the extracellular matrix, important to breast physiology and tumorigenesis, remains unclear. Here we report that both hormones induce the secreted protease Adamts18 in myoepithelial cells by controlling Wnt4 expression with consequent paracrine canonical Wnt signaling activation. Adamts18 is required for stem cell activation, has multiple binding partners in the basement membrane and interacts genetically with the basal membrane-specific proteoglycan, Col18a1, pointing to the basement membrane as part of the stem cell niche. In vitro, ADAMTS18 cleaves fibronectin; in vivo, Adamts18 deletion causes increased collagen deposition during puberty, which results in impaired Hippo signaling and reduced Fgfr2 expression both of which control stem cell function. Thus, Adamts18 links luminal hormone receptor signaling to basement membrane remodeling and stem cell activation.

    Funded by: Schweizerischer Nationalfonds zur Förderung der Wissenschaftlichen Forschung (Swiss National Science Foundation): 31003A_141248

    Nature communications 2020;11;1;1571

  • KAT7 is a genetic vulnerability of acute myeloid leukemias driven by MLL rearrangements.

    Au YZ, Gu M, De Braekeleer E, Gozdecka M, Aspris D, Tarumoto Y, Cooper J, Yu J, Ong SH, Chen X, Tzelepis K, Huntly BJP, Vassiliou G and Yusa K

    Stem Cell Genetics, Wellcome Sanger Institute, Hinxton, Cambridge, UK.

    Histone acetyltransferases (HATs) catalyze the transfer of an acetyl group from acetyl-CoA to lysine residues of histones and play a central role in transcriptional regulation in diverse biological processes. Dysregulation of HAT activity can lead to human diseases including developmental disorders and cancer. Through genome-wide CRISPR-Cas9 screens, we identified several HATs of the MYST family as fitness genes for acute myeloid leukemia (AML). Here we investigate the essentiality of lysine acetyltransferase KAT7 in AMLs driven by the MLL-X gene fusions. We found that KAT7 loss leads to a rapid and complete loss of both H3K14ac and H4K12ac marks, in association with reduced proliferation, increased apoptosis, and differentiation of AML cells. Acetyltransferase activity of KAT7 is essential for the proliferation of these cells. Mechanistically, our data propose that acetylated histones provide a platform for the recruitment of MLL-fusion-associated adaptor proteins such as BRD4 and AF4 to gene promoters. Upon KAT7 loss, these factors together with RNA polymerase II rapidly dissociate from several MLL-fusion target genes that are essential for AML cell proliferation, including MEIS1, PBX3, and SENP6. Our findings reveal that KAT7 is a plausible therapeutic target for this poor prognosis AML subtype.

    Funded by: Bloodwise: 17006; Cancer Research UK (CRUK): C22324/A23015; Kay Kendall Leukaemia Fund (KKLF): KKL920; Wellcome Trust (Wellcome): RG94424, WT206194

    Leukemia 2020

  • Genomic and transcriptomic evidence for descent from Plasmodium and loss of blood schizogony in Hepatocystis parasites from naturally infected red colobus monkeys.

    Aunin E, Böhme U, Sanderson T, Simons ND, Goldberg TL, Ting N, Chapman CA, Newbold CI, Berriman M and Reid AJ

    Parasite Genomics, Wellcome Sanger Institute, Hinxton, Cambridge, United Kingdom.

    Hepatocystis is a genus of single-celled parasites infecting, amongst other hosts, monkeys, bats and squirrels. Although thought to have descended from malaria parasites (Plasmodium spp.), Hepatocystis spp. are thought not to undergo replication in the blood-the part of the Plasmodium life cycle which causes the symptoms of malaria. Furthermore, Hepatocystis is transmitted by biting midges, not mosquitoes. Comparative genomics of Hepatocystis and Plasmodium species therefore presents an opportunity to better understand some of the most important aspects of malaria parasite biology. We were able to generate a draft genome for Hepatocystis sp. using DNA sequencing reads from the blood of a naturally infected red colobus monkey. We provide robust phylogenetic support for Hepatocystis sp. as a sister group to Plasmodium parasites infecting rodents. We show transcriptomic support for a lack of replication in the blood and genomic support for a complete loss of a family of genes involved in red blood cell invasion. Our analyses highlight the rapid evolution of genes involved in parasite vector stages, revealing genes that may be critical for interactions between malaria parasites and mosquitoes.

    Funded by: Medical Research Council: MR/M003906/1; Wellcome Trust: 104792/Z/14/Z, 206194/Z/17/Z, 210918/Z/18/Z

    PLoS pathogens 2020;16;8;e1008717

  • Identification of Intrinsic Drug Resistance and Its Biomarkers in High-Throughput Pharmacogenomic and CRISPR Screens.

    Ayestaran I, Galhoz A, Spiegel E, Sidders B, Dry JR, Dondelinger F, Bender A, McDermott U, Iorio F and Menden MP

    Institute of Computational Biology, Helmholtz Zentrum München GmbH-German Research Center for Environmental Health, Neuherberg 85764, Germany.

    High-throughput drug screens in cancer cell lines test compounds at low concentrations, thereby enabling the identification of drug-sensitivity biomarkers, while resistance biomarkers remain underexplored. Dissecting meaningful drug responses at high concentrations is challenging due to cytotoxicity, i.e., off-target effects, thus limiting resistance biomarker discovery to frequently mutated cancer genes. To address this, we interrogate subpopulations carrying sensitivity biomarkers and consecutively investigate unexpectedly resistant (UNRES) cell lines for unique genetic alterations that may drive resistance. By analyzing the GDSC and CTRP datasets, we find 53 and 35 UNRES cases, respectively. For 24 and 28 of them, we highlight putative resistance biomarkers. We find clinically relevant cases such as EGFR<sup>T790M</sup> mutation in NCI-H1975 or PTEN loss in NCI-H1650 cells, in lung adenocarcinoma treated with EGFR inhibitors. Interrogating the underpinnings of drug resistance with publicly available CRISPR phenotypic assays assists in prioritizing resistance drivers, offering hypotheses for drug combinations.

    Patterns (New York, N.Y.) 2020;1;5;100065

  • Frequency-dependent selection can forecast evolution in Streptococcus pneumoniae.

    Azarian T, Martinez PP, Arnold BJ, Qiu X, Grant LR, Corander J, Fraser C, Croucher NJ, Hammitt LL, Reid R, Santosham M, Weatherholtz RC, Bentley SD, O'Brien KL, Lipsitch M and Hanage WP

    Burnett School of Biomedical Sciences, University of Central Florida, Orlando, Florida, United States of America.

    Predicting how pathogen populations will change over time is challenging. Such has been the case with Streptococcus pneumoniae, an important human pathogen, and the pneumococcal conjugate vaccines (PCVs), which target only a fraction of the strains in the population. Here, we use the frequencies of accessory genes to predict changes in the pneumococcal population after vaccination, hypothesizing that these frequencies reflect negative frequency-dependent selection (NFDS) on the gene products. We find that the standardized predicted fitness of a strain, estimated by an NFDS-based model at the time the vaccine is introduced, enables us to predict whether the strain increases or decreases in prevalence following vaccination. Further, we are able to forecast the equilibrium post-vaccine population composition and assess the invasion capacity of emerging lineages. Overall, we provide a method for predicting the impact of an intervention on pneumococcal populations with potential application to other bacterial pathogens in which NFDS is a driving force.

    PLoS biology 2020;18;10;e3000878

  • Functional signatures of evolutionarily young CTCF binding sites.

    Azazi D, Mudge JM, Odom DT and Flicek P

    European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.

    Background: The introduction of novel CTCF binding sites in gene regulatory regions in the rodent lineage is partly the effect of transposable element expansion, particularly in the murine lineage. The exact mechanism and functional impact of evolutionarily novel CTCF binding sites are not yet fully understood. We investigated the impact of novel subspecies-specific CTCF binding sites in two Mus genus subspecies, Mus musculus domesticus and Mus musculus castaneus, that diverged 0.5 million years ago.

    Results: CTCF binding site evolution is influenced by the action of the B2-B4 family of transposable elements independently in both lineages, leading to the proliferation of novel CTCF binding sites. A subset of evolutionarily young sites may harbour transcriptional functionality as evidenced by the stability of their binding across multiple tissues in M. musculus domesticus (BL6), while overall the distance of subspecies-specific CTCF binding to the nearest transcription start sites and/or topologically associated domains (TADs) is largely similar to musculus-common CTCF sites. Remarkably, we discovered a recurrent regulatory architecture consisting of a CTCF binding site and an interferon gene that appears to have been tandemly duplicated to create a 15-gene cluster on chromosome 4, thus forming a novel BL6 specific immune locus in which CTCF may play a regulatory role.

    Conclusions: Our results demonstrate that thousands of CTCF binding sites show multiple functional signatures rapidly after incorporation into the genome.

    Funded by: Cancer Research UK: 20412; H2020 European Research Council: 615584; NHGRI NIH HHS: U41HG007234; Wellcome Trust: WT108749/Z/15/Z, WT202878/B/16/Z, WT202878/Z/16/Z

    BMC biology 2020;18;1;132

  • Adipose Tissue-Liver Cross Talk in the Control of Whole-Body Metabolism: Implications in Nonalcoholic Fatty Liver Disease.

    Azzu V, Vacca M, Virtue S, Allison M and Vidal-Puig A

    Wellcome Trust-Medical Research Council Institute of Metabolic Science-Metabolic Research Laboratories, Addenbrooke's Hospital; The Liver Unit, Department of Medicine, Cambridge University Hospitals National Health Service Foundation Trust, Cambridge Biomedical Campus, Hills Road, Cambridge. Electronic address:

    Adipose tissue and the liver play significant roles in the regulation of whole-body energy homeostasis, but they have not evolved to cope with the continuous, chronic, nutrient surplus seen in obesity. In this review, we detail how prolonged metabolic stress leads to adipose tissue dysfunction, inflammation, and adipokine release that results in increased lipid flux to the liver. Overall, the upshot of hepatic fat accumulation alongside an insulin-resistant state is that hepatic lipid enzymatic pathways are modulated and overwhelmed, resulting in the selective buildup of toxic lipid species, which worsens the pro-inflammatory and pro-fibrotic shift observed in nonalcoholic steatohepatitis.

    Funded by: Biotechnology and Biological Sciences Research Council: BB/H002731/1; Medical Research Council: G0400192, G0600717, G0802051, MC_G0802535, MC_UU_00014/2, MC_UU_12012/2, MC_UU_12012/5

    Gastroenterology 2020;158;7;1899-1912

  • Multi-locus genotyping reveals established endemicity of a geographically distinct Plasmodium vivax population in Mauritania, West Africa.

    Ba H, Auburn S, Jacob CG, Goncalves S, Duffy CW, Stewart LB, Price RN, Deh YB, Diallo MY, Tandia A, Kwiatkowski DP and Conway DJ

    Institut National de Recherche en Santé Publique, Nouakchott, Mauritania.

    Background: Plasmodium vivax has been recently discovered as a significant cause of malaria in Mauritania, although very rare elsewhere in West Africa. It has not been known if this is a recently introduced or locally remnant parasite population, nor whether the genetic structure reflects epidemic or endemic transmission.

    Methodology/principal findings: To investigate the P. vivax population genetic structure in Mauritania and compare with populations previously analysed elsewhere, multi-locus genotyping was undertaken on 100 clinical isolates, using a genome-wide panel of 38 single nucleotide polymorphisms (SNPs), plus seven SNPs in drug resistance genes. The Mauritanian P. vivax population is shown to be genetically diverse and divergent from populations elsewhere, indicated consistently by genetic distance matrix analysis, principal components analyses, and fixation indices. Only one isolate had a genotype clearly indicating recent importation, from a southeast Asian source. There was no linkage disequilibrium in the local parasite population, and only a small number of infections appeared to be closely genetically related, indicating that there is ongoing genetic recombination consistent with endemic transmission. The P. vivax diversity in a remote mining town was similar to that in the capital Nouakchott, with no indication of local substructure or of epidemic population structure. Drug resistance alleles were virtually absent in Mauritania, in contrast with P. vivax in other areas of the world.

    Conclusions/significance: The molecular epidemiology indicates that there is long-standing endemic transmission that will be very challenging to eliminate. The virtual absence of drug resistance alleles suggests that most infections have been untreated, and that this endemic infection has been more neglected in comparison to P. vivax elsewhere.

    PLoS neglected tropical diseases 2020;14;12;e0008945

  • Microarray analyses reveal strain-specific antibody responses to Plasmodium falciparum apical membrane antigen 1 variants following natural infection and vaccination.

    Bailey JA, Berry AA, Travassos MA, Ouattara A, Boudova S, Dotsey EY, Pike A, Jacob CG, Adams M, Tan JC, Bannen RM, Patel JJ, Pablo J, Nakajima R, Jasinskas A, Dutta S, Takala-Harrison S, Lyke KE, Laurens MB, Niangaly A, Coulibaly D, Kouriba B, Doumbo OK, Thera MA, Felgner PL and Plowe CV

    Center for Vaccine Development and Global Health, University of Maryland School of Medicine, Baltimore, MD, USA.

    Vaccines based on Plasmodium falciparum apical membrane antigen 1 (AMA1) have failed due to extensive polymorphism in AMA1. To assess the strain-specificity of antibody responses to malaria infection and AMA1 vaccination, we designed protein and peptide microarrays representing hundreds of unique AMA1 variants. Following clinical malaria episodes, children had short-lived, sequence-independent increases in average whole-protein seroreactivity, as well as strain-specific responses to peptides representing diverse epitopes. Vaccination resulted in dramatically increased seroreactivity to all 263 AMA1 whole-protein variants. High-density peptide analysis revealed that vaccinated children had increases in seroreactivity to four distinct epitopes that exceeded responses to natural infection. A single amino acid change was critical to seroreactivity to peptides in a region of AMA1 associated with strain-specific vaccine efficacy. Antibody measurements using whole antigens may be biased towards conserved, immunodominant epitopes. Peptide microarrays may help to identify immunogenic epitopes, define correlates of vaccine protection, and measure strain-specific vaccine-induced antibodies.

    Funded by: Division of Intramural Research, National Institute of Allergy and Infectious Diseases (Division of Intramural Research of the NIAID): R21AI119733; NIAID NIH HHS: K23 AI125720, R01 AI093635, R21 AI119733, T32 AI007524, U01 AI065683, U19 AI129386; U.S. Department of Health &amp; Human Services | NIH | Fogarty International Center (FIC): D43TW001589; U.S. Department of Health &amp; Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI): R01HL130750; U.S. Department of Health &amp; Human Services | National Institutes of Health (NIH): R01AI093635, U01AI065683

    Scientific reports 2020;10;1;3952

  • AZD4320, A Dual Inhibitor of Bcl-2 and Bcl-xL, Induces Tumor Regression in Hematologic Cancer Models without Dose-limiting Thrombocytopenia.

    Balachander SB, Criscione SW, Byth KF, Cidado J, Adam A, Lewis P, Macintyre T, Wen S, Lawson D, Burke K, Lubinski T, Tyner JW, Kurtz SE, McWeeney SK, Varnes J, Diebold RB, Gero T, Ioannidis S, Hennessy EJ, McCoull W, Saeh JC, Tabatabai A, Tavana O, Su N, Schuller A, Garnett MJ, Jaaks P, Coker EA, Gregory GP, Newbold A, Johnstone RW, Gangl E, Wild M, Zinda M, Secrist JP, Davies BR, Fawell SE and Gibbons FD

    Bioscience, Oncology R&D, AstraZeneca, Boston, Massachusetts.

    Purpose: Targeting Bcl-2 family members upregulated in multiple cancers has emerged as an important area of cancer therapeutics. While venetoclax, a Bcl-2-selective inhibitor, has had success in the clinic, another family member, Bcl-x<sub>L</sub>, has also emerged as an important target and as a mechanism of resistance. Therefore, we developed a dual Bcl-2/Bcl-x<sub>L</sub> inhibitor that broadens the therapeutic activity while minimizing Bcl-x<sub>L</sub>-mediated thrombocytopenia.

    Experimental design: We used structure-based chemistry to design a small-molecule inhibitor of Bcl-2 and Bcl-x<sub>L</sub> and assessed the activity against <i>in vitro</i> cell lines, patient samples, and <i>in vivo</i> models. We applied pharmacokinetic/pharmacodynamic (PK/PD) modeling to integrate our understanding of on-target activity of the dual inhibitor in tumors and platelets across dose levels and over time.

    Results: We discovered AZD4320, which has nanomolar affinity for Bcl-2 and Bcl-x<sub>L</sub>, and mechanistically drives cell death through the mitochondrial apoptotic pathway. AZD4320 demonstrates activity in both Bcl-2- and Bcl-x<sub>L</sub>-dependent hematologic cancer cell lines and enhanced activity in acute myeloid leukemia (AML) patient samples compared with the Bcl-2-selective agent venetoclax. A single intravenous bolus dose of AZD4320 induces tumor regression with transient thrombocytopenia, which recovers in less than a week, suggesting a clinical weekly schedule would enable targeting of Bcl-2/Bcl-x<sub>L</sub>-dependent tumors without incurring dose-limiting thrombocytopenia. AZD4320 demonstrates monotherapy activity in patient-derived AML and venetoclax-resistant xenograft models.

    Conclusions: AZD4320 is a potent molecule with manageable thrombocytopenia risk to explore the utility of a dual Bcl-2/Bcl-x<sub>L</sub> inhibitor across a broad range of tumor types with dysregulation of Bcl-2 prosurvival proteins.

    Clinical cancer research : an official journal of the American Association for Cancer Research 2020;26;24;6535-6549

  • NRASQ61K melanoma tumor formation is reduced by p38-MAPK14 activation in zebrafish models and NRAS-mutated human melanoma cells.

    Banik I, Cheng PF, Dooley CM, Travnickova J, Merteroglu M, Dummer R, Patton EE, Busch-Nentwich EM and Levesque MP

    University Hospital Zurich, University of Zurich, Zurich, Switzerland.

    Oncogenic BRAF and NRAS mutations drive human melanoma initiation. We used transgenic zebrafish to model NRAS mutant melanoma and the rapid tumor onset allowed us to study candidate tumor suppressors. We identified P38α-MAPK14 as a potential tumor suppressor in The Cancer Genome Atlas melanoma cohort of NRAS mutant melanomas, and overexpression significantly increased the time to tumor onset in transgenic zebrafish with NRAS-driven melanoma. Pharmacological activation of P38α-MAPK14 using anisomycin reduced in vitro viability of melanoma cultures, which we confirmed by stable overexpression of p38α. We observed that the viability of MEK-inhibitor resistant melanoma cells could be reduced by combined treatment of anisomycin and MEK-inhibition. Our study demonstrates that activating the p38α-MAPK14 pathway in the presence of oncogenic NRAS abrogates melanoma in vitro and in vivo.

    Pigment cell & melanoma research 2020

  • Ageing compromises mouse thymus function and remodels epithelial cell differentiation.

    Baran-Gale J, Morgan MD, Maio S, Dhalla F, Calvo-Asensio I, Deadman ME, Handel AE, Maynard A, Chen S, Green F, Sit RV, Neff NF, Darmanis S, Tan W, May AP, Marioni JC, Ponting CP and Holländer GA

    MRC Human Genetics Unit, Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh, United Kingdom.

    Ageing is characterised by cellular senescence, leading to imbalanced tissue maintenance, cell death and compromised organ function. This is first observed in the thymus, the primary lymphoid organ that generates and selects T cells. However, the molecular and cellular mechanisms underpinning these ageing processes remain unclear. Here, we show that mouse ageing leads to less efficient T cell selection, decreased self-antigen representation and increased T cell receptor repertoire diversity. Using a combination of single-cell RNA-seq and lineage-tracing, we find that progenitor cells are the principal targets of ageing, whereas the function of individual mature thymic epithelial cells is compromised only modestly. Specifically, an early-life precursor cell population, retained in the mouse cortex postnatally, is virtually extinguished at puberty. Concomitantly, a medullary precursor cell quiesces, thereby impairing maintenance of the medullary epithelium. Thus, ageing disrupts thymic progenitor differentiation and impairs the core immunological functions of the thymus.

    Funded by: European Molecular Biology Laboratory: 17197; Medical Research Council: MC_UU_00007/15; Swiss National Science Foundation: 310030_184672, IZLJZ3_171050; Wellcome: 105045/Z/14/Z, 109032/Z/15/Z

    eLife 2020;9

  • Expert curation of the human and mouse olfactory receptor gene repertoires identifies conserved coding regions split across two exons.

    Barnes IHA, Ibarra-Soria X, Fitzgerald S, Gonzalez JM, Davidson C, Hardy MP, Manthravadi D, Van Gerven L, Jorissen M, Zeng Z, Khan M, Mombaerts P, Harrow J, Logan DW and Frankish A

    European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.

    Background: Olfactory receptor (OR) genes are the largest multi-gene family in the mammalian genome, with 874 in human and 1483 loci in mouse (including pseudogenes). The expansion of the OR gene repertoire has occurred through numerous duplication events followed by diversification, resulting in a large number of highly similar paralogous genes. These characteristics have made the annotation of the complete OR gene repertoire a complex task. Most OR genes have been predicted in silico and are typically annotated as intronless coding sequences.

    Results: Here we have developed an expert curation pipeline to analyse and annotate every OR gene in the human and mouse reference genomes. By combining evidence from structural features, evolutionary conservation and experimental data, we have unified the annotation of these gene families, and have systematically determined the protein-coding potential of each locus. We have defined the non-coding regions of many OR genes, enabling us to generate full-length transcript models. We found that 13 human and 41 mouse OR loci have coding sequences that are split across two exons. These split OR genes are conserved across mammals, and are expressed at the same level as protein-coding OR genes with an intronless coding region. Our findings challenge the long-standing and widespread notion that the coding region of a vertebrate OR gene is contained within a single exon.

    Conclusions: This work provides the most comprehensive curation effort of the human and mouse OR gene repertoires to date. The complete annotation has been integrated into the GENCODE reference gene set, for immediate availability to the research community.

    Funded by: NHGRI NIH HHS: 2U41HG007234, U41 HG007234

    BMC genomics 2020;21;1;196

  • Sociodemographic inequities associated with participation in leisure-time physical activity in sub-Saharan Africa: an individual participant data meta-analysis.

    Barr AL, Partap U, Young EH, Agoudavi K, Balde N, Kagaruki GB, Mayige MT, Longo-Mbenza B, Mutungi G, Mwalim O, Wesseh CS, Bahendeka SK, Guwatudde D, Jørgensen JMA, Bovet P, Motala AA and Sandhu MS

    Department of Medicine, University of Cambridge, Cambridge, UK.

    Background: Leisure-time physical activity (LTPA) is an important contributor to total physical activity and the focus of many interventions promoting activity in high-income populations. Little is known about LTPA in sub-Saharan Africa (SSA), and with expected declines in physical activity due to rapid urbanisation and lifestyle changes we aimed to assess the sociodemographic differences in the prevalence of LTPA in the adult populations of this region to identify potential barriers for equitable participation.

    Methods: A two-step individual participant data meta-analysis was conducted using data collected in SSA through 10 population health surveys that included the Global Physical Activity Questionnaire. For each sociodemographic characteristic, the pooled adjusted prevalence and risk ratios (RRs) for participation in LTPA were calculated using the random effects method. Between-study heterogeneity was explored through meta-regression analyses and tests for interaction.

    Results: Across the 10 populations (N = 26,022), 18.9% (95%CI: 14.3, 24.1; I<sup>2</sup> = 99.0%) of adults (≥ 18 years) participated in LTPA. Men were more likely to participate in LTPA compared with women (RR for women: 0.43; 95%CI: 0.32, 0.60; P < 0.001; I<sup>2</sup> = 97.5%), while age was inversely associated with participation. Higher levels of education were associated with increased LTPA participation (RR: 1.30; 95%CI: 1.09, 1.55; P = 0.004; I<sup>2</sup> = 98.1%), with those living in rural areas or self-employed less likely to participate in LTPA. These associations remained after adjusting for time spent physically active at work or through active travel.

    Conclusions: In these populations, participation in LTPA was low, and strongly associated with sex, age, education, self-employment and urban residence. Identifying the potential barriers that reduce participation in these groups is necessary to enable equitable access to the health and social benefits associated with LTPA.

    Funded by: Medical Research Council: MR/K013491/1; Medical Research Foundation: MR/K013491/1; Wellcome Trust: WT206194

    BMC public health 2020;20;1;927

  • Nosocomial outbreak of the Middle East Respiratory Syndrome coronavirus: A phylogenetic, epidemiological, clinical and infection control analysis.

    Barry M, Phan MV, Akkielah L, Al-Majed F, Alhetheel A, Somily A, Alsubaie SS, McNabb SJ, Cotten M, Zumla A and Memish ZA

    Infectious Diseases Division, Faculty of Medicine, King Khalid University Hospital, King Saud University, Riyadh, Saudi Arabia. Electronic address:

    Background: Middle East Respiratory Syndrome coronavirus (MERS-CoV) continues to cause intermittent community and nosocomial outbreaks. Obtaining data on specific source(s) and transmission dynamics of MERS-CoV during nosocomial outbreaks has been challenging. We performed a clinical, epidemiological and phylogenetic investigation of an outbreak of MERS-CoV at a University Hospital in Riyadh, Kingdom of Saudi Arabia.

    Methods: Clinical, epidemiological and infection control data were obtained from patients and Healthcare workers (HCWs). Full genome sequencing was conducted on nucleic acid extracted directly from MERS-CoV PCR-confirmed clinical samples and phylogenetic analysis performed. Phylogenetic analysis combined with published MERS-CoV genomes was performed. HCWs compliance with infection control practices was also assessed.

    Results: Of 235 persons investigated, there were 23 laboratory confirmed MERS cases, 10 were inpatients and 13 HCWs. Eight of 10 MERS inpatients died (80% mortality). There were no deaths among HCWs. The primary index case assumed from epidemiological investigation was not substantiated phylogenetically. 17/18 of MERS cases were linked both phylogenetically and epidemiologically. One asymptomatic HCW yielded a MERS-CoV genome not directly linked to any other case in the investigation. Five HCWs with mild symptoms yielded >75% full MERS-CoV genome sequences. HCW compliance with use of gowns was 62.1%, gloves 69.7%, and masks 57.6%.

    Conclusions: Several factors and sources, including a HCW MERS-CoV 'carrier phenomenon', occur during nosocomial MERS-CoV outbreaks. Phylogenetic analyses of MERS-CoV linked to clinical and epidemiological information is essential for outbreak investigation. The specific role of apparently healthy HCWs in causing nosocomial outbreaks requires further definition.

    Travel medicine and infectious disease 2020;101807

  • Mouse Models of Myeloid Malignancies.

    Basheer F and Vassiliou G

    Wellcome-MRC Cambridge Stem Cell Institute, Jeffrey Cheah Biomedical Centre, Department of Haematology, University of Cambridge, Cambridge CB2 0AW, United Kingdom.

    Mouse models of human myeloid malignancies support the detailed and focused investigation of selected driver mutations and represent powerful tools in the study of these diseases. Carefully developed murine models can closely recapitulate human myeloid malignancies in vivo, enabling the interrogation of a number of aspects of these diseases including their preclinical course, interactions with the microenvironment, effects of pharmacological agents, and the role of non-cell-autonomous factors, as well as the synergy between co-occurring mutations. Importantly, advances in gene-editing technologies, particularly CRISPR-Cas9, have opened new avenues for the development and study of genetically modified mice and also enable the direct modification of mouse and human hematopoietic cells. In this review we provide a concise overview of some of the important mouse models that have advanced our understanding of myeloid leukemogenesis with an emphasis on models relevant to clonal hematopoiesis, myelodysplastic syndromes, and acute myeloid leukemia with a normal karyotype.

    Cold Spring Harbor perspectives in medicine 2020

  • Dissecting the early steps of MLL induced leukaemogenic transformation using a mouse model of AML.

    Basilico S, Wang X, Kennedy A, Tzelepis K, Giotopoulos G, Kinston SJ, Quiros PM, Wong K, Adams DJ, Carnevalli LS, Huntly BJP, Vassiliou GS, Calero-Nieto FJ and Göttgens B

    Wellcome and MRC Cambridge Stem Cell Institute and University of Cambridge Department of Haematology, Jeffrey Cheah Biomedical Centre, Puddicombe Way, Cambridge, CB2 0AW, UK.

    Leukaemogenic mutations commonly disrupt cellular differentiation and/or enhance proliferation, thus perturbing the regulatory programs that control self-renewal and differentiation of stem and progenitor cells. Translocations involving the Mll1 (Kmt2a) gene generate powerful oncogenic fusion proteins, predominantly affecting infant and paediatric AML and ALL patients. The early stages of leukaemogenic transformation are typically inaccessible from human patients and conventional mouse models. Here, we take advantage of cells conditionally blocked at the multipotent haematopoietic progenitor stage to develop a MLL-r model capturing early cellular and molecular consequences of MLL-ENL expression based on a clear clonal relationship between parental and leukaemic cells. Through a combination of scRNA-seq, ATAC-seq and genome-scale CRISPR-Cas9 screening, we identify pathways and genes likely to drive the early phases of leukaemogenesis. Finally, we demonstrate the broad utility of using matched parental and transformed cells for small molecule inhibitor studies by validating both previously known and other potential therapeutic targets.

    Nature communications 2020;11;1;1407

  • Acral lentiginous melanoma: Basic facts, biological characteristics and research perspectives of an understudied disease.

    Basurto-Lozada P, Molina-Aguilar C, Castañeda-Garcia C, Vázquez-Cruz ME, Garcia-Salinas OI, Álvarez-Cano A, Martínez-Said H, Roldán-Marín R, Adams DJ, Possik PA and Robles-Espinoza CD

    Laboratorio Internacional de Investigación sobre el Genoma Humano, Universidad Nacional Autónoma de México, Campus Juriquilla, Blvd Juriquilla 3001, Santiago de Querétaro, 76230, México.

    Acral lentiginous melanoma is a histological subtype of cutaneous melanoma that occurs in the glabrous skin of the palms, soles and in the nail unit. Although in some countries, particularly in Latin America, Africa and Asia, it represents the most frequently diagnosed subtype of the disease, it only represents a small proportion of melanoma cases in European-descent populations, which is partially why it has not been studied to the same extent as other forms of melanoma. As a result, its unique genomic drivers remain comparatively poorly explored, as well as its causes, with current evidence supporting a UV-independent path to tumorigenesis. In this Review, we discuss current knowledge of the aetiology and diagnostic criteria of acral lentiginous melanoma, as well as its epidemiological and histopathological characteristics. We also describe what is known about the genomic landscape of this disease and review the available biological models to explore potential therapeutic targets.

    Pigment cell & melanoma research 2020

  • Identification of region-specific astrocyte subtypes at single cell resolution.

    Batiuk MY, Martirosyan A, Wahis J, de Vin F, Marneffe C, Kusserow C, Koeppen J, Viana JF, Oliveira JF, Voet T, Ponting CP, Belgard TG and Holt MG

    Laboratory of Glia Biology, VIB-KU Leuven Center for Brain and Disease Research, Leuven, Belgium.

    Astrocytes, a major cell type found throughout the central nervous system, have general roles in the modulation of synapse formation and synaptic transmission, blood-brain barrier formation, and regulation of blood flow, as well as metabolic support of other brain resident cells. Crucially, emerging evidence shows specific adaptations and astrocyte-encoded functions in regions, such as the spinal cord and cerebellum. To investigate the true extent of astrocyte molecular diversity across forebrain regions, we used single-cell RNA sequencing. Our analysis identifies five transcriptomically distinct astrocyte subtypes in adult mouse cortex and hippocampus. Validation of our data in situ reveals distinct spatial positioning of defined subtypes, reflecting the distribution of morphologically and physiologically distinct astrocyte populations. Our findings are evidence for specialized astrocyte subtypes between and within brain regions. The data are available through an online database (, providing a resource on which to base explorations of local astrocyte diversity and function in the brain.

    Nature communications 2020;11;1;1220

  • Evolution of Salmonella enterica serotype Typhimurium driven by anthropogenic selection and niche adaptation.

    Bawn M, Alikhan NF, Thilliez G, Kirkwood M, Wheeler NE, Petrovska L, Dallman TJ, Adriaenssens EM, Hall N and Kingsley RA

    Quadram Institute Biosciences, Norwich Research Park, Norwich, United Kingdom.

    Salmonella enterica serotype Typhimurium (S. Typhimurium) is a leading cause of gastroenteritis and bacteraemia worldwide, and a model organism for the study of host-pathogen interactions. Two S. Typhimurium strains (SL1344 and ATCC14028) are widely used to study host-pathogen interactions, yet genotypic variation results in strains with diverse host range, pathogenicity and risk to food safety. The population structure of diverse strains of S. Typhimurium revealed a major phylogroup of predominantly sequence type 19 (ST19) and minor of ST36. The major phylogroup had a population structure with two high order clades (α and β) and multiple subclades on extended internal branches, that exhibited distinct signatures of host adaptation and anthropogenic selection. Clade α contained a number of subclades composed of strains from well characterized epidemics in domesticated animals, while clade β contained multiple subclades associated with wild avian species. The contrasting epidemiology of strains in clade α and β was reflected by the distinct distribution of antimicrobial resistance (AMR) genes, accumulation of hypothetically disrupted coding sequences (HDCS), and signatures of functional diversification. These observations were consistent with elevated anthropogenic selection of clade α lineages from adaptation to circulation in populations of domesticated livestock, and the predisposition of clade β lineages to undergo adaptation to an invasive lifestyle by a process of convergent evolution with of host adapted Salmonella serotypes. Gene flux was predominantly driven by acquisition and recombination of prophage and associated cargo genes, with only occasional loss of these elements. The acquisition of large chromosomally-encoded genetic islands was limited, but notably, a feature of two recent pandemic clones (DT104 and monophasic S. Typhimurium ST34) of clade α (SGI-1 and SGI-4).

    PLoS genetics 2020;16;6;e1008850

  • Astrocyte layers in the mammalian cerebral cortex revealed by a single-cell in situ transcriptomic map.

    Bayraktar OA, Bartels T, Holmqvist S, Kleshchevnikov V, Martirosyan A, Polioudakis D, Ben Haim L, Young AMH, Batiuk MY, Prakash K, Brown A, Roberts K, Paredes MF, Kawaguchi R, Stockley JH, Sabeur K, Chang SM, Huang E, Hutchinson P, Ullian EM, Hemberg M, Coppola G, Holt MG, Geschwind DH and Rowitch DH

    Department of Paediatrics, Wellcome-MRC Cambridge Stem Cell Institute, University of Cambridge, Cambridge, UK.

    Although the cerebral cortex is organized into six excitatory neuronal layers, it is unclear whether glial cells show distinct layering. In the present study, we developed a high-content pipeline, the large-area spatial transcriptomic (LaST) map, which can quantify single-cell gene expression in situ. Screening 46 candidate genes for astrocyte diversity across the mouse cortex, we identified superficial, mid and deep astrocyte identities in gradient layer patterns that were distinct from those of neurons. Astrocyte layer features, established in the early postnatal cortex, mostly persisted in adult mouse and human cortex. Single-cell RNA sequencing and spatial reconstruction analysis further confirmed the presence of astrocyte layers in the adult cortex. Satb2 and Reeler mutations that shifted neuronal post-mitotic development were sufficient to alter glial layering, indicating an instructive role for neuronal cues. Finally, astrocyte layer patterns diverged between mouse cortical regions. These findings indicate that excitatory neurons and astrocytes are organized into distinct lineage-associated laminae.

    Funded by: Department of Health; Howard Hughes Medical Institute; NIMH NIH HHS: R01 MH109912, U01 MH105991; NINDS NIH HHS: P30 NS062691; Wellcome Trust: 108139

    Nature neuroscience 2020;23;4;500-509

  • Archaeogenetics: What Can Ancient Genomes Tell Us about the Origin of Syphilis?

    Beale MA and Lukehart SA

    Parasites and Microbes Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK. Electronic address:

    The origin of syphilis has been hotly debated for decades. Ancient pathogen DNA may provide new evidence to redefine our understanding of this mystery, but is the mystery itself flawed in its assumptions?

    Current biology : CB 2020;30;19;R1092-R1095

  • Defining the clinical genomic landscape for real-world precision oncology.

    Beer PA, Cooke SL, Chang DK and Biankin AV

    Sanger Institute, Wellcome Trust Genome Campus, Cambridge CB10 1SA, United Kingdom; Wolfson Wohl Cancer Research Centre, Institute of Cancer Sciences, University of Glasgow, Garscube Estate, Switchback Road, Bearsden, Glasgow, Scotland G61 1QH, United Kingdom.

    Through the delivery of large international projects including ICGC and TCGA, knowledge of cancer genomics is reaching saturation point. Enabling this to improve patient outcomes now requires embedding comprehensive genomic profiling into routine oncology practice. Towards this goal, this study defined the biologically and clinically relevant genomic features of adult cancer through detailed curation and analysis of large genomic datasets, accumulated literature and biomarker-driven therapeutics in clinic and development. The characteristics and prevalence of these features were then interrogated in 2348 whole genome sequences, covering 21 solid tumour types, generated by the PCAWG project. This analysis highlights the predominant contribution of copy number alterations and identifies a critical role for disruptive structural variants in the inactivation of clinically important tumour suppressor genes, including PTEN and RB1, which are not currently captured by diagnostic assays. This study defines a set of essential genomic features for the characterisation of common adult cancers.

    Genomics 2020

  • Fundamental differences in physiology of Bordetella pertussis dependent on the two-component system Bvg revealed by gene essentiality studies.

    Belcher T, MacArthur I, King JD, Langridge GC, Mayho M, Parkhill J and Preston A

    Present address: Institute Pasteur Lille, Lille, France.

    The identification of genes essential for a bacterium's growth reveals much about its basic physiology under different conditions. <i>Bordetella pertussis</i>, the causative agent of whooping cough, adopts both virulent and avirulent states through the activity of the two-component system, Bvg. The genes essential for <i>B. pertussis</i> growth <i>in vitro</i> were defined using transposon sequencing, for different Bvg-determined growth states. In addition, comparison of the insertion indices of each gene between Bvg phases identified those genes whose mutation exerted a significantly different fitness cost between phases. As expected, many of the genes identified as essential for growth in other bacteria were also essential for <i>B. pertussis</i>. However, the essentiality of some genes was dependent on Bvg. In particular, a number of key cell wall biosynthesis genes, including the entire <i>mre</i>/<i>mrd</i> locus, were essential for growth of the avirulent (Bvg minus) phase but not the virulent (Bvg plus) phase. In addition, cell wall biosynthesis was identified as a fundamental process that when disrupted produced greater fitness costs for the Bvg minus phase compared to the Bvg plus phase. Bvg minus phase growth was more susceptible than Bvg plus phase growth to the cell wall-disrupting antibiotic ampicillin, demonstrating the increased susceptibility of the Bvg minus phase to disruption of cell wall synthesis. This Bvg-dependent conditional essentiality was not due to Bvg-regulation of expression of cell wall biosynthesis genes; suggesting that this fundamental process differs between the Bvg phases in <i>B. pertussis</i> and is more susceptible to disruption in the Bvg minus phase. The ability of a bacterium to modify its cell wall synthesis is important when considering the action of antibiotics, particularly if developing novel drugs targeting cell wall synthesis.

    Microbial genomics 2020

  • Reticular Fibroblasts Expressing the Transcription Factor WT1 Define a Stromal Niche That Maintains and Replenishes Splenic Red Pulp Macrophages.

    Bellomo A, Mondor I, Spinelli L, Lagueyrie M, Stewart BJ, Brouilly N, Malissen B, Clatworthy MR and Bajénoff M

    Aix Marseille Univ, CNRS, INSERM, CIML, Marseille, France.

    Located within red pulp cords, splenic red pulp macrophages (RPMs) are constantly exposed to the blood flow, clearing senescent red blood cells (RBCs) and recycling iron from hemoglobin. Here, we studied the mechanisms underlying RPM homeostasis, focusing on the involvement of stromal cells as these cells perform anchoring and nurturing macrophage niche functions in lymph nodes and liver. Microscopy revealed that RPMs are embedded in a reticular meshwork of red pulp fibroblasts characterized by the expression of the transcription factor Wilms' Tumor 1 (WT1) and colony stimulating factor 1 (CSF1). Conditional deletion of Csf1 in WT1<sup>+</sup> red pulp fibroblasts, but not white pulp fibroblasts, drastically altered the RPM network without altering circulating CSF1 levels. Upon RPM depletion, red pulp fibroblasts transiently produced the monocyte chemoattractants CCL2 and CCL7, thereby contributing to the replenishment of the RPM network. Thus, red pulp fibroblasts anchor and nurture RPM, a function likely conserved in humans.

    Immunity 2020

  • Considerations in assessing germline variant pathogenicity using cosegregation analysis.

    Belman S, Parsons MT, Spurdle AB, Goldgar DE and Feng BJ

    University of Utah, Salt Lake City, UT, USA.

    Purpose: The American College of Medical Genetics and Genomics (ACMG) and the Association for Molecular Pathology (AMP) have developed guidelines for classifying germline variants as pathogenic or benign to interpret genetic testing results. Cosegregation analysis is an important component of the guidelines. There are two main approaches for cosegregation analysis: meiosis counting and Bayes factor-based quantitative methods. Of these, the ACMG/AMP guidelines employ only meiosis counting. The accuracy of either approach has not been sufficiently addressed in previous works.

    Methods: We analyzed hypothetical, simulated, and real-life data to evaluate the accuracy of each approach for cancer-associated genes.

    Results: We demonstrate that meiosis counting can provide incorrect classifications when the underlying genetic basis of the disease departs from simple Mendelian situations. Some Bayes factor approaches are currently implemented with inappropriate penetrance. We propose an improved penetrance model and describe several critical considerations, including the accuracy of cosegregation for moderate-risk genes and the impact of pleiotropy, population, and birth year. We highlight a webserver, COOL (Co-segregation Online, ), that implements an accurate Bayes factor cosegregation analysis.

    Conclusion: An appropriate penetrance model improves the accuracy of Bayes factor cosegregation analysis for high-penetrant variants, and is a better choice than meiosis counting whenever feasible.

    Genetics in medicine : official journal of the American College of Medical Genetics 2020

  • The fix is in.

    Bentley S

    Parasites and Microbes, Wellcome Sanger Institute, Hinxton, UK.

    Nature microbiology 2020;5;3;393-394

  • Insights into human genetic variation and population history from 929 diverse genomes.

    Bergström A, McCarthy SA, Hui R, Almarri MA, Ayub Q, Danecek P, Chen Y, Felkel S, Hallast P, Kamm J, Blanché H, Deleuze JF, Cann H, Mallick S, Reich D, Sandhu MS, Skoglund P, Scally A, Xue Y, Durbin R and Tyler-Smith C

    Wellcome Sanger Institute, Hinxton CB10 1SA, UK.

    Genome sequences from diverse human groups are needed to understand the structure of genetic variation in our species and the history of, and relationships between, different populations. We present 929 high-coverage genome sequences from 54 diverse human populations, 26 of which are physically phased using linked-read sequencing. Analyses of these genomes reveal an excess of previously undocumented common genetic variation private to southern Africa, central Africa, Oceania, and the Americas, but an absence of such variants fixed between major geographical regions. We also find deep and gradual population separations within Africa, contrasting population size histories between hunter-gatherer and agriculturalist groups in the past 10,000 years, and a contrast between single Neanderthal but multiple Denisovan source populations contributing to present-day human populations.

    Funded by: Cancer Research UK: FC001595; European Research Council; Howard Hughes Medical Institute; Medical Research Council: FC001595; Medical Research Council,; Wellcome Trust: 098051, 206194, 207492, FC001595

    Science (New York, N.Y.) 2020;367;6484

  • Generating realistic null hypothesis of cancer mutational landscapes using SigProfilerSimulator.

    Bergstrom EN, Barnes M, Martincorena I and Alexandrov LB

    Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, CA, 92093, USA.

    Background: Performing a statistical test requires a null hypothesis. In cancer genomics, a key challenge is the fast generation of accurate somatic mutational landscapes that can be used as a realistic null hypothesis for making biological discoveries.

    Results: Here we present SigProfilerSimulator, a powerful tool that is capable of simulating the mutational landscapes of thousands of cancer genomes at different resolutions within seconds. Applying SigProfilerSimulator to 2144 whole-genome sequenced cancers reveals: (i) that most doublet base substitutions are not due to two adjacent single base substitutions but likely occur as single genomic events; (ii) that an extended sequencing context of ± 2 bp is required to more completely capture the patterns of substitution mutational signatures in human cancer; (iii) information on false-positive discovery rate of commonly used bioinformatics tools for detecting driver genes.

    Conclusions: SigProfilerSimulator's breadth of features allows one to construct a tailored null hypothesis and use it for evaluating the accuracy of other bioinformatics tools or for downstream statistical analysis for biological discoveries. SigProfilerSimulator is freely available at with an extensive documentation at .

    Funded by: Cancer Research UK Grand Challenge: C98/A24032

    BMC bioinformatics 2020;21;1;438

  • High-Resolution mRNA and Secretome Atlas of Human Enteroendocrine Cells.

    Beumer J, Puschhof J, Bauzá-Martinez J, Martínez-Silgado A, Elmentaite R, James KR, Ross A, Hendriks D, Artegiani B, Busslinger GA, Ponsioen B, Andersson-Rolf A, Saftien A, Boot C, Kretzschmar K, Geurts MH, Bar-Ephraim YE, Pleguezuelos-Manzano C, Post Y, Begthel H, van der Linden F, Lopez-Iglesias C, van de Wetering WJ, van der Linden R, Peters PJ, Heck AJR, Goedhart J, Snippert H, Zilbauer M, Teichmann SA, Wu W and Clevers H

    Hubrecht Institute, Royal Netherlands Academy of Arts and Sciences (KNAW) and UMC Utrecht, 3584 CT Utrecht, the Netherlands; Oncode Institute, Hubrecht Institute, 3584 CT Utrecht, the Netherlands.

    Enteroendocrine cells (EECs) sense intestinal content and release hormones to regulate gastrointestinal activity, systemic metabolism, and food intake. Little is known about the molecular make-up of human EEC subtypes and the regulated secretion of individual hormones. Here, we describe an organoid-based platform for functional studies of human EECs. EEC formation is induced in vitro by transient expression of NEUROG3. A set of gut organoids was engineered in which the major hormones are fluorescently tagged. A single-cell mRNA atlas was generated for the different EEC subtypes, and their secreted products were recorded by mass-spectrometry. We note key differences to murine EECs, including hormones, sensory receptors, and transcription factors. Notably, several hormone-like molecules were identified. Inter-EEC communication is exemplified by secretin-induced GLP-1 secretion. Indeed, individual EEC subtypes carry receptors for various EEC hormones. This study provides a rich resource to study human EEC development and function.

    Cell 2020

  • STROBE-metagenomics: a STROBE extension statement to guide the reporting of metagenomics studies.

    Bharucha T, Oeser C, Balloux F, Brown JR, Carbo EC, Charlett A, Chiu CY, Claas ECJ, de Goffau MC, de Vries JJC, Eloit M, Hopkins S, Huggett JF, MacCannell D, Morfopoulou S, Nath A, O'Sullivan DM, Reoma LB, Shaw LP, Sidorov I, Simner PJ, Van Tan L, Thomson EC, van Dorp L, Wilson MR, Breuer J and Field N

    Department of Biochemistry, University of Oxford, Oxford, UK; Wellcome Trust Research Unit, Lao-Oxford-Mahosot Hospital, Vientiane, Laos. Electronic address:

    The term metagenomics refers to the use of sequencing methods to simultaneously identify genomic material from all organisms present in a sample, with the advantage of greater taxonomic resolution than culture or other methods. Applications include pathogen detection and discovery, species characterisation, antimicrobial resistance detection, virulence profiling, and study of the microbiome and microecological factors affecting health. However, metagenomics involves complex and multistep processes and there are important technical and methodological challenges that require careful consideration to support valid inference. We co-ordinated a multidisciplinary, international expert group to establish reporting guidelines that address specimen processing, nucleic acid extraction, sequencing platforms, bioinformatics considerations, quality assurance, limits of detection, power and sample size, confirmatory testing, causality criteria, cost, and ethical issues. The guidance recognises that metagenomics research requires pragmatism and caution in interpretation, and that this field is rapidly evolving.

    Funded by: NINDS NIH HHS: K08 NS096117

    The Lancet. Infectious diseases 2020

  • Find and fuse: Unsolved mysteries in sperm-egg recognition.

    Bianchi E and Wright GJ

    Cell Surface Signalling Laboratory, Wellcome Sanger Institute, Cambridge, United Kingdom.

    Sexual reproduction is such a successful way of creating progeny with subtle genetic variations that the vast majority of eukaryotic species use it. In mammals, it involves the formation of highly specialised cells: the sperm in males and the egg in females, each carrying the genetic inheritance of an individual. The interaction of sperm and egg culminates with the fusion of their cell membranes, triggering the molecular events that result in the formation of a new genetically distinct organism. Although we have a good cellular description of fertilisation in mammals, many of the molecules involved remain unknown, and especially the identity and role of cell surface proteins that are responsible for sperm-egg recognition, binding, and fusion. Here, we will highlight and discuss these gaps in our knowledge and how the role of some recently discovered sperm cell surface and secreted proteins contribute to our understanding of this fundamental process.

    PLoS biology 2020;18;11;e3000953

  • Unsupervised generative and graph representation learning for modelling cell differentiation.

    Bica I, Andrés-Terré H, Cvejic A and Liò P

    Department of Engineering Science, University of Oxford, Oxford, OX1 3PJ, United Kingdom.

    Using machine learning techniques to build representations from biomedical data can help us understand the latent biological mechanism of action and lead to important discoveries. Recent developments in single-cell RNA-sequencing protocols have allowed measuring gene expression for individual cells in a population, thus opening up the possibility of finding answers to biomedical questions about cell differentiation. In this paper, we explore unsupervised generative neural methods, based on the variational autoencoder, that can model cell differentiation by building meaningful representations from the high dimensional and complex gene expression data. We use disentanglement methods based on information theory to improve the data representation and achieve better separation of the biological factors of variation in the gene expression data. In addition, we use a graph autoencoder consisting of graph convolutional layers to predict relationships between single-cells. Based on these models, we develop a computational framework that consists of methods for identifying the cell types in the dataset, finding driver genes for the differentiation process and obtaining a better understanding of relationships between cells. We illustrate our methods on datasets from multiple species and also from different sequencing technologies.

    Funded by: Alan Turing Institute: EP/N510129/1

    Scientific reports 2020;10;1;9790

  • The genome sequence of the channel bull blenny, Cottoperca gobio (Günther, 1861).

    Bista I, McCarthy SA, Wood J, Ning Z, Detrich Iii HW, Desvignes T, Postlethwait J, Chow W, Howe K, Torrance J, Smith M, Oliver K, Vertebrate Genomes Project Consortium, Miska EA and Durbin R

    Wellcome Sanger Institute, Cambridge, CB10 1SA, UK.

    We present a genome assembly for <i>Cottoperca gobio</i> (channel bull blenny, (Günther, 1861)); Chordata; Actinopterygii (ray-finned fishes), a temperate water outgroup for Antarctic Notothenioids. The size of the genome assembly is 609 megabases, with the majority of the assembly scaffolded into 24 chromosomal pseudomolecules. Gene annotation on Ensembl of this assembly has identified 21,662 coding genes.

    Wellcome open research 2020;5;148

  • Evolution and dissemination of L and M plasmid lineages carrying antibiotic resistance genes in diverse Gram-negative bacteria.

    Blackwell GA, Doughty EL and Moran RA

    EMBL-EBI, Wellcome Genome Campus, Hinxton, UK; Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK.

    Conjugative, broad host-range plasmids of the L/M complex have been associated with antibiotic resistance since the 1970s. They are found in Gram-negative bacterial genera that cause human infections and persist in hospital environments. It is crucial that these plasmids are typed accurately so that their clinical and global dissemination can be traced in epidemiological studies. The L/M complex has previously been divided into L, M1 and M2 subtypes. However, those types do not encompass all diversity seen in the group. Here, we have examined 148 complete L/M plasmid sequences in order to understand the diversity of the complex and trace the evolution of distinct lineages. The backbone sequence of each plasmid was determined by removing translocatable genetic elements and reversing their effects in silico. The sequence identities of replication regions and complete backbones were then considered for typing. This supported the distinction of L and M plasmids and revealed that there are five L and eight M types, where each type is comprised of further sub-lineages that are distinguished by variation in their backbone and translocatable element content. Regions containing antibiotic resistance genes in L and M sub-lineages have often formed by initial rare insertion events, followed by insertion of other translocatable elements within the inceptive element. As such, islands evolve in situ to contain genes conferring resistance to multiple antibiotics. In some cases, different plasmid sub-lineages have acquired the same or related resistance genes independently. This highlights the importance of these plasmids in acting as vehicles for the dissemination of emerging resistance genes. Materials are provided here for typing plasmids of the L/M complex from complete sequences or draft genomes. This should enable rapid identification of novel types and facilitate tracking the evolution of existing lineages.

    Plasmid 2020;102528

  • Leupaxin Expression Is Dispensable for B Cell Immune Responses.

    Bonaud A, Clare S, Bisio V, Sowerby JM, Yao S, Ostergaard H, Balabanian K, Smith KGC and Espéli M

    Inflammation Chemokines and Immunopathology, Institut National de la Santé et de la Recherche Medicale (INSERM), Faculté de Médecine, Université Paris-Sud, Université Paris-Saclay, Clamart, France.

    The generation of a potent humoral immune response by B cells relies on the integration of signals induced by the B cell receptor, toll-like receptors and both negative and positive co-receptors. Several reports also suggest that integrin signaling plays an important role in this process. How integrin signaling is regulated in B cells is however still partially understood. Integrin activity and function are controlled by several mechanisms including regulation by molecular adaptors of the paxillin family. In B cells, Leupaxin (Lpxn) is the most expressed member of the family and <i>in vitro</i> studies suggest that it could dampen BCR signaling. Here, we report that <i>Lpxn</i> expression is increased in germinal center B cells compared to naïve B cells. Moreover, <i>Lpxn</i> deficiency leads to decreased B cell differentiation into plasma cells <i>in vitro</i>. However, Lpxn seems dispensable for the generation of a potent B cell immune response <i>in vivo</i>. Altogether our results suggest that Lpxn is dispensable for T-dependent and T-independent B cell immune responses.

    Frontiers in immunology 2020;11;466

  • Draft Genome Sequences of the Type Strains of Actinobacillus indolicus (46K2C) and Actinobacillus porcinus (NM319), Two NAD-Dependent Bacterial Species Found in the Respiratory Tract of Pigs.

    Bossé JT, Li Y, Fernandez Crespo R, Angen Ø, Holden MTG, Weinert LA, Maskell DJ, Tucker AW, Wren BW, Rycroft AN, Langford PR and BRaDP1T consortium

    Section of Paediatric Infectious Disease, Department of Infectious Disease, Imperial College London, London, United Kingdom

    We report here the draft genome sequences of the type strains of <i>Actinobacillus indolicus</i> (46K2C) and <i>Actinobacillus porcinus</i> (NM319). These NAD-dependent bacterial species are frequently found in the upper respiratory tract of pigs and are occasionally associated with lung pathology.

    Funded by: Biotechnology and Biological Sciences Research Council: BB/G018553/1; Wellcome Trust

    Microbiology resource announcements 2020;9;1

  • The immunological network in the developing human skin.

    Botting RA and Haniffa M

    Biosciences Institute, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne, NE2 4HH, UK.

    Establishment of a well-functioning immune network in skin is crucial for its barrier function. This begins in utero alongside the structural differentiation and maturation of skin, and continues to expand and diversify across the human lifespan. The microenvironment of the developing human skin supports immune cell differentiation and has an overall anti-inflammatory profile. Immunologically inert and skewed immune populations found in developing human skin promote wound healing, and as such may play a crucial role in the structural changes occurring during skin development.

    Immunology 2020

  • SID-2 negatively regulates development likely independent of nutritional dsRNA uptake.

    Braukmann F, Jordan D, Jenkins B, Koulman A and Miska EA

    Wellcome Trust/Cancer Research UK Gurdon Institute, University of Cambridge , Cambridge, UK.

    RNA interference (RNAi) is a gene regulatory mechanism based on RNA-RNA interaction conserved through eukaryotes. Surprisingly, many animals can take-up human-made double stranded RNA (dsRNA) from the environment to initiate RNAi suggesting a mechanism for dsRNA-based information exchange between organisms and their environment. However, no naturally occurring example has been identified since the discovery of the phenomenon 22 years ago. Therefore it remains enigmatic why animals are able to take up dsRNA. Here, we explore other possible functions by performing phenotypic studies of dsRNA uptake deficient <i>sid-2</i> mutants in <i>Caenorhabditis elegans</i>. We find that SID-2 does not have a nutritional role in feeding experiments using genetic sensitized mutants. Furthermore, we use robot assisted imaging to show that <i>sid-2</i> mutants accelerate growth rate and, by maternal contribution, body length at hatching. Finally, we perform transcriptome and lipidome analysis showing that <i>sid-2</i> has no effect on energy storage lipids, but affects signalling lipids and the embryo transcriptome. Overall, these results suggest that <i>sid-2</i> has mild effects on development and is unlikely functioning in the nutritional uptake of dsRNA. These findings broaden our understanding of the biological role of SID-2 and motivate studies identifying the role of environmental dsRNA uptake.

    RNA biology 2020;1-12

  • Persistent circulation of a fluoroquinolone-resistant Salmonella enterica Typhi clone in the Indian subcontinent.

    Britto CD, Dyson ZA, Mathias S, Bosco A, Dougan G, Jose S, Nagaraj S, Holt KE and Pollard AJ

    Oxford Vaccine Group, Department of Paediatrics, University of Oxford, and the NIHR Oxford Biomedical Research Centre, Oxford, OX3 7LE, UK.

    Background: The molecular structure of circulating enteric fever pathogens was studied using hospital-based genomic surveillance in a tertiary care referral centre in South India as a first genomic surveillance study, to our knowledge, of blood culture-confirmed enteric fever in the region.

    Methods: Blood culture surveillance was conducted at St John's Medical College Hospital, Bengaluru, between July 2016 and June 2017. The bacterial isolates collected were linked to demographic variables of patients and subjected to WGS. The resulting pathogen genomic data were also globally contextualized to gauge possible phylogeographical patterns.

    Results: Hospital-based genomic surveillance for enteric fever in Bengaluru, India, identified 101 Salmonella enterica Typhi and 14 S. Paratyphi A in a 1 year period. Ninety-six percent of isolates displayed non-susceptibility to fluoroquinolones. WGS showed the dominant pathogen was S. Typhi genotype (H58 lineage II). A fluoroquinolone-resistant triple-mutant clone of S. Typhi previously associated with gatifloxacin treatment failure in Nepal was implicated in 18% of enteric fever cases, indicating ongoing inter-regional circulation.

    Conclusions: Enteric fever in South India continues to be a major public health issue and is strongly associated with antimicrobial resistance. Robust microbiological surveillance is necessary to direct appropriate treatment and preventive strategies. Of particular concern is the emergence and expansion of the highly fluoroquinolone-resistant triple-mutant S. Typhi clone and its ongoing inter- and intra-country transmission in South Asia, which highlights the need for regional coordination of intervention strategies, including vaccination and longer-term strategies such as improvements to support hygiene and sanitation.

    The Journal of antimicrobial chemotherapy 2020;75;2;337-341

  • Pathogen genomic surveillance of typhoidal Salmonella infection in adults and children reveals no association between clinical outcomes and infecting genotypes.

    Britto CD, Mathias S, Bosco A, Dyson ZA, Dougan G, Raveendran S, Abin VL, Jose S, Nagaraj S, Holt KE and Pollard AJ

    Oxford Vaccine Group, Department of Paediatrics, University of Oxford and the NIHR Oxford Biomedical Research Centre, Oxford, OX3 7LE UK.

    Background: India is endemic for enteric fever, and it is not known whether the variations in clinical manifestations between patients are due to host, environmental or pathogen factors.Blood culture surveillance was conducted at St. John's Medical College Hospital, Bangalore, between July 2016 and June 2017. Clinical, laboratory and demographic data were collected from each case, and bacterial isolates were subjected to whole genome sequencing. Comparative analysis between adults and paediatric patients was carried out to ascertain differences between adult and paediatric disease.

    Results: Among the 113 cases of blood culture-confirmed enteric fever, young adults (16-30 years) and children < 15 years accounted for 47% and 37% of cases, respectively. Anaemia on presentation was seen in 46% of cases, and 19% had an abnormal leucocyte count on presentation. The majority received treatment as inpatients (70%), and among these, adults had a significantly longer duration of admission when compared with children (<i>p</i> = 0.002). There were atypical presentations including arthritis, acute haemolysis and a case of repeated typhoid infection with two separate <i>S.</i> Typhi genotypes. There was no association between infecting genotype/serovar and treatment status (outpatient vs inpatient), month of isolation, duration of admission, patient age (adult or child), antimicrobial susceptibility, Widal positivity or haematologic parameters.

    Conclusions: Amidst the many public health concerns of South India, enteric fever continues to contribute substantially to hospital burden with non-specific as well as uncommon clinical features in both paediatric and adult populations likely driven by host and environmental factors. Robust clinical surveillance as well monitoring of pathogen population structure is required to inform treatment and preventive strategies.

    Tropical medicine and health 2020;48;58

  • Prophage exotoxins enhance colonization fitness in epidemic scarlet fever-causing Streptococcus pyogenes.

    Brouwer S, Barnett TC, Ly D, Kasper KJ, De Oliveira DMP, Rivera-Hernandez T, Cork AJ, McIntyre L, Jespersen MG, Richter J, Schulz BL, Dougan G, Nizet V, Yuen KY, You Y, McCormick JK, Sanderson-Smith ML, Davies MR and Walker MJ

    Australian Infectious Diseases Research Centre and School of Chemistry and Molecular Biosciences, The University of Queensland, St. Lucia, QLD, Australia.

    The re-emergence of scarlet fever poses a new global public health threat. The capacity of North-East Asian serotype M12 (emm12) Streptococcus pyogenes (group A Streptococcus, GAS) to cause scarlet fever has been linked epidemiologically to the presence of novel prophages, including prophage ΦHKU.vir encoding the secreted superantigens SSA and SpeC and the DNase Spd1. Here, we report the molecular characterization of ΦHKU.vir-encoded exotoxins. We demonstrate that streptolysin O (SLO)-induced glutathione efflux from host cellular stores is a previously unappreciated GAS virulence mechanism that promotes SSA release and activity, representing the first description of a thiol-activated bacterial superantigen. Spd1 is required for resistance to neutrophil killing. Investigating single, double and triple isogenic knockout mutants of the ΦHKU.vir-encoded exotoxins, we find that SpeC and Spd1 act synergistically to facilitate nasopharyngeal colonization in a mouse model. These results offer insight into the pathogenesis of scarlet fever-causing GAS mediated by prophage ΦHKU.vir exotoxins.

    Funded by: CIHR; Wellcome Trust

    Nature communications 2020;11;1;5018

  • MYC-induced human acute myeloid leukemia requires a continuing IL-3/GM-CSF costimulus.

    Bulaeva E, Pellacani D, Nakamichi N, Hammond CA, Beer PA, Lorzadeh A, Moksa M, Carles A, Bilenky M, Lefort S, Shu J, Wilhelm BT, Weng AP, Hirst M and Eaves CJ

    Terry Fox Laboratory, British Columbia Cancer Agency, Vancouver, BC, Canada.

    Hematopoietic clones with leukemogenic mutations arise in healthy people as they age, but progression to acute myeloid leukemia (AML) is rare. Recent evidence suggests that the microenvironment may play an important role in modulating human AML population dynamics. To investigate this concept further, we examined the combined and separate effects of an oncogene (c-MYC) and exposure to interleukin-3 (IL-3), granulocyte-macrophage colony-stimulating factor (GM-CSF), and stem cell factor (SCF) on the experimental genesis of a human AML in xenografted immunodeficient mice. Initial experiments showed that normal human CD34+ blood cells transduced with a lentiviral MYC vector and then transplanted into immunodeficient mice produced a hierarchically organized, rapidly fatal, and serially transplantable blast population, phenotypically and transcriptionally similar to human AML cells, but only in mice producing IL-3, GM-CSF, and SCF transgenically or in regular mice in which the cells were exposed to IL-3 or GM-CSF delivered using a cotransduction strategy. In their absence, the MYC+ human cells produced a normal repertoire of lymphoid and myeloid progeny in transplanted mice for many months, but, on transfer to secondary mice producing the human cytokines, the MYC+ cells rapidly generated AML. Indistinguishable diseases were also obtained efficiently from both primitive (CD34+CD38-) and late granulocyte-macrophage progenitor (GMP) cells. These findings underscore the critical role that these cytokines can play in activating a malignant state in normally differentiating human hematopoietic cells in which MYC expression has been deregulated. They also introduce a robust experimental model of human leukemogenesis to further elucidate key mechanisms involved and test strategies to suppress them.

    Blood 2020;136;24;2764-2773

  • Periostin: contributor to abnormal airway epithelial function in asthma?

    Burgess JK, Jonker MR, Berg M, Ten Hacken NTH, Meyer KB, van den Berge M, Nawijn MC and Heijink IH

    University of Groningen, University Medical Centre Groningen, Department of Pathology & Medical Biology, Experimental Pulmonology and Inflammation Research, Groningen, The Netherlands.

    Periostin may serve as a biomarker for type-2-mediated eosinophilic airway inflammation in asthma. We hypothesised that type-2 cytokine IL-13 induces airway epithelial expression of periostin, which in turn contributes to epithelial changes observed in asthma.We studied the effect of IL-13 on periostin expression in BEAS-2B and air-liquid interface (ALI)-differentiated primary bronchial epithelial cells (PBECs). Additionally, effects of recombinant human periostin on epithelial-to-mesenchymal transition (EMT) markers and mucin genes were assessed. In bronchial biopsies and induced sputum from asthma patients and healthy controls, we analysed periostin single cell gene expression and protein levels.IL-13 increased <i>POSTN</i> expression in both cell types, which was accompanied by EMT-related features in BEAS-2B. In ALI-differentiated PBECs, IL-13 increased periostin basolateral and apical release. Apical administration of periostin increased the expression of <i>MMP9, MUC5B</i> and <i>MUC5AC</i> In bronchial biopsies, <i>POSTN</i> expression was mainly confined to basal epithelial cells, ionocytes, endothelial cells and fibroblasts, showing higher expression in basal epithelial cells from asthma patients <i>versus</i> controls. Higher protein levels of periostin, expressed in epithelial and subepithelial layers, was confirmed in bronchial biopsies from asthma patients compared to healthy controls. Although sputum periostin levels were not higher in asthma, levels correlated with eosinophil numbers and coughing up mucus.Periostin expression is increased by IL-13 in bronchial epithelial cells and higher in bronchial biopsies from asthma patients. This may have important consequences, as administration of periostin increased epithelial expression of mucin genes, supporting the relationship of periostin with type-2 mediated asthma and mucus secretion.

    The European respiratory journal 2020

  • Dynamic regulation of hypoxia-inducible factor-1α activity is essential for normal B cell development.

    Burrows N, Bashford-Rogers RJM, Bhute VJ, Peñalver A, Ferdinand JR, Stewart BJ, Smith JEG, Deobagkar-Lele M, Giudice G, Connor TM, Inaba A, Bergamaschi L, Smith S, Tran MGB, Petsalaki E, Lyons PA, Espeli M, Huntly BJP, Smith KGC, Cornall RJ, Clatworthy MR and Maxwell PH

    Cambridge Institute for Medical Research, The Keith Peters Building, University of Cambridge, Cambridge Biomedical Campus, Cambridge, UK.

    B lymphocyte development and selection are central to adaptive immunity and self-tolerance. These processes require B cell receptor (BCR) signaling and occur in bone marrow, an environment with variable hypoxia, but whether hypoxia-inducible factor (HIF) is involved is unknown. We show that HIF activity is high in human and murine bone marrow pro-B and pre-B cells and decreases at the immature B cell stage. This stage-specific HIF suppression is required for normal B cell development because genetic activation of HIF-1α in murine B cells led to reduced repertoire diversity, decreased BCR editing and developmental arrest of immature B cells, resulting in reduced peripheral B cell numbers. HIF-1α activation lowered surface BCR, CD19 and B cell-activating factor receptor and increased expression of proapoptotic BIM. BIM deletion rescued the developmental block. Administration of a HIF activator in clinical use markedly reduced bone marrow and transitional B cells, which has therapeutic implications. Together, our work demonstrates that dynamic regulation of HIF-1α is essential for normal B cell development.

    Funded by: DH | National Institute for Health Research (NIHR): CL-2006-14-006, RP-2017-08-ST2-002; Lister Institute of Preventive Medicine: Summer studentship; RCUK | Medical Research Council (MRC): G0802266; Rosetrees Trust: G102721; Wellcome Trust (Wellcome): 097922/Z/11/Z, 100140, 19710, WT106068AIA

    Nature immunology 2020

  • Cysteine synthases CYSL-1 and CYSL-2 mediate C. elegans heritable adaptation to P. vranovensis infection.

    Burton NO, Riccio C, Dallaire A, Price J, Jenkins B, Koulman A and Miska EA

    Centre for Trophoblast Research, Department of Physiology, Development and Neuroscience, University of Cambridge, Cambridge, CB2 3EG, UK.

    Parental exposure to pathogens can prime offspring immunity in diverse organisms. The mechanisms by which this heritable priming occurs are largely unknown. Here we report that the soil bacteria Pseudomonas vranovensis is a natural pathogen of the nematode Caenorhabditis elegans and that parental exposure of animals to P. vranovensis promotes offspring resistance to infection. Furthermore, we demonstrate a multigenerational enhancement of progeny survival when three consecutive generations of animals are exposed to P. vranovensis. By investigating the mechanisms by which animals heritably adapt to P. vranovensis infection, we found that parental infection by P. vranovensis results in increased expression of the cysteine synthases cysl-1 and cysl-2 and the regulator of hypoxia inducible factor rhy-1 in progeny, and that these three genes are required for adaptation to P. vranovensis. These observations establish a CYSL-1, CYSL-2, and RHY-1 dependent mechanism by which animals heritably adapt to infection.

    Funded by: Cancer Research UK (CRUK): C13474/A18583, C6946/A14492; RCUK | Biotechnology and Biological Sciences Research Council (BBSRC): BB/M027252/1; Wellcome Trust (Wellcome): 092096/Z/10/Z, 104640/Z/14/Z

    Nature communications 2020;11;1;1741

  • Interferon-induced Protein-44 and Interferon-induced Protein 44-like restrict replication of Respiratory Syncytial Virus.

    Busse DC, Habgood-Coote D, Clare S, Brandt C, Bassano I, Kaforou M, Herberg J, Levin M, Eleouet JF, Kellam P and Tregoning JS

    Department of Infectious Disease, Imperial College London, St. Mary's Campus, London, United Kingdom.

    Cellular intrinsic immunity, mediated by the expression of an array of interferon-stimulated antiviral genes, is a vital part of host defence. We have previously used a bioinformatic screen to identify two interferon stimulated genes (ISG) with poorly characterised function, Interferon-induced protein 44 (IFI44) and interferon-induced protein 44-like (IFI44L), as potentially being important in Respiratory Syncytial Virus (RSV) infection. Using overexpression systems, CRISPR-Cas9-mediated knockout, and a knockout mouse model we investigated the antiviral capability of these genes in the control of RSV replication. Over-expression of IFI44 or IFI44L was sufficient to restrict RSV infection at an early time post infection. Knocking out these genes in mammalian airway epithelial cells increased levels of infection. Both genes express antiproliferative factors that have no effect on RSV attachment but reduce RSV replication in a minigenome assay. The loss of <i>Ifi44</i> was associated with a more severe infection phenotype in a mouse model of infection. These studies demonstrate a function for IFI44 and IFI44L in controlling RSV infection.<b>IMPORTANCE</b> RSV infects all children under two years of age, but only a subset of children get severe disease. We hypothesize that susceptibility to severe RSV necessitating hospitalization in children without pre-defined risk factors is in part mediated at the anti-viral gene level. But there is a large array of anti-viral genes, particularly in the ISG family about which the mechanism is poorly understood. Having previously identified IFI44 and IFI44L as possible genes of interest in a bioinformatic screen, we dissected the function of these two genes in the control of RSV. Through a range of over-expression and knockout studies we show that the genes are anti-viral and anti-proliferative. This study is important because IFI44 and IFI44L are upregulated after a wide range of viral infections and IFI44L can serve as a diagnostic bio-marker of viral infection.

    Journal of virology 2020

  • The influence of tumour mutational burden on renal cancer immune infiltration and survival.

    Byrne MHV and Mitchell TJ

    Department of Surgery, University of Cambridge, Addenbrooke's Hospital, Cambridge, UK.

    Annals of translational medicine 2020;8;6;271

  • Human and mouse essentiality screens as a resource for disease gene discovery.

    Cacheiro P, Muñoz-Fuentes V, Murray SA, Dickinson ME, Bucan M, Nutter LMJ, Peterson KA, Haselimashhadi H, Flenniken AM, Morgan H, Westerberg H, Konopka T, Hsu CW, Christiansen A, Lanza DG, Beaudet AL, Heaney JD, Fuchs H, Gailus-Durner V, Sorg T, Prochazka J, Novosadova V, Lelliott CJ, Wardle-Jones H, Wells S, Teboul L, Cater H, Stewart M, Hough T, Wurst W, Sedlacek R, Adams DJ, Seavitt JR, Tocchini-Valentini G, Mammano F, Braun RE, McKerlie C, Herault Y, de Angelis MH, Mallon AM, Lloyd KCK, Brown SDM, Parkinson H, Meehan TF, Smedley D, Genomics England Research Consortium and International Mouse Phenotyping Consortium

    Clinical Pharmacology, William Harvey Research Institute, School of Medicine and Dentistry, Queen Mary University of London, London, EC1M 6BQ, UK.

    The identification of causal variants in sequencing studies remains a considerable challenge that can be partially addressed by new gene-specific knowledge. Here, we integrate measures of how essential a gene is to supporting life, as inferred from viability and phenotyping screens performed on knockout mice by the International Mouse Phenotyping Consortium and essentiality screens carried out on human cell lines. We propose a cross-species gene classification across the Full Spectrum of Intolerance to Loss-of-function (FUSIL) and demonstrate that genes in five mutually exclusive FUSIL categories have differing biological properties. Most notably, Mendelian disease genes, particularly those associated with developmental disorders, are highly overrepresented among genes non-essential for cell survival but required for organism development. After screening developmental disorder cases from three independent disease sequencing consortia, we identify potentially pathogenic variants in genes not previously associated with rare diseases. We therefore propose FUSIL as an efficient approach for disease gene discovery.

    Funded by: Medical Research Council: MC_U142684171, MC_U142684172, MR/S006753/1; NHGRI NIH HHS: UM1 HG006348, UM1 HG006370; NIH HHS: UM1 OD023221

    Nature communications 2020;11;1;655

  • FAMIN Is a Multifunctional Purine Enzyme Enabling the Purine Nucleotide Cycle.

    Cader MZ, de Almeida Rodrigues RP, West JA, Sewell GW, Md-Ibrahim MN, Reikine S, Sirago G, Unger LW, Iglesias-Romero AB, Ramshorn K, Haag LM, Saveljeva S, Ebel JF, Rosenstiel P, Kaneider NC, Lee JC, Lawley TD, Bradley A, Dougan G, Modis Y, Griffin JL and Kaser A

    Cambridge Institute of Therapeutic Immunology and Infectious Disease, Jeffrey Cheah Biomedical Centre, University of Cambridge, Cambridge CB2 0AW, UK; Division of Gastroenterology and Hepatology, Department of Medicine, University of Cambridge, Addenbrooke's Hospital, Cambridge CB2 0QQ, UK.

    Mutations in FAMIN cause arthritis and inflammatory bowel disease in early childhood, and a common genetic variant increases the risk for Crohn's disease and leprosy. We developed an unbiased liquid chromatography-mass spectrometry screen for enzymatic activity of this orphan protein. We report that FAMIN phosphorolytically cleaves adenosine into adenine and ribose-1-phosphate. Such activity was considered absent from eukaryotic metabolism. FAMIN and its prokaryotic orthologs additionally have adenosine deaminase, purine nucleoside phosphorylase, and S-methyl-5'-thioadenosine phosphorylase activity, hence, combine activities of the namesake enzymes of central purine metabolism. FAMIN enables in macrophages a purine nucleotide cycle (PNC) between adenosine and inosine monophosphate and adenylosuccinate, which consumes aspartate and releases fumarate in a manner involving fatty acid oxidation and ATP-citrate lyase activity. This macrophage PNC synchronizes mitochondrial activity with glycolysis by balancing electron transfer to mitochondria, thereby supporting glycolytic activity and promoting oxidative phosphorylation and mitochondrial H<sup>+</sup> and phosphate recycling.

    Funded by: Wellcome Trust

    Cell 2020;180;2;278-295.e23

  • Minimal phenotyping yields genome-wide association signals of low specificity for major depression.

    Cai N, Revez JA, Adams MJ, Andlauer TFM, Breen G, Byrne EM, Clarke TK, Forstner AJ, Grabe HJ, Hamilton SP, Levinson DF, Lewis CM, Lewis G, Martin NG, Milaneschi Y, Mors O, Müller-Myhsok B, Penninx BWJH, Perlis RH, Pistis G, Potash JB, Preisig M, Shi J, Smoller JW, Streit F, Tiemeier H, Uher R, Van der Auwera S, Viktorin A, Weissman MM, MDD Working Group of the Psychiatric Genomics Consortium, Kendler KS and Flint J

    Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK.

    Minimal phenotyping refers to the reliance on the use of a small number of self-reported items for disease case identification, increasingly used in genome-wide association studies (GWAS). Here we report differences in genetic architecture between depression defined by minimal phenotyping and strictly defined major depressive disorder (MDD): the former has a lower genotype-derived heritability that cannot be explained by inclusion of milder cases and a higher proportion of the genome contributing to this shared genetic liability with other conditions than for strictly defined MDD. GWAS based on minimal phenotyping definitions preferentially identifies loci that are not specific to MDD, and, although it generates highly predictive polygenic risk scores, the predictive power can be explained entirely by large sample sizes rather than by specificity for MDD. Our results show that reliance on results from minimal phenotyping may bias views of the genetic architecture of MDD and impede the ability to identify pathways specific to MDD.

    Nature genetics 2020

  • Repeatable ecological dynamics govern the response of experimental communities to antibiotic pulse perturbation.

    Cairns J, Jokela R, Becks L, Mustonen V and Hiltunen T

    Wellcome Sanger Institute, Cambridge, UK.

    In an era of pervasive anthropogenic ecological disturbances, there is a pressing need to understand the factors that constitute community response and resilience. A detailed understanding of disturbance response needs to go beyond associations and incorporate features of disturbances, species traits, rapid evolution and dispersal. Multispecies microbial communities that experience antibiotic perturbation represent a key system with important medical dimensions. However, previous microbiome studies on this theme have relied on high-throughput sequencing data from uncultured species without the ability to explicitly account for the role of species traits and immigration. Here, we serially passage a 34-species defined bacterial community through different levels of pulse antibiotic disturbance, manipulating the presence or absence of species immigration. To understand the ecological community response measured using amplicon sequencing, we combine initial trait data measured for each species separately and metagenome sequencing data revealing adaptive mutations during the experiment. We found that the ecological community response was highly repeatable within the experimental treatments, which could be attributed in part to key species traits (antibiotic susceptibility and growth rate). Increasing antibiotic levels were also coupled with an increasing probability of species extinction, making species immigration critical for community resilience. Moreover, we detected signals of antibiotic-resistance evolution occurring within species at the same time scale, leaving evolutionary changes in communities despite recovery at the species compositional level. Together, these observations reveal a disturbance response that presents as classic species sorting, but is nevertheless accompanied by rapid within-species evolution.

    Funded by: Academy of Finland (Suomen Akatemia): 106993, 313270; Deutsche Forschungsgemeinschaft (German Research Foundation): 4135/9; Jenny ja Antti Wihurin Rahasto (Jenny and Antti Wihuri Foundation): 190040

    Nature ecology & evolution 2020

  • Evolution in interacting species alters predator life-history traits, behaviour and morphology in experimental microbial communities.

    Cairns J, Moerman F, Fronhofer EA, Altermatt F and Hiltunen T

    Wellcome Sanger Institute, Cambridge CB10 1SA, UK.

    Predator-prey interactions heavily influence the dynamics of many ecosystems. An increasing body of evidence suggests that rapid evolution and coevolution can alter these interactions, with important ecological implications, by acting on traits determining fitness, including reproduction, anti-predatory defence and foraging efficiency. However, most studies to date have focused only on evolution in the prey species, and the predator traits in (co)evolving systems remain poorly understood. Here, we investigated changes in predator traits after approximately 600 generations in a predator-prey (ciliate-bacteria) evolutionary experiment. Predators independently evolved on seven different prey species, allowing generalization of the predator's evolutionary response. We used highly resolved automated image analysis to quantify changes in predator life history, morphology and behaviour. Consistent with previous studies, we found that prey evolution impaired growth of the predator, although the effect depended on the prey species. By contrast, predator evolution did not cause a clear increase in predator growth when feeding on ancestral prey. However, predator evolution affected morphology and behaviour, increasing size, speed and directionality of movement, which have all been linked to higher prey search efficiency. These results show that in (co)evolving systems, predator adaptation can occur in traits relevant to foraging efficiency without translating into an increased ability of the predator to grow on the ancestral prey type.

    Proceedings. Biological sciences 2020;287;1928;20200652

  • Comparison of visualization tools for single-cell RNAseq data.

    Cakir B, Prete M, Huang N, van Dongen S, Pir P and Kiselev VY

    Wellcome Sanger Institute, Hinxton, CB10 1SA, UK.

    In the last decade, single cell RNAseq (scRNAseq) datasets have grown in size from a single cell to millions of cells. Due to its high dimensionality, it is not always feasible to visualize scRNAseq data and share it in a scientific report or an article publication format. Recently, many interactive analysis and visualization tools have been developed to address this issue and facilitate knowledge transfer in the scientific community. In this study, we review several of the currently available scRNAseq visualization tools and benchmark the subset that allows to visualize the data on the web and share it with others. We consider the memory and time required to prepare datasets for sharing as the number of cells increases, and additionally review the user experience and features available in the web interface. To address the problem of format compatibility we have also developed a user-friendly R package, <i>sceasy</i>, which allows users to convert their own scRNAseq datasets into a specific data format for visualization.

    NAR genomics and bioinformatics 2020;2;3;lqaa052

  • The Structure of the Cysteine-Rich Domain of Plasmodium falciparum P113 Identifies the Location of the RH5 Binding Site.

    Campeotto I, Galaway F, Mehmood S, Barfod LK, Quinkert D, Kotraiah V, Phares TW, Wright KE, Snijders AP, Draper SJ, Higgins MK and Wright GJ

    Department of Biochemistry, University of Oxford, Oxford, United Kingdom.

    <i>Plasmodium falciparum</i> RH5 is a secreted parasite ligand that is essential for erythrocyte invasion through direct interaction with the host erythrocyte receptor basigin. RH5 forms a tripartite complex with two other secreted parasite proteins, CyRPA and RIPR, and is tethered to the surface of the parasite through membrane-anchored P113. Antibodies against RH5, CyRPA, and RIPR can inhibit parasite invasion, suggesting that vaccines containing these three components have the potential to prevent blood-stage malaria. To further explore the role of the P113-RH5 interaction, we selected monoclonal antibodies against P113 that were either inhibitory or noninhibitory for RH5 binding. Using a Fab fragment as a crystallization chaperone, we determined the crystal structure of the RH5 binding region of P113 and showed that it is composed of two domains with structural similarities to rhamnose-binding lectins. We identified the RH5 binding site on P113 by using a combination of hydrogen-deuterium exchange mass spectrometry and site-directed mutagenesis. We found that a monoclonal antibody to P113 that bound to this interface and inhibited the RH5-P113 interaction did not inhibit parasite blood-stage growth. These findings provide further structural information on the protein interactions of RH5 and will be helpful in guiding the development of blood-stage malaria vaccines that target RH5.<b>IMPORTANCE</b> Malaria is a deadly infectious disease primarily caused by the parasite <i>Plasmodium falciparum</i> It remains a major global health problem, and there is no highly effective vaccine. A parasite protein called RH5 is centrally involved in the invasion of host red blood cells, making it-and the other parasite proteins it interacts with-promising vaccine targets. We recently identified a protein called P113 that binds RH5, suggesting that it anchors RH5 to the parasite surface. In this paper, we use structural biology to locate and characterize the RH5 binding region on P113. These findings will be important to guide the development of new antimalarial vaccines to ultimately prevent this disease, which affects some of the poorest people on the planet.

    mBio 2020;11;5

  • Reconstitution of a functional human thymus by postnatal stromal progenitor cells and natural whole-organ scaffolds.

    Campinoti S, Gjinovci A, Ragazzini R, Zanieri L, Ariza-McNaughton L, Catucci M, Boeing S, Park JE, Hutchinson JC, Muñoz-Ruiz M, Manti PG, Vozza G, Villa CE, Phylactopoulos DE, Maurer C, Testa G, Stauss HJ, Teichmann SA, Sebire NJ, Hayday AC, Bonnet D and Bonfanti P

    Epithelial Stem Cell Biology & Regenerative Medicine laboratory, The Francis Crick Institute, 1 Midland Road, London, NW1 1AT, UK.

    The thymus is a primary lymphoid organ, essential for T cell maturation and selection. There has been long-standing interest in processes underpinning thymus generation and the potential to manipulate it clinically, because alterations of thymus development or function can result in severe immunodeficiency and autoimmunity. Here, we identify epithelial-mesenchymal hybrid cells, capable of long-term expansion in vitro, and able to reconstitute an anatomic phenocopy of the native thymus, when combined with thymic interstitial cells and a natural decellularised extracellular matrix (ECM) obtained by whole thymus perfusion. This anatomical human thymus reconstruction is functional, as judged by its capacity to support mature T cell development in vivo after transplantation into humanised immunodeficient mice. These findings establish a basis for dissecting the cellular and molecular crosstalk between stroma, ECM and thymocytes, and offer practical prospects for treating congenital and acquired immunological diseases.

    Funded by: Cancer Research UK: FC001045; Department of Health; Medical Research Council: FC001045, MC_PC_17180; Wellcome Trust: 106292/Z/14/Z, FC001045

    Nature communications 2020;11;1;6372

  • Interstitial Cell Remodeling Promotes Aberrant Adipogenesis in Dystrophic Muscles.

    Camps J, Breuls N, Sifrim A, Giarratana N, Corvelyn M, Danti L, Grosemans H, Vanuytven S, Thiry I, Belicchi M, Meregalli M, Platko K, MacDonald ME, Austin RC, Gijsbers R, Cossu G, Torrente Y, Voet T and Sampaolesi M

    Laboratory of Translational Cardiomyology, Department of Development and Regeneration, Stem Cell Research Institute, KU Leuven, 3000 Leuven, Belgium; Bayer AG, Research & Development, Pharmaceuticals, 13353 Berlin, Germany.

    Fibrosis and fat replacement in skeletal muscle are major complications that lead to a loss of mobility in chronic muscle disorders, such as muscular dystrophy. However, the in vivo properties of adipogenic stem and precursor cells remain unclear, mainly due to the high cell heterogeneity in skeletal muscles. Here, we use single-cell RNA sequencing to decomplexify interstitial cell populations in healthy and dystrophic skeletal muscles. We identify an interstitial CD142-positive cell population in mice and humans that is responsible for the inhibition of adipogenesis through GDF10 secretion. Furthermore, we show that the interstitial cell composition is completely altered in muscular dystrophy, with a near absence of CD142-positive cells. The identification of these adipo-regulatory cells in the skeletal muscle aids our understanding of the aberrant fat deposition in muscular dystrophy, paving the way for treatments that could counteract degeneration in patients with muscular dystrophy.

    Cell reports 2020;31;5;107597

  • From GWAS to Function: Using Functional Genomics to Identify the Mechanisms Underlying Complex Diseases.

    Cano-Gamez E and Trynka G

    Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, United Kingdom.

    Genome-wide association studies (GWAS) have successfully mapped thousands of loci associated with complex traits. These associations could reveal the molecular mechanisms altered in common complex diseases and result in the identification of novel drug targets. However, GWAS have also left a number of outstanding questions. In particular, the majority of disease-associated loci lie in non-coding regions of the genome and, even though they are thought to play a role in gene expression regulation, it is unclear which genes they regulate and in which cell types or physiological contexts this regulation occurs. This has hindered the translation of GWAS findings into clinical interventions. In this review we summarize how these challenges have been addressed over the last decade, with a particular focus on the integration of GWAS results with functional genomics datasets. Firstly, we investigate how the tissues and cell types involved in diseases can be identified using methods that test for enrichment of GWAS variants in genomic annotations. Secondly, we explore how to find the genes regulated by GWAS loci using methods that test for colocalization of GWAS signals with molecular phenotypes such as quantitative trait loci (QTLs). Finally, we highlight potential future research avenues such as integrating GWAS results with single-cell sequencing read-outs, designing functionally informed polygenic risk scores (PRS), and validating disease associated genes using genetic engineering. These tools will be crucial to identify new drug targets for common complex diseases.

    Frontiers in genetics 2020;11;424

  • Single-cell transcriptomics identifies an effectorness gradient shaping the response of CD4+ T cells to cytokines.

    Cano-Gamez E, Soskic B, Roumeliotis TI, So E, Smyth DJ, Baldrighi M, Willé D, Nakic N, Esparza-Gordillo J, Larminie CGC, Bronson PG, Tough DF, Rowan WC, Choudhary JS and Trynka G

    Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK.

    Naïve CD4<sup>+</sup> T cells coordinate the immune response by acquiring an effector phenotype in response to cytokines. However, the cytokine responses in memory T cells remain largely understudied. Here we use quantitative proteomics, bulk RNA-seq, and single-cell RNA-seq of over 40,000 human naïve and memory CD4<sup>+</sup> T cells to show that responses to cytokines differ substantially between these cell types. Memory T cells are unable to differentiate into the Th2 phenotype, and acquire a Th17-like phenotype in response to iTreg polarization. Single-cell analyses show that T cells constitute a transcriptional continuum that progresses from naïve to central and effector memory T cells, forming an effectorness gradient accompanied by an increase in the expression of chemokines and cytokines. Finally, we show that T cell activation and cytokine responses are influenced by the effectorness gradient. Our results illustrate the heterogeneity of T cell responses, furthering our understanding of inflammation.

    Funded by: Cancer Research UK: C309/A25144; Wellcome Trust: WT206194

    Nature communications 2020;11;1;1801

  • Analysis of endothelial-to-haematopoietic transition at the single cell level identifies cell cycle regulation as a driver of differentiation.

    Canu G, Athanasiadis E, Grandy RA, Garcia-Bernardo J, Strzelecka PM, Vallier L, Ortmann D and Cvejic A

    Wellcome Trust-Medical Research Council Cambridge Stem Cell Institute, Cambridge, UK.

    Background: Haematopoietic stem cells (HSCs) first arise during development in the aorta-gonad-mesonephros (AGM) region of the embryo from a population of haemogenic endothelial cells which undergo endothelial-to-haematopoietic transition (EHT). Despite the progress achieved in recent years, the molecular mechanisms driving EHT are still poorly understood, especially in human where the AGM region is not easily accessible.

    Results: In this study, we take advantage of a human pluripotent stem cell (hPSC) differentiation system and single-cell transcriptomics to recapitulate EHT in vitro and uncover mechanisms by which the haemogenic endothelium generates early haematopoietic cells. We show that most of the endothelial cells reside in a quiescent state and progress to the haematopoietic fate within a defined time window, within which they need to re-enter into the cell cycle. If cell cycle is blocked, haemogenic endothelial cells lose their EHT potential and adopt a non-haemogenic identity. Furthermore, we demonstrate that CDK4/6 and CDK1 play a key role not only in the transition but also in allowing haematopoietic progenitors to establish their full differentiation potential.

    Conclusion: We propose a direct link between the molecular machineries that control cell cycle progression and EHT.

    Funded by: British Heart Foundation (GB): PhD Studentship; Cancer Research UK: C45041/A14953; European Research Council (): 677501; Wellcome Trust and MRC: Core support grant to Cambridge Stem Cell Institute

    Genome biology 2020;21;1;157

  • The concerted action of two B3-like prophage genes exclude superinfecting bacteriophages by blocking DNA entry into Pseudomonas aeruginosa.

    Carballo-Ontiveros MA, Cazares A, Vinuesa P, Kameyama L and Guarneros G

    Departamento de Genética y Biología Molecular, Centro de Investigación y de Estudios Avanzados del Instituto Politécnico Nacional (CINVESTAV-IPN), Mexico City, Mexico.

    In this study, we describe seven vegetative phage genomes homologous to the historic phage B3 that infect <i>Pseudomonas aeruginosa</i> Like other phage groups, the B3-like group contains conserved (core) and variable (accessory) Open Reading Frames (ORFs) grouped at fixed regions in their genomes; however, in either case, many ORFs remain without assigned functions. We constructed lysogens of the seven B3-like phages in strain Ps33 of <i>P. aeruginosa</i>, a novel clinical isolate, and assayed the exclusion phenotype against a variety of temperate and virulent superinfecting phages. In addition to the classic exclusion conferred by the phage immunity repressor, the phenotype observed in B3-like lysogens suggested the presence of other exclusion genes. We set out to identify the genes responsible for this exclusion phenotype. Phage Ps56 was chosen as the study subject since it excluded numerous temperate and virulent phages. Restriction of Ps56 genome, cloning of several fragments, and resection of the fragments that retained the exclusion phenotype allowed us to identify two core ORFs, so far without any assigned function, as responsible for a type of exclusion. Neither gene expressed separately from plasmids showed activity, but the concurrent expression of both ORFs is needed for exclusion. Our data suggest that phage adsorption occurs, but phage genome translocation to the host's cytoplasm is defective. To our knowledge, this is the first report on this type of exclusion mediated by a prophage in <i>P. aeruginosa</i><b>IMPORTANCE</b><i>Pseudomonas aeruginosa</i> is a Gram-negative bacterium, frequently isolated from infected immunocompromised patients, and the strains are resistant to a broad spectrum of antibiotics. Recently, the use of phages has been proposed as an alternative therapy against multidrug-resistant bacteria. However, this approach may present various hurdles. This work addresses the problem that pathogenic bacteria may be lysogenized by phages carrying genes encoding resistance against secondary infections, such as those used in phage therapy. Discovering phage genes that exclude superinfecting phages not only assign novel functions to orphan genes in databases but also provide insight into selection of proper phages for use in phage therapy.

    Journal of virology 2020

  • Movement of transposable elements contributes to cichlid diversity.

    Carleton KL, Conte MA, Malinsky M, Nandamuri SP, Sandkam BA, Meier JI, Mwaiko S, Seehausen O and Kocher TD

    Department of Biology, University of Maryland, College Park, MD, 20742, USA.

    African cichlid fishes are a prime model for studying speciation mechanisms. Despite the development of extensive genomic resources, it has been difficult to determine which sources of genetic variation are responsible for cichlid phenotypic variation. One of their most variable phenotypes is visual sensitivity, with some of the largest spectral shifts among vertebrates. These shifts arise primarily from differential expression of seven cone opsin genes. By mapping expression quantitative trait loci (eQTL) in intergeneric crosses of Lake Malawi cichlids, we previously identified four causative genetic variants that correspond to indels in the promoters of either key transcription factors or an opsin gene. In this comprehensive study, we show that these indels are the result of the movement of transposable elements (TEs) that correlate with opsin expression variation across the Malawi flock. In tracking the evolutionary history of these particular indels, we found they are endemic to Lake Malawi, suggesting that these TEs are recently active and are segregating within the Malawi cichlid lineage. However, an independent indel has arisen at a similar genomic location in one locus outside of the Malawi flock. The convergence in TE movement suggests these loci are primed for TE insertion and subsequent deletions. Increased TE mobility may be associated with interspecific hybridization, which disrupts mechanisms of TE suppression. This might provide a link between cichlid hybridization and accelerated regulatory variation. Overall, our study suggests that TEs may be an important driver of key regulatory changes, facilitating rapid phenotypic change and possibly speciation in African cichlids.

    Molecular ecology 2020

  • Cancer LncRNA Census reveals evidence for deep functional conservation of long noncoding RNAs in tumorigenesis.

    Carlevaro-Fita J, Lanzós A, Feuerbach L, Hong C, Mas-Ponte D, Pedersen JS, PCAWG Drivers and Functional Interpretation Group, Johnson R and PCAWG Consortium

    Department of Medical Oncology, Inselspital, University Hospital and University of Bern, 3010, Bern, Switzerland.

    Long non-coding RNAs (lncRNAs) are a growing focus of cancer genomics studies, creating the need for a resource of lncRNAs with validated cancer roles. Furthermore, it remains debated whether mutated lncRNAs can drive tumorigenesis, and whether such functions could be conserved during evolution. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, we introduce the Cancer LncRNA Census (CLC), a compilation of 122 GENCODE lncRNAs with causal roles in cancer phenotypes. In contrast to existing databases, CLC requires strong functional or genetic evidence. CLC genes are enriched amongst driver genes predicted from somatic mutations, and display characteristic genomic features. Strikingly, CLC genes are enriched for driver mutations from unbiased, genome-wide transposon-mutagenesis screens in mice. We identified 10 tumour-causing mutations in orthologues of 8 lncRNAs, including LINC-PINT and NEAT1, but not MALAT1. Thus CLC represents a dataset of high-confidence cancer lncRNAs. Mutagenesis maps are a novel means for identifying deeply-conserved roles of lncRNAs in tumorigenesis.

    Communications biology 2020;3;1;56

  • Defining multiplicity of vector uptake in transfected Plasmodium parasites.

    Carrasquilla M, Adjalley S, Sanderson T, Marin-Menendez A, Coyle R, Montandon R, Rayner JC, Pance A and Lee MCS

    Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK.

    The recurrent emergence of drug resistance in Plasmodium falciparum increases the urgency to genetically validate drug resistance mechanisms and identify new targets. Reverse genetics have facilitated genome-scale knockout screens in Plasmodium berghei and Toxoplasma gondii, in which pooled transfections of multiple vectors were critical to increasing scale and throughput. These approaches have not yet been implemented in human malaria species such as P. falciparum and P. knowlesi, in part because the extent to which pooled transfections can be performed in these species remains to be evaluated. Here we use next-generation sequencing to quantitate uptake of a pool of 94 barcoded vectors. The distribution of vector acquisition allowed us to estimate the number of barcodes and DNA molecules taken up by the parasite population. Dilution cloning of P. falciparum transfectants showed that individual clones possess as many as seven episomal barcodes, revealing that an intake of multiple vectors is a frequent event despite the inefficient transfection efficiency. Transfection of three spectrally-distinct fluorescent reporters allowed us to evaluate different transfection methods and revealed that schizont-stage transfection limited the tendency for parasites to take up multiple vectors. In contrast to P. falciparum, we observed that the higher transfection efficiency of P. knowlesi resulted in near complete representation of the library. These findings have important implications for how reverse genetics can be scaled in culturable Plasmodium species.

    Funded by: Wellcome Trust; Wellcome Trust (Wellcome): 206194

    Scientific reports 2020;10;1;10894

  • GM-CSF Calibrates Macrophage Defense and Wound Healing Programs during Intestinal Infection and Inflammation.

    Castro-Dopico T, Fleming A, Dennison TW, Ferdinand JR, Harcourt K, Stewart BJ, Cader Z, Tuong ZK, Jing C, Lok LSC, Mathews RJ, Portet A, Kaser A, Clare S and Clatworthy MR

    Molecular Immunity Unit, University of Cambridge Department of Medicine, MRC Laboratory of Molecular Biology, Cambridge, UK.

    Macrophages play a central role in intestinal immunity, but inappropriate macrophage activation is associated with inflammatory bowel disease (IBD). Here, we identify granulocyte-macrophage colony stimulating factor (GM-CSF) as a critical regulator of intestinal macrophage activation in patients with IBD and mice with dextran sodium sulfate (DSS)-induced colitis. We find that GM-CSF drives the maturation and polarization of inflammatory intestinal macrophages, promoting anti-microbial functions while suppressing wound-healing transcriptional programs. Group 3 innate lymphoid cells (ILC3s) are a major source of GM-CSF in intestinal inflammation, with a strong positive correlation observed between ILC or CSF2 transcripts and M1 macrophage signatures in IBD mucosal biopsies. Furthermore, GM-CSF-dependent macrophage polarization results in a positive feedback loop that augmented ILC3 activation and type 17 immunity. Together, our data reveal an important role for GM-CSF-mediated ILC-macrophage crosstalk in calibrating intestinal macrophage phenotype to enhance anti-bacterial responses, while inhibiting pro-repair functions associated with fibrosis and stricturing, with important clinical implications.

    Cell reports 2020;32;1;107857

  • The Ncoa7 locus regulates V-ATPase formation and function, neurodevelopment and behaviour.

    Castroflorio E, den Hoed J, Svistunova D, Finelli MJ, Cebrian-Serrano A, Corrochano S, Bassett AR, Davies B and Oliver PL

    MRC Harwell Institute, Harwell Campus, Oxfordshire, OX11 0RD, UK.

    Members of the Tre2/Bub2/Cdc16 (TBC), lysin motif (LysM), domain catalytic (TLDc) protein family are associated with multiple neurodevelopmental disorders, although their exact roles in disease remain unclear. For example, nuclear receptor coactivator 7 (NCOA7) has been associated with autism, although almost nothing is known regarding the mode-of-action of this TLDc protein in the nervous system. Here we investigated the molecular function of NCOA7 in neurons and generated a novel mouse model to determine the consequences of deleting this locus in vivo. We show that NCOA7 interacts with the cytoplasmic domain of the vacuolar (V)-ATPase in the brain and demonstrate that this protein is required for normal assembly and activity of this critical proton pump. Neurons lacking Ncoa7 exhibit altered development alongside defective lysosomal formation and function; accordingly, Ncoa7 deletion animals exhibited abnormal neuronal patterning defects and a reduced expression of lysosomal markers. Furthermore, behavioural assessment revealed anxiety and social defects in mice lacking Ncoa7. In summary, we demonstrate that NCOA7 is an important V-ATPase regulatory protein in the brain, modulating lysosomal function, neuronal connectivity and behaviour; thus our study reveals a molecular mechanism controlling endolysosomal homeostasis that is essential for neurodevelopment.

    Funded by: FP7 Ideas: European Research Council: PAROSIN 311384; Medical Research Council (UK): MR/P502005/1; Wellcome Trust: 203141/Z/16/Z

    Cellular and molecular life sciences : CMLS 2020

  • Multihost Transmission of Schistosoma mansoni in Senegal, 2015-2018.

    Catalano S, Léger E, Fall CB, Borlase A, Diop SD, Berger D, Webster BL, Faye B, Diouf ND, Rollinson D, Sène M, Bâ K and Webster JP

    In West Africa, Schistosoma spp. are capable of infecting multiple definitive hosts, a lifecycle feature that may complicate schistosomiasis control. We characterized the evolutionary relationships among multiple Schistosoma mansoni isolates collected from snails (intermediate hosts), humans (definitive hosts), and rodents (definitive hosts) in Senegal. On a local scale, diagnosis of S. mansoni infection ranged 3.8%-44.8% in school-aged children, 1.7%-52.6% in Mastomys huberti mice, and 1.8%-7.1% in Biomphalaria pfeifferi snails. Our phylogenetic framework confirmed the presence of multiple S. mansoni lineages that could infect both humans and rodents; divergence times of these lineages varied (0.13-0.02 million years ago). We propose that extensive movement of persons across West Africa might have contributed to the establishment of these various multihost S. mansoni clades. High S. mansoni prevalence in rodents at transmission sites frequented by humans further highlights the implications that alternative hosts could have on future public health interventions.

    Emerging infectious diseases 2020;26;6;1234-1242

  • Eco-Evolutionary Effects of Bacterial Cooperation on Phage Therapy: An Unknown Risk?

    Cazares A, García-Contreras R and Pérez-Velázquez J

    EMBL's European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridge, United Kingdom.

    If there is something we have learned from the antibiotic era, it is that indiscriminate use of a therapeutic agent without a clear understanding of its long-term evolutionary impact can have enormous health repercussions. This knowledge is particularly relevant when the therapeutic agents are remarkably adaptable and diverse biological entities capable of a plethora of interactions, most of which remain largely unexplored. Although phage therapy (PT) undoubtedly holds the potential to save lives, its current efficacy in case studies recalls the golden era of antibiotics, when these compounds were highly effective and the possibility of them becoming ineffective seemed remote. Safe PT schemes depend on our understanding of how phages interact with, and evolve in, highly complex environments. Here, we summarize and review emerging evidence in a commonly overlooked theme in PT: bacteria-phage interactions. In particular, we discuss the influence of quorum sensing (QS) on phage susceptibility, the consequent role of phages in modulating bacterial cooperation, and the potential implications of this relationship in PT, including how we can use this knowledge to inform PT strategies. We highlight that the influence of QS on phage susceptibility seems to be widespread but can have contrasting outcomes depending on the bacterial host, underscoring the need to thoroughly characterize this link in various bacterial models. Furthermore, we encourage researchers to exploit competition experiments, experimental evolution, and mathematical modeling to explore this relationship further in relevant infection models. Finally, we emphasize that long-term PT success requires research on phage ecology and evolution to inform the design of optimal therapeutic schemes.

    Frontiers in microbiology 2020;11;590294

  • Heterotypic cell-cell communication regulates glandular stem cell multipotency.

    Centonze A, Lin S, Tika E, Sifrim A, Fioramonti M, Malfait M, Song Y, Wuidart A, Van Herck J, Dannau A, Bouvencourt G, Dubois C, Dedoncker N, Sahay A, de Maertelaer V, Siebel CW, Van Keymeulen A, Voet T and Blanpain C

    Laboratory of Stem Cells and Cancer, Université Libre de Bruxelles (ULB), Brussels, Belgium.

    Glandular epithelia, including the mammary and prostate glands, are composed of basal cells (BCs) and luminal cells (LCs)<sup>1,2</sup>. Many glandular epithelia develop from multipotent basal stem cells (BSCs) that are replaced in adult life by distinct pools of unipotent stem cells<sup>1,3-8</sup>. However, adult unipotent BSCs can reactivate multipotency under regenerative conditions and upon oncogene expression<sup>3,9-13</sup>. This suggests that an active mechanism restricts BSC multipotency under normal physiological conditions, although the nature of this mechanism is unknown. Here we show that the ablation of LCs reactivates the multipotency of BSCs from multiple epithelia both in vivo in mice and in vitro in organoids. Bulk and single-cell RNA sequencing revealed that, after LC ablation, BSCs activate a hybrid basal and luminal cell differentiation program before giving rise to LCs-reminiscent of the genetic program that regulates multipotency during embryonic development<sup>7</sup>. By predicting ligand-receptor pairs from single-cell data<sup>14</sup>, we find that TNF-which is secreted by LCs-restricts BC multipotency under normal physiological conditions. By contrast, the Notch, Wnt and EGFR pathways were activated in BSCs and their progeny after LC ablation; blocking these pathways, or stimulating the TNF pathway, inhibited regeneration-induced BC multipotency. Our study demonstrates that heterotypic communication between LCs and BCs is essential to maintain lineage fidelity in glandular epithelial stem cells.

    Nature 2020;584;7822;608-613

  • Bacterial survival: evolve and adapt or perish.

    Chaguza C

    Genomics of Pneumonia and Meningitis, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK.

    Nature reviews. Microbiology 2020;18;1;5

  • Early Signals of Vaccine-driven Perturbation Seen in Pneumococcal Carriage Population Genomic Data.

    Chaguza C, Heinsbroek E, Gladstone RA, Tafatatha T, Alaerts M, Peno C, Cornick JE, Musicha P, Bar-Zeev N, Kamng'ona A, Kadioglu A, McGee L, Hanage WP, Breiman RF, Heyderman RS, French N, Everett DB and Bentley SD

    Parasites and Microbes Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge.

    Background: Pneumococcal conjugate vaccines (PCVs) have reduced pneumococcal diseases globally. Pneumococcal genomic surveys elucidate PCV effects on population structure but are rarely conducted in low-income settings despite the high disease burden.

    Methods: We undertook whole-genome sequencing (WGS) of 660 pneumococcal isolates collected through surveys from healthy carriers 2 years from 13-valent PCV (PCV13) introduction and 1 year after rollout in northern Malawi. We investigated changes in population structure, within-lineage serotype dynamics, serotype diversity, and frequency of antibiotic resistance (ABR) and accessory genes.

    Results: In children <5 years of age, frequency and diversity of vaccine serotypes (VTs) decreased significantly post-PCV, but no significant changes occurred in persons ≥5 years of age. Clearance of VT serotypes was consistent across different genetic backgrounds (lineages). There was an increase of nonvaccine serotypes (NVTs)-namely 7C, 15B/C, and 23A-in children <5 years of age, but 28F increased in both age groups. While carriage rates have been recently shown to remain stable post-PCV due to replacement serotypes, there was no change in diversity of NVTs. Additionally, frequency of intermediate-penicillin-resistant lineages decreased post-PCV. Although frequency of ABR genes remained stable, other accessory genes, especially those associated with mobile genetic element and bacteriocins, showed changes in frequency post-PCV.

    Conclusions: We demonstrate evidence of significant population restructuring post-PCV driven by decreasing frequency of vaccine serotypes and increasing frequency of few NVTs mainly in children under 5. Continued surveillance with WGS remains crucial to fully understand dynamics of the residual VTs and replacement NVT serotypes post-PCV.

    Funded by: Medical Research Council: MR/P011284/1, MR/R002592/1; NIAID NIH HHS: R01 AI106786; Wellcome Trust: 084679/Z/08/Z

    Clinical infectious diseases : an official publication of the Infectious Diseases Society of America 2020;70;7;1294-1303

  • Using genomics to improve preparedness and response of future epidemics or pandemics in Africa.

    Chaguza C, Nyaga MM, Mwenda JM, Esona MD and Jere KC

    Parasites and Microbes Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge CB10 1SA, UK.

    The Lancet. Microbe 2020;1;7;e275-e276

  • Within-host microevolution of Streptococcus pneumoniae is rapid and adaptive during natural colonisation.

    Chaguza C, Senghore M, Bojang E, Gladstone RA, Lo SW, Tientcheu PE, Bancroft RE, Worwui A, Foster-Nyarko E, Ceesay F, Okoi C, McGee L, Klugman KP, Breiman RF, Barer MR, Adegbola RA, Antonio M, Bentley SD and Kwambana-Adams BA

    Parasites and Microbes Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK.

    Genomic evolution, transmission and pathogenesis of Streptococcus pneumoniae, an opportunistic human-adapted pathogen, is driven principally by nasopharyngeal carriage. However, little is known about genomic changes during natural colonisation. Here, we use whole-genome sequencing to investigate within-host microevolution of naturally carried pneumococci in ninety-eight infants intensively sampled sequentially from birth until twelve months in a high-carriage African setting. We show that neutral evolution and nucleotide substitution rates up to forty-fold faster than observed over longer timescales in S. pneumoniae and other bacteria drives high within-host pneumococcal genetic diversity. Highly divergent co-existing strain variants emerge during colonisation episodes through real-time intra-host homologous recombination while the rest are co-transmitted or acquired independently during multiple colonisation episodes. Genic and intergenic parallel evolution occur particularly in antibiotic resistance, immune evasion and epithelial adhesion genes. Our findings suggest that within-host microevolution is rapid and adaptive during natural colonisation.

    Funded by: Bill and Melinda Gates Foundation (Bill &amp; Melinda Gates Foundation): OPP1034556

    Nature communications 2020;11;1;3442

  • Bacterial genome-wide association study of hyper-virulent pneumococcal serotype 1 identifies genetic variation associated with neurotropism.

    Chaguza C, Yang M, Cornick JE, du Plessis M, Gladstone RA, Kwambana-Adams BA, Lo SW, Ebruke C, Tonkin-Hill G, Peno C, Senghore M, Obaro SK, Ousmane S, Pluschke G, Collard JM, Sigaùque B, French N, Klugman KP, Heyderman RS, McGee L, Antonio M, Breiman RF, von Gottberg A, Everett DB, Kadioglu A and Bentley SD

    Parasites and Microbes Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK.

    Hyper-virulent Streptococcus pneumoniae serotype 1 strains are endemic in Sub-Saharan Africa and frequently cause lethal meningitis outbreaks. It remains unknown whether genetic variation in serotype 1 strains modulates tropism into cerebrospinal fluid to cause central nervous system (CNS) infections, particularly meningitis. Here, we address this question through a large-scale linear mixed model genome-wide association study of 909 African pneumococcal serotype 1 isolates collected from CNS and non-CNS human samples. By controlling for host age, geography, and strain population structure, we identify genome-wide statistically significant genotype-phenotype associations in surface-exposed choline-binding (P = 5.00 × 10<sup>-08</sup>) and helicase proteins (P = 1.32 × 10<sup>-06</sup>) important for invasion, immune evasion and pneumococcal tropism to CNS. The small effect sizes and negligible heritability indicated that causation of CNS infection requires multiple genetic and other factors reflecting a complex and polygenic aetiology. Our findings suggest that certain pathogen genetic variation modulate pneumococcal survival and tropism to CNS tissue, and therefore, virulence for meningitis.

    Funded by: Bill and Melinda Gates Foundation (Bill &amp; Melinda Gates Foundation): OPP1023440, OPP1034556

    Communications biology 2020;3;1;559

  • Genome-wide CRISPR screens of oral squamous cell carcinoma reveal fitness genes in the Hippo pathway.

    Chai AWY, Yee PS, Price S, Yee SM, Lee HM, Tiong VK, Gonçalves E, Behan FM, Bateson J, Gilbert J, Tan AC, McDermott U, Garnett MJ and Cheong SC

    Head and Neck Cancer Research Team, Cancer Research Malaysia, Subang Jaya, Malaysia.

    New therapeutic targets for oral squamous cell carcinoma (OSCC) are urgently needed. We conducted genome-wide CRISPR-Cas9 screens in 21 OSCC cell lines, primarily derived from Asians, to identify genetic vulnerabilities that can be explored as therapeutic targets. We identify known and novel fitness genes and demonstrate that many previously identified OSCC-related cancer genes are non-essential and could have limited therapeutic value, while other fitness genes warrant further investigation for their potential as therapeutic targets. We validate a distinctive dependency on YAP1 and WWTR1 of the Hippo pathway, where the lost-of-fitness effect of one paralog can be compensated only in a subset of lines. We also discover that OSCCs with WWTR1 dependency signature are significantly associated with biomarkers of favourable response towards immunotherapy. In summary, we have delineated the genetic vulnerabilities of OSCC, enabling the prioritization of therapeutic targets for further exploration, including the targeting of YAP1 and WWTR1.

    Funded by: Cancer Research Malaysia; Medical Research Council: MR/P013457/1; Wellcome Trust: 206194

    eLife 2020;9

  • BlobToolKit - Interactive Quality Assessment of Genome Assemblies.

    Challis R, Richards E, Rajan J, Cochrane G and Blaxter M

    University of Edinburgh; Wellcome Trust Sanger Institute

    Reconstruction of target genomes from sequence data produced by instruments that are agnostic as to the species-of-origin may be confounded by contaminant DNA. Whether introduced during sample processing or through co-extraction alongside the target DNA, if insufficient care is taken during the assembly process, the final assembled genome may be a mixture of data from several species. Such assemblies can confound sequence-based biological inference and, when deposited in public databases, may be included in downstream analyses by users unaware of underlying problems. We present BlobToolKit, a software suite to aid researchers in identifying and isolating non-target data in draft and publicly available genome assemblies. BlobToolKit can be used to process assembly, read and analysis files for fully reproducible interactive exploration in the browser-based Viewer. BlobToolKit can be used during assembly to filter non-target DNA, helping researchers produce assemblies with high biological credibility. We have been running an automated BlobToolKit pipeline on eukaryotic assemblies publicly available in the International Nucleotide Sequence Data Collaboration and are making the results available through a public instance of the Viewer at . We aim to complete analysis of all publicly available genomes and then maintain currency with the flow of new genomes. We have worked to embed these views into the presentation of genome assemblies at the European Nucleotide Archive, providing an indication of assembly quality alongside the public record with links out to allow full exploration in the Viewer.

    G3 (Bethesda, Md.) 2020

  • Transcription phenotypes of pancreatic cancer are driven by genomic events during tumor evolution.

    Chan-Seng-Yue M, Kim JC, Wilson GW, Ng K, Figueroa EF, O'Kane GM, Connor AA, Denroche RE, Grant RC, McLeod J, Wilson JM, Jang GH, Zhang A, Dodd A, Liang SB, Borgida A, Chadwick D, Kalimuthu S, Lungu I, Bartlett JMS, Krzyzanowski PM, Sandhu V, Tiriac H, Froeling FEM, Karasinska JM, Topham JT, Renouf DJ, Schaeffer DF, Jones SJM, Marra MA, Laskin J, Chetty R, Stein LD, Zogopoulos G, Haibe-Kains B, Campbell PJ, Tuveson DA, Knox JJ, Fischer SE, Gallinger S and Notta F

    Princess Margaret Cancer Centre, Toronto, Ontario, Canada.

    Pancreatic adenocarcinoma presents as a spectrum of a highly aggressive disease in patients. The basis of this disease heterogeneity has proved difficult to resolve due to poor tumor cellularity and extensive genomic instability. To address this, a dataset of whole genomes and transcriptomes was generated from purified epithelium of primary and metastatic tumors. Transcriptome analysis demonstrated that molecular subtypes are a product of a gene expression continuum driven by a mixture of intratumoral subpopulations, which was confirmed by single-cell analysis. Integrated whole-genome analysis uncovered that molecular subtypes are linked to specific copy number aberrations in genes such as mutant KRAS and GATA6. By mapping tumor genetic histories, tetraploidization emerged as a key mutational process behind these events. Taken together, these data support the premise that the constellation of genomic aberrations in the tumor gives rise to the molecular subtype, and that disease heterogeneity is due to ongoing genomic instability during progression.

    Nature genetics 2020;52;2;231-240

  • Refining the transcriptome of the human malaria parasite Plasmodium falciparum using amplification-free RNA-seq.

    Chappell L, Ross P, Orchard L, Russell TJ, Otto TD, Berriman M, Rayner JC and Llinás M

    Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, CB10 1SA, UK.

    Background: Plasmodium parasites undergo several major developmental transitions during their complex lifecycle, which are enabled by precisely ordered gene expression programs. Transcriptomes from the 48-h blood stages of the major human malaria parasite Plasmodium falciparum have been described using cDNA microarrays and RNA-seq, but these assays have not always performed well within non-coding regions, where the AT-content is often 90-95%.

    Results: We developed a directional, amplification-free RNA-seq protocol (DAFT-seq) to reduce bias against AT-rich cDNA, which we have applied to three strains of P. falciparum (3D7, HB3 and IT). While strain-specific differences were detected, overall there is strong conservation between the transcriptional profiles. For the 3D7 reference strain, transcription was detected from 89% of the genome, with over 78% of the genome transcribed into mRNAs. We also find that transcription from bidirectional promoters frequently results in non-coding, antisense transcripts. These datasets allowed us to refine the 5' and 3' untranslated regions (UTRs), which can be variable, long (> 1000 nt), and often overlap those of adjacent transcripts.

    Conclusions: The approaches applied in this study allow a refined description of the transcriptional landscape of P. falciparum and demonstrate that very little of the densely packed P. falciparum genome is inactive or redundant. By capturing the 5' and 3' ends of mRNAs, we reveal both constant and dynamic use of transcriptional start sites across the intraerythrocytic developmental cycle that will be useful in guiding the definition of regulatory regions for use in future experimental gene expression studies.

    Funded by: Burroughs Wellcome Fund: 1007041.02; NIGMS NIH HHS: P50 GM071508; NIH HHS: 1DP2OD001315; Wellcome Trust: 206194

    BMC genomics 2020;21;1;395

  • High-Throughput Quantitative RT-PCR in Single and Bulk C. elegans Samples Using Nanofluidic Technology.

    Chauve L, Le Pen J, Hodge F, Todtenhaupt P, Biggins L, Miska EA, Andrews S and Casanueva O

    Babraham Institute.

    This paper presents a high-throughput reverse transcription quantitative PCR (RT-qPCR) assay for Caenorhabditis elegans that is fast, robust, and highly sensitive. This protocol obtains precise measurements of gene expression from single worms or from bulk samples. The protocol presented here provides a novel adaptation of existing methods for complementary DNA (cDNA) preparation coupled to a nanofluidic RT-qPCR platform. The first part of this protocol, named 'Worm-to-CT', allows cDNA production directly from nematodes without the need for prior mRNA isolation. It increases experimental throughput by allowing the preparation of cDNA from 96 worms in 3.5 h. The second part of the protocol uses existing nanofluidic technology to run high-throughput RT-qPCR on the cDNA. This paper evaluates two different nanofluidic chips: the first runs 96 samples and 96 targets, resulting in 9,216 reactions in approximately 1.5 days of benchwork. The second chip type consists of six 12 x 12 arrays, resulting in 864 reactions. Here, the Worm-to-CT method is demonstrated by quantifying mRNA levels of genes encoding heat shock proteins from single worms and from bulk samples. Provided is an extensive list of primers designed to amplify processed RNA for the majority of coding genes within the C. elegans genome.

    Journal of visualized experiments : JoVE 2020;159

  • Paired rRNA-depleted and polyA-selected RNA sequencing data and supporting multi-omics data from human T cells.

    Chen L, Yang R, Kwan T, Tang C, Watt S, Zhang Y, Bourque G, Ge B, Downes K, Frontini M, Ouwehand WH, Lin JW, Soranzo N, Pastinen T and Chen L

    Key Laboratory of Birth Defects and Related Diseases of Women and Children of MOE, Department of Laboratory Medicine, State Key Laboratory of Biotherapy, West China Second Hospital, Sichuan University, Chengdu, Sichuan, 610041, China.

    Both poly(A) enrichment and ribosomal RNA depletion are commonly used for RNA sequencing. Either has its advantages and disadvantages that may lead to biases in the downstream analyses. To better access these effects, we carried out both ribosomal RNA-depleted and poly(A)-selected RNA-seq for CD4<sup>+</sup> T naive cells isolated from 40 healthy individuals from the Blueprint Project. For these 40 individuals, the genomic and epigenetic data were also available. This dataset offers a unique opportunity to understand how library construction influences differential gene expression, alternative splicing and molecular QTL (quantitative loci) analyses for human primary cells.

    Funded by: EC | EC Seventh Framework Programm | FP7 Health (FP7-HEALTH - Specific Programme &quot;Cooperation&amp;quot;: Health): HEALTH-F5-2011-282510; Gouvernement du Canada | Canadian Institutes of Health Research (Instituts de Recherche en Santé du Canada): EP1-120608,EP2-120609

    Scientific data 2020;7;1;376

  • Trans-ethnic and Ancestry-Specific Blood-Cell Genetics in 746,667 Individuals from 5 Global Populations.

    Chen MH, Raffield LM, Mousas A, Sakaue S, Huffman JE, Moscati A, Trivedi B, Jiang T, Akbari P, Vuckovic D, Bao EL, Zhong X, Manansala R, Laplante V, Chen M, Lo KS, Qian H, Lareau CA, Beaudoin M, Hunt KA, Akiyama M, Bartz TM, Ben-Shlomo Y, Beswick A, Bork-Jensen J, Bottinger EP, Brody JA, van Rooij FJA, Chitrala K, Cho K, Choquet H, Correa A, Danesh J, Di Angelantonio E, Dimou N, Ding J, Elliott P, Esko T, Evans MK, Floyd JS, Broer L, Grarup N, Guo MH, Greinacher A, Haessler J, Hansen T, Howson JMM, Huang QQ, Huang W, Jorgenson E, Kacprowski T, Kähönen M, Kamatani Y, Kanai M, Karthikeyan S, Koskeridis F, Lange LA, Lehtimäki T, Lerch MM, Linneberg A, Liu Y, Lyytikäinen LP, Manichaikul A, Martin HC, Matsuda K, Mohlke KL, Mononen N, Murakami Y, Nadkarni GN, Nauck M, Nikus K, Ouwehand WH, Pankratz N, Pedersen O, Preuss M, Psaty BM, Raitakari OT, Roberts DJ, Rich SS, Rodriguez BAT, Rosen JD, Rotter JI, Schubert P, Spracklen CN, Surendran P, Tang H, Tardif JC, Trembath RC, Ghanbari M, Völker U, Völzke H, Watkins NA, Zonderman AB, VA Million Veteran Program, Wilson PWF, Li Y, Butterworth AS, Gauchat JF, Chiang CWK, Li B, Loos RJF, Astle WJ, Evangelou E, van Heel DA, Sankaran VG, Okada Y, Soranzo N, Johnson AD, Reiner AP, Auer PL and Lettre G

    The Framingham Heart Study, National Heart, Lung and Blood Institute, Framingham, MA 01702, USA; Population Sciences Branch, Division of Intramural Research, National Heart, Lung and Blood Institute, Framingham, MA 01702, USA.

    Most loci identified by GWASs have been found in populations of European ancestry (EUR). In trans-ethnic meta-analyses for 15 hematological traits in 746,667 participants, including 184,535 non-EUR individuals, we identified 5,552 trait-variant associations at p < 5 × 10<sup>-9</sup>, including 71 novel associations not found in EUR populations. We also identified 28 additional novel variants in ancestry-specific, non-EUR meta-analyses, including an IL7 missense variant in South Asians associated with lymphocyte count in vivo and IL-7 secretion levels in vitro. Fine-mapping prioritized variants annotated as functional and generated 95% credible sets that were 30% smaller when using the trans-ethnic as opposed to the EUR-only results. We explored the clinical significance and predictive value of trans-ethnic variants in multiple populations and compared genetic architecture and the effect of natural selection on these blood phenotypes between populations. Altogether, our results for hematological traits highlight the value of a more global representation of populations in genetic studies.

    Cell 2020;182;5;1198-1213.e14

  • Re-evaluation of human BDCA-2+ DC during acute sterile skin inflammation

    Chen, Yi-Ling, Gomes, Tomas, Hardman, Clare S., Vieira Braga, Felipe A., Gutowska-Owsiak, Danuta, Salimi, Maryam, Gray, Nicki, Duncan, David A., Reynolds, Gary, Johnson, David, Salio, Mariolina, Cerundolo, Vincenzo, Barlow, Jillian L., McKenzie, Andrew N.J., Teichmann, Sarah A., Haniffa, Muzlifah and Ogg, Graham

    Journal of Experimental Medicine 2020;217;3

  • Inflammatory Signals Induce AT2 Cell-Derived Damage-Associated Transient Progenitors that Mediate Alveolar Regeneration.

    Choi J, Park JE, Tsagkogeorga G, Yanagita M, Koo BK, Han N and Lee JH

    Wellcome-MRC Cambridge Stem Cell Institute, University of Cambridge, Cambridge, UK.

    Tissue regeneration is a multi-step process mediated by diverse cellular hierarchies and states that are also implicated in tissue dysfunction and pathogenesis. Here we leveraged single-cell RNA sequencing in combination with in vivo lineage tracing and organoid models to finely map the trajectories of alveolar-lineage cells during injury repair and lung regeneration. We identified a distinct AT2-lineage population, damage-associated transient progenitors (DATPs), that arises during alveolar regeneration. We found that interstitial macrophage-derived IL-1β primes a subset of AT2 cells expressing Il1r1 for conversion into DATPs via a HIF1α-mediated glycolysis pathway, which is required for mature AT1 cell differentiation. Importantly, chronic inflammation mediated by IL-1β prevents AT1 differentiation, leading to aberrant accumulation of DATPs and impaired alveolar regeneration. Together, this stepwise mapping to cell fate transitions shows how an inflammatory niche controls alveolar regeneration by controlling stem cell fate and behavior.

    Cell stem cell 2020

  • Investigating Cellular Recognition Using CRISPR/Cas9 Genetic Screening.

    Chong ZS, Wright GJ and Sharma S

    Cell Surface Signalling Laboratory, Wellcome Sanger Institute, Cambridge CB10 1SA, UK. Electronic address:

    Neighbouring cells can recognise and communicate with each other by direct binding between cell surface receptor and ligand pairs. Examples of cellular recognition events include pathogen entry into a host cell, sperm-egg fusion, and self/nonself discrimination by the immune system. Despite growing appreciation of cell surface recognition molecules as potential therapeutic targets, identifying key factors contributing to cellular recognition remains technically challenging to perform on a genome-wide scale. Recently, genome-scale clustered regularly interspaced short palindromic repeats (CRISPR) knockout or activation (CRISPR-KO/CRISPRa) screens have been applied to identify the molecular determinants of cellular recognition. In this review, we discuss how CRISPR-KO/CRISPRa screening has contributed to our understanding of cellular recognition processes, and how it can be applied to investigate these important interactions in a range of biological contexts.

    Trends in cell biology 2020

  • Targeting NLRP3 and staphylococcal pore-forming toxin receptors in human-induced pluripotent stem cell-derived macrophages.

    Chow SH, Deo P, Yeung ATY, Kostoulias XP, Jeon Y, Gao ML, Seidi A, Olivier FAB, Sridhar S, Nethercott C, Cameron D, Robertson AAB, Robert R, Mackay CR, Traven A, Jin ZB, Hale C, Dougan G, Peleg AY and Naderer T

    Infection & Immunity Program, Biomedicine Discovery Institute and Department of Biochemistry & Molecular Biology, Monash University, Clayton, Victoria, Australia.

    Staphylococcus aureus causes necrotizing pneumonia by secreting toxins such as leukocidins that target front-line immune cells. The mechanism by which leukocidins kill innate immune cells and trigger inflammation during S. aureus lung infection, however, remains unresolved. Here, we explored human-induced pluripotent stem cell-derived macrophages (hiPSC-dMs) to study the interaction of the leukocidins Panton-Valentine leukocidin (PVL) and LukAB with lung macrophages, which are the initial leukocidin targets during S. aureus lung invasion. hiPSC-dMs were susceptible to the leukocidins PVL and LukAB and both leukocidins triggered NLPR3 inflammasome activation resulting in IL-1β secretion. hiPSC-dM cell death after LukAB exposure, however, was only temporarily dependent of NLRP3, although NLRP3 triggered marked cell death after PVL treatment. CRISPR/Cas9-mediated deletion of the PVL receptor, C5aR1, protected hiPSC-dMs from PVL cytotoxicity, despite the expression of other leukocidin receptors, such as CD45. PVL-deficient S. aureus had reduced ability to induce lung IL-1β levels in human C5aR1 knock-in mice. Unexpectedly, inhibiting NLRP3 activity resulted in increased wild-type S. aureus lung burdens. Our findings suggest that NLRP3 induces macrophage death and IL-1β secretion after PVL exposure and controls S. aureus lung burdens.

    Funded by: Australian National Health and Medical Research Council Practitioner Fellowship: APP1117940; Australian Research Council Future Fellows: FT170100313, FT190100733; National Health and Medical Research Council: 1163556; National Key R&amp;D Program of China: 2017YFA0105300

    Journal of leukocyte biology 2020

  • Outbreak of Dirkmeia churashimaensis Fungemia in a Neonatal Intensive Care Unit, India.

    Chowdhary A, Sharada K, Singh PK, Bhagwani DK, Kumar N, de Groot T and Meis JF

    Bloodstream infections caused by uncommon or novel fungal species are challenging to identify and treat. We report a series of cases of fungemia due to a rare basidiomycete yeast, Dirkmeia churashimaensis, in neonatal patients in India. Whole-genome sequence typing demonstrated that the patient isolates were genetically indistinguishable, indicating a single-source infection.

    Emerging infectious diseases 2020;26;4;764-768

  • Analysis of CRISPR-Cas9 screens identify genetic dependencies in melanoma.

    Christodoulou E, Rashid M, Pacini C, Alastair D, Robertson H, van Groningen T, Teunisse AFAS, Iorio F, Jochemsen AG, Adams DJ and van Doorn R

    Department of Dermatology, Leiden University Medical Center, Leiden, The Netherlands.

    Targeting the MAPK signaling pathway has transformed the treatment of metastatic melanoma. CRISPR-Cas9 viability screens provide a genome-wide approach to uncover novel genetic dependencies. Here we analyzed recently reported CRISPR-Cas9 screens comparing data from 28 melanoma cell lines and 313 cell lines of other tumor types in order to identify fitness genes related to melanoma. We found an average of 1,494 fitness genes in each melanoma cell line. We identified 33 genes inactivation of which specifically reduced the fitness of melanoma. This set of tumor type-specific genes includes established melanoma fitness genes as well as many genes that have not previously been associated with melanoma growth. Several genes encode proteins that can be targeted using available inhibitors. We verified that genetic inactivation of DUSP4 and PPP2R2A reduces the proliferation of melanoma cells. DUSP4 encodes an inhibitor of ERK, suggesting that further activation of MAPK signaling activity through its loss is selectively deleterious to melanoma cells. Collectively, these data present a resource of genetic dependencies in melanoma that may be explored as potential therapeutic targets.

    Pigment cell & melanoma research 2020

  • The Organoid Cell Atlas

    Christoph Bock, Michael Boutros, J. Gray Camp, Laura Clarke, Hans Clevers, Juergen A. Knoblich, Prisca Liberali, Aviv Regev, Anne C. Rios, Oliver Stegle, Hendrik G. Stunnenberg, Sarah A. Teichmann, Barbara Treutlein and Robert G. J. Vries

    Nature Biotechnology 2020;39;1;13

  • Relative abundance of the Prevotella genus within the human gut microbiota of elderly volunteers determines the inter-individual responses to dietary supplementation with wheat bran arabinoxylan-oligosaccharides.

    Chung WSF, Walker AW, Bosscher D, Garcia-Campayo V, Wagner J, Parkhill J, Duncan SH and Flint HJ

    Gut Health Group, Rowett Institute, University of Aberdeen, Foresterhill, Aberdeen, Scotland, AB25 2ZD, UK.

    Background: The human colon is colonised by a dense microbial community whose species composition and metabolism are linked to health and disease. The main energy sources for colonic bacteria are dietary polysaccharides and oligosaccharides. These play a major role in modulating gut microbial composition and metabolism, which in turn can impact on health outcomes.

    Results: We investigated the influence of wheat bran arabinoxylan oligosaccharides (AXOS) and maltodextrin supplements in modulating the composition of the colonic microbiota and metabolites in healthy adults over the age of 60. Male and female volunteers, (n = 21, mean BMI 25.2 ± 0.7 kg/m<sup>2</sup>) participated in the double-blind, cross over supplement study. Faecal samples were collected for analysis of microbiota, short chain fatty acids levels and calprotectin. Blood samples were collected to measure glucose, cholesterol and triglycerides levels. There was no change in these markers nor in calprotectin levels in response to the supplements. Both supplements were well-tolerated by the volunteers. Microbiota analysis across the whole volunteer cohort revealed a significant increase in the proportional abundance of faecal Bifidobacterium species (P ≤ 0.01) in response to AXOS, but not maltodextrin, supplementation. There was considerable inter-individual variation in the other bacterial taxa that responded, with a clear stratification of volunteers as either Prevotella-plus (n = 8; > 0.1% proportional abundance) or Prevotella-minus (n = 13; ≤0.1% proportional abundance) subjects founded on baseline sample profiles. There was a significant increase in the proportional abundance of both faecal Bifidobacterium (P ≤ 0.01) and Prevotella species (P ≤ 0.01) in Prevotella-plus volunteers during AXOS supplementation, while Prevotella and Bacteroides relative abundances showed an inverse relationship. Proportional abundance of 26 OTUs, including bifidobacteria and Anaerostipes hadrus, differed significantly between baseline samples of Prevotella-plus compared to Prevotella-minus individuals.

    Conclusions: The wheat bran AXOS supplementation was bifidogenic and resulted in changes in human gut microbiota composition that depended on the initial microbiota profile, specifically the presence or absence of Prevotella spp. as a major component of the microbiota. Our data therefore suggest that initial profiling of individuals through gut microbiota analysis should be considered important when contemplating nutritional interventions that rely on prebiotics.

    Trial registration: Clinical trial registration number: NCT02693782 . Registered 29 February 2016 - Retrospectively registered,

    Funded by: Biotechnology and Biological Sciences Research Council: A08456

    BMC microbiology 2020;20;1;283

  • KMT2B-related disorders: expansion of the phenotypic spectrum and long-term efficacy of deep brain stimulation.

    Cif L, Demailly D, Lin JP, Barwick KE, Sa M, Abela L, Malhotra S, Chong WK, Steel D, Sanchis-Juan A, Ngoh A, Trump N, Meyer E, Vasques X, Rankin J, Allain MW, Applegate CD, Attaripour Isfahani S, Baleine J, Balint B, Bassetti JA, Baple EL, Bhatia KP, Blanchet C, Burglen L, Cambonie G, Seng EC, Bastaraud SC, Cyprien F, Coubes C, d'Hardemare V, Deciphering Developmental Disorders Study, Doja A, Dorison N, Doummar D, Dy-Hollins ME, Farrelly E, Fitzpatrick DR, Fearon C, Fieg EL, Fogel BL, Forman EB, Fox RG, Genomics England Research Consortium, Gahl WA, Galosi S, Gonzalez V, Graves TD, Gregory A, Hallett M, Hasegawa H, Hayflick SJ, Hamosh A, Hully M, Jansen S, Jeong SY, Krier JB, Krystal S, Kumar KR, Laurencin C, Lee H, Lesca G, François LL, Lynch T, Mahant N, Martinez-Agosto JA, Milesi C, Mills KA, Mondain M, Morales-Briceno H, NIHR BioResource, Ostergaard JR, Pal S, Pallais JC, Pavillard F, Perrigault PF, Petersen AK, Polo G, Poulen G, Rinne T, Roujeau T, Rogers C, Roubertie A, Sahagian M, Schaefer E, Selim L, Selway R, Sharma N, Signer R, Soldatos AG, Stevenson DA, Stewart F, Tchan M, Undiagnosed Diseases Network, Verma IC, de Vries BBA, Wilson JL, Wong DA, Zaitoun R, Zhen D, Znaczko A, Dale RC, de Gusmão CM, Friedman J, Fung VSC, King MD, Mohammad SS, Rohena L, Waugh JL, Toro C, Raymond FL, Topf M, Coubes P, Gorman KM and Kurian MA

    Département de Neurochirurgie, Unité des Pathologies Cérébrales Résistantes, Unité de Recherche sur les Comportements et Mouvements Anormaux, Hôpital Gui de Chauliac, Centre Hospitalier Régional Montpellier, Montpellier, France.

    Heterozygous mutations in KMT2B are associated with an early-onset, progressive and often complex dystonia (DYT28). Key characteristics of typical disease include focal motor features at disease presentation, evolving through a caudocranial pattern into generalized dystonia, with prominent oromandibular, laryngeal and cervical involvement. Although KMT2B-related disease is emerging as one of the most common causes of early-onset genetic dystonia, much remains to be understood about the full spectrum of the disease. We describe a cohort of 53 patients with KMT2B mutations, with detailed delineation of their clinical phenotype and molecular genetic features. We report new disease presentations, including atypical patterns of dystonia evolution and a subgroup of patients with a non-dystonic neurodevelopmental phenotype. In addition to the previously reported systemic features, our study has identified co-morbidities, including the risk of status dystonicus, intrauterine growth retardation, and endocrinopathies. Analysis of this study cohort (n = 53) in tandem with published cases (n = 80) revealed that patients with chromosomal deletions and protein truncating variants had a significantly higher burden of systemic disease (with earlier onset of dystonia) than those with missense variants. Eighteen individuals had detailed longitudinal data available after insertion of deep brain stimulation for medically refractory dystonia. Median age at deep brain stimulation was 11.5 years (range: 4.5-37.0 years). Follow-up after deep brain stimulation ranged from 0.25 to 22 years. Significant improvement of motor function and disability (as assessed by the Burke Fahn Marsden's Dystonia Rating Scales, BFMDRS-M and BFMDRS-D) was evident at 6 months, 1 year and last follow-up (motor, P = 0.001, P = 0.004, and P = 0.012; disability, P = 0.009, P = 0.002 and P = 0.012). At 1 year post-deep brain stimulation, >50% of subjects showed BFMDRS-M and BFMDRS-D improvements of >30%. In the long-term deep brain stimulation cohort (deep brain stimulation inserted for >5 years, n = 8), improvement of >30% was maintained in 5/8 and 3/8 subjects for the BFMDRS-M and BFMDRS-D, respectively. The greatest BFMDRS-M improvements were observed for trunk (53.2%) and cervical (50.5%) dystonia, with less clinical impact on laryngeal dystonia. Improvements in gait dystonia decreased from 20.9% at 1 year to 16.2% at last assessment; no patient maintained a fully independent gait. Reduction of BFMDRS-D was maintained for swallowing (52.9%). Five patients developed mild parkinsonism following deep brain stimulation. KMT2B-related disease comprises an expanding continuum from infancy to adulthood, with early evidence of genotype-phenotype correlations. Except for laryngeal dysphonia, deep brain stimulation provides a significant improvement in quality of life and function with sustained clinical benefit depending on symptoms distribution.

    Funded by: NHGRI NIH HHS: U01 HG007690, U01 HG007703, U54 HG006493, UM1 HG006493; NINDS NIH HHS: K23 NS101096, P01 NS087997; Wellcome Trust

    Brain : a journal of neurology 2020;143;11;3242-3261

  • A brief history of human disease genetics.

    Claussnitzer M, Cho JH, Collins R, Cox NJ, Dermitzakis ET, Hurles ME, Kathiresan S, Kenny EE, Lindgren CM, MacArthur DG, North KN, Plon SE, Rehm HL, Risch N, Rotimi CN, Shendure J, Soranzo N and McCarthy MI

    Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, USA.

    A primary goal of human genetics is to identify DNA sequence variants that influence biomedical traits, particularly those related to the onset and progression of human disease. Over the past 25 years, progress in realizing this objective has been transformed by advances in technology, foundational genomic resources and analytical tools, and by access to vast amounts of genotype and phenotype data. Genetic discoveries have substantially improved our understanding of the mechanisms responsible for many rare and common diseases and driven development of novel preventative and therapeutic strategies. Medical innovation will increasingly focus on delivering care tailored to individual patterns of genetic predisposition.

    Funded by: Howard Hughes Medical Institute; NIDDK NIH HHS: R01 DK106593, U01 DK062422, U01 DK062429; Wellcome Trust: 090532, 098381, 106130, 203141, 212259

    Nature 2020;577;7789;179-189

  • Peripheral T3 signaling is target of pesticides in zebrafish larvae and adult liver.

    Colella M, Nittoli V, Porciello A, Porreca I, Reale C, Russo F, Russo NA, Roberto L, Albano F, De Felice M, Mallardo M and Ambrosino C

    M Colella, Science and Technology, University of Sannio, Benevento, Italy.

    The intra-tissue level of thyroid hormones (THs) regulates organ functions. Environmental factors can impair it by damaging the thyroid gland and/or the peripheral TH metabolism. We investigated the effects of embryonic and/or long-life exposure to low-dose pesticides, ethylenethiourea (ETU), chlorpyrifos (CPF) and their mixture, on the intra-tissue T4/T3 metabolism/signalling in zebrafish at different life stages. Hypothyroidism was evidenced in exposed larvae that showed reduced number of follicle and induced tshb mRNAs. Despite that, we evidenced the increase of free T4 (fT4) and free T3 (fT3) levels/signalling that was confirmed by the transcriptional regulation of TH metabolic enzymes (deiodinases) and T3-regulated mRNAs (cpt1, igfbp1a). The second-generation larvae showed effects on thyroid and TH signalling even when not directly exposed, suggesting a role of the parental exposure. In adult zebrafish we found a sex-dependent damage of hepatic T3 level/signalling associated to liver steatosis, more pronounced in female, with a sex-dependent alteration of transcripts codifying the key enzymes involved in "de novo lipogenesis" and of β- oxidation. We found an impaired activation of liver T3 and PPARα/Foxo3a pathways whose deregulation was already involved in mammalian liver steatosis. The data underscore the intra-tissue imbalance of T3 level as a target of thyroid endocrine disruptors (THDC) and suggest that the effects of slight modification of T3 signalling might be amplified by its direct regulation or crosstalk with PPAR/Foxo3a pathways. Because T3 levels define the hypothyroid/hyperthyroid status of each organ, our findings might explain the pleiotropic and site-dependent effects of pesticides.

    The Journal of endocrinology 2020

  • Genome-wide gene-environment analyses of major depressive disorder and reported lifetime traumatic experiences in UK Biobank.

    Coleman JRI, Peyrot WJ, Purves KL, Davis KAS, Rayner C, Choi SW, Hübel C, Gaspar HA, Kan C, Van der Auwera S, Adams MJ, Lyall DM, Choi KW, on the behalf of Major Depressive Disorder Working Group of the Psychiatric Genomics Consortium, Dunn EC, Vassos E, Danese A, Maughan B, Grabe HJ, Lewis CM, O'Reilly PF, McIntosh AM, Smith DJ, Wray NR, Hotopf M, Eley TC and Breen G

    Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology & Neuroscience, King's College London, London, UK.

    Depression is more frequent among individuals exposed to traumatic events. Both trauma exposure and depression are heritable. However, the relationship between these traits, including the role of genetic risk factors, is complex and poorly understood. When modelling trauma exposure as an environmental influence on depression, both gene-environment correlations and gene-environment interactions have been observed. The UK Biobank concurrently assessed Major Depressive Disorder (MDD) and self-reported lifetime exposure to traumatic events in 126,522 genotyped individuals of European ancestry. We contrasted genetic influences on MDD stratified by reported trauma exposure (final sample size range: 24,094-92,957). The SNP-based heritability of MDD with reported trauma exposure (24%) was greater than MDD without reported trauma exposure (12%). Simulations showed that this is not confounded by the strong, positive genetic correlation observed between MDD and reported trauma exposure. We also observed that the genetic correlation between MDD and waist circumference was only significant in individuals reporting trauma exposure (r<sub>g</sub> = 0.24, p = 1.8 × 10<sup>-7</sup> versus r<sub>g</sub> = -0.05, p = 0.39 in individuals not reporting trauma exposure, difference p = 2.3 × 10<sup>-4</sup>). Our results suggest that the genetic contribution to MDD is greater when reported trauma is present, and that a complex relationship exists between reported trauma exposure, body composition, and MDD.

    Funded by: Department of Health | National Health and Medical Research Council (NHMRC): 1078901, 1087889; Medical Research Council: G0200243, MC_PC_12028, MC_PC_17209, MC_PC_17228, MC_QA137853, MR/N015746/1; NIMH NIH HHS: U01 MH109514, U01 MH109528, U01 MH109532; Nederlandse Organisatie voor Wetenschappelijk Onderzoek (Netherlands Organisation for Scientific Research): 91619152

    Molecular psychiatry 2020;25;7;1430-1446

  • Designing ecologically optimized pneumococcal vaccines using population genomics.

    Colijn C, Corander J and Croucher NJ

    Department of Mathematics, Simon Fraser University, Burnaby, BC, Canada.

    Streptococcus pneumoniae (the pneumococcus) is a common nasopharyngeal commensal that can cause invasive pneumococcal disease (IPD). Each component of current protein-polysaccharide conjugate vaccines (PCVs) generally induces immunity specific to one of the approximately 100 pneumococcal serotypes, and typically eliminates it from carriage and IPD through herd immunity. Overall carriage rates remain stable owing to replacement by non-PCV serotypes. Consequently, the net change in IPD incidence is determined by the relative invasiveness of the pre- and post-PCV-carried pneumococcal populations. In the present study, we identified PCVs expected to minimize the post-vaccine IPD burden by applying Bayesian optimization to an ecological model of serotype replacement that integrated epidemiological and genomic data. We compared optimal formulations for reducing infant-only or population-wide IPD, and identified potential benefits to including non-conserved pneumococcal carrier proteins. Vaccines were also devised to minimize IPD resistant to antibiotic treatment, despite the ecological model assuming that resistance levels in the carried population would be preserved. We found that expanding infant-administered PCV valency is likely to result in diminishing returns, and that complementary pairs of infant- and adult-administered vaccines could be a superior strategy. PCV performance was highly dependent on the circulating pneumococcal population, further highlighting the advantages of a diversity of anti-pneumococcal vaccination strategies.

    Funded by: EC | EC Seventh Framework Programm | FP7 Ideas: European Research Council (FP7-IDEAS-ERC - Specific Programme: &quot;Ideas&quot; Implementing the Seventh Framework Programme of the European Community for Research, Technological Development and Demonstration Activities (2007 to 2013)): 742158; RCUK | Engineering and Physical Sciences Research Council (EPSRC): EP/K026003/1, EP/N014529/1; Wellcome Trust (Wellcome): 104169/Z/14/A

    Nature microbiology 2020;5;3;473-485

  • Definition of a genetic relatedness cutoff to exclude recent transmission of meticillin-resistant Staphylococcus aureus: a genomic epidemiology analysis.

    Coll F, Raven KE, Knight GM, Blane B, Harrison EM, Leek D, Enoch DA, Brown NM, Parkhill J and Peacock SJ

    Department of Infection Biology, Faculty of Infectious and Tropical Diseases, London School of Hygiene & Tropical Medicine, London, UK.

    Background: Whole-genome sequencing (WGS) can be used in genomic epidemiology investigations to confirm or refute outbreaks of bacterial pathogens, and to support targeted and efficient infection control interventions. We aimed to define a genetic relatedness cutoff, quantified as a number of single-nucleotide polymorphisms (SNP), for meticillin-resistant <i>Staphylococcus aureus</i> (MRSA), above which recent (ie, within 6 months) patient-to-patient transmission could be ruled out.

    Methods: We did a retrospective genomic and epidemiological analysis of MRSA data from two prospective observational cohort studies in the UK to establish SNP cutoffs for genetic relatedness, above which recent transmission was unlikely. We used three separate approaches to calculate these thresholds. First, we applied a linear mixed model to estimate the <i>S aureus</i> substitution rate and 95th percentile within-host diversity in a cohort in which multiple isolates were sequenced per individual. Second, we applied a simulated transmission model to this same genomic dataset. Finally, in a second cohort, we determined the genetic distance (ie, the number of SNPs) that would capture 95% of epidemiologically linked cases. We applied the three approaches to both whole-genome and core-genome sequences.

    Findings: In the linear mixed model, the estimated substitution rate was roughly 5 whole-genome SNPs (wgSNPs) or 3 core-genome SNPs (cgSNPs) per genome per year, and the 95th percentile within-host diversity was 19 wgSNPs or 10 cgSNPs. The combined SNP cutoffs for detection of MRSA transmission within 6 months per this model were thus 24 wgSNPs or 13 cgSNPs. The simulated transmission model suggested that cutoffs of 17 wgSNPs or 12 cgSNPs would detect 95% of MRSA transmission events within the same timeframe. Finally, in the second cohort, cutoffs of 22 wgSNPs or 11 cgSNPs captured 95% of epidemiologically linked cases within 6 months.

    Interpretation: On the basis of our results, we propose conservative cutoffs of 25 wgSNPs or 15 cgSNPS above which transmission of MRSA within the previous 6 months can be ruled out. These cutoffs could potentially be used as part of a genomic sequencing approach to the management of outbreaks of MRSA in conjunction with traditional epidemiological techniques.

    Funding: UK Department of Health, Wellcome Trust, UK National Institute for Health Research.

    The Lancet. Microbe 2020;1;8;e328-e335

  • Spatial competition shapes the dynamic mutational landscape of normal esophageal epithelium.

    Colom B, Alcolea MP, Piedrafita G, Hall MWJ, Wabik A, Dentro SC, Fowler JC, Herms A, King C, Ong SH, Sood RK, Gerstung M, Martincorena I, Hall BA and Jones PH

    Wellcome Sanger Institute, Hinxton, UK.

    During aging, progenitor cells acquire mutations, which may generate clones that colonize the surrounding tissue. By middle age, normal human tissues, including the esophageal epithelium (EE), become a patchwork of mutant clones. Despite their relevance for understanding aging and cancer, the processes that underpin mutational selection in normal tissues remain poorly understood. Here, we investigated this issue in the esophageal epithelium of mutagen-treated mice. Deep sequencing identified numerous mutant clones with multiple genes under positive selection, including Notch1, Notch2 and Trp53, which are also selected in human esophageal epithelium. Transgenic lineage tracing revealed strong clonal competition that evolved over time. Clone dynamics were consistent with a simple model in which the proliferative advantage conferred by positively selected mutations depends on the nature of the neighboring cells. When clones with similar competitive fitness collide, mutant cell fate reverts towards homeostasis, a constraint that explains how selection operates in normal-appearing epithelium.

    Funded by: Cancer Research UK: A17257, A21777, C57387/A21777, C609/A17257, C609/A27326; Medical Research Council: MC_UU_12022/9, MR/S000216/1; Wellcome Trust: 098051, 296194

    Nature genetics 2020;52;6;604-614

  • Screening for functional transcriptional and splicing regulatory variants with GenIE.

    Cooper SE, Schwartzentruber J, Bello E, Coomber EL and Bassett AR

    Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK.

    Genome-wide association studies (GWAS) have identified numerous genetic loci underlying human diseases, but a fundamental challenge remains to accurately identify the underlying causal genes and variants. Here, we describe an arrayed CRISPR screening method, Genome engineering-based Interrogation of Enhancers (GenIE), which assesses the effects of defined alleles on transcription or splicing when introduced in their endogenous genomic locations. We use this sensitive assay to validate the activity of transcriptional enhancers and splice regulatory elements in human induced pluripotent stem cells (hiPSCs), and develop a software package (rgenie) to analyse the data. We screen the 99% credible set of Alzheimer's disease (AD) GWAS variants identified at the clusterin (CLU) locus to identify a subset of likely causal variants, and employ GenIE to understand the impact of specific mutations on splicing efficiency. We thus establish GenIE as an efficient tool to rapidly screen for the role of transcribed variants on gene expression.

    Nucleic acids research 2020

  • Lineage-Independent Tumors in Bilateral Neuroblastoma.

    Coorens THH, Farndon SJ, Mitchell TJ, Jain N, Lee S, Hubank M, Sebire N, Anderson J and Behjati S

    From the Wellcome Sanger Institute, Hinxton (T.H.H.C., T.J.M., S.L., S.B.), Cambridge University Hospitals NHS Foundation Trust (S.J.F., T.J.M., S.B.) and the Departments of Surgery (T.J.M.) and Paediatrics (S.B.), University of Cambridge, Cambridge, and UCL Great Ormond Street Institute of Child Health (N.J., N.S., J.A.), Great Ormond Street Hospital for Children NHS Foundation Trust (N.J., N.S., J.A.), and the Royal Marsden NHS Foundation Trust (M.H.), London - all in the United Kingdom.

    Childhood tumors that occur synchronously in different anatomical sites usually represent metastatic disease. However, such tumors can be independent neoplasms. We investigated whether cases of bilateral neuroblastoma represented independent tumors in two children with pathogenic germline mutations by genotyping somatic mutations shared between tumors and blood. Our results suggested that in both children, the lineages that had given rise to the tumors had segregated within the first cell divisions of the zygote, without being preceded by a common premalignant clone. In one patient, the tumors had parallel evolution, including distinct second hits in <i>SMARCA4</i>, a putative predisposition gene for neuroblastoma. These findings portray cases of bilateral neuroblastoma as having independent lesions mediated by a germline predisposition. (Funded by Children with Cancer UK and Wellcome.).

    The New England journal of medicine 2020;383;19;1860-1865

  • Refinement of the clinical and mutational spectrum of UBE2A deficiency syndrome.

    Cordeddu V, Macke EL, Radio FC, Lo Cicero S, Pantaleoni F, Tatti M, Bellacchio E, Ciolfi A, Agolini E, Bruselles A, Brunetti-Pierri N, Suri M, Josephs KS, McEntagart M, Lanpher B, Nickels KC, Haworth A, Reed L, Cappuccio G, Mammi I, Tarnowski JM, Novelli A, Deciphering Developmental Disorders Study, Melis D, Callewaert B, Dallapiccola B, Klee E and Tartaglia M

    National Center for Drug Research and Evaluation, Istituto Superiore di Sanità, Rome, Italy.

    UBE2A deficiency, that is, intellectual disability (ID) Nascimento type (MIM 300860), is an X-linked syndrome characterized by developmental delay, moderate to severe ID, seizures, dysmorphisms, skin anomalies, and urogenital malformations. Forty affected subjects have been reported thus far, with 31 cases having intragenic UBE2A variants. Here, we report on additional eight affected subjects from seven unrelated families who were found to be hemizygous for previously unreported UBE2A missense variants (p.Glu62Lys, p.Arg95Cys, p.Thr99Ala, and p.Arg135Trp) or small in-frame deletions (p.Val81_Ala83del, and p.Asp101del). A wide phenotypic spectrum was documented in these subjects, ranging from moderate ID associated with mild dysmorphisms to severe features including congenital heart defects (CHD), severe cognitive impairment, and pineal gland tumors. Four variants affected residues (Glu62, Arg95, Thr99 and Asp101) that contribute to stabilizing the structure of the E3 binding domain. The three-residue in-frame deletion, p.Val81_Ala83del, resulted from aberrant processing of the transcript. This variant and p.Arg135Trp mapped to regions of the protein located far from the E3 binding region, and caused variably accelerated protein degradation. By reviewing available clinical information, we revise the clinical and molecular profile of the disorder and document genotype-phenotype correlations. Pineal gland cysts/tumors, CHD and hypogammaglobulinemia emerge as recurrent features.

    Funded by: Fondazione Bambino Gesù: Vite Coraggiose; Italian Ministry of Health: Ricerca Corrente 2019; Telethon Foundation: GSP15001

    Clinical genetics 2020;98;2;172-178

  • Protein antibiotics: mind your language.

    Correia A and Weimann A

    Sanger Institute, Wellcome Trust Genome Campus, Hinxton, UK.

    Nature reviews. Microbiology 2020

  • Baseline Gut Microbiota Composition Is Associated With Schistosoma mansoni Infection Burden in Rodent Models.

    Cortés A, Clare S, Costain A, Almeida A, McCarthy C, Harcourt K, Brandt C, Tolley C, Rooney J, Berriman M, Lawley T, MacDonald AS, Rinaldi G and Cantacessi C

    Department of Veterinary Medicine, University of Cambridge, Cambridge, United Kingdom.

    In spite of growing evidence supporting the occurrence of complex interactions between <i>Schistosoma</i> and gut bacteria in mice and humans, no data is yet available on whether worm-mediated changes in microbiota composition are dependent on the baseline gut microbial profile of the vertebrate host. In addition, the impact of such changes on the susceptibility to, and pathophysiology of, schistosomiasis remains largely unexplored. In this study, mice colonized with gut microbial populations from a human donor (HMA mice), as well as microbiota-wild type (WT) animals, were infected with <i>Schistosoma mansoni</i>, and alterations of their gut microbial profiles at 50 days post-infection were compared to those occurring in uninfected HMA and WT rodents, respectively. Significantly higher worm and egg burdens, together with increased specific antibody responses to parasite antigens, were observed in HMA compared to WT mice. These differences were associated to extensive dissimilarities between the gut microbial profiles of each HMA and WT groups of mice at baseline; in particular, the gut microbiota of HMA animals was characterized by low microbial alpha diversity and expanded Proteobacteria, as well as by the absence of putative immunomodulatory bacteria (e.g. <i>Lactobacillus</i>). Furthermore, differences in infection-associated changes in gut microbiota composition were observed between HMA and WT mice. Altogether, our findings support the hypothesis that susceptibility to <i>S.</i><i>mansoni</i> infection in mice is partially dependent on the composition of the host baseline microbiota. Moreover, this study highlights the applicability of HMA mouse models to address key biological questions on host-parasite-microbiota relationships in human helminthiases.

    Frontiers in immunology 2020;11;593838

  • Comprehensive analysis of chromothripsis in 2,658 human cancers using whole-genome sequencing.

    Cortés-Ciriano I, Lee JJ, Xi R, Jain D, Jung YL, Yang L, Gordenin D, Klimczak LJ, Zhang CZ, Pellman DS, PCAWG Structural Variation Working Group, Park PJ and PCAWG Consortium

    Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.

    Chromothripsis is a mutational phenomenon characterized by massive, clustered genomic rearrangements that occurs in cancer and other diseases. Recent studies in selected cancer types have suggested that chromothripsis may be more common than initially inferred from low-resolution copy-number data. Here, as part of the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA), we analyze patterns of chromothripsis across 2,658 tumors from 38 cancer types using whole-genome sequencing data. We find that chromothripsis events are pervasive across cancers, with a frequency of more than 50% in several cancer types. Whereas canonical chromothripsis profiles display oscillations between two copy-number states, a considerable fraction of events involve multiple chromosomes and additional structural alterations. In addition to non-homologous end joining, we detect signatures of replication-associated processes and templated insertions. Chromothripsis contributes to oncogene amplification and to inactivation of genes such as mismatch-repair-related genes. These findings show that chromothripsis is a major process that drives genome evolution in human cancer.

    Funded by: EC | EU Framework Programme for Research and Innovation H2020 | H2020 European Institute of Innovation and Technology (H2020 The European Institute of Innovation and Technology): 703543

    Nature genetics 2020;52;3;331-341

  • Polyclonal Campylobacter fetus Infections Among Unrelated Patients, Montevideo, Uruguay, 2013-2018.

    Costa D, Betancor L, Gadea P, Cabezas L, Caiata L, Palacio R, Seija V, Galiana A, Vieytes M, Cristophersen I, Calleros L and Iraola G

    Microbial Genomics Laboratory, Institut Pasteur de Montevideo, Montevideo, Uruguay.

    In Montevideo (2013-2018), 8 Campylobacter fetus extraintestinal infections were reported. The polyclonal nature of strains revealed by whole-genome sequencing and the apparent lack of epidemiological links was incompatible with a single contamination source, supporting alternative routes of transmission.

    Clinical infectious diseases : an official publication of the Infectious Diseases Society of America 2020;70;6;1236-1239

  • A Way Straight-Forward for Leishmania Genetics.

    Cotton JA and Franssen SU

    Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK.

    Genetic exchange between Leishmania parasites was demonstrated in sandflies over 10 years ago. Louradour et al. have shown in vitro hybridization of two Leishmania tropica isolates, with the potential to remove a major roadblock to using forward genetics in Leishmania, understanding Leishmania reproductive biology, and analyzing gene flow in natural populations.

    Trends in parasitology 2020

  • Genomic analysis of natural intra-specific hybrids among Ethiopian isolates of Leishmania donovani.

    Cotton JA, Durrant C, Franssen SU, Gelanew T, Hailu A, Mateus D, Sanders MJ, Berriman M, Volf P, Miles MA and Yeo M

    Wellcome Sanger Institute, Hinxton, United Kingdom.

    Parasites of the genus Leishmania (Kinetoplastida: Trypanosomatidae) cause widespread and devastating human diseases. Visceral leishmaniasis due to Leishmania donovani is endemic in Ethiopia where it has also been responsible for major epidemics. The presence of hybrid genotypes has been widely reported in surveys of natural populations, genetic variation reported in a number of Leishmania species, and the extant capacity for genetic exchange demonstrated in laboratory experiments. However, patterns of recombination and the evolutionary history of admixture that produced these hybrid populations remain unclear. Here, we use whole-genome sequence data to investigate Ethiopian L. donovani isolates previously characterized as hybrids by microsatellite and multi-locus sequencing. To date there is only one previous study on a natural population of Leishmania hybrids based on whole-genome sequences. We propose that these hybrids originate from recombination between two different lineages of Ethiopian L. donovani occurring in the same region. Patterns of inheritance are more complex than previously reported with multiple, apparently independent, origins from similar parents that include backcrossing with parental types. Analysis indicates that hybrids are representative of at least three different histories. Furthermore, isolates were highly polysomic at the level of chromosomes with differences between parasites recovered from a recrudescent infection from a previously treated individual. The results demonstrate that recombination is a significant feature of natural populations and contributes to the growing body of data that shows how recombination, and gene flow, shape natural populations of Leishmania.

    Funded by: Wellcome Trust: 206194

    PLoS neglected tropical diseases 2020;14;4;e0007143

  • An integrated national scale SARS-CoV-2 genomic surveillance network.

    COVID-19 Genomics UK (COG-UK)</span>

    Funded by: Medical Research Council: MC_PC_19027

    The Lancet. Microbe 2020;1;3;e99-e100

  • Three-Dimensional Printed Molds for Image-Guided Surgical Biopsies: An Open Source Computational Platform.

    Crispin-Ortuzar M, Gehrung M, Ursprung S, Gill AB, Warren AY, Beer L, Gallagher FA, Mitchell TJ, Mendichovszky IA, Priest AN, Stewart GD, Sala E and Markowetz F

    Cancer Research UK, Cambridge Institute, University of Cambridge, Cambridge, United Kingdom.

    Purpose: Spatial heterogeneity of tumors is a major challenge in precision oncology. The relationship between molecular and imaging heterogeneity is still poorly understood because it relies on the accurate coregistration of medical images and tissue biopsies. Tumor molds can guide the localization of biopsies, but their creation is time consuming, technologically challenging, and difficult to interface with routine clinical practice. These hurdles have so far hindered the progress in the area of multiscale integration of tumor heterogeneity data.

    Methods: We have developed an open-source computational framework to automatically produce patient-specific 3-dimensional-printed molds that can be used in the clinical setting. Our approach achieves accurate coregistration of sampling location between tissue and imaging, and integrates seamlessly with clinical, imaging, and pathology workflows.

    Results: We applied our framework to patients with renal cancer undergoing radical nephrectomy. We created personalized molds for 6 patients, obtaining Dice similarity coefficients between imaging and tissue sections ranging from 0.86 to 0.96 for tumor regions and between 0.70 and 0.76 for healthy kidneys. The framework required minimal manual intervention, producing the final mold design in just minutes, while automatically taking into account clinical considerations such as a preference for specific cutting planes.

    Conclusion: Our work provides a robust and automated interface between imaging and tissue samples, enabling the development of clinical studies to probe tumor heterogeneity on multiple spatial scales.

    JCO clinical cancer informatics 2020;4;736-748

  • Screening of a library of recombinant Schistosoma mansoni proteins with sera from murine and human controlled infections identifies early serological markers.

    Crosnier C, Hokke CH, Protasio AV, Brandt C, Rinaldi G, Langenberg MCC, Clare S, Janse JJ, Wilson S, Berriman M, Roestenberg M and Wright GJ

    Wellcome Sanger Institute, Cambridge, UK.

    Schistosomiasis is a major global health problem caused by blood-dwelling parasitic worms, which is currently tackled primarily by mass administration of the drug praziquantel. Appropriate drug treatment strategies are informed by diagnostics that establish the prevalence and intensity of infection, which, in regions of low transmission, should be highly sensitive. To identify sensitive new serological markers of Schistosoma mansoni infections, we have compiled a recombinant protein library of parasite cell-surface and secreted proteins expressed in mammalian cells. Together with a time series of sera samples from volunteers experimentally infected with a defined number of male parasites, we probed this protein library to identify several markers that can detect primary infections with as low as ten parasites and as early as five weeks post infection. These new markers could be further explored as valuable tools to detect ongoing and previous S. mansoni infections, including in endemic regions where transmission is low.

    The Journal of infectious diseases 2020

  • Genome-wide association study identifies 48 common genetic variants associated with handedness.

    Cuellar-Partida G, Tung JY, Eriksson N, Albrecht E, Aliev F, Andreassen OA, Barroso I, Beckmann JS, Boks MP, Boomsma DI, Boyd HA, Breteler MMB, Campbell H, Chasman DI, Cherkas LF, Davies G, de Geus EJC, Deary IJ, Deloukas P, Dick DM, Duffy DL, Eriksson JG, Esko T, Feenstra B, Geller F, Gieger C, Giegling I, Gordon SD, Han J, Hansen TF, Hartmann AM, Hayward C, Heikkilä K, Hicks AA, Hirschhorn JN, Hottenga JJ, Huffman JE, Hwang LD, Ikram MA, Kaprio J, Kemp JP, Khaw KT, Klopp N, Konte B, Kutalik Z, Lahti J, Li X, Loos RJF, Luciano M, Magnusson SH, Mangino M, Marques-Vidal P, Martin NG, McArdle WL, McCarthy MI, Medina-Gomez C, Melbye M, Melville SA, Metspalu A, Milani L, Mooser V, Nelis M, Nyholt DR, O'Connell KS, Ophoff RA, Palmer C, Palotie A, Palviainen T, Pare G, Paternoster L, Peltonen L, Penninx BWJH, Polasek O, Pramstaller PP, Prokopenko I, Raikkonen K, Ripatti S, Rivadeneira F, Rudan I, Rujescu D, Smit JH, Smith GD, Smoller JW, Soranzo N, Spector TD, Pourcain BS, Starr JM, Stefánsson H, Steinberg S, Teder-Laving M, Thorleifsson G, Stefánsson K, Timpson NJ, Uitterlinden AG, van Duijn CM, van Rooij FJA, Vink JM, Vollenweider P, Vuoksimaa E, Waeber G, Wareham NJ, Warrington N, Waterworth D, Werge T, Wichmann HE, Widen E, Willemsen G, Wright AF, Wright MJ, Xu M, Zhao JH, Kraft P, Hinds DA, Lindgren CM, Mägi R, Neale BM, Evans DM and Medland SE

    The University of Queensland Diamantina Institute, The University of Queensland, Woolloongabba, Queensland, Australia.

    Handedness has been extensively studied because of its relationship with language and the over-representation of left-handers in some neurodevelopmental disorders. Using data from the UK Biobank, 23andMe and the International Handedness Consortium, we conducted a genome-wide association meta-analysis of handedness (N = 1,766,671). We found 41 loci associated (P < 5 × 10<sup>-8</sup>) with left-handedness and 7 associated with ambidexterity. Tissue-enrichment analysis implicated the CNS in the aetiology of handedness. Pathways including regulation of microtubules and brain morphology were also highlighted. We found suggestive positive genetic correlations between left-handedness and neuropsychiatric traits, including schizophrenia and bipolar disorder. Furthermore, the genetic correlation between left-handedness and ambidexterity is low (r<sub>G</sub> = 0.26), which implies that these traits are largely influenced by different genetic mechanisms. Our findings suggest that handedness is highly polygenic and that the genetic variants that predispose to left-handedness may underlie part of the association with some psychiatric disorders.

    Funded by: DH | NIHR | Health Services Research Programme (NIHR Health Services Research Programme): 5P50HD028138-27, NF-SI-0617-10090; Department of Education and Training | Australian Research Council (ARC): DE180100976; Department of Health | National Health and Medical Research Council (NHMRC): APP1103623, APP1104818, APP1137714; Wellcome Trust (Wellcome): 090532, 106130, 098381, 203141, 212259, 095101, 200837, 099673/Z/12/Z

    Nature human behaviour 2020

  • Single-cell RNA-sequencing of differentiating iPS cells reveals dynamic genetic effects on gene expression.

    Cuomo ASE, Seaton DD, McCarthy DJ, Martinez I, Bonder MJ, Garcia-Bernardo J, Amatya S, Madrigal P, Isaacson A, Buettner F, Knights A, Natarajan KN, HipSci Consortium, Vallier L, Marioni JC, Chhatriwala M and Stegle O

    European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, CB10 1SD Hinxton, Cambridge, UK.

    Recent developments in stem cell biology have enabled the study of cell fate decisions in early human development that are impossible to study in vivo. However, understanding how development varies across individuals and, in particular, the influence of common genetic variants during this process has not been characterised. Here, we exploit human iPS cell lines from 125 donors, a pooled experimental design, and single-cell RNA-sequencing to study population variation of endoderm differentiation. We identify molecular markers that are predictive of differentiation efficiency of individual lines, and utilise heterogeneity in the genetic background across individuals to map hundreds of expression quantitative trait loci that influence expression dynamically during differentiation and across cellular contexts.

    Funded by: Wellcome Trust

    Nature communications 2020;11;1;810

  • Lung function and microbiota diversity in cystic fibrosis.

    Cuthbertson L, Walker AW, Oliver AE, Rogers GB, Rivett DW, Hampton TH, Ashare A, Elborn JS, De Soyza A, Carroll MP, Hoffman LR, Lanyon C, Moskowitz SM, O'Toole GA, Parkhill J, Planet PJ, Teneback CC, Tunney MM, Zuckerman JB, Bruce KD and van der Gast CJ

    National Heart and Lung Institute, Imperial College London, London, UK.

    Background: Chronic infection and concomitant airway inflammation is the leading cause of morbidity and mortality for people living with cystic fibrosis (CF). Although chronic infection in CF is undeniably polymicrobial, involving a lung microbiota, infection surveillance and control approaches remain underpinned by classical aerobic culture-based microbiology. How to use microbiomics to direct clinical management of CF airway infections remains a crucial challenge. A pivotal step towards leveraging microbiome approaches in CF clinical care is to understand the ecology of the CF lung microbiome and identify ecological patterns of CF microbiota across a wide spectrum of lung disease. Assessing sputum samples from 299 patients attending 13 CF centres in Europe and the USA, we determined whether the emerging relationship of decreasing microbiota diversity with worsening lung function could be considered a generalised pattern of CF lung microbiota and explored its potential as an informative indicator of lung disease state in CF.

    Results: We tested and found decreasing microbiota diversity with a reduction in lung function to be a significant ecological pattern. Moreover, the loss of diversity was accompanied by an increase in microbiota dominance. Subsequently, we stratified patients into lung disease categories of increasing disease severity to further investigate relationships between microbiota characteristics and lung function, and the factors contributing to microbiota variance. Core taxa group composition became highly conserved within the severe disease category, while the rarer satellite taxa underpinned the high variability observed in the microbiota diversity. Further, the lung microbiota of individual patient were increasingly dominated by recognised CF pathogens as lung function decreased. Conversely, other bacteria, especially obligate anaerobes, increasingly dominated in those with better lung function. Ordination analyses revealed lung function and antibiotics to be main explanators of compositional variance in the microbiota and the core and satellite taxa. Biogeography was found to influence acquisition of the rarer satellite taxa.

    Conclusions: Our findings demonstrate that microbiota diversity and dominance, as well as the identity of the dominant bacterial species, in combination with measures of lung function, can be used as informative indicators of disease state in CF. Video Abstract.

    Funded by: Wellcome Trust: WT 098051

    Microbiome 2020;8;1;45

  • A restricted spectrum of missense KMT2D variants cause a multiple malformations disorder distinct from Kabuki syndrome.

    Cuvertino S, Hartill V, Colyer A, Garner T, Nair N, Al-Gazali L, Canham N, Faundes V, Flinter F, Hertecant J, Holder-Espinasse M, Jackson B, Lynch SA, Nadat F, Narasimhan VM, Peckham M, Sellers R, Seri M, Montanari F, Southgate L, Squeo GM, Trembath R, van Heel D, Venuto S, Weisberg D, Stals K, Ellard S, Genomics England Research Consortium, Barton A, Kimber SJ, Sheridan E, Merla G, Stevens A, Johnson CA and Banka S

    Division of Evolution and Genomic Sciences, School of Biological Sciences, Faculty of Biology, Medicine, and Health, The University of Manchester, Manchester, UK.

    Purpose: To investigate if specific exon 38 or 39 KMT2D missense variants (MVs) cause a condition distinct from Kabuki syndrome type 1 (KS1).

    Methods: Multiple individuals, with MVs in exons 38 or 39 of KMT2D that encode a highly conserved region of 54 amino acids flanked by Val3527 and Lys3583, were identified and phenotyped. Functional tests were performed to study their pathogenicity and understand the disease mechanism.

    Results: The consistent clinical features of the affected individuals, from seven unrelated families, included choanal atresia, athelia or hypoplastic nipples, branchial sinus abnormalities, neck pits, lacrimal duct anomalies, hearing loss, external ear malformations, and thyroid abnormalities. None of the individuals had intellectual disability. The frequency of clinical features, objective software-based facial analysis metrics, and genome-wide peripheral blood DNA methylation patterns in these patients were significantly different from that of KS1. Circular dichroism spectroscopy indicated that these MVs perturb KMT2D secondary structure through an increased disordered to ɑ-helical transition.

    Conclusion: KMT2D MVs located in a specific region spanning exons 38 and 39 and affecting highly conserved residues cause a novel multiple malformations syndrome distinct from KS1. Unlike KMT2D haploinsufficiency in KS1, these MVs likely result in disease through a dominant negative mechanism.

    Funded by: British Heart Foundation: FS/13/32/30069; Department of Health; Medical Research Council: MR/K011154/1; Wellcome Trust: 062164/Z/00/Z, 102627/Z/13/Z, 213312/Z/18/Z

    Genetics in medicine : official journal of the American College of Medical Genetics 2020;22;5;867-877

  • Comparative genomics of Salmonella enterica serovar Enteritidis ST-11 isolated in Uruguay reveals lineages associated with particular epidemiological traits.

    D'Alessandro B, Pérez Escanda V, Balestrazzi L, Grattarola F, Iriarte A, Pickard D, Yim L, Chabalgoity JA and Betancor L

    Departamento de Desarrollo Biotecnológico, Instituto de Higiene, Facultad de Medicina, Universidad de la República, Av. Alfredo Navarro 3051, CP, 11600, Montevideo, Uruguay.

    Salmonella enterica serovar Enteritidis is a major cause of foodborne disease in Uruguay since 1995. We used a genomic approach to study a set of isolates from different sources and years. Whole genome phylogeny showed that most of the strains are distributed in two major lineages (E1 and E2), both belonging to MLST sequence type 11 the major ST among serovar Enteritidis. Strikingly, E2 isolates are over-represented in periods of outbreak abundance in Uruguay, while E1 span all epidemic periods. Both lineages circulate in neighbor countries at the same timescale as in Uruguay, and are present in minor numbers in distant countries. We identified allelic variants associated with each lineage. Three genes, ycdX, pduD and hsdM, have distinctive variants in E1 that may result in defective products. Another four genes (ybiO, yiaN, aas, aceA) present variants specific for the E2 lineage. Overall this work shows that S. enterica serovar Enteritidis strains circulating in Uruguay have the same phylogenetic profile than strains circulating in the region, as well as in more distant countries. Based on these results we hypothesize that the E2 lineage, which is more prevalent during epidemics, exhibits a combination of allelic variants that could be associated with its epidemic ability.

    Scientific reports 2020;10;1;3638

  • DNA methylation repels binding of hypoxia-inducible transcription factors to maintain tumor immunotolerance.

    D'Anna F, Van Dyck L, Xiong J, Zhao H, Berrens RV, Qian J, Bieniasz-Krzywiec P, Chandra V, Schoonjans L, Matthews J, De Smedt J, Minnoye L, Amorim R, Khorasanizadeh S, Yu Q, Zhao L, De Borre M, Savvides SN, Simon MC, Carmeliet P, Reik W, Rastinejad F, Mazzone M, Thienpont B and Lambrechts D

    Center for Cancer Biology, VIB, 3000, Leuven, Belgium.

    Background: Hypoxia is pervasive in cancer and other diseases. Cells sense and adapt to hypoxia by activating hypoxia-inducible transcription factors (HIFs), but it is still an outstanding question why cell types differ in their transcriptional response to hypoxia.

    Results: We report that HIFs fail to bind CpG dinucleotides that are methylated in their consensus binding sequence, both in in vitro biochemical binding assays and in vivo studies of differentially methylated isogenic cell lines. Based on in silico structural modeling, we show that 5-methylcytosine indeed causes steric hindrance in the HIF binding pocket. A model wherein cell-type-specific methylation landscapes, as laid down by the differential expression and binding of other transcription factors under normoxia, control cell-type-specific hypoxia responses is observed. We also discover ectopic HIF binding sites in repeat regions which are normally methylated. Genetic and pharmacological DNA demethylation, but also cancer-associated DNA hypomethylation, expose these binding sites, inducing HIF-dependent expression of cryptic transcripts. In line with such cryptic transcripts being more prone to cause double-stranded RNA and viral mimicry, we observe low DNA methylation and high cryptic transcript expression in tumors with high immune checkpoint expression, but not in tumors with low immune checkpoint expression, where they would compromise tumor immunotolerance. In a low-immunogenic tumor model, DNA demethylation upregulates cryptic transcript expression in a HIF-dependent manner, causing immune activation and reducing tumor growth.

    Conclusions: Our data elucidate the mechanism underlying cell-type-specific responses to hypoxia and suggest DNA methylation and hypoxia to underlie tumor immunotolerance.

    Funded by: European Research Council: CHAMELEO 334420, CHAMELEON 617595; Fonds Wetenschappelijk Onderzoek: 11B3818N, G070615N; Stichting Tegen Kanker: FAF-C/2016/876

    Genome biology 2020;21;1;182

  • A Nationwide Outbreak of Invasive Pneumococcal Disease in Israel Caused by Streptococcus Pneumoniae Serotype 2.

    Dagan R, Ben-Shimol S, Benisty R, Regev-Yochay G, Lo SW, Bentley SD, Hawkins PA, McGee L, Ron M, Givon-Lavi N, Valinsky L and Rokney A

    Faculty of Health Sciences, Ben-Gurion University of the Negev, Beer Sheva, Israel.

    Background: Invasive pneumococcal disease (IPD) caused by Streptococcus pneumoniae serotype 2 (Sp2) is infrequent. Large scale outbreaks have not been reported following pneumococcal conjugate vaccine (PCV) implementation. We describe a Sp2 IPD outbreak in Israel, in the 13-valent PCV (PCV13) era, with focus on Sp2 population structure and evolutionary dynamics.

    Methods: The data derived from a population-based, nationwide active surveillance of IPD since 2009. 7-valent PCV (PCV7)/PCV13 vaccines were introduced in July 2009 and November 2010, respectively. Sp2 isolates were tested for antimicrobial susceptibility, Multilocus Sequence Typing (MLST) and Whole Genome Sequencing (WGS) analysis.

    Results: Overall, 170 Sp2 IPD cases were identified during 2009-2019; Sp2 increased in 2015 and caused 6% of IPD during 2015-2019, a 7-fold increase compared with 2009-2014.The outbreak was caused by a previously unreported molecular type (ST-13578), initially observed in Israel in 2014. This clone caused 88% of Sp2 during 2015-2019. ST-13578 is a single-locus variant of ST-1504, previously reported globally, including in Israel. WGS analysis confirmed clonality among the ST-13578 population. Single-nucleotide polymorphisms-dense regions support a hypothesis that the ST-13578 outbreak clone evolved from ST-1504 by recombination.All tested strains were penicillin-susceptible (MIC <0.06 μg/mL). The ST-13578 clone was identified almost exclusively (99%) in the Jewish population and was mainly distributed in 3/7 Israeli districts. The outbreak is still ongoing, although declining since 2017.Conclusions: To the best of our knowledge, this is the first widespread Sp2 outbreak since PCV13 introduction worldwide, caused by the emerging ST-13578 clone.

    Clinical infectious diseases : an official publication of the Infectious Diseases Society of America 2020

  • A Robust Method Uncovers Significant Context-Specific Heritability in Diverse Complex Traits.

    Dahl A, Nguyen K, Cai N, Gandal MJ, Flint J and Zaitlen N

    Department of Neurology, University of California Los Angeles, Los Angeles, CA 90095, USA; Department of Medicine, University of California San Francisco, San Francisco, CA 94158, USA. Electronic address:

    Gene-environment interactions (GxE) can be fundamental in applications ranging from functional genomics to precision medicine and is a conjectured source of substantial heritability. However, unbiased methods to profile GxE genome-wide are nascent and, as we show, cannot accommodate general environment variables, modest sample sizes, heterogeneous noise, and binary traits. To address this gap, we propose a simple, unifying mixed model for gene-environment interaction (GxEMM). In simulations and theory, we show that GxEMM can dramatically improve estimates and eliminate false positives when the assumptions of existing methods fail. We apply GxEMM to a range of human and model organism datasets and find broad evidence of context-specific genetic effects, including GxSex, GxAdversity, and GxDisease interactions across thousands of clinical and molecular phenotypes. Overall, GxEMM is broadly applicable for testing and quantifying polygenic interactions, which can be useful for explaining heritability and invaluable for determining biologically relevant environments.

    Funded by: NCI NIH HHS: R01 CA227237; NHGRI NIH HHS: R01 HG006399, U01 HG009080; NHLBI NIH HHS: K25 HL121295; NIDCR NIH HHS: R03 DE025665; NIEHS NIH HHS: R01 ES029929

    American journal of human genetics 2020;106;1;71-91

  • Integrated chromosomal and plasmid sequence analyses reveal diverse modes of carbapenemase gene spread among Klebsiella pneumoniae.

    David S, Cohen V, Reuter S, Sheppard AE, Giani T, Parkhill J, European Survey of Carbapenemase-Producing Enterobacteriaceae (EuSCAPE) Working Group, ESCMID Study Group for Epidemiological Markers (ESGEM), Rossolini GM, Feil EJ, Grundmann H and Aanensen DM

    Centre for Genomic Pathogen Surveillance, Wellcome Genome Campus, Hinxton, CB10 1SA Cambridge, United Kingdom;

    Molecular and genomic surveillance systems for bacterial pathogens currently rely on tracking clonally evolving lineages. By contrast, plasmids are usually excluded or analyzed with low-resolution techniques, despite being the primary vectors of antibiotic resistance genes across many key pathogens. Here, we used a combination of long- and short-read sequence data of <i>Klebsiella pneumoniae</i> isolates (<i>n</i> = 1,717) from a European survey to perform an integrated, continent-wide study of chromosomal and plasmid diversity. This revealed three contrasting modes of dissemination used by carbapenemase genes, which confer resistance to last-line carbapenems. First, <i>bla</i><sub>OXA-48-like</sub> genes have spread primarily via the single epidemic pOXA-48-like plasmid, which emerged recently in clinical settings and spread rapidly to numerous lineages. Second, <i>bla</i><sub>VIM</sub> and <i>bla</i><sub>NDM</sub> genes have spread via transient associations of many diverse plasmids with numerous lineages. Third, <i>bla</i><sub>KPC</sub> genes have transmitted predominantly by stable association with one successful clonal lineage (ST258/512) yet have been mobilized among diverse plasmids within this lineage. We show that these plasmids, which include pKpQIL-like and IncX3 plasmids, have a long association (and are coevolving) with the lineage, although frequent recombination and rearrangement events between them have led to a complex array of mosaic plasmids carrying <i>bla</i><sub>KPC</sub> Taken altogether, these results reveal the diverse trajectories of antibiotic resistance genes in clinical settings, summarized as using one plasmid/multiple lineages, multiple plasmids/multiple lineages, and multiple plasmids/one lineage. Our study provides a framework for the much needed incorporation of plasmid data into genomic surveillance systems, an essential step toward a more comprehensive understanding of resistance spread.

    Proceedings of the National Academy of Sciences of the United States of America 2020

  • Single-Cell RNA Sequencing Reveals a Dynamic Stromal Niche That Supports Tumor Growth.

    Davidson S, Efremova M, Riedel A, Mahata B, Pramanik J, Huuhtanen J, Kar G, Vento-Tormo R, Hagai T, Chen X, Haniffa MA, Shields JD and Teichmann SA

    Medical Research Council Cancer Unit, University of Cambridge, Hutchison/Medical Research Council Research Centre, Box 197 Cambridge Biomedical Campus, Cambridge, CB2 0XZ, UK.

    Here, using single-cell RNA sequencing, we examine the stromal compartment in murine melanoma and draining lymph nodes (LNs) at points across tumor development, providing data at Naive lymphocytes from LNs undergo activation and clonal expansion within the tumor, before PD1 and Lag3 expression, while tumor-associated myeloid cells promote the formation of a suppressive niche. We identify three temporally distinct stromal populations displaying unique functional signatures, conserved across mouse and human tumors. Whereas "immune" stromal cells are observed in early tumors, "contractile" cells become more prevalent at later time points. Complement component C3 is specifically expressed in the immune population. Its cleavage product C3a supports the recruitment of C3aR<sup>+</sup> macrophages, and perturbation of C3a and C3aR disrupts immune infiltration, slowing tumor growth. Our results highlight the power of scRNA-seq to identify complex interplays and increase stromal diversity as a tumor develops, revealing that stromal cells acquire the capacity to modulate immune landscapes from early disease.

    Funded by: Cancer Research UK: A20193; European Research Council: 260507, 646794; Medical Research Council: MC_UU_12022/5; Wellcome Trust: 206194

    Cell reports 2020;31;7;107628

  • Formin, an opinion.

    Davison A, McDowell GS, Holden JM, Johnson HF, Wade CM, Chiba S, Jackson DJ, Levin M and Blaxter ML

    University of Nottingham, Nottingham, NG7 2RD, UK

    Development (Cambridge, England) 2020;147;1

  • Characterization of nuclear mitochondrial insertions in the whole genomes of primates.

    Dayama G, Zhou W, Prado-Martinez J, Marques-Bonet T and Mills RE

    Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI 48109, USA.

    The transfer and integration of whole and partial mitochondrial genomes into the nuclear genomes of eukaryotes is an ongoing process that has facilitated the transfer of genes and contributed to the evolution of various cellular pathways. Many previous studies have explored the impact of these insertions, referred to as NumtS, but have focused primarily on older events that have become fixed and are therefore present in all individual genomes for a given species. We previously developed an approach to identify novel Numt polymorphisms from next-generation sequence data and applied it to thousands of human genomes. Here, we extend this analysis to 79 individuals of other great ape species including chimpanzee, bonobo, gorilla, orang-utan and also an old world monkey, macaque. We show that recent Numt insertions are prevalent in each species though at different apparent rates, with chimpanzees exhibiting a significant increase in both polymorphic and fixed Numt sequences as compared to other great apes. We further assessed positional effects in each species in terms of evolutionary time and rate of insertion and identified putative hotspots on chromosome 5 for Numt integration, providing insight into both recent polymorphic and older fixed reference NumtS in great apes in comparison to human events.

    NAR genomics and bioinformatics 2020;2;4;lqaa089

  • Opportunistic genomic screening. Recommendations of the European Society of Human Genetics.

    de Wert G, Dondorp W, Clarke A, Dequeker EMC, Cordier C, Deans Z, van El CG, Fellmann F, Hastings R, Hentze S, Howard H, Macek M, Mendes A, Patch C, Rial-Sebbag E, Stefansdottir V, Cornel MC, Forzano F and European Society of Human Genetics

    Department of Health, Ethics and Society, CAPHRI Care and Public Health Research Institute, and Research School GROW for Oncology & Developmental Biology, Maastricht University, Maastricht, The Netherlands.

    If genome sequencing is performed in health care, in theory the opportunity arises to take a further look at the data: opportunistic genomic screening (OGS). The European Society of Human Genetics (ESHG) in 2013 recommended that genome analysis should be restricted to the original health problem at least for the time being. Other organizations have argued that 'actionable' genetic variants should or could be reported (including American College of Medical Genetics and Genomics, French Society of Predictive and Personalized Medicine, Genomics England). They argue that the opportunity should be used to routinely and systematically look for secondary findings-so-called opportunistic screening. From a normative perspective, the distinguishing characteristic of screening is not so much its context (whether public health or health care), but the lack of an indication for having this specific test or investigation in those to whom screening is offered. Screening entails a more precarious benefits-to-risks balance. The ESHG continues to recommend a cautious approach to opportunistic screening. Proportionality and autonomy must be guaranteed, and in collectively funded health-care systems the potential benefits must be balanced against health care expenditures. With regard to genome sequencing in pediatrics, ESHG argues that it is premature to look for later-onset conditions in children. Counseling should be offered and informed consent is and should be a central ethical norm. Depending on developing evidence on penetrance, actionability, and available resources, OGS pilots may be justified to generate data for a future, informed, comparative analysis of OGS and its main alternatives, such as cascade testing.

    Funded by: ZonMw (Netherlands Organisation for Health Research and Development): 80-84600-98-3002

    European journal of human genetics : EJHG 2020

  • A practical framework and online tool for mutational signature analyses show inter-tissue variation and driver dependencies.

    Degasperi A, Amarante TD, Czarnecki J, Shooter S, Zou X, Glodzik D, Morganella S, Nanda AS, Badja C, Koh G, Momen SE, Georgakopoulos-Soares I, Dias JML, Young J, Memari Y, Davies H and Nik-Zainal S

    MRC Cancer Unit, University of Cambridge, Hutchison/MRC Research Centre, Cambridge Biomedical Campus, Cambridge, UK CB2 0XZ.

    <i>Mutational signatures</i> are patterns of mutations that arise during tumorigenesis. We present an enhanced, practical framework for mutational signature analyses. Applying these methods on 3,107 whole genome sequenced (WGS) primary cancers of 21 organs reveals known signatures and nine previously undescribed rearrangement signatures. We highlight inter-organ variability of signatures and present a way of visualizing that diversity, reinforcing our findings in an independent analysis of 3,096 WGS metastatic cancers. Signatures with a high level of genomic instability are dependent on <i>TP53</i> dysregulation. We illustrate how uncertainty in mutational signature identification and assignment to samples affects tumor classification, reinforcing that using multiple orthogonal mutational signature data is not only beneficial, it is essential for accurate tumor stratification. Finally, we present a reference web-based tool for cancer and experimentally-generated mutational signatures, called Signal (, that also supports performing mutational signature analyses.

    Funded by: Cancer Research UK: A22932, A23433, A23916; Wellcome Trust: 100183

    Nature cancer 2020;1;2;249-263

  • Defining the Design Principles of Skin Epidermis Postnatal Growth.

    Dekoninck S, Hannezo E, Sifrim A, Miroshnikova YA, Aragona M, Malfait M, Gargouri S, de Neunheuser C, Dubois C, Voet T, Wickström SA, Simons BD and Blanpain C

    Université Libre de Bruxelles, Laboratory of Stem Cells and Cancer, Brussels 1070, Belgium.

    During embryonic and postnatal development, organs and tissues grow steadily to achieve their final size at the end of puberty. However, little is known about the cellular dynamics that mediate postnatal growth. By combining in vivo clonal lineage tracing, proliferation kinetics, single-cell transcriptomics, and in vitro micro-pattern experiments, we resolved the cellular dynamics taking place during postnatal skin epidermis expansion. Our data revealed that harmonious growth is engineered by a single population of developmental progenitors presenting a fixed fate imbalance of self-renewing divisions with an ever-decreasing proliferation rate. Single-cell RNA sequencing revealed that epidermal developmental progenitors form a more uniform population compared with adult stem and progenitor cells. Finally, we found that the spatial pattern of cell division orientation is dictated locally by the underlying collagen fiber orientation. Our results uncover a simple design principle of organ growth where progenitors and differentiated cells expand in harmony with their surrounding tissues.

    Cell 2020;181;3;604-620.e22

  • Accurate and fast identification of Campylobacter fetus in bulls by real-time PCR targeting a 16S rRNA gene sequence.

    Delpiazzo R, Barcellos M, Barros S, Betancor L, Fraga M, Gil J, Iraola G, Morsella C, Paolicchi F, Pérez R, Riet-Correa F, Sanguinetti M, Silva A, da Silva Silveira C and Calleros L

    Departamento de Salud de los Sistemas Pecuarios, Facultad de Veterinaria, Universidad de la República Oriental del Uruguay, Estación Experimental "Dr. Mario A. Cassinoni", Ruta 3 Km. 363, Paysandú, Uruguay.

    <i>Campylobacter fetus</i> is an important animal pathogen that causes infectious infertility, embryonic mortality and abortions in cattle and sheep flocks. There are two recognized subspecies related with reproductive disorders in livestock: <i>Campylobacter fetus</i> subsp. <i>fetus</i> (Cff) and <i>Campylobacter fetus</i> subsp. <i>venerealis</i> (Cfv). Rapid and reliable detection of this pathogenic species in bulls is of upmost importance for disease control in dairy and beef herds as they are asymptomatic carriers. The aim of the present work was to assess the performance a real-time PCR (qPCR) method for the diagnosis of <i>Campylobacter fetus</i> in samples from bulls, comparing it with culture and isolation methods. 520 preputial samples were both cultured in Skirrow's medium and analyzed by qPCR. The estimated sensitivity of qPCR was 90.9% (95% CI, 69.4%-100%), and the specificity was 99.4% (95% CI, 98.6% - 100%). The proportion of <i>C. fetus</i> positive individuals was 2.1% by isolation and 2.5% by qPCR. Isolates were identified by biochemical tests as Cfv (<i>n</i> = 9) and Cff (<i>n</i> = 2). Our findings support the use of qPCR for fast and accurate detection of <i>C. fetus</i> directly from field samples of preputial smegma of bulls. The qPCR method showed to be suitable for massive screenings because it can be performed in pooled samples without losing accuracy and sensitivity.

    Veterinary and animal science 2020;11;100163

  • Biologically indeterminate yet ordered promiscuous gene expression in single medullary thymic epithelial cells.

    Dhalla F, Baran-Gale J, Maio S, Chappell L, Holländer GA and Ponting CP

    Weatherall Institute of Molecular Medicine, University of Oxford, Oxford, UK.

    To induce central T-cell tolerance, medullary thymic epithelial cells (mTEC) collectively express most protein-coding genes, thereby presenting an extensive library of tissue-restricted antigens (TRAs). To resolve mTEC diversity and whether promiscuous gene expression (PGE) is stochastic or coordinated, we sequenced transcriptomes of 6,894 single mTEC, enriching for 1,795 rare cells expressing either of two TRAs, TSPAN8 or GP2. Transcriptional heterogeneity allowed partitioning of mTEC into 15 reproducible subpopulations representing distinct maturational trajectories, stages and subtypes, including novel mTEC subsets, such as chemokine-expressing and ciliated TEC, which warrant further characterisation. Unexpectedly, 50 modules of genes were robustly defined each showing patterns of co-expression within individual cells, which were mainly not explicable by chromosomal location, biological pathway or tissue specificity. Further, TSPAN8<sup>+</sup> and GP2<sup>+</sup> mTEC were randomly dispersed within thymic medullary islands. Consequently, these data support observations that PGE exhibits ordered co-expression, although mechanisms underlying this instruction remain biologically indeterminate. Ordered co-expression and random spatial distribution of a diverse range of TRAs likely enhance their presentation and encounter with passing thymocytes, while maintaining mTEC identity.

    Funded by: UK Research and Innovation | Medical Research Council (MRC): MC_UU_00007/15; Wellcome Trust; Wellcome Trust (WT): 105045/Z/14/Z, 109032/Z/15/Z

    The EMBO journal 2020;39;1;e101828

  • The changing epidemiology of carbapenemase-producing Klebsiella pneumoniae in Italy: toward polyclonal evolution with emergence of high-risk lineages.

    Di Pilato V, Errico G, Monaco M, Giani T, Del Grosso M, Antonelli A, David S, Lindh E, Camilli R, Aanensen DM, Rossolini GM, Pantosti A and AR-ISS Laboratory Study Group on carbapenemase-producing Klebsiella pneumoniae

    Department of Infectious Diseases, Istituto Superiore di Sanità, Rome, Italy.

    Background: Previous studies showed that the epidemic of carbapenem-resistant Klebsiella pneumoniae (CR-KP) observed in Italy since 2010 was sustained mostly by strains of clonal group (CG) 258 producing KPC-type carbapenemases. In the framework of the National Antibiotic-Resistance Surveillance (AR-ISS), a countrywide survey was conducted in 2016 to explore the evolution of the phenotypic and genotypic characteristics of CR-KP isolates.

    Methods: From March to July 2016, hospital laboratories participating in AR-ISS were requested to provide consecutive, non-duplicated CR-KP (meropenem and/or imipenem MIC >1 mg/L) from invasive infections. Antibiotic susceptibility was determined according to EUCAST recommendations. A WGS approach was adopted to characterize the isolates by investigating phylogeny, resistome and virulome.

    Results: Twenty-four laboratories provided 157 CR-KP isolates, of which 156 were confirmed as K. pneumoniae sensu stricto by WGS and found to carry at least one carbapenemase-encoding gene, corresponding in most cases (96.1%) to blaKPC. MLST- and SNP-based phylogeny revealed that 87.8% of the isolates clustered in four major lineages: CG258 (47.4%), with ST512 as the most common clone, CG307 (19.9%), ST101 (15.4%) and ST395 (5.1%). A close association was identified between lineages and antibiotic resistance phenotypes and genotypes, virulence traits and capsular types. Colistin resistance, mainly associated with mgrB mutations, was common in all major lineages except ST395.

    Conclusions: This WGS-based survey showed that, although CG258 remained the most common CR-KP lineage in Italy, a polyclonal population has emerged with the spread of the new high-risk lineages CG307, ST101 and ST395, while KPC remained the most common carbapenemase.

    The Journal of antimicrobial chemotherapy 2020

  • Single-cell atlas of the first intra-mammalian developmental stage of the human parasite Schistosoma mansoni.

    Diaz Soria CL, Lee J, Chong T, Coghlan A, Tracey A, Young MD, Andrews T, Hall C, Ng BL, Rawlinson K, Doyle SR, Leonard S, Lu Z, Bennett HM, Rinaldi G, Newmark PA and Berriman M

    Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK.

    Over 250 million people suffer from schistosomiasis, a tropical disease caused by parasitic flatworms known as schistosomes. Humans become infected by free-swimming, water-borne larvae, which penetrate the skin. The earliest intra-mammalian stage, called the schistosomulum, undergoes a series of developmental transitions. These changes are critical for the parasite to adapt to its new environment as it navigates through host tissues to reach its niche, where it will grow to reproductive maturity. Unravelling the mechanisms that drive intra-mammalian development requires knowledge of the spatial organisation and transcriptional dynamics of different cell types that comprise the schistomulum body. To fill these important knowledge gaps, we perform single-cell RNA sequencing on two-day old schistosomula of Schistosoma mansoni. We identify likely gene expression profiles for muscle, nervous system, tegument, oesophageal gland, parenchymal/primordial gut cells, and stem cells. In addition, we validate cell markers for all these clusters by in situ hybridisation in schistosomula and adult parasites. Taken together, this study provides a comprehensive cell-type atlas for the early intra-mammalian stage of this devastating metazoan parasite.

    Funded by: Howard Hughes Medical Institute; Wellcome Trust: 107475/Z/15/Z, 206194

    Nature communications 2020;11;1;6411

  • Large genome-wide association study identifies three novel risk variants for restless legs syndrome.

    Didriksen M, Nawaz MS, Dowsett J, Bell S, Erikstrup C, Pedersen OB, Sørensen E, Jennum PJ, Burgdorf KS, Burchell B, Butterworth AS, Soranzo N, Rye DB, Trotti LM, Saini P, Stefansdottir L, Magnusson SH, Thorleifsson G, Sigmundsson T, Sigurdsson AP, Van Den Hurk K, Quee F, Tanck MWT, Ouwehand WH, Roberts DJ, Earley EJ, Busch MP, Mast AE, Page GP, Danesh J, Di Angelantonio E, Stefansson H, Ullum H and Stefansson K

    Department of Clinical Immunology, Copenhagen University Hospital, Rigshospitalet, 2100, Copenhagen, Denmark.

    Restless legs syndrome (RLS) is a common neurological sensorimotor disorder often described as an unpleasant sensation associated with an urge to move the legs. Here we report findings from a meta-analysis of genome-wide association studies of RLS including 480,982 Caucasians (cases = 10,257) and a follow up sample of 24,977 (cases = 6,651). We confirm 19 of the 20 previously reported RLS sequence variants at 19 loci and report three novel RLS associations; rs112716420-G (OR = 1.25, P = 1.5 × 10<sup>-18</sup>), rs10068599-T (OR = 1.09, P = 6.9 × 10<sup>-10</sup>) and rs10769894-A (OR = 0.90, P = 9.4 × 10<sup>-14</sup>). At four of the 22 RLS loci, cis-eQTL analysis indicates a causal impact on gene expression. Through polygenic risk score for RLS we extended prior epidemiological findings implicating obesity, smoking and high alcohol intake as risk factors for RLS. To improve our understanding, with the purpose of seeking better treatments, more genetics studies yielding deeper insights into the disease biology are needed.

    Funded by: British Heart Foundation (BHF): RG/13/13/30194

    Communications biology 2020;3;1;703

  • Exclusive enteral nutrition mediates gut microbial and metabolic changes that are associated with remission in children with Crohn's disease.

    Diederen K, Li JV, Donachie GE, de Meij TG, de Waart DR, Hakvoort TBM, Kindermann A, Wagner J, Auyeung V, Te Velde AA, Heinsbroek SEM, Benninga MA, Kinross J, Walker AW, de Jonge WJ and Seppen J

    Department of Pediatric Gastroenterology and Nutrition, Amsterdam UMC, Location AMC & VUmc, Amsterdam, The Netherlands.

    A nutritional intervention, exclusive enteral nutrition (EEN) can induce remission in patients with pediatric Crohn's disease (CD). We characterized changes in the fecal microbiota and metabolome to identify the mechanism of EEN. Feces of 43 children were collected prior, during and after EEN. Microbiota and metabolites were analyzed by 16S rRNA gene amplicon sequencing and NMR. Selected metabolites were evaluated in relevant model systems. Microbiota and metabolome of patients with CD and controls were different at all time points. Amino acids, primary bile salts, trimethylamine and cadaverine were elevated in patients with CD. Microbiota and metabolome differed between responders and non-responders prior to EEN. EEN decreased microbiota diversity and reduced amino acids, trimethylamine and cadaverine towards control levels. Patients with CD had reduced microbial metabolism of bile acids that partially normalized during EEN. Trimethylamine and cadaverine inhibited intestinal cell growth. TMA and cadaverine inhibited LPS-stimulated TNF-alpha and IL-6 secretion by primary human monocytes. A diet rich in free amino acids worsened inflammation in the DSS model of intestinal inflammation. Trimethylamine, cadaverine, bile salts and amino acids could play a role in the mechanism by which EEN induces remission. Prior to EEN, microbiota and metabolome are different between responders and non-responders.

    Scientific reports 2020;10;1;18879

  • Identification of a conserved var gene in different Plasmodium falciparum strains.

    Dimonte S, Bruske EI, Enderes C, Otto TD, Turner L, Kremsner P and Frank M

    Institute of Tropical Medicine, University of Tuebingen, Wilhelmstr. 27, 72074, Tuebingen, Germany.

    Background: The multicopy var gene family of Plasmodium falciparum is of crucial importance for pathogenesis and antigenic variation. So far only var2csa, the var gene responsible for placental malaria, was found to be highly conserved among all P. falciparum strains. Here, a new conserved 3D7 var gene (PF3D7_0617400) is identified in several field isolates.

    Methods: DNA sequencing, transcriptional analysis, Cluster of Differentiation (CD) 36-receptor binding, indirect immunofluorescence with PF3D7_0617400-antibodies and quantification of surface reactivity against semi-immune sera were used to characterize an NF54 clone and a Gabonese field isolate clone (MOA C3) transcribing the gene. A population of 714 whole genome sequenced parasites was analysed to characterize the conservation of the locus in African and Asian isolates. The genetic diversity of two var2csa fragments was compared with the genetic diversity of 57 microsatellites fragments in field isolates.

    Results: PFGA01_060022400 was identified in a Gabonese parasite isolate (MOA) from a chronic infection and found to be 99% identical with PF3D7_0617400 of the 3D7 genome strain. Transcriptional analysis and immunofluorescence showed expression of the gene in an NF54 and a MOA clone but CD36 binding assays and surface reactivity to semi-immune sera differed markedly in the two clones. Long-read Pacific bioscience whole genome sequencing showed that PFGA01_060022400 is located in the internal cluster of chromosome 6. The full length PFGA01_060022400 was detected in 36 of 714 P. falciparum isolates and 500 bp fragments were identified in more than 100 isolates. var2csa was in parts highly conserved (H<sub>e</sub> = 0) but in other parts as variable (H<sub>e</sub> = 0.86) as the 57 microsatellites markers (H<sub>e</sub> = 0.8).

    Conclusions: Individual var gene sequences exhibit conservation in the global parasite population suggesting that purifying selection may limit overall genetic diversity of some var genes. Notably, field and laboratory isolates expressing the same var gene exhibit markedly different phenotypes.

    Malaria journal 2020;19;1;194

  • Identifying proteins bound to native mitotic ESC chromosomes reveals chromatin repressors are important for compaction.

    Djeghloul D, Patel B, Kramer H, Dimond A, Whilding C, Brown K, Kohler AC, Feytout A, Veland N, Elliott J, Bharat TAM, Tarafder AK, Löwe J, Ng BL, Guo Y, Guy J, Huseyin MK, Klose RJ, Merkenschlager M and Fisher AG

    Lymphocyte Development Group, MRC London Institute of Medical Sciences, Imperial College London, Hammersmith Hospital Campus, Du Cane Road, London, W12 0NN, UK.

    Epigenetic information is transmitted from mother to daughter cells through mitosis. Here, to identify factors that might play a role in conveying epigenetic memory through cell division, we report on the isolation of unfixed, native chromosomes from metaphase-arrested cells using flow cytometry and perform LC-MS/MS to identify chromosome-bound proteins. A quantitative proteomic comparison between metaphase-arrested cell lysates and chromosome-sorted samples reveals a cohort of proteins that were significantly enriched on mitotic ESC chromosomes. These include pluripotency-associated transcription factors, repressive chromatin-modifiers such as PRC2 and DNA methyl-transferases, and proteins governing chromosome architecture. Deletion of PRC2, Dnmt1/3a/3b or Mecp2 in ESCs leads to an increase in the size of individual mitotic chromosomes, consistent with de-condensation. Similar results were obtained by the experimental cleavage of cohesin. Thus, we identify chromosome-bound factors in pluripotent stem cells during mitosis and reveal that PRC2, DNA methylation and Mecp2 are required to maintain chromosome compaction.

    Nature communications 2020;11;1;4118

  • Deciphering immunity at high plexity and resolution.

    Domínguez Conde C and Teichmann SA

    Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK.

    Nature reviews. Immunology 2020;20;2;77-78

  • SpeS: A Novel Superantigen and Its Potential as a Vaccine Adjuvant against Strangles.

    Dominguez-Medina CC, Rash NL, Robillard S, Robinson C, Efstratiou A, Broughton K, Parkhill J, Holden MTG, Lopez-Alvarez MR, Paillot R and Waller AS

    Animal Health Trust, Lanwades Park, Kentford, Newmarket CB8 7UU, UK.

    Bacterial superantigens (sAgs) are powerful activators of the immune response that trigger unspecific T cell responses accompanied by the release of proinflammatory cytokines. <i>Streptococcus equi</i> (<i>S. equi</i>) and <i>Streptococcus zooepidemicus</i> (<i>S. zooepidemicus</i>) produce sAgs that play an important role in their ability to cause disease. Strangles, caused by <i>S. equi</i>, is one of the most common infectious diseases of horses worldwide. Here, we report the identification of a new sAg of <i>S. zooepidemicus</i>, SpeS, and show that mutation of the putative T cell receptor (TCR)-binding motif (YAY to IAY) abrogated TCR-binding, whilst maintaining interaction with major histocompatibility complex (MHC) class II molecules. The fusion of SpeS and SpeS<sup>Y39I</sup> to six <i>S. equi</i> surface proteins using two different peptide linkers was conducted to determine if MHC class II-binding properties were maintained. Proliferation assays, qPCR and flow cytometry analysis showed that SpeS<sup>Y39I</sup> and its fusion proteins induced less mitogenic activity and interferon gamma expression when compared to SpeS, whilst retaining Antigen-Presenting Cell (APC)-binding properties. Our data suggest that SpeS<sup>Y39I</sup>-surface protein fusions could be used to direct vaccine antigens towards antigen-presenting cells in vivo with the potential to enhance antigen presentation and improve immune responses.

    Funded by: Biotechnology and Biological Sciences Research Council: BB/P002757/1

    International journal of molecular sciences 2020;21;12

  • 'Community evolution' - laboratory strains and pedigrees in the age of genomics.

    Dorman MJ and Thomson NR

    Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK.

    Molecular microbiologists depend heavily on laboratory strains of bacteria, which are ubiquitous across the community of research groups working on a common organism. However, this presumes that strains present in different laboratories are in fact identical. Work on a culture of <i>Vibrio cholerae</i> preserved from 1916 provoked us to consider recent studies, which have used both classical genetics and next-generation sequencing to study the heterogeneity of laboratory strains. Here, we review and discuss mutations and phenotypic variation in supposedlyisogenic reference strains of <i>V. cholerae</i> and <i>Escherichia coli</i>, and we propose that by virtue of the dissemination of laboratory strains across the world, a large 'community evolution' experiment is currently ongoing.

    Microbiology (Reading, England) 2020

  • Genomics of the Argentinian cholera epidemic elucidate the contrasting dynamics of epidemic and endemic Vibrio cholerae.

    Dorman MJ, Domman D, Poklepovich T, Tolley C, Zolezzi G, Kane L, Viñas MR, Panagópulo M, Moroni M, Binsztein N, Caffer MI, Clare S, Dougan G, Salmond GPC, Parkhill J, Campos J and Thomson NR

    Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK.

    In order to control and eradicate epidemic cholera, we need to understand how epidemics begin, how they spread, and how they decline and eventually end. This requires extensive sampling of epidemic disease over time, alongside the background of endemic disease that may exist concurrently with the epidemic. The unique circumstances surrounding the Argentinian cholera epidemic of 1992-1998 presented an opportunity to do this. Here, we use 490 Argentinian V. cholerae genome sequences to characterise the variation within, and between, epidemic and endemic V. cholerae. We show that, during the 1992-1998 cholera epidemic, the invariant epidemic clone co-existed alongside highly diverse members of the Vibrio cholerae species in Argentina, and we contrast the clonality of epidemic V. cholerae with the background diversity of local endemic bacteria. Our findings refine and add nuance to our genomic definitions of epidemic and endemic cholera, and are of direct relevance to controlling current and future cholera epidemics.

    Funded by: NCATS NIH HHS: KL2 TR001448; Wellcome Trust: 206194

    Nature communications 2020;11;1;4918

  • Discordant bioinformatic predictions of antimicrobial resistance from whole-genome sequencing data of bacterial isolates: an inter-laboratory study.

    Doyle RM, O'Sullivan DM, Aller SD, Bruchmann S, Clark T, Coello Pelegrin A, Cormican M, Diez Benavente E, Ellington MJ, McGrath E, Motro Y, Phuong Thuy Nguyen T, Phelan J, Shaw LP, Stabler RA, van Belkum A, van Dorp L, Woodford N, Moran-Gilad J, Huggett JF and Harris KA

    Clinical Research Department, London School of Hygiene and Tropical Medicine, London, UK.

    Antimicrobial resistance (AMR) poses a threat to public health. Clinical microbiology laboratories typically rely on culturing bacteria for antimicrobial-susceptibility testing (AST). As the implementation costs and technical barriers fall, whole-genome sequencing (WGS) has emerged as a 'one-stop' test for epidemiological and predictive AST results. Few published comparisons exist for the myriad analytical pipelines used for predicting AMR. To address this, we performed an inter-laboratory study providing sets of participating researchers with identical short-read WGS data from clinical isolates, allowing us to assess the reproducibility of the bioinformatic prediction of AMR between participants, and identify problem cases and factors that lead to discordant results. We produced ten WGS datasets of varying quality from cultured carbapenem-resistant organisms obtained from clinical samples sequenced on either an Illumina NextSeq or HiSeq instrument. Nine participating teams ('participants') were provided these sequence data without any other contextual information. Each participant used their choice of pipeline to determine the species, the presence of resistance-associated genes, and to predict susceptibility or resistance to amikacin, gentamicin, ciprofloxacin and cefotaxime. We found participants predicted different numbers of AMR-associated genes and different gene variants from the same clinical samples. The quality of the sequence data, choice of bioinformatic pipeline and interpretation of the results all contributed to discordance between participants. Although much of the inaccurate gene variant annotation did not affect genotypic resistance predictions, we observed low specificity when compared to phenotypic AST results, but this improved in samples with higher read depths. Had the results been used to predict AST and guide treatment, a different antibiotic would have been recommended for each isolate by at least one participant. These challenges, at the final analytical stage of using WGS to predict AMR, suggest the need for refinements when using this technology in clinical settings. Comprehensive public resistance sequence databases, full recommendations on sequence data quality and standardization in the comparisons between genotype and resistance phenotypes will all play a fundamental role in the successful implementation of AST prediction using WGS in clinical microbiology laboratories.

    Microbial genomics 2020;6;2

  • Genomic and transcriptomic variation defines the chromosome-scale assembly of Haemonchus contortus, a model gastrointestinal worm.

    Doyle SR, Tracey A, Laing R, Holroyd N, Bartley D, Bazant W, Beasley H, Beech R, Britton C, Brooks K, Chaudhry U, Maitland K, Martinelli A, Noonan JD, Paulini M, Quail MA, Redman E, Rodgers FH, Sallé G, Shabbir MZ, Sankaranarayanan G, Wit J, Howe KL, Sargison N, Devaney E, Berriman M, Gilleard JS and Cotton JA

    Wellcome Sanger Institute, Hinxton, Cambridgeshire, CB10 1SA, UK.

    Haemonchus contortus is a globally distributed and economically important gastrointestinal pathogen of small ruminants and has become a key nematode model for studying anthelmintic resistance and other parasite-specific traits among a wider group of parasites including major human pathogens. Here, we report using PacBio long-read and OpGen and 10X Genomics long-molecule methods to generate a highly contiguous 283.4 Mbp chromosome-scale genome assembly including a resolved sex chromosome for the MHco3(ISE).N1 isolate. We show a remarkable pattern of conservation of chromosome content with Caenorhabditis elegans, but almost no conservation of gene order. Short and long-read transcriptome sequencing allowed us to define coordinated transcriptional regulation throughout the parasite's life cycle and refine our understanding of cis- and trans-splicing. Finally, we provide a comprehensive picture of chromosome-wide genetic diversity both within a single isolate and globally. These data provide a high-quality comparison for understanding the evolution and genomics of Caenorhabditis and other nematodes and extend the experimental tractability of this model parasitic nematode in understanding helminth biology, drug discovery and vaccine development, as well as important adaptive traits such as drug resistance.

    Funded by: RCUK | Biotechnology and Biological Sciences Research Council (BBSRC): BB/K020048/1, BB/M003949/1, BB/P024610/1; Wellcome Trust (Wellcome): 067811, WT206194

    Communications biology 2020;3;1;656

  • Molecular Evolution of IDH Wild-Type Glioblastomas Treated With Standard of Care Affects Survival and Design of Precision Medicine Trials: A Report From the EORTC 1542 Study.

    Draaisma K, Chatzipli A, Taphoorn M, Kerkhof M, Weyerbrock A, Sanson M, Hoeben A, Lukacova S, Lombardi G, Leenstra S, Hanse M, Fleischeuer R, Watts C, McAbee J, Angelopoulos N, Gorlia T, Golfinopoulos V, Kros JM, Verhaak RGW, Bours V, van den Bent MJ, McDermott U, Robe PA and French PJ

    Erasmus University Medical Center, Rotterdam, the Netherlands.

    Purpose: Precision medicine trials in glioblastoma (GBM) are often conducted at tumor recurrence. However, second surgeries for recurrent GBM are not routinely performed, and therefore, molecular data for trial inclusion are predominantly derived from the primary sample. This study aims to establish whether molecular targets change during tumor progression and, if so, whether this affects precision medicine trial design.

    Materials and methods: We collected 186 pairs of primary-recurrent GBM samples from patients receiving chemoradiotherapy with temozolomide and sequenced approximately 300 cancer genes. <i>MGMT</i>, <i>TERT</i>, and <i>EGFRvIII</i> status was individually determined.

    Results: The molecular profile of our cohort was identical to that of other GBM cohorts (<i>IDH</i> wild-type [WT], 95%; <i>EGFR</i> amplified, approximately 50%), indicating that patients amenable to second surgery do not represent a specific molecular subtype. Molecular events in <i>IDH</i> WT GBMs were stable in approximately 80% of events, but changes in mutation status were observed for all examined genes (range, approximately 90% and 60% for <i>TERT</i> and <i>EGFR</i> mutations, respectively), and such changes strongly affected targeted trial size and design. A similar pattern of GBM driver instability was observed within <i>MGMT</i> promoter-methylated tumors. <i>MGMT</i> promoter methylation status remained prognostic at tumor recurrence. The observation that hypermutation at GBM recurrence was rare (8%) and not correlated with outcome was relevant for immunotherapy-based treatments.

    Conclusion: This large cohort of matched primary and recurrent <i>IDH</i> WT tumors establishes the frequency of GBM driver instability after chemoradiotherapy with temozolomide. This allows per gene or pathway calculation of trial size at tumor recurrence, using molecular data of the primary tumor only. We also identify genes for which repeat surgery is necessary because of low mutation retention rate.

    Journal of clinical oncology : official journal of the American Society of Clinical Oncology 2020;38;1;81-99

  • Targeting DNA Damage Response and Replication Stress in Pancreatic Cancer.

    Dreyer SB, Upstill-Goddard R, Paulus-Hock V, Paris C, Lampraki EM, Dray E, Serrels B, Caligiuri G, Rebus S, Plenker D, Galluzzo Z, Brunton H, Cunningham R, Tesson M, Nourse C, Bailey UM, Jones M, Moran-Jones K, Wright DW, Duthie F, Oien K, Evers L, McKay CJ, McGregor GA, Gulati A, Brough R, Bajrami I, Pettitt S, Dziubinski ML, Candido J, Balkwill F, Barry ST, Grützmann R, Rahib L, Glasgow Precision Oncology Laboratory, Australian Pancreatic Cancer Genome Initiative, Johns A, Pajic M, Froeling FEM, Beer P, Musgrove EA, Petersen GM, Ashworth A, Frame MC, Crawford HC, Simeone DM, Lord C, Mukhopadhyay D, Pilarsky C, Tuveson DA, Cooke SL, Jamieson NB, Morton JP, Sansom OJ, Bailey PJ, Biankin AV and Chang DK

    Wolfson Wohl Cancer Research Centre, Institute of Cancer Sciences, University of Glasgow, Garscube Estate, Switchback Road, Bearsden, Glasgow, Scotland G61 1QH, UNITED KINGDOM; West of Scotland Pancreatic Unit, Glasgow Royal Infirmary, Glasgow G31 2ER UNITED KINGDOM.

    Background and aims: Continuing recalcitrance to therapy cements pancreatic cancer (PC) as the most lethal malignancy, which is set to become the second leading cause of cancer death in our society. The study aim was to investigate the association between DNA damage response (DDR), replication stress and novel therapeutic response in PC to develop a biomarker driven therapeutic strategy targeting DDR and replication stress in PC.

    Methods: We interrogated the transcriptome, genome, proteome and functional characteristics of 61 novel PC patient-derived cell lines to define novel therapeutic strategies targeting DDR and replication stress. Validation was done in patient derived xenografts and human PC organoids.

    Results: Patient-derived cell lines faithfully recapitulate the epithelial component of pancreatic tumors including previously described molecular subtypes. Biomarkers of DDR deficiency, including a novel signature of homologous recombination deficiency, co-segregates with response to platinum (P < 0.001) and PARP inhibitor therapy (P < 0.001) in vitro and in vivo. We generated a novel signature of replication stress with which predicts response to ATR (P < 0.018) and WEE1 inhibitor (P < 0.029) treatment in both cell lines and human PC organoids. Replication stress was enriched in the squamous subtype of PC (P < 0.001) but not associated with DDR deficiency.

    Conclusions: Replication stress and DDR deficiency are independent of each other, creating opportunities for therapy in DDR proficient PC, and post-platinum therapy.

    Gastroenterology 2020

  • Heterozygous Variants in KDM4B Lead to Global Developmental Delay and Neuroanatomical Defects.

    Duncan AR, Vitobello A, Collins SC, Vancollie VE, Lelliott CJ, Rodan L, Shi J, Seman AR, Agolini E, Novelli A, Prontera P, Guillen Sacoto MJ, Santiago-Sim T, Trimouille A, Goizet C, Nizon M, Bruel AL, Philippe C, Grant PE, Wojcik MH, Stoler J, Genetti CA, van Dooren MF, Maas SM, Alders M, Faivre L, Sorlin A, Yoon G, Yalcin B and Agrawal PB

    Division of Newborn Medicine, Department of Pediatrics, Boston Children's Hospital, Boston, MA 02115, USA.

    KDM4B is a lysine-specific demethylase with a preferential activity on H3K9 tri/di-methylation (H3K9me3/2)-modified histones. H3K9 tri/di-demethylation is an important epigenetic mechanism responsible for silencing of gene expression in animal development and cancer. However, the role of KDM4B on human development is still poorly characterized. Through international data sharing, we gathered a cohort of nine individuals with mono-allelic de novo or inherited variants in KDM4B. All individuals presented with dysmorphic features and global developmental delay (GDD) with language and motor skills most affected. Three individuals had a history of seizures, and four had anomalies on brain imaging ranging from agenesis of the corpus callosum with hydrocephalus to cystic formations, abnormal hippocampi, and polymicrogyria. In mice, lysine demethylase 4B is expressed during brain development with high levels in the hippocampus, a region important for learning and memory. To understand how KDM4B variants can lead to GDD in humans, we assessed the effect of KDM4B disruption on brain anatomy and behavior through an in vivo heterozygous mouse model (Kdm4b<sup>+/-</sup>), focusing on neuroanatomical changes. In mutant mice, the total brain volume was significantly reduced with decreased size of the hippocampal dentate gyrus, partial agenesis of the corpus callosum, and ventriculomegaly. This report demonstrates that variants in KDM4B are associated with GDD/ intellectual disability and neuroanatomical defects. Our findings suggest that KDM4B variation leads to a chromatinopathy, broadening the spectrum of this group of Mendelian disorders caused by alterations in epigenetic machinery.

    American journal of human genetics 2020

  • Organoids - New Models for Host-Helminth Interactions.

    Duque-Correa MA, Maizels RM, Grencis RK and Berriman M

    Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK. Electronic address:

    Organoids are multicellular culture systems that replicate tissue architecture and function, and are increasingly used as models of viral, bacterial, and protozoan infections. Organoids have great potential to improve our current understanding of helminth interactions with their hosts and to replace or reduce the dependence on using animal models. In this review, we discuss the applicability of this technology to helminth infection research, including strategies of co-culture of helminths or their products with organoids and the challenges, advantages, and drawbacks of the use of organoids for these studies. We also explore how complementing organoid systems with other cell types and components may allow more complex models to be generated in the future to further investigate helminth-host interactions.

    Funded by: National Centre for the Replacement, Refinement and Reduction of Animals in Research: NC/P001521/1

    Trends in parasitology 2020;36;2;170-181

  • Development of caecaloids to study host-pathogen interactions: new insights into immunoregulatory functions of Trichuris muris extracellular vesicles in the caecum.

    Duque-Correa MA, Schreiber F, Rodgers FH, Goulding D, Forrest S, White R, Buck A, Grencis RK and Berriman M

    Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK. Electronic address:

    The caecum, an intestinal appendage in the junction of the small and large intestines, displays a unique epithelium that serves as an exclusive niche for a range of pathogens including whipworms (Trichuris spp.). While protocols to grow organoids from small intestine (enteroids) and colon (colonoids) exist, the conditions to culture organoids from the caecum have yet to be described. Here, we report methods to grow, differentiate and characterise mouse adult stem cell-derived caecal organoids, termed caecaloids. We compare the cellular composition of caecaloids with that of enteroids, identifying differences in intestinal epithelial cell (IEC) populations that mimic those found in the caecum and small intestine. The remarkable similarity in the IEC composition and spatial conformation of caecaloids and their tissue of origin enables their use as an in vitro model to study host interactions with important caecal pathogens. Thus, exploiting this system, we investigated the responses of caecal IECs to extracellular vesicles (EVs) secreted/excreted by the intracellular helminth Trichuris muris. Our findings reveal novel immunoregulatory effects of whipworm EVs on the caecal epithelium, including the downregulation of responses to nucleic acid recognition and type-I interferon (IFN) signalling.

    International journal for parasitology 2020

  • Population genomic evidence that human and animal infections in Africa come from the same populations of Dracunculus medinensis.

    Durrant C, Thiele EA, Holroyd N, Doyle SR, Sallé G, Tracey A, Sankaranarayanan G, Lotkowska ME, Bennett HM, Huckvale T, Abdellah Z, Tchindebet O, Wossen M, Logora MSY, Coulibaly CO, Weiss A, Schulte-Hostedde AI, Foster JM, Cleveland CA, Yabsley MJ, Ruiz-Tiben E, Berriman M, Eberhard ML and Cotton JA

    Parasites and Microbes, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, United Kingdom.

    Background: Guinea worm-Dracunculus medinensis-was historically one of the major parasites of humans and has been known since antiquity. Now, Guinea worm is on the brink of eradication, as efforts to interrupt transmission have reduced the annual burden of disease from millions of infections per year in the 1980s to only 54 human cases reported globally in 2019. Despite the enormous success of eradication efforts to date, one complication has arisen. Over the last few years, hundreds of dogs have been found infected with this previously apparently anthroponotic parasite, almost all in Chad. Moreover, the relative numbers of infections in humans and dogs suggests that dogs are currently the principal reservoir on infection and key to maintaining transmission in that country.

    Principal findings: In an effort to shed light on this peculiar epidemiology of Guinea worm in Chad, we have sequenced and compared the genomes of worms from dog, human and other animal infections. Confirming previous work with other molecular markers, we show that all of these worms are D. medinensis, and that the same population of worms are causing both infections, can confirm the suspected transmission between host species and detect signs of a population bottleneck due to the eradication efforts. The diversity of worms in Chad appears to exclude the possibility that there were no, or very few, worms present in the country during a 10-year absence of reported cases.

    Conclusions: This work reinforces the importance of adequate surveillance of both human and dog populations in the Guinea worm eradication campaign and suggests that control programs aiming to interrupt disease transmission should stay aware of the possible emergence of unusual epidemiology as pathogens approach elimination.

    PLoS neglected tropical diseases 2020;14;11;e0008623

  • Project Score database: a resource for investigating cancer cell dependencies and prioritizing therapeutic targets.

    Dwane L, Behan FM, Gonçalves E, Lightfoot H, Yang W, van der Meer D, Shepherd R, Pignatelli M, Iorio F and Garnett MJ

    Wellcome Sanger Institute, Hinxton, Cambridge CB10 1SA, UK.

    CRISPR genetic screens in cancer cell models are a powerful tool to elucidate oncogenic mechanisms and to identify promising therapeutic targets. The Project Score database ( uses genome-wide CRISPR-Cas9 dropout screening data in hundreds of highly annotated cancer cell models to identify genes required for cell fitness and prioritize novel oncology targets. The Project Score database currently allows users to investigate the fitness effect of 18 009 genes tested across 323 cancer cell models. Through interactive interfaces, users can investigate data by selecting a specific gene, cancer cell model or tissue type, as well as browsing all gene fitness scores. Additionally, users can identify and rank candidate drug targets based on an established oncology target prioritization pipeline, incorporating genetic biomarkers and clinical datasets for each target, and including suitability for drug development based on pharmaceutical tractability. Data are freely available and downloadable. To enhance analyses, links to other key resources including Open Targets, COSMIC, the Cell Model Passports, UniProt and the Genomics of Drug Sensitivity in Cancer are provided. The Project Score database is a valuable new tool for investigating genetic dependencies in cancer cells and the identification of candidate oncology targets.

    Nucleic acids research 2020

  • Synergistic Targeting of FLT3 Mutations in AML via Combined Menin-MLL and FLT3 Inhibition.

    Dzama MM, Steiner M, Rausch J, Sasca D, Schönfeld J, Kunz K, Taubert MC, McGeehan GM, Chen CW, Mupo A, Hähnel PS, Theobald M, Kindler T, Koche RP, Vassiliou GS, Armstrong SA and Kühn MWM

    University Medical Center of the Johannes Gutenberg University Mainz, Mainz, Germany.

    The interaction of Menin (MEN1) and MLL (MLL1, KMT2A) is a dependency and potential therapeutic opportunity against NPM1-mutant (NPM1mut) and MLL-rearranged (MLL-r) leukemias. Concomitant activating driver mutations in the gene encoding the tyrosine kinase FLT3 occur in both leukemias and are particularly common in the NPM1mut subtype. Transcriptional profiling upon pharmacological inhibition of the Menin-MLL complex revealed specific changes in gene expression with downregulation of the MEIS1 transcription-factor and its transcriptional target gene FLT3 being most pronounced. Combining Menin-MLL-inhibition with specific small-molecule kinase inhibitors of FLT3-phosphorylation resulted in a significantly superior reduction of phosphorylated FLT3 and transcriptional suppression of genes downstream to FLT3 signaling. The drug combination induced synergistic inhibition of proliferation as well as enhanced apoptosis and differentiation compared to single-drug treatment in models of human and murine NPM1mut and MLL-r leukemias harboring an FLT3 mutation. Primary AML cells harvested from patients with NPM1mutFLT3mut AML showed significantly better responses to combined Menin and FLT3-inhibition than to single-drug or vehicle control treatment, while AML cells with wildtype NPM1, MLL, and FLT3 were not affected by any of the two drugs. In vivo treatment of leukemic animals with MLL-r FLT3mut leukemia reduced leukemia burden significantly and prolonged survival compared to the single-drug and vehicle control groups. Our data suggest that combined Menin-MLL and FLT3-inhibition represents a novel and promising therapeutic strategy for patients with NPM1mut or MLL-r leukemia and concurrent FLT3 mutation.

    Blood 2020

  • Epigenetic priming by Dppa2 and 4 in pluripotency facilitates multi-lineage commitment.

    Eckersley-Maslin MA, Parry A, Blotenburg M, Krueger C, Ito Y, Franklin VNR, Narita M, D'Santos CS and Reik W

    Epigenetics Programme, Babraham Institute, Cambridge, UK.

    How the epigenetic landscape is established in development is still being elucidated. Here, we uncover developmental pluripotency associated 2 and 4 (DPPA2/4) as epigenetic priming factors that establish a permissive epigenetic landscape at a subset of developmentally important bivalent promoters characterized by low expression and poised RNA-polymerase. Differentiation assays reveal that Dppa2/4 double knockout mouse embryonic stem cells fail to exit pluripotency and differentiate efficiently. DPPA2/4 bind both H3K4me3-marked and bivalent gene promoters and associate with COMPASS- and Polycomb-bound chromatin. Comparing knockout and inducible knockdown systems, we find that acute depletion of DPPA2/4 results in rapid loss of H3K4me3 from key bivalent genes, while H3K27me3 is initially more stable but lost following extended culture. Consequently, upon DPPA2/4 depletion, these promoters gain DNA methylation and are unable to be activated upon differentiation. Our findings uncover a novel epigenetic priming mechanism at developmental promoters, poising them for future lineage-specific activation.

    Nature structural & molecular biology 2020

  • Patient-specific logic models of signaling pathways from screenings on cancer biopsies to prioritize personalized combination therapies.

    Eduati F, Jaaks P, Wappler J, Cramer T, Merten CA, Garnett MJ and Saez-Rodriguez J

    European Molecular Biology Laboratory (EMBL), Genome Biology Unit, Heidelberg, Germany.

    Mechanistic modeling of signaling pathways mediating patient-specific response to therapy can help to unveil resistance mechanisms and improve therapeutic strategies. Yet, creating such models for patients, in particular for solid malignancies, is challenging. A major hurdle to build these models is the limited material available that precludes the generation of large-scale perturbation data. Here, we present an approach that couples ex vivo high-throughput screenings of cancer biopsies using microfluidics with logic-based modeling to generate patient-specific dynamic models of extrinsic and intrinsic apoptosis signaling pathways. We used the resulting models to investigate heterogeneity in pancreatic cancer patients, showing dissimilarities especially in the PI3K-Akt pathway. Variation in model parameters reflected well the different tumor stages. Finally, we used our dynamic models to efficaciously predict new personalized combinatorial treatments. Our results suggest that our combination of microfluidic experiments and mathematical model can be a novel tool toward cancer precision medicine.

    Molecular systems biology 2020;16;2;e8664

  • Computational methods for single-cell omics across modalities.

    Efremova M and Teichmann SA

    Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK.

    Nature methods 2020;17;1;14-17

  • CellPhoneDB: inferring cell-cell communication from combined expression of multi-subunit ligand-receptor complexes.

    Efremova M, Vento-Tormo M, Teichmann SA and Vento-Tormo R

    Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK.

    Cell-cell communication mediated by ligand-receptor complexes is critical to coordinating diverse biological processes, such as development, differentiation and inflammation. To investigate how the context-dependent crosstalk of different cell types enables physiological processes to proceed, we developed CellPhoneDB, a novel repository of ligands, receptors and their interactions. In contrast to other repositories, our database takes into account the subunit architecture of both ligands and receptors, representing heteromeric complexes accurately. We integrated our resource with a statistical framework that predicts enriched cellular interactions between two cell types from single-cell transcriptomics data. Here, we outline the structure and content of our repository, provide procedures for inferring cell-cell communication networks from single-cell RNA sequencing data and present a practical step-by-step guide to help implement the protocol. CellPhoneDB v.2.0 is an updated version of our resource that incorporates additional functionalities to enable users to introduce new interacting molecules and reduces the time and resources needed to interrogate large datasets. CellPhoneDB v.2.0 is publicly available, both as code and as a user-friendly web interface; it can be used by both experts and researchers with little experience in computational genomics. In our protocol, we demonstrate how to evaluate meaningful biological interactions with CellPhoneDB v.2.0 using published datasets. This protocol typically takes ~2 h to complete, from installation to statistical analysis and visualization, for a dataset of ~10 GB, 10,000 cells and 19 cell types, and using five threads.

    Funded by: Wellcome Trust (Wellcome): 211276/Z/18/Z, WT206194

    Nature protocols 2020

  • Immunology in the Era of Single-Cell Technologies.

    Efremova M, Vento-Tormo R, Park JE, Teichmann SA and James KR

    Wellcome Sanger Institute, Hinxton, Cambridgeshire CB10 1SA, United Kingdom; email:,

    Immune cells are characterized by diversity, specificity, plasticity, and adaptability-properties that enable them to contribute to homeostasis and respond specifically and dynamically to the many threats encountered by the body. Single-cell technologies, including the assessment of transcriptomics, genomics, and proteomics at the level of individual cells, are ideally suited to studying these properties of immune cells. In this review we discuss the benefits of adopting single-cell approaches in studying underappreciated qualities of immune cells and highlight examples where these technologies have been critical to advancing our understanding of the immune system in health and disease. Expected final online publication date for the <i>Annual Review of Immunology</i>, Volume 38 is April 26, 2020. Please see for revised estimates.

    Annual review of immunology 2020

  • A p53-Dependent Checkpoint Induced upon DNA Damage Alters Cell Fate during hiPSC Differentiation.

    Eldridge CB, Allen FJ, Crisp A, Grandy RA, Vallier L and Sale JE

    MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, UK.

    The ability of human induced pluripotent stem cells (hiPSCs) to differentiate in vitro to each of the three germ layer lineages has made them an important model of early human development and a tool for tissue engineering. However, the factors that disturb the intricate transcriptional choreography of differentiation remain incompletely understood. Here, we uncover a critical time window during which DNA damage significantly reduces the efficiency and fidelity with which hiPSCs differentiate to definitive endoderm. DNA damage prevents the normal reduction of p53 levels as cells pass through the epithelial-to-mesenchymal transition, diverting the transcriptional program toward mesoderm without induction of an apoptotic response. In contrast, TP53-deficient cells differentiate to endoderm with high efficiency after DNA damage, suggesting that p53 enforces a "differentiation checkpoint" in early endoderm differentiation that alters cell fate in response to DNA damage.

    Stem cell reports 2020

  • An amphipathic peptide with antibiotic activity against multidrug-resistant Gram-negative bacteria.

    Elliott AG, Huang JX, Neve S, Zuegg J, Edwards IA, Cain AK, Boinett CJ, Barquist L, Lundberg CV, Steen J, Butler MS, Mobli M, Porter KM, Blaskovich MAT, Lociuro S, Strandh M and Cooper MA

    Centre for Superbug Solutions, Institute for Molecular Bioscience, The University of Queensland, Queensland, QLD, 4072, Australia.

    Peptide antibiotics are an abundant and synthetically tractable source of molecular diversity, but they are often cationic and can be cytotoxic, nephrotoxic and/or ototoxic, which has limited their clinical development. Here we report structure-guided optimization of an amphipathic peptide, arenicin-3, originally isolated from the marine lugworm Arenicola marina. The peptide induces bacterial membrane permeability and ATP release, with serial passaging resulting in a mutation in mlaC, a phospholipid transport gene. Structure-based design led to AA139, an antibiotic with broad-spectrum in vitro activity against multidrug-resistant and extensively drug-resistant bacteria, including ESBL, carbapenem- and colistin-resistant clinical isolates. The antibiotic induces a 3-4 log reduction in bacterial burden in mouse models of peritonitis, pneumonia and urinary tract infection. Cytotoxicity and haemolysis of the progenitor peptide is ameliorated with AA139, and the 'no observable adverse effect level' (NOAEL) dose in mice is ~10-fold greater than the dose generally required for efficacy in the infection models.

    Funded by: Medical Research Council: G1100100/1; Wellcome Trust: WT098051, WT104797/Z/14/Z

    Nature communications 2020;11;1;3184

  • Reliable detection of somatic mutations in solid tissues by laser-capture microdissection and low-input DNA sequencing.

    Ellis P, Moore L, Sanders MA, Butler TM, Brunner SF, Lee-Six H, Osborne R, Farr B, Coorens THH, Lawson ARJ, Cagan A, Stratton MR, Martincorena I and Campbell PJ

    Cancer, Ageing and Somatic Mutation (CASM), Wellcome Sanger Institute, Hinxton, UK.

    Somatic mutations accumulate in healthy tissues as we age, giving rise to cancer and potentially contributing to ageing. To study somatic mutations in non-neoplastic tissues, we developed a series of protocols to sequence the genomes of small populations of cells isolated from histological sections. Here, we describe a complete workflow that combines laser-capture microdissection (LCM) with low-input genome sequencing, while circumventing the use of whole-genome amplification (WGA). The protocol is subdivided broadly into four steps: tissue processing, LCM, low-input library generation and mutation calling and filtering. The tissue processing and LCM steps are provided as general guidelines that might require tailoring based on the specific requirements of the study at hand. Our protocol for low-input library generation uses enzymatic rather than acoustic fragmentation to generate WGA-free whole-genome libraries. Finally, the mutation calling and filtering strategy has been adapted from previously published protocols to account for artifacts introduced via library creation. To date, we have used this workflow to perform targeted and whole-genome sequencing of small populations of cells (typically 100-1,000 cells) in thousands of microbiopsies from a wide range of human tissues. The low-input DNA protocol is designed to be compatible with liquid handling platforms and make use of equipment and expertise standard to any core sequencing facility. However, obtaining low-input DNA material via LCM requires specialized equipment and expertise. The entire protocol from tissue reception through whole-genome library generation can be accomplished in as little as 1 week, although 2-3 weeks would be a more typical turnaround time.

    Funded by: Cancer Research UK (CRUK): C20/A20917, C57387/A21777, C98/A24032; Nederlandse Organisatie voor Wetenschappelijk Onderzoek (Netherlands Organisation for Scientific Research): 019.153LW.038; Pathological Society of Great Britain and Ireland: 1175; Schweizerischer Nationalfonds zur Förderung der Wissenschaftlichen Forschung (Swiss National Science Foundation): P2SKP3-171753, P400PB-180790; Wellcome Trust (Wellcome): WT088340MA

    Nature protocols 2020

  • Single-Cell Sequencing of Developing Human Gut Reveals Transcriptional Links to Childhood Crohn's Disease.

    Elmentaite R, Ross ADB, Roberts K, James KR, Ortmann D, Gomes T, Nayak K, Tuck L, Pritchard S, Bayraktar OA, Heuschkel R, Vallier L, Teichmann SA and Zilbauer M

    Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1SA, UK.

    Human gut development requires the orchestrated interaction of differentiating cell types. Here, we generate an in-depth single-cell map of the developing human intestine at 6-10 weeks post-conception. Our analysis reveals the transcriptional profile of cycling epithelial precursor cells; distinct from LGR5-expressing cells. We propose that these cells may contribute to differentiated cell subsets via the generation of LGR5-expressing stem cells and receive signals from surrounding mesenchymal cells. Furthermore, we draw parallels between the transcriptomes of ex vivo tissues and in vitro fetal organoids, revealing the maturation of organoid cultures in a dish. Lastly, we compare scRNA-seq profiles from pediatric Crohn's disease epithelium alongside matched healthy controls to reveal disease-associated changes in the epithelial composition. Contrasting these with the fetal profiles reveals the re-activation of fetal transcription factors in Crohn's disease. Our study provides a resource available at, and underscores the importance of unraveling fetal development in understanding disease.

    Funded by: Medical Research Council: MR/T001917/1

    Developmental cell 2020;55;6;771-783.e5

  • A missense variant in Mitochondrial Amidoxime Reducing Component 1 gene and protection against liver disease.

    Emdin CA, Haas ME, Khera AV, Aragam K, Chaffin M, Klarin D, Hindy G, Jiang L, Wei WQ, Feng Q, Karjalainen J, Havulinna A, Kiiskinen T, Bick A, Ardissino D, Wilson JG, Schunkert H, McPherson R, Watkins H, Elosua R, Bown MJ, Samani NJ, Baber U, Erdmann J, Gupta N, Danesh J, Saleheen D, Chang KM, Vujkovic M, Voight B, Damrauer S, Lynch J, Kaplan D, Serper M, Tsao P, Million Veteran Program, Mercader J, Hanis C, Daly M, Denny J, Gabriel S and Kathiresan S

    Center for Genomic Medicine, Massachusetts General Hospital, Boston, Massachusetts, United States of America.

    Analyzing 12,361 all-cause cirrhosis cases and 790,095 controls from eight cohorts, we identify a common missense variant in the Mitochondrial Amidoxime Reducing Component 1 gene (MARC1 p.A165T) that associates with protection from all-cause cirrhosis (OR 0.91, p = 2.3*10-11). This same variant also associates with lower levels of hepatic fat on computed tomographic imaging and lower odds of physician-diagnosed fatty liver as well as lower blood levels of alanine transaminase (-0.025 SD, 3.7*10-43), alkaline phosphatase (-0.025 SD, 1.2*10-37), total cholesterol (-0.030 SD, p = 1.9*10-36) and LDL cholesterol (-0.027 SD, p = 5.1*10-30) levels. We identified a series of additional MARC1 alleles (low-frequency missense p.M187K and rare protein-truncating p.R200Ter) that also associated with lower cholesterol levels, liver enzyme levels and reduced risk of cirrhosis (0 cirrhosis cases for 238 R200Ter carriers versus 17,046 cases of cirrhosis among 759,027 non-carriers, p = 0.04) suggesting that deficiency of the MARC1 enzyme may lower blood cholesterol levels and protect against cirrhosis.

    PLoS genetics 2020;16;4;e1008629

  • Single-cell transcriptomics of alloreactive CD4+ T cells over time reveals divergent fates during gut graft-versus-host disease.

    Engel JA, Lee HJ, Williams CG, Kuns R, Olver S, Lansink LI, Soon MS, Andersen SB, Powell JE, Svensson V, Teichmann SA, Hill GR, Varelias A, Koyama M and Haque A

    QIMR Berghofer Medical Research Institute, Herston, Brisbane, Queensland, Australia.

    Acute gastrointestinal (GI) graft-versus-host disease (GVHD) is a primary determinant of mortality after allogeneic hematopoietic stem cell transplantation (alloSCT). The condition is mediated by alloreactive donor CD4+ T cells that differentiate into pathogenic subsets expressing IFN-γ, IL-17A, or GM-CSF and is regulated by subsets expressing IL-10 and/or Foxp3. Developmental relationships between Th cell states during priming in mesenteric lymph nodes (mLNs) and effector function in the GI tract remain undefined at genome scale. We applied scRNA-Seq and computational modeling to a mouse model of donor DC-mediated GVHD exacerbation, creating an atlas of putative CD4+ T cell differentiation pathways in vivo. Computational trajectory inference suggested emergence of pathogenic and regulatory states along a single developmental trajectory in mLNs. Importantly, we inferred an unexpected second trajectory, categorized by little proliferation or cytokine expression, reduced glycolysis, and high tcf7 expression. TCF1hi cells upregulated α4β7 before gut migration and failed to express cytokines. These cells exhibited recall potential and plasticity following secondary transplantation, including cytokine or Foxp3 expression, but reduced T cell factor 1 (TCF1). Thus, scRNA-Seq suggested divergence of alloreactive CD4+ T cells into quiescent and effector states during gut GVHD exacerbation by donor DC, reflecting putative heterogeneous priming in vivo. These findings, which are potentially the first at a single-cell level during GVHD over time, may assist in examination of T cell differentiation in patients undergoing alloSCT.

    Funded by: Wellcome Trust

    JCI insight 2020;5;13

  • Guidelines for reporting single-cell RNA-seq experiments.

    Füllgrabe A, George N, Green M, Nejad P, Aronow B, Fexova SK, Fischer C, Freeberg MA, Huerta L, Morrison N, Scheuermann RH, Taylor D, Vasilevsky N, Clarke L, Gehlenborg N, Kent J, Marioni J, Teichmann S, Brazma A and Papatheodorou I

    European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK.

    Funded by: California Institute for Regenerative Medicine (CIRM): GC1R-06673-B; U.S. Department of Health &amp; Human Services | National Institutes of Health (NIH): OT2OD026677; Wellcome Trust (Wellcome): 108437/Z/15/Z

    Nature biotechnology 2020

  • Concordance for clonal hematopoiesis is limited in elderly twins.

    Fabre MA, McKerrell T, Zwiebel M, Vijayabaskar MS, Park N, Wells PM, Rad R, Deloukas P, Small K, Steves CJ and Vassiliou GS

    Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom.

    Although acquisition of leukemia-associated somatic mutations by 1 or more hematopoietic stem cells is inevitable with advancing age, its consequences are highly variable, ranging from clinically silent clonal hematopoiesis (CH) to leukemic progression. To investigate the influence of heritable factors on CH, we performed deep targeted sequencing of blood DNA from 52 monozygotic (MZ) and 27 dizygotic (DZ) twin pairs (aged 70-99 years). Using this highly sensitive approach, we identified CH (variant allele frequency ≥0.5%) in 62% of individuals. We did not observe higher concordance for CH within MZ twin pairs as compared with that within DZ twin pairs, or to that expected by chance. However, we did identify 2 MZ pairs in which both twins harbored identical rare somatic mutations, suggesting a shared cell of origin. Finally, in 3 MZ twin pairs harboring mutations in the same driver genes, serial blood samples taken 4 to 5 years apart showed substantial twin-to-twin variability in clonal trajectories. Our findings propose that the inherited genome does not exert a dominant influence on the behavior of adult CH and provide evidence that CH mutations may be acquired in utero.

    Funded by: Medical Research Council: MC_PC_12009; Wellcome Trust

    Blood 2020;135;4;269-273

  • Single-Cell Transcriptomics of Parkinson's Disease Human In Vitro Models Reveals Dopamine Neuron-Specific Stress Responses.

    Fernandes HJR, Patikas N, Foskolou S, Field SF, Park JE, Byrne ML, Bassett AR and Metzakopian E

    UK Dementia Research Institute, Department of Clinical Neurosciences, Cambridge Biomedical Campus, University of Cambridge, Cambridge, CB2 0AH, UK; Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.

    The advent of induced pluripotent stem cell (iPSC)-derived neurons has revolutionized Parkinson's disease (PD) research, but single-cell transcriptomic analysis suggests unresolved cellular heterogeneity within these models. Here, we perform the largest single-cell transcriptomic study of human iPSC-derived dopaminergic neurons to elucidate gene expression dynamics in response to cytotoxic and genetic stressors. We identify multiple neuronal subtypes with transcriptionally distinct profiles and differential sensitivity to stress, highlighting cellular heterogeneity in dopamine in vitro models. We validate this disease model by showing robust expression of PD GWAS genes and overlap with postmortem adult substantia nigra neurons. Importantly, stress signatures are ameliorated using felodipine, an FDA-approved drug. Using isogenic SNCA-A53T mutants, we find perturbations in glycolysis, cholesterol metabolism, synaptic signaling, and ubiquitin-proteasomal degradation. Overall, our study reveals cell type-specific perturbations in human dopamine neurons, which will further our understanding of PD and have implications for cell replacement therapies.

    Cell reports 2020;33;2;108263

  • FGFR1 Oncogenic Activation Reveals an Alternative Cell of Origin of SCLC in Rb1/p53 Mice.

    Ferone G, Song JY, Krijgsman O, van der Vliet J, Cozijnsen M, Semenova EA, Adams DJ, Peeper D and Berns A

    Oncode Institute, Division of Molecular Genetics, the Netherlands Cancer Institute, Plesmanlaan 121, 1066 CX Amsterdam, the Netherlands.

    Fibroblast growth factor receptor 1 (FGFR1) is frequently amplified in human small-cell lung cancer (SCLC), but its contribution to SCLC and other lung tumors has remained elusive. Here, we assess the tumorigenic capacity of constitutive-active FGFR1 (FGFR1<sup>K656E</sup>) with concomitant RB and P53 depletion in mouse lung. Our results reveal a context-dependent effect of FGFR1<sup>K656E</sup>: it impairs SCLC development from CGRP<sup>POS</sup> neuroendocrine (NE) cells, which are considered the major cell of origin of SCLC, whereas it promotes SCLC and low-grade NE bronchial lesions from tracheobronchial-basal cells. Moreover, FGFR1<sup>K656E</sup> induces lung adenocarcinoma (LADC) from most lung cell compartments. However, its expression is not sustained in LADC originating from CGRP<sup>POS</sup> cells. Therefore, cell context and tumor stage should be taken into account when considering FGFR1 inhibition as a therapeutic option.

    Cell reports 2020;30;11;3837-3850.e3

  • simurg: simulate bacterial pangenomes in R.

    Ferrés I, Fresia P and Iraola G

    Microbial Genomics Laboratory, Institut Pasteur Montevideo, Uruguay.

    Motivation: The pangenome concept describes genetic variability as the union of genes shared in a set of genomes and constitutes the current paradigm for comparative analysis of bacterial populations. However, there is a lack of tools to simulate pangenome variability and structure using defined evolutionary models.

    Results: We developed simurg, an R package that allows to simulate bacterial pangenomes using different combinations of evolutionary constraints such as gene gain, gene loss and mutation rates. Our tool allows the straightforward and reproducible simulation of bacterial pangenomes using real sequence data, providing a valuable tool for benchmarking of pangenome software or comparing evolutionary hypotheses.

    Availability and implementation: The simurg package is released under the GPL-3 license, and is freely available for download from GitHub (

    Supplementary information: Supplementary data are available at Bioinformatics online.

    Bioinformatics (Oxford, England) 2020;36;4;1273-1274

  • Update on the pathology, genetics and somatic landscape of sebaceous tumours.

    Ferreira I, Wiedemeyer K, Demetter P, Adams DJ, Arends MJ and Brenn T

    Université Libre de Bruxelles, Brussels, Belgium.

    Cutaneous sebaceous neoplasms show a predilection for the head and neck area of adults and include tumours with benign behaviour, sebaceous adenoma and sebaceoma, and sebaceous carcinoma with potential for an aggressive disease course at the malignant end of the spectrum. The majority of tumours are solitary and sporadic, but a subset of tumours may be associated with Lynch syndrome, also known as hereditary non-polyposis colon cancer (HNPCC) and previously referred to as Muir-Torre syndrome (now known to be part of Lynch syndrome). This review provides an overview of the clinical and histological features of cutaneous sebaceous neoplasia with an emphasis on differentiating features and differential diagnosis. It also offers insights into the recently described molecular pathways involved in the development of sebaceous tumours and their association with Lynch syndrome.

    Funded by: Cancer Research UK: 14356

    Histopathology 2020;76;5;640-649

  • Cohort Profile: East London Genes & Health (ELGH), a community-based population genomics and health study in British Bangladeshi and British Pakistani people.

    Finer S, Martin HC, Khan A, Hunt KA, MacLaughlin B, Ahmed Z, Ashcroft R, Durham C, MacArthur DG, McCarthy MI, Robson J, Trivedi B, Griffiths C, Wright J, Trembath RC and van Heel DA

    Blizard Institute, Barts and the London School of Medicine and Dentistry, Queen Mary University of London, London, UK.

    Funded by: Department of Health; Medical Research Council: MR/M009017/1; Wellcome Trust: 102627, 210561, 210561/Z/18/Z

    International journal of epidemiology 2020;49;1;20-21i

  • The Key Glycolytic Enzyme Phosphofructokinase Is Involved in Resistance to Antiplasmodial Glycosides.

    Fisher GM, Cobbold SA, Jezewski A, Carpenter EF, Arnold M, Cowell AN, Tjhin ET, Saliba KJ, Skinner-Adams TS, Lee MCS, Odom John A, Winzeler EA, McConville MJ, Poulsen SA and Andrews KT

    Griffith Institute for Drug Discovery, Griffith University, Queensland, Australia.

    <i>Plasmodium</i> parasites rely heavily on glycolysis for ATP production and for precursors for essential anabolic pathways, such as the methylerythritol phosphate (MEP) pathway. Here, we show that mutations in the <i>Plasmodium falciparum</i> glycolytic enzyme, phosphofructokinase (<i>Pf</i>PFK9), are associated with <i>in vitro</i> resistance to a primary sulfonamide glycoside (PS-3). Flux through the upper glycolysis pathway was significantly reduced in PS-3-resistant parasites, which was associated with reduced ATP levels but increased flux into the pentose phosphate pathway. PS-3 may directly or indirectly target enzymes in these pathways, as PS-3-treated parasites had elevated levels of glycolytic and tricarboxylic acid (TCA) cycle intermediates. PS-3 resistance also led to reduced MEP pathway intermediates, and PS-3-resistant parasites were hypersensitive to the MEP pathway inhibitor, fosmidomycin. Overall, this study suggests that PS-3 disrupts core pathways in central carbon metabolism, which is compensated for by mutations in <i>Pf</i>PFK9, highlighting a novel metabolic drug resistance mechanism in <i>P. falciparum</i><b>IMPORTANCE</b> Malaria, caused by <i>Plasmodium</i> parasites, continues to be a devastating global health issue, causing 405,000 deaths and 228 million cases in 2018. Understanding key metabolic processes in malaria parasites is critical to the development of new drugs to combat this major infectious disease. The <i>Plasmodium</i> glycolytic pathway is essential to the malaria parasite, providing energy for growth and replication and supplying important biomolecules for other essential <i>Plasmodium</i> anabolic pathways. Despite this overreliance on glycolysis, no current drugs target glycolysis, and there is a paucity of information on critical glycolysis targets. Our work addresses this unmet need, providing new mechanistic insights into this key pathway.

    mBio 2020;11;6

  • Genomic signatures of domestication in Old World camels.

    Fitak RR, Mohandesan E, Corander J, Yadamsuren A, Chuluunbat B, Abdelhadi O, Raziq A, Nagy P, Walzer C, Faye B and Burger PA

    Institute of Population Genetics, Vetmeduni Vienna, Veterinärplatz 1, 1210, Vienna, Austria.

    Domestication begins with the selection of animals showing less fear of humans. In most domesticates, selection signals for tameness have been superimposed by intensive breeding for economical or other desirable traits. Old World camels, conversely, have maintained high genetic variation and lack secondary bottlenecks associated with breed development. By re-sequencing multiple genomes from dromedaries, Bactrian camels, and their endangered wild relatives, here we show that positive selection for candidate genes underlying traits collectively referred to as 'domestication syndrome' is consistent with neural crest deficiencies and altered thyroid hormone-based signaling. Comparing our results with other domestic species, we postulate that the core set of domestication genes is considerably smaller than the pan-domestication set - and overlapping genes are likely a result of chance and redundancy. These results, along with the extensive genomic resources provided, are an important contribution to understanding the evolutionary history of camels and the genomic features of their domestication.

    Funded by: Austrian Science Fund (Fonds zur Förderung der Wissenschaftlichen Forschung): P24706-B25, P29623-B25

    Communications biology 2020;3;1;316

  • Drivers underpinning the malignant transformation of giant cell tumour of bone.

    Fittall MW, Lyskjaer I, Ellery P, Lombard P, Ijaz J, Strobl AC, Oukrif D, Tarabichi M, Sill M, Koelsche C, Mechtersheimer G, Demeulemeester J, Tirabosco R, Amary F, Campbell PJ, Pfister S, Jones DTW, Pillay N, Van Loo P, Behjati S and Flanagan AM

    The Francis Crick Institute, London, UK.

    The rare benign giant cell tumour of bone (GCTB) is defined by an almost unique mutation in the H3.3 family of histone genes H3-3A or H3-3B, however the same mutation is occasionally found in primary malignant bone tumours which share many features with the benign variant. Moreover, lung metastases can occur despite the absence of malignant histological features in either the primary or metastatic lesions. Herein we investigated the genetic events of 17 GCTBs including benign and malignant variants and the methylation profiles of 122 bone tumour samples including GCTBs. Benign GCTBs possessed few somatic alterations and no other known drivers besides the H3.3 mutation, whereas all malignant tumours harboured at least one additional driver mutation and exhibited genomic features resembling osteosarcomas, including high mutational burden, additional driver event(s) and a high degree of aneuploidy. The H3.3 mutation was found to predate the development of aneuploidy. In contrast to osteosarcomas, malignant H3.3-mutated tumours were enriched for a variety of alterations involving TERT, other than amplification, suggesting telomere dysfunction in the transformation of benign to malignant GCTB. DNA sequencing of the benign metastasising GCTB revealed no additional driver alterations; polyclonal seeding in the lung was identified, implying that the metastatic lesions represent an embolic event. Unsupervised clustering of DNA methylation profiles revealed that malignant H3.3- mutated tumours are distinct from their benign counterpart, and other bone tumours. Differential methylation analysis identified CCND1, encoding cyclin D1, as a plausible cancer driver gene in these tumours because hypermethylation of the CCND1 promoter was specific for GCTBs. We report here the genomic and methylation patterns underlying the rare clinical phenomena of benign metastasising and malignant transformation of GCTB and show how the combination of genomic and epigenomic findings could potentially distinguish benign from malignant GCTBs, thereby predicting aggressive behaviour in challenging diagnostic cases. This article is protected by copyright. All rights reserved.

    The Journal of pathology 2020

  • Genomically Aided Diagnosis of Severe Developmental Disorders.

    FitzPatrick DR and Firth HV

    MRC Human Genetics Unit, University of Edinburgh, Edinburgh EH4 2XU, United Kingdom; email:

    Our ability to make accurate and specific genetic diagnoses in individuals with severe developmental disorders has been transformed by data derived from genomic sequencing technologies. These data reveal both the patterns and rates of different mutational mechanisms and identify regions of the human genome with fewer mutations than would be expected. In outbred populations, the most common identifiable cause of severe developmental disorders is de novo mutation affecting the coding region in one of approximately 500 different genes, almost universally showing constraint. Simply combining the location of a de novo genomic event with its predicted consequence on the gene product gives significant diagnostic power. Our knowledge of the diversity of phenotypic consequences associated with comparable diagnostic genotypes at each locus is improving. Computationally useful phenotype data will improve diagnostic interpretation of ultrarare genetic variants and, in the long run, indicate which specific embryonic processes have been perturbed. Expected final online publication date for the <i>Annual Review of Genomics and Human Genetics</i>, Volume 21 is August 31, 2020. Please see for revised estimates.

    Annual review of genomics and human genetics 2020

  • Gut-educated IgA plasma cells defend the meningeal venous sinuses.

    Fitzpatrick Z, Frazer G, Ferro A, Clare S, Bouladoux N, Ferdinand J, Tuong ZK, Negro-Demontel ML, Kumar N, Suchanek O, Tajsic T, Harcourt K, Scott K, Bashford-Rogers R, Helmy A, Reich DS, Belkaid Y, Lawley TD, McGavern DB and Clatworthy MR

    Molecular Immunity Unit, Department of Medicine, University of Cambridge, Cambridge, UK.

    The central nervous system has historically been viewed as an immune-privileged site, but recent data have shown that the meninges-the membranes that surround the brain and spinal cord-contain a diverse population of immune cells<sup>1</sup>. So far, studies have focused on macrophages and T cells, but have not included a detailed analysis of meningeal humoral immunity. Here we show that, during homeostasis, the mouse and human meninges contain IgA-secreting plasma cells. These cells are positioned adjacent to dural venous sinuses: regions of slow blood flow with fenestrations that can potentially permit blood-borne pathogens to access the brain<sup>2</sup>. Peri-sinus IgA plasma cells increased with age and following a breach of the intestinal barrier. Conversely, they were scarce in germ-free mice, but their presence was restored by gut re-colonization. B cell receptor sequencing confirmed that meningeal IgA<sup>+</sup> cells originated in the intestine. Specific depletion of meningeal plasma cells or IgA deficiency resulted in reduced fungal entrapment in the peri-sinus region and increased spread into the brain following intravenous challenge, showing that meningeal IgA is essential for defending the central nervous system at this vulnerable venous barrier surface.

    Nature 2020

  • AZD0364 is a potent and selective ERK1/2 inhibitor which enhances anti-tumour activity in KRAS mutant tumour models when combined with the MEK inhibitor selumetinib.

    Flemington V, Davies EJ, Robinson D, Sandin LC, Delpuech O, Zhang P, Hanson L, Farrington P, Bell S, Falenta K, Gibbons FD, Lindsay N, Smith A, Wilson J, Roberts K, Tonge M, Hopcroft P, Willis SE, Roudier MP, Rooney C, Coker EA, Jaaks P, Garnett MJ, Fawell SE, Jones CD, Ward RA, Simpson I, Cosulich SC, Pease JE and Smith PD

    Oncology Bioscience, AstraZeneca

    The RAS-regulated RAF-MEK1/2-ERK1/2 (RAS/MAPK) signalling pathway is a major driver in oncogenesis and is frequently dysregulated in human cancers, primarily by mutations in BRAF or RAS genes. The clinical benefit of inhibitors of this pathway as single agents has only been realized in BRAF mutant melanoma, with limited effect of single agent pathway inhibitors in KRAS mutant tumours. Combined inhibition of multiple nodes within this pathway, such as MEK1/2 and ERK1/2, may be necessary to effectively suppress pathway signalling in KRAS mutant tumours and achieve meaningful clinical benefit. Here we report the discovery and characterization of AZD0364, a novel, reversible, ATP-competitive ERK1/2 inhibitor with high potency and kinase selectivity. In vitro, AZD0364 treatment resulted in inhibition of proximal and distal biomarkers and reduced proliferation in sensitive BRAF mutant and KRAS mutant cell lines. In multiple in vivo xenograft models, AZD0364 showed dose and time-dependent modulation of ERK1/2-dependent signalling biomarkers resulting in tumour regression in sensitive BRAF and KRAS mutant xenografts. We demonstrate that AZD0364 in combination with the MEK1/2 inhibitor selumetinib (AZD6244, ARRY142886) enhances efficacy in KRAS mutant preclinical models that are moderately sensitive or resistant to MEK1/2 inhibition. This combination results in deeper and more durable suppression of the RAS/MAPK signalling pathway that is not achievable with single agent treatment. The AZD0364 and selumetinib combination also results in significant tumour regressions in multiple KRAS mutant xenograft models. The combination of ERK1/2 and MEK1/2 inhibition thereby represents a viable clinical approach to target KRAS mutant tumours.

    Molecular cancer therapeutics 2020

  • Genomic and drug target evaluation of 90 cardiovascular proteins in 30,931 individuals.

    Folkersen L, Gustafsson S, Wang Q, Hansen DH, Hedman ÅK, Schork A, Page K, Zhernakova DV, Wu Y, Peters J, Eriksson N, Bergen SE, Boutin TS, Bretherick AD, Enroth S, Kalnapenkis A, Gådin JR, Suur BE, Chen Y, Matic L, Gale JD, Lee J, Zhang W, Quazi A, Ala-Korpela M, Choi SH, Claringbould A, Danesh J, Davey Smith G, de Masi F, Elmståhl S, Engström G, Fauman E, Fernandez C, Franke L, Franks PW, Giedraitis V, Haley C, Hamsten A, Ingason A, Johansson Å, Joshi PK, Lind L, Lindgren CM, Lubitz S, Palmer T, Macdonald-Dunlop E, Magnusson M, Melander O, Michaelsson K, Morris AP, Mägi R, Nagle MW, Nilsson PM, Nilsson J, Orho-Melander M, Polasek O, Prins B, Pålsson E, Qi T, Sjögren M, Sundström J, Surendran P, Võsa U, Werge T, Wernersson R, Westra HJ, Yang J, Zhernakova A, Ärnlöv J, Fu J, Smith JG, Esko T, Hayward C, Gyllensten U, Landen M, Siegbahn A, Wilson JF, Wallentin L, Butterworth AS, Holmes MV, Ingelsson E and Mälarstig A

    Department of Medicine, Karolinska Institute, Solna, Sweden.

    Circulating proteins are vital in human health and disease and are frequently used as biomarkers for clinical decision-making or as targets for pharmacological intervention. Here, we map and replicate protein quantitative trait loci (pQTL) for 90 cardiovascular proteins in over 30,000 individuals, resulting in 451 pQTLs for 85 proteins. For each protein, we further perform pathway mapping to obtain trans-pQTL gene and regulatory designations. We substantiate these regulatory findings with orthogonal evidence for trans-pQTLs using mouse knockdown experiments (ABCA1 and TRIB1) and clinical trial results (chemokine receptors CCR2 and CCR5), with consistent regulation. Finally, we evaluate known drug targets, and suggest new target candidates or repositioning opportunities using Mendelian randomization. This identifies 11 proteins with causal evidence of involvement in human disease that have not previously been targeted, including EGF, IL-16, PAPPA, SPON1, F3, ADM, CASP-8, CHI3L1, CXCL16, GDF15 and MMP-12. Taken together, these findings demonstrate the utility of large-scale mapping of the genetics of the proteome and provide a resource for future precision studies of circulating proteins in human health.

    Nature metabolism 2020;2;10;1135-1148

  • IRF5 Promotes Influenza Virus-Induced Inflammatory Responses in Human Induced Pluripotent Stem Cell-Derived Myeloid Cells and Murine Models.

    Forbester JL, Clement M, Wellington D, Yeung A, Dimonte S, Marsden M, Chapman L, Coomber EL, Tolley C, Lees E, Hale C, Clare S, Udalova I, Dong T, Dougan G and Humphreys IR

    Division of Infection and Immunity/Systems Immunity, University Research Institute, Cardiff, United Kingdom

    Recognition of influenza A virus (IAV) by the innate immune system triggers pathways that restrict viral replication, activate innate immune cells, and regulate adaptive immunity. However, excessive innate immune activation can exaggerate disease. The pathways promoting excessive activation are incompletely understood, with limited experimental models to investigate the mechanisms driving influenza virus-induced inflammation in humans. Interferon regulatory factor 5 (IRF5) is a transcription factor that plays important roles in the induction of cytokines after viral sensing. In an <i>in vivo</i> model of IAV infection, IRF5 deficiency reduced IAV-driven immune pathology and associated inflammatory cytokine production, specifically reducing cytokine-producing myeloid cell populations in <i>Irf5</i><sup>-/-</sup> mice but not impacting type 1 interferon (IFN) production or virus replication. Using cytometry by time of flight (CyTOF), we identified that human lung IRF5 expression was highest in cells of the myeloid lineage. To investigate the role of IRF5 in mediating human inflammatory responses by myeloid cells to IAV, we employed human-induced pluripotent stem cells (hIPSCs) with biallelic mutations in <i>IRF5</i>, demonstrating for the first time that induced pluripotent stem cell-derived dendritic cells (iPS-DCs) with biallelic mutations can be used to investigate the regulation of human virus-induced immune responses. Using this technology, we reveal that IRF5 deficiency in human DCs, or macrophages, corresponded with reduced virus-induced inflammatory cytokine production, with IRF5 acting downstream of Toll-like receptor 7 (TLR7) and, possibly, retinoic acid-inducible gene I (RIG-I) after viral sensing. Thus, IRF5 acts as a regulator of myeloid cell inflammatory cytokine production during IAV infection in mice and humans and drives immune-mediated viral pathogenesis independently of type 1 IFN and virus replication.<b>IMPORTANCE</b> The inflammatory response to influenza A virus (IAV) participates in infection control but contributes to disease severity. After viral detection, intracellular pathways are activated, initiating cytokine production, but these pathways are incompletely understood. We show that interferon regulatory factor 5 (IRF5) mediates IAV-induced inflammation and, in mice, drives pathology. This was independent of antiviral type 1 IFN and virus replication, implying that IRF5 could be specifically targeted to treat influenza virus-induced inflammation. We show for the first time that human iPSC technology can be exploited in genetic studies of virus-induced immune responses. Using this technology, we deleted IRF5 in human myeloid cells. These IRF5-deficient cells exhibited impaired influenza virus-induced cytokine production and revealed that IRF5 acts downstream of Toll-like receptor 7 and possibly retinoic acid-inducible gene I. Our data demonstrate the importance of IRF5 in influenza virus-induced inflammation, suggesting that genetic variation in the IRF5 gene may influence host susceptibility to viral diseases.

    Funded by: Medical Research Council: MR/L018942/1; Wellcome Trust: 207503/Z/17/Z, WT098051

    Journal of virology 2020;94;9

  • Whole genome sequencing of Plasmodium vivax isolates reveals frequent sequence and structural polymorphisms in erythrocyte binding genes.

    Ford A, Kepple D, Abagero BR, Connors J, Pearson R, Auburn S, Getachew S, Ford C, Gunalan K, Miller LH, Janies DA, Rayner JC, Yan G, Yewhalaw D and Lo E

    Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, United States of America.

    Plasmodium vivax malaria is much less common in Africa than the rest of the world because the parasite relies primarily on the Duffy antigen/chemokine receptor (DARC) to invade human erythrocytes, and the majority of Africans are Duffy negative. Recently, there has been a dramatic increase in the reporting of P. vivax cases in Africa, with a high number of them being in Duffy negative individuals, potentially indicating P. vivax has evolved an alternative invasion mechanism that can overcome Duffy negativity. Here, we analyzed single nucleotide polymorphism (SNP) and copy number variation (CNV) in Whole Genome Sequence (WGS) data from 44 P. vivax samples isolated from symptomatic malaria patients in southwestern Ethiopia, where both Duffy positive and Duffy negative individuals are found. A total of 123,711 SNPs were detected, of which 22.7% were nonsynonymous and 77.3% were synonymous mutations. The largest number of SNPs were detected on chromosomes 9 (24,007 SNPs; 19.4% of total) and 10 (16,852 SNPs, 13.6% of total). There were particularly high levels of polymorphism in erythrocyte binding gene candidates including merozoite surface protein 1 (MSP1) and merozoite surface protein 3 (MSP3.5, MSP3.85 and MSP3.9). Two genes, MAEBL and MSP3.8 related to immunogenicity and erythrocyte binding function were detected with significant signals of positive selection. Variation in gene copy number was also concentrated in genes involved in host-parasite interactions, including the expansion of the Duffy binding protein gene (PvDBP) on chromosome 6 and MSP3.11 on chromosome 10. Based on the phylogeny constructed from the whole genome sequences, the expansion of these genes was an independent process among the P. vivax lineages in Ethiopia. We further inferred transmission patterns of P. vivax infections among study sites and showed various levels of gene flow at a small geographical scale. The genomic features of P. vivax provided baseline data for future comparison with those in Duffy-negative individuals and allowed us to develop a panel of informative Single Nucleotide Polymorphic markers diagnostic at a micro-geographical scale.

    PLoS neglected tropical diseases 2020;14;10;e0008234

  • A bug's life: Delving into the challenges of helminth microbiome studies.

    Formenti F, Cortés A, Brindley PJ, Cantacessi C and Rinaldi G

    Department of Veterinary Medicine, University of Cambridge, Cambridge, United Kingdom.

    PLoS neglected tropical diseases 2020;14;9;e0008446

  • Sex chromosome evolution in parasitic nematodes of humans.

    Foster JM, Grote A, Mattick J, Tracey A, Tsai YC, Chung M, Cotton JA, Clark TA, Geber A, Holroyd N, Korlach J, Li Y, Libro S, Lustigman S, Michalski ML, Paulini M, Rogers MB, Teigen L, Twaddle A, Welch L, Berriman M, Dunning Hotopp JC and Ghedin E

    Division of Protein Expression & Modification, New England Biolabs, Ipswich, MA, 01938, USA.

    Sex determination mechanisms often differ even between related species yet the evolution of sex chromosomes remains poorly understood in all but a few model organisms. Some nematodes such as Caenorhabditis elegans have an XO sex determination system while others, such as the filarial parasite Brugia malayi, have an XY mechanism. We present a complete B. malayi genome assembly and define Nigon elements shared with C. elegans, which we then map to the genomes of other filarial species and more distantly related nematodes. We find a remarkable plasticity in sex chromosome evolution with several distinct cases of neo-X and neo-Y formation, X-added regions, and conversion of autosomes to sex chromosomes from which we propose a model of chromosome evolution across different nematode clades. The phylum Nematoda offers a new and innovative system for gaining a deeper understanding of sex chromosome evolution.

    Funded by: Medical Research Council: MR/L001020/1; NIAID NIH HHS: U19 AI110820; Wellcome Trust: 098051

    Nature communications 2020;11;1;1964

  • Selection of oncogenic mutant clones in normal human skin varies with body site.

    Fowler JC, King C, Bryant C, Hall M, Sood R, Ong SH, Earp E, Fernandez-Antoran D, Koeppel J, Dentro SC, Shorthouse D, Durrani A, Fife K, Rytina E, Milne D, Roshan A, Mahububani K, Saeb-Parsy K, Hall BA, Gerstung M and Jones PH

    Precancer Team, Wellcome Sanger Institute.

    Skin cancer risk varies substantially across the body, yet how this relates to the mutations found in normal skin is unknown. Here we mapped mutant clones in skin from high and low risk sites. The density of mutations varied by location. The prevalence of NOTCH1 and FAT1 mutations in forearm, trunk and leg skin was similar to that in keratinocyte cancers. Most mutations were caused by ultraviolet (UV) light, but mutational signature analysis suggested differences in DNA repair processes between sites. 11 mutant genes were under positive selection, with TP53 preferentially selected in the head and FAT1 in the leg. Fine scale mapping revealed 10% of clones had copy number alterations. Analysis of hair follicles showed mutations in the upper follicle resembled adjacent skin, but the lower follicle was sparsely mutated. Normal skin is dense patchwork of mutant clones arising from competitive selection that varies by location.

    Cancer discovery 2020

  • Global genome diversity of the Leishmania donovani complex.

    Franssen SU, Durrant C, Stark O, Moser B, Downing T, Imamura H, Dujardin JC, Sanders MJ, Mauricio I, Miles MA, Schnur LF, Jaffe CL, Nasereddin A, Schallig H, Yeo M, Bhattacharyya T, Alam MZ, Berriman M, Wirth T, Schönian G and Cotton JA

    Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, United Kingdom.

    Protozoan parasites of the <i>Leishmania donovani</i> complex - <i>L. donovani</i> and <i>L. infantum</i> - cause the fatal disease visceral leishmaniasis. We present the first comprehensive genome-wide global study, with 151 cultured field isolates representing most of the geographical distribution. <i>L. donovani</i> isolates separated into five groups that largely coincide with geographical origin but vary greatly in diversity. In contrast, the majority of <i>L. infantum</i> samples fell into one globally-distributed group with little diversity. This picture is complicated by several hybrid lineages. Identified genetic groups vary in heterozygosity and levels of linkage, suggesting different recombination histories. We characterise chromosome-specific patterns of aneuploidy and identified extensive structural variation, including known and suspected drug resistance loci. This study reveals greater genetic diversity than suggested by geographically-focused studies, provides a resource of genomic variation for future work and sets the scene for a new understanding of the evolution and genetics of the <i>Leishmania donovani</i> complex.

    Funded by: EU Framework Programme for Research and Innovation: FP7- 222895; Wellcome Trust: Wellcome Sanger Institute core funding, WT098051, Wellcome Sanger Institute core funding, WT206194

    eLife 2020;9

  • A publicly accessible database for Clostridioides difficile genome sequences supports tracing of transmission chains and epidemics.

    Frentrup M, Zhou Z, Steglich M, Meier-Kolthoff JP, Göker M, Riedel T, Bunk B, Spröer C, Overmann J, Blaschitz M, Indra A, von Müller L, Kohl TA, Niemann S, Seyboldt C, Klawonn F, Kumar N, Lawley TD, García-Fernández S, Cantón R, Del Campo R, Zimmermann O, Groß U, Achtman M and Nübel U

    Leibniz Institute DSMZ, Braunschweig, Germany.

    <i>Clostridioides difficile</i> is the primary infectious cause of antibiotic-associated diarrhea. Local transmissions and international outbreaks of this pathogen have been previously elucidated by bacterial whole-genome sequencing, but comparative genomic analyses at the global scale were hampered by the lack of specific bioinformatic tools. Here we introduce a publicly accessible database within EnteroBase ( that automatically retrieves and assembles <i>C. difficile</i> short-reads from the public domain, and calls alleles for core-genome multilocus sequence typing (cgMLST). We demonstrate that comparable levels of resolution and precision are attained by EnteroBase cgMLST and single-nucleotide polymorphism analysis. EnteroBase currently contains 18 254 quality-controlled <i>C. difficile</i> genomes, which have been assigned to hierarchical sets of single-linkage clusters by cgMLST distances. This hierarchical clustering is used to identify and name populations of <i>C. difficile</i> at all epidemiological levels, from recent transmission chains through to epidemic and endemic strains. Moreover, it puts newly collected isolates into phylogenetic and epidemiological context by identifying related strains among all previously published genome data. For example, HC2 clusters (i.e. chains of genomes with pairwise distances of up to two cgMLST alleles) were statistically associated with specific hospitals (<i>P</i><10<sup>-4</sup>) or single wards (<i>P</i>=0.01) within hospitals, indicating they represented local transmission clusters. We also detected several HC2 clusters spanning more than one hospital that by retrospective epidemiological analysis were confirmed to be associated with inter-hospital patient transfers. In contrast, clustering at level HC150 correlated with <i>k</i>-mer-based classification and was largely compatible with PCR ribotyping, thus enabling comparisons to earlier surveillance data. EnteroBase enables contextual interpretation of a growing collection of assembled, quality-controlled <i>C. difficile</i> genome sequences and their associated metadata. Hierarchical clustering rapidly identifies database entries that are related at multiple levels of genetic distance, facilitating communication among researchers, clinicians and public-health officials who are combatting disease caused by <i>C. difficile</i>.

    Microbial genomics 2020

  • Pan-cancer computational histopathology reveals mutations, tumor composition and prognosis

    Fu,Yu, Jung,Alexander W., Torne,Ramon Viñas, Gonzalez,Santiago, Vöhringer, Harald, Shmatko,Artem, Yates,Lucy R., Jimenez-Linan, Mercedes, Moore,Luiza and Gerstung,Moritz

    We use deep transfer learning to quantify histopathological patterns across 17,355 hematoxylin and eosin-stained histopathology slide images from 28 cancer types and correlate these with matched genomic, transcriptomic and survival data. This approach accurately classifies cancer types and provides spatially resolved tumor and normal tissue distinction. Automatically learned computational histopathological features correlate with a large range of recurrent genetic aberrations across cancer types. This includes whole-genome duplications, which display universal features across cancer types, individual chromosomal aneuploidies, focal amplifications and deletions, as well as driver gene mutations. There are widespread associations between bulk gene expression levels and histopathology, which reflect tumor composition and enable the localization of transcriptomically defined tumor-infiltrating lymphocytes. Computational histopathology augments prognosis based on histopathological subtyping and grading, and highlights prognostically relevant areas such as necrosis or lymphocytic aggregates. These findings show the remarkable potential of computer vision in characterizing the molecular basis of tumor histopathology.

    Nature cancer 2020;1;8;800-810

  • Ancient Jomon genome sequence analysis sheds light on migration patterns of early East Asian populations.

    Gakuhari T, Nakagome S, Rasmussen S, Allentoft ME, Sato T, Korneliussen T, Chuinneagáin BN, Matsumae H, Koganebuchi K, Schmidt R, Mizushima S, Kondo O, Shigehara N, Yoneda M, Kimura R, Ishida H, Masuyama T, Yamada Y, Tajima A, Shibata H, Toyoda A, Tsurumoto T, Wakebe T, Shitara H, Hanihara T, Willerslev E, Sikora M and Oota H

    Center for Cultural Resource Studies, College of Human and Social Sciences, Kanazawa University, Kanazawa, Japan.

    Anatomically modern humans reached East Asia more than 40,000 years ago. However, key questions still remain unanswered with regard to the route(s) and the number of wave(s) in the dispersal into East Eurasia. Ancient genomes at the edge of the region may elucidate a more detailed picture of the peopling of East Eurasia. Here, we analyze the whole-genome sequence of a 2,500-year-old individual (IK002) from the main-island of Japan that is  characterized with a typical Jomon culture. The phylogenetic analyses support multiple waves of migration, with IK002 forming a basal lineage to the East and Northeast Asian genomes examined, likely representing some of the earliest-wave migrants who went north from Southeast Asia to East Asia. Furthermore, IK002 shows strong genetic affinity with the indigenous Taiwan aborigines, which may support a coastal route of the Jomon-ancestry migration. This study highlights the power of ancient genomics to provide new insights into the complex history of human migration into East Eurasia.

    Communications biology 2020;3;1;437

  • Rapid and sensitive large-scale screening of low affinity extracellular receptor protein interactions by using reaction induced inhibition of Gaussia luciferase.

    Galaway F and Wright GJ

    Cell Surface Signalling Laboratory, Wellcome Sanger Institute, Cambridge, UK.

    Extracellular protein interactions mediated by cell surface receptors are essential for intercellular communication in multicellular organisms. Assays to detect extracellular interactions must account for their often weak binding affinities and also the biochemical challenges in solubilising membrane-embedded receptors in an active form. Methods based on detecting direct binding of soluble recombinant receptor ectodomains have been successful, but genome-scale screening is limited by the usual requirement of producing sufficient amounts of each protein in two different forms, usually a "bait" and "prey". Here, we show that oligomeric receptor ectodomains coupled to concatenated units of the light-generating Gaussia luciferase enzyme robustly detected low affinity interactions and reduced the amount of protein required by several orders of magnitude compared to other reporter enzymes. Importantly, we discovered that this flash-type luciferase exhibited a reaction-induced inhibition that permitted the use of a single protein preparation as both bait and prey thereby halving the number of expression plasmids and recombinant proteins required for screening. This approach was tested against a benchmarked set of quantified extracellular interactions and shown to detect extremely weak interactions (K<sub>D</sub>s ≥ μM). This method will facilitate large-scale receptor interaction screening and contribute to the goal of mapping networks of cellular communication.

    Funded by: Wellcome Trust: 206194

    Scientific reports 2020;10;1;10522

  • A New Pneumococcal Capsule Type, 10D, is the 100th Serotype and Has a Large cps Fragment from an Oral Streptococcus.

    Ganaie F, Saad JS, McGee L, van Tonder AJ, Bentley SD, Lo SW, Gladstone RA, Turner P, Keenan JD, Breiman RF and Nahm MH

    Department of Medicine, University of Alabama at Birmingham, Birmingham, Alabama, USA.

    <i>Streptococcus pneumoniae</i> (pneumococcus) is a major human pathogen producing structurally diverse capsular polysaccharides. Widespread use of highly successful pneumococcal conjugate vaccines (PCVs) targeting pneumococcal capsules has greatly reduced infections by the vaccine types but increased infections by nonvaccine serotypes. Herein, we report a new and the 100th capsule type, named serotype 10D, by determining its unique chemical structure and biosynthetic roles of all capsule synthesis locus (<i>cps</i>) genes. The name 10D reflects its serologic cross-reaction with serotype 10A and appearance of cross-opsonic antibodies in response to immunization with 10A polysaccharide in a 23-valent pneumococcal vaccine. Genetic analysis showed that 10D <i>cps</i> has three large regions syntenic to and highly homologous with <i>cps</i> loci from serotype 6C, serotype 39, and an oral streptococcus strain (<i>S. mitis</i> SK145). The 10D <i>cps</i> region syntenic to SK145 is about 6 kb and has a short gene fragment of <i>wciN</i>α at the 5' end. The presence of this nonfunctional <i>wciN</i>α fragment provides compelling evidence for a recent interspecies genetic transfer from oral streptococcus to pneumococcus. Since oral streptococci have a large repertoire of <i>cps</i> loci, widespread PCV usage could facilitate the appearance of novel serotypes through interspecies recombination.<b>IMPORTANCE</b> The polysaccharide capsule is essential for the pathogenicity of pneumococcus, which is responsible for millions of deaths worldwide each year. Currently available pneumococcal vaccines are designed to elicit antibodies to the capsule polysaccharides of the pneumococcal isolates commonly causing diseases, and the antibodies provide protection only against the pneumococcus expressing the vaccine-targeted capsules. Since pneumococci can produce different capsule polysaccharides and therefore reduce vaccine effectiveness, it is important to track the appearance of novel pneumococcal capsule types and how these new capsules are created. Herein, we describe a new and the 100th pneumococcal capsule type with unique chemical and serological properties. The capsule type was named 10D for its serologic similarity to 10A. Genetic studies provide strong evidence that pneumococcus created 10D capsule polysaccharide by capturing a large genetic fragment from an oral streptococcus. Such interspecies genetic exchanges could greatly increase diversity of pneumococcal capsules and complicate serotype shifts.

    mBio 2020;11;3

  • Functional Microbiomics Reveals Alterations of the Gut Microbiome and Host Co-Metabolism in Patients With Alcoholic Hepatitis.

    Gao B, Duan Y, Lang S, Barupal D, Wu TC, Valdiviez L, Roberts B, Choy YY, Shen T, Byram G, Zhang Y, Fan S, Wancewicz B, Shao Y, Vervier K, Wang Y, Zhou R, Jiang L, Nath S, Loomba R, Abraldes JG, Bataller R, Tu XM, Stärkel P, Lawley TD, Fiehn O and Schnabl B

    Department of Medicine University of California San Diego La Jolla CA.

    Alcohol-related liver disease is a major public health burden, and the gut microbiota is an important contributor to disease pathogenesis. The aim of the present study is to characterize functional alterations of the gut microbiota and test their performance for short-term mortality prediction in patients with alcoholic hepatitis. We integrated shotgun metagenomics with untargeted metabolomics to investigate functional alterations of the gut microbiota and host co-metabolism in a multicenter cohort of patients with alcoholic hepatitis. Profound changes were found in the gut microbial composition, functional metagenome, serum, and fecal metabolomes in patients with alcoholic hepatitis compared with nonalcoholic controls. We demonstrate that in comparison with single omics alone, the performance to predict 30-day mortality was improved when combining microbial pathways with respective serum metabolites in patients with alcoholic hepatitis. The area under the receiver operating curve was higher than 0.85 for the tryptophan, isoleucine, and methionine pathways as predictors for 30-day mortality, but achieved 0.989 for using the urea cycle pathway in combination with serum urea, with a bias-corrected prediction error of 0.083 when using leave-one-out cross validation. <i>Conclusion:</i> Our study reveals changes in key microbial metabolic pathways associated with disease severity that predict short-term mortality in our cohort of patients with alcoholic hepatitis.

    Hepatology communications 2020;4;8;1168-1182

  • Identification of slit3 as a locus affecting nicotine preference in zebrafish and human smoking behaviour.

    García-González J, Brock AJ, Parker MO, Riley RJ, Joliffe D, Sudwarts A, Teh MT, Busch-Nentwich EM, Stemple DL, Martineau AR, Kaprio J, Palviainen T, Kuan V, Walton RT and Brennan CH

    School of Biological and Chemical Sciences, Queen Mary, University of London, London, United Kingdom.

    To facilitate smoking genetics research we determined whether a screen of mutagenized zebrafish for nicotine preference could predict loci affecting smoking behaviour. From 30 screened F<sub>3</sub> sibling groups, where each was derived from an individual ethyl-nitrosurea mutagenized F<sub>0</sub> fish, two showed increased or decreased nicotine preference. Out of 25 inactivating mutations carried by the F<sub>3</sub> fish, one in the <i>slit3</i> gene segregated with increased nicotine preference in heterozygous individuals. Focussed SNP analysis of the human <i>SLIT3</i> locus in cohorts from UK (n=863) and Finland (n=1715) identified two variants associated with cigarette consumption and likelihood of cessation. Characterisation of <i>slit3</i> mutant larvae and adult fish revealed decreased sensitivity to the dopaminergic and serotonergic antagonist amisulpride, known to affect startle reflex that is correlated with addiction in humans, and increased <i>htr1aa</i> mRNA expression in mutant larvae. No effect on neuronal pathfinding was detected. These findings reveal a role for SLIT3 in development of pathways affecting responses to nicotine in zebrafish and smoking in humans.

    Funded by: Academy of Finland: 308248, 312073; Biotechnology and Biological Sciences Research Council: BB/M007863; Medical Research Council: G1000403; NIDA NIH HHS: U01 DA044400; NIH HHS: Project grant, U01 DA 044400-03; National Centre for the Replacement, Refinement and Reduction of Animals in Research: G1000053; National Institute for Health Research: NF-SI-0515-10076, NIHR PGfAR RP-PG-0407-10398, PGfAR RP-PG-0609-10181; Royal Society: Industry Fellows College; Wellcome Trust: Clinical research fellowship WT 110284/Z/15/Z

    eLife 2020;9

  • Detection of simple and complex de novo mutations with multiple reference sequences.

    Garimella KV, Iqbal Z, Krause MA, Campino S, Kekre M, Drury E, Kwiatkowski D, Sá JM, Wellems TE and McVean G

    Data Sciences Platform, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA.

    The characterization of de novo mutations in regions of high sequence and structural diversity from whole-genome sequencing data remains highly challenging. Complex structural variants tend to arise in regions of high repetitiveness and low complexity, challenging both de novo assembly, in which short reads do not capture the long-range context required for resolution, and mapping approaches, in which improper alignment of reads to a reference genome that is highly diverged from that of the sample can lead to false or partial calls. Long-read technologies can potentially solve such problems but are currently unfeasible to use at scale. Here we present Corticall, a graph-based method that combines the advantages of multiple technologies and prior data sources to detect arbitrary classes of genetic variant. We construct multisample, colored de Bruijn graphs from short-read data for all samples, align long-read-derived haplotypes and multiple reference data sources to restore graph connectivity information, and call variants using graph path-finding algorithms and a model for simultaneous alignment and recombination. We validate and evaluate the approach using extensive simulations and use it to characterize the rate and spectrum of de novo mutation events in 119 progeny from four <i>Plasmodium falciparum</i> experimental crosses, using long-read data on the parents to inform reconstructions of the progeny and to detect several known and novel nonallelic homologous recombination events.

    Genome research 2020;30;8;1154-1169

  • Long-term expansion, genomic stability and in vivo safety of adult human pancreas organoids.

    Georgakopoulos N, Prior N, Angres B, Mastrogiovanni G, Cagan A, Harrison D, Hindley CJ, Arnes-Benito R, Liau SS, Curd A, Ivory N, Simons BD, Martincorena I, Wurst H, Saeb-Parsy K and Huch M

    The Wellcome Trust/ Cancer Research UK Gurdon Institute, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QN, UK.

    Background: Pancreatic organoid systems have recently been described for the in vitro culture of pancreatic ductal cells from mouse and human. Mouse pancreatic organoids exhibit unlimited expansion potential, while previously reported human pancreas organoid (hPO) cultures do not expand efficiently long-term in a chemically defined, serum-free medium. We sought to generate a 3D culture system for long-term expansion of human pancreas ductal cells as hPOs to serve as the basis for studies of human pancreas ductal epithelium, exocrine pancreatic diseases and the development of a genomically stable replacement cell therapy for diabetes mellitus.

    Results: Our chemically defined, serum-free, human pancreas organoid culture medium supports the generation and expansion of hPOs with high efficiency from both fresh and cryopreserved primary tissue. hPOs can be expanded from a single cell, enabling their genetic manipulation and generation of clonal cultures. hPOs expanded for months in vitro maintain their ductal morphology, biomarker expression and chromosomal integrity. Xenografts of hPOs survive long-term in vivo when transplanted into the pancreas of immunodeficient mice. Notably, mouse orthotopic transplants show no signs of tumorigenicity. Crucially, our medium also supports the establishment and expansion of hPOs in a chemically defined, modifiable and scalable, biomimetic hydrogel.

    Conclusions: hPOs can be expanded long-term, from both fresh and cryopreserved human pancreas tissue in a chemically defined, serum-free medium with no detectable tumorigenicity. hPOs can be clonally expanded, genetically manipulated and are amenable to culture in a chemically defined hydrogel. hPOs therefore represent an abundant source of pancreas ductal cells that retain the characteristics of the tissue-of-origin, which opens up avenues for modelling diseases of the ductal epithelium and increasing understanding of human pancreas exocrine biology as well as for potentially producing insulin-secreting cells for the treatment of diabetes.

    Funded by: Cancer Research UK: C6946/A14492; Horizon 2020: ECH2020-668350; Wellcome Trust: 092096, 104151/Z/14/

    BMC developmental biology 2020;20;1;4

  • Transcription-coupled repair and mismatch repair contribute towards preserving genome integrity at mononucleotide repeat tracts.

    Georgakopoulos-Soares I, Koh G, Momen SE, Jiricny J, Hemberg M and Nik-Zainal S

    Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK.

    The mechanisms that underpin how insertions or deletions (indels) become fixed in DNA have primarily been ascribed to replication-related and/or double-strand break (DSB)-related processes. Here, we introduce a method to evaluate indels, orientating them relative to gene transcription. In so doing, we reveal a number of surprising findings: First, there is a transcriptional strand asymmetry in the distribution of mononucleotide repeat tracts in the reference human genome. Second, there is a strong transcriptional strand asymmetry of indels across 2,575 whole genome sequenced human cancers. We suggest that this is due to the activity of transcription-coupled nucleotide excision repair (TC-NER). Furthermore, TC-NER interacts with mismatch repair (MMR) under physiological conditions to produce strand bias. Finally, we show how insertions and deletions differ in their dependencies on these repair pathways. Our analytical approach reveals insights into the contribution of DNA repair towards indel mutagenesis in human cells.

    Funded by: Cancer Research UK: A23916, A25274; Swiss National Science Foundation: 170267; Wellcome Trust

    Nature communications 2020;11;1;1980

  • Asymmetron: a toolkit for the identification of strand asymmetry patterns in biological sequences.

    Georgakopoulos-Soares I, Mouratidis I, Parada GE, Matharu N, Hemberg M and Ahituv N

    Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA.

    DNA strand asymmetries can have a major effect on several biological functions, including replication, transcription and transcription factor binding. As such, DNA strand asymmetries and mutational strand bias can provide information about biological function. However, a versatile tool to explore this does not exist. Here, we present Asymmetron, a user-friendly computational tool that performs statistical analysis and visualizations for the evaluation of strand asymmetries. Asymmetron takes as input DNA features provided with strand annotation and outputs strand asymmetries for consecutive occurrences of a single DNA feature or between pairs of features. We illustrate the use of Asymmetron by identifying transcriptional and replicative strand asymmetries of germline structural variant breakpoints. We also show that the orientation of the binding sites of 45% of human transcription factors analyzed have a significant DNA strand bias in transcribed regions, that is also corroborated in ChIP-seq analyses, and is likely associated with transcription. In summary, we provide a novel tool to assess DNA strand asymmetries and show how it can be used to derive new insights across a variety of biological disciplines.

    Nucleic acids research 2020

  • Dynamic changes in the epigenomic landscape regulate human organogenesis and link to developmental disorders.

    Gerrard DT, Berry AA, Jennings RE, Birket MJ, Zarrineh P, Garstang MG, Withey SL, Short P, Jiménez-Gancedo S, Firbas PN, Donaldson I, Sharrocks AD, Hanley KP, Hurles ME, Gomez-Skarmeta JL, Bobola N and Hanley NA

    Faculty of Biology, Medicine & Health, Manchester Academic Health Sciences Centre, University of Manchester, Oxford Road, Manchester, M13 9PT, UK.

    How the genome activates or silences transcriptional programmes governs organ formation. Little is known in human embryos undermining our ability to benchmark the fidelity of stem cell differentiation or cell programming, or interpret the pathogenicity of noncoding variation. Here, we study histone modifications across thirteen tissues during human organogenesis. We integrate the data with transcription to build an overview of how the human genome differentially regulates alternative organ fates including by repression. Promoters from nearly 20,000 genes partition into discrete states. Key developmental gene sets are actively repressed outside of the appropriate organ without obvious bivalency. Candidate enhancers, functional in zebrafish, allow imputation of tissue-specific and shared patterns of transcription factor binding. Overlaying more than 700 noncoding mutations from patients with developmental disorders allows correlation to unanticipated target genes. Taken together, the data provide a comprehensive genomic framework for investigating normal and abnormal human development.

    Funded by: Academy of Medical Sciences: Lecturer starter grant; RCUK | Medical Research Council (MRC): CRTF, MR/000638/1, MR/J003352/1, MR/L009986/1, MR/S036121/1, PhD studentship; Wellcome Trust (Wellcome): 088566, 097820, 105610

    Nature communications 2020;11;1;3920

  • The evolutionary history of 2,658 cancers.

    Gerstung M, Jolly C, Leshchiner I, Dentro SC, Gonzalez S, Rosebrock D, Mitchell TJ, Rubanova Y, Anur P, Yu K, Tarabichi M, Deshwar A, Wintersinger J, Kleinheinz K, Vázquez-García I, Haase K, Jerman L, Sengupta S, Macintyre G, Malikic S, Donmez N, Livitz DG, Cmero M, Demeulemeester J, Schumacher S, Fan Y, Yao X, Lee J, Schlesner M, Boutros PC, Bowtell DD, Zhu H, Getz G, Imielinski M, Beroukhim R, Sahinalp SC, Ji Y, Peifer M, Markowetz F, Mustonen V, Yuan K, Wang W, Morris QD, PCAWG Evolution &amp; Heterogeneity Working Group, Spellman PT, Wedge DC, Van Loo P and PCAWG Consortium

    European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, UK.

    Cancer develops through a process of somatic evolution<sup>1,2</sup>. Sequencing data from a single biopsy represent a snapshot of this process that can reveal the timing of specific genomic aberrations and the changing influence of mutational processes<sup>3</sup>. Here, by whole-genome sequencing analysis of 2,658 cancers as part of the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA)<sup>4</sup>, we reconstruct the life history and evolution of mutational processes and driver mutation sequences of 38 types of cancer. Early oncogenesis is characterized by mutations in a constrained set of driver genes, and specific copy number gains, such as trisomy 7 in glioblastoma and isochromosome 17q in medulloblastoma. The mutational spectrum changes significantly throughout tumour evolution in 40% of samples. A nearly fourfold diversification of driver genes and increased genomic instability are features of later stages. Copy number alterations often occur in mitotic crises, and lead to simultaneous gains of chromosomal segments. Timing analyses suggest that driver mutations often precede diagnosis by many years, if not decades. Together, these results determine the evolutionary trajectories of cancer, and highlight opportunities for early cancer detection.

    Funded by: Medical Research Council: MR/L016311; NCI NIH HHS: 1U24CA143799; NIH HHS: GM108308; NIMH NIH HHS: MH086633; Wellcome Trust: FC001202

    Nature 2020;578;7793;122-128

  • Common and rare variant prediction and penetrance of IBD in a large, multi-ethnic, health system-based biobank cohort.

    Gettler K, Levantovsky R, Moscati A, Giri M, Wu Y, Hsu NY, Chuang LS, Sazonovs A, Venkateswaran S, Korie U, Chasteau C, UK IBD Genetics Consortium, NIDDK IBDGC, Duerr RH, Silverberg MS, Snapper SB, Daly MJ, McGovern DP, Brant SR, Kugathasan S, Anderson CA, Itan Y and Cho JH

    Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA.

    Background and aims: Polygenic risk scores (PRS) may soon be used to predict inflammatory bowel disease (IBD) risk in prevention efforts. We leveraged exome-sequence and SNP array data from 29,358 individuals in the multi-ethnic, randomly-ascertained health system-based BioMe biobank to define effects of common and rare IBD variants on disease prediction and pathophysiology.

    Methods: PRS were calculated from European, African-American, and Ashkenazi Jewish (AJ) reference case-control studies, and a meta-GWAS run using all three association datasets. PRS were then combined using regression to assess which combination of scores best predicted IBD status in European, AJ, Hispanic, and African American cohorts in BioMe. Additionally, rare variants were assessed in genes associated with very early onset IBD (VEO-IBD), by estimating genetic penetrance in each BioMe population.

    Results: Combining risk scores based on association data from distinct ancestral populations improved IBD prediction for every population in BioMe and significantly improved prediction among European ancestry UK Biobank individuals. Lower predictive power for non-Europeans was observed, reflecting in part substantially lower African IBD case-control reference sizes. We replicated associations for two VEO-IBD genes, ADAM17 and LRBA, with high dominant model penetrance in BioMe. Autosomal recessive LRBA risk alleles are associated with severe, early-onset autoimmunity; we show that heterozygous carriage of an African-predominant LRBA protein-altering allele is associated with significantly decreased LRBA and CTLA-4 expression with T cell activation.

    Conclusions: Greater genetic diversity in African populations improves prediction across populations, and generalizes some VEO-IBD genes. Increasing African-American IBD case-collections should be prioritized to reduce health disparities and enhance pathophysiologic insight.

    Gastroenterology 2020

  • Investigating higher-order interactions in single-cell data with scHOT.

    Ghazanfar S, Lin Y, Su X, Lin DM, Patrick E, Han ZG, Marioni JC and Yang JYH

    Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, UK.

    Single-cell genomics has transformed our ability to examine cell fate choice. Examining cells along a computationally ordered 'pseudotime' offers the potential to unpick subtle changes in variability and covariation among key genes. We describe an approach, scHOT-single-cell higher-order testing-which provides a flexible and statistically robust framework for identifying changes in higher-order interactions among genes. scHOT can be applied for cells along a continuous trajectory or across space and accommodates various higher-order measurements including variability or correlation. We demonstrate the use of scHOT by studying coordinated changes in higher-order interactions during embryonic development of the mouse liver. Additionally, scHOT identifies subtle changes in gene-gene correlations across space using spatially resolved transcriptomics data from the mouse olfactory bulb. scHOT meaningfully adds to first-order differential expression testing and provides a framework for interrogating higher-order interactions using single-cell data.

    Funded by: Cancer Research UK (CRUK): 17197; Royal Society: NIF\R1\181950

    Nature methods 2020

  • Open Targets Genetics: systematic identification of trait-associated genes using large-scale genetics and functional genomics.

    Ghoussaini M, Mountjoy E, Carmona M, Peat G, Schmidt EM, Hercules A, Fumis L, Miranda A, Carvalho-Silva D, Buniello A, Burdett T, Hayhurst J, Baker J, Ferrer J, Gonzalez-Uriarte A, Jupp S, Karim MA, Koscielny G, Machlitt-Northen S, Malangone C, Pendlington ZM, Roncaglia P, Suveges D, Wright D, Vrousgou O, Papa E, Parkinson H, MacArthur JAL, Todd JA, Barrett JC, Schwartzentruber J, Hulcoop DG, Ochoa D, McDonagh EM and Dunham I

    Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK.

    Open Targets Genetics ( is an open-access integrative resource that aggregates human GWAS and functional genomics data including gene expression, protein abundance, chromatin interaction and conformation data from a wide range of cell types and tissues to make robust connections between GWAS-associated loci, variants and likely causal genes. This enables systematic identification and prioritisation of likely causal variants and genes across all published trait-associated loci. In this paper, we describe the public resources we aggregate, the technology and analyses we use, and the functionality that the portal offers. Open Targets Genetics can be searched by variant, gene or study/phenotype. It offers tools that enable users to prioritise causal variants and genes at disease-associated loci and access systematic cross-disease and disease-molecular trait colocalization analysis across 92 cell types and tissues including the eQTL Catalogue. Data visualizations such as Manhattan-like plots, regional plots, credible sets overlap between studies and PheWAS plots enable users to explore GWAS signals in depth. The integrated data is made available through the web portal, for bulk download and via a GraphQL API, and the software is open source. Applications of this integrated data include identification of novel targets for drug discovery and drug repurposing.

    Nucleic acids research 2020

  • ACE inhibition and cardiometabolic risk factors, lung ACE2 and TMPRSS2 gene expression, and plasma ACE2 levels: a Mendelian randomization study.

    Gill D, Arvanitis M, Carter P, Hernández Cordero AI, Jo B, Karhunen V, Larsson SC, Li X, Lockhart SM, Mason A, Pashos E, Saha A, Tan VY, Zuber V, Bossé Y, Fahle S, Hao K, Jiang T, Joubert P, Lunt AC, Ouwehand WH, Roberts DJ, Timens W, van den Berge M, Watkins NA, Battle A, Butterworth AS, Danesh J, Di Angelantonio E, Engelhardt BE, Peters JE, Sin DD and Burgess S

    Department of Epidemiology and Biostatistics, St Mary's Hospital, Imperial College London, Medical School Building, London, UK.

    Angiotensin-converting enzyme 2 (ACE2) and serine protease TMPRSS2 have been implicated in cell entry for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the virus responsible for coronavirus disease 2019 (COVID-19). The expression of <i>ACE2</i> and <i>TMPRSS2</i> in the lung epithelium might have implications for the risk of SARS-CoV-2 infection and severity of COVID-19. We use human genetic variants that proxy angiotensin-converting enzyme (ACE) inhibitor drug effects and cardiovascular risk factors to investigate whether these exposures affect lung <i>ACE2</i> and <i>TMPRSS2</i> gene expression and circulating ACE2 levels. We observed no consistent evidence of an association of genetically predicted serum ACE levels with any of our outcomes. There was weak evidence for an association of genetically predicted serum ACE levels with <i>ACE2</i> gene expression in the Lung eQTL Consortium (<i>p</i> = 0.014), but this finding did not replicate. There was evidence of a positive association of genetic liability to type 2 diabetes mellitus with lung <i>ACE2</i> gene expression in the Gene-Tissue Expression (GTEx) study (<i>p</i> = 4 × 10<sup>-4</sup>) and with circulating plasma ACE2 levels in the INTERVAL study (<i>p</i> = 0.03), but not with lung <i>ACE2</i> expression in the Lung eQTL Consortium study (<i>p</i> = 0.68). There were no associations of genetically proxied liability to the other cardiometabolic traits with any outcome. This study does not provide consistent evidence to support an effect of serum ACE levels (as a proxy for ACE inhibitors) or cardiometabolic risk factors on lung <i>ACE2</i> and <i>TMPRSS2</i> expression or plasma ACE2 levels.

    Royal Society open science 2020;7;11;200958

  • Whole-genome sequencing analysis of the cardiometabolic proteome.

    Gilly A, Park YC, Png G, Barysenka A, Fischer I, Bjørnland T, Southam L, Suveges D, Neumeyer S, Rayner NW, Tsafantakis E, Karaleftheri M, Dedoussis G and Zeggini E

    Institute of Translational Genomics, Helmholtz Zentrum München-German Research Center for Environmental Health, Neuherberg, Germany.

    The human proteome is a crucial intermediate between complex diseases and their genetic and environmental components, and an important source of drug development targets and biomarkers. Here, we comprehensively assess the genetic architecture of 257 circulating protein biomarkers of cardiometabolic relevance through high-depth (22.5×) whole-genome sequencing (WGS) in 1328 individuals. We discover 131 independent sequence variant associations (P < 7.45 × 10<sup>-11</sup>) across the allele frequency spectrum, all of which replicate in an independent cohort (n = 1605, 18.4x WGS). We identify for the first time replicating evidence for rare-variant cis-acting protein quantitative trait loci for five genes, involving both coding and noncoding variation. We construct and validate polygenic scores that explain up to 45% of protein level variation. We find causal links between protein levels and disease risk, identifying high-value biomarkers and drug development targets.

    Funded by: Wellcome Trust: 098051

    Nature communications 2020;11;1;6336

  • Mass drug administration with azithromycin for trachoma elimination and the population structure of Streptococcus pneumoniae in the nasopharynx.

    Gladstone RA, Bojang E, Hart J, Harding-Esch EM, Mabey D, Sillah A, Bailey RL, Burr SE, Roca A, Bentley SD and Holland MJ

    Parasites and microbes, Wellcome Sanger Institute, Hinxton, England, UK. Electronic address:

    Objectives: Mass drug administration (MDA) with azithromycin for trachoma elimination reduces nasopharyngeal carriage of Streptococcus pneumoniae in the short term. We evaluated S. pneumoniae carried in the nasopharynx before and after a round of azithromycin MDA to determine whether MDA was associated with changes in pneumococcal population structure and resistance.

    Methods: We analyzed 514 pneumococcal whole genomes randomly selected from nasopharyngeal samples collected in two Gambian villages that received 3 annual rounds of MDA for trachoma elimination. The 514 samples represented 293 participants, of which 75% were children aged 0-9 years, isolated during three cross-sectional surveys conducted before the third round of MDA (CSS-1) and at one (CSS-2) and six (CSS-3) months after MDA. Bayesian Analysis of Population Structure (BAPS) was used to cluster related isolates by capturing variation in the core genome. Serotype and multi-locus sequence type were inferred from the genotype. Antimicrobial resistance determinants were identified from assemblies, including known macrolide resistance genes.

    Results: Twenty-seven BAPS clusters were assigned. These consisted of 81 sequence types (STs). Two BAPS clusters not observed in CSS-1 (n=109) or CSS-2 (n=69), increased in frequency in CSS-3 (n=126); BAPS20 (8.73%, p=0.016) and BAPS22 (7.14%, p=0.032) but were not associated with antimicrobial resistance. Macrolide resistance within BAPS17 increased after treatment (CSS-1 n=0/6, CSS-2/3 n=5/5, p=0.002) and was carried on a mobile transposable element that also conferred resistance to tetracycline.

    Conclusions: Limited changes in pneumococcal population structure were observed after the third round of MDA suggesting treatment had little effect on the circulating lineages. An increase in macrolide resistance within one BAPS highlights the need for antimicrobial resistance surveillance in treated villages.

    Clinical microbiology and infection : the official publication of the European Society of Clinical Microbiology and Infectious Diseases 2020

  • Visualizing variation within Global Pneumococcal Sequence Clusters (GPSCs) and country population snapshots to contextualize pneumococcal isolates.

    Gladstone RA, Lo SW, Goater R, Yeats C, Taylor B, Hadfield J, Lees JA, Croucher NJ, van Tonder AJ, Bentley LJ, Quah FX, Blaschke AJ, Pershing NL, Byington CL, Balaji V, Hryniewicz W, Sigauque B, Ravikumar KL, Almeida SCG, Ochoa TJ, Ho PL, du Plessis M, Ndlangisa KM, Cornick JE, Kwambana-Adams B, Benisty R, Nzenze SA, Madhi SA, Hawkins PA, Pollard AJ, Everett DB, Antonio M, Dagan R, Klugman KP, von Gottberg A, Metcalf BJ, Li Y, Beall BW, McGee L, Breiman RF, Aanensen DM, Bentley SD and The Global Pneumococcal Sequencing Consortium

    Parasites and microbes, Wellcome Sanger InstituteHinxton, UK.

    Knowledge of pneumococcal lineages, their geographic distribution and antibiotic resistance patterns, can give insights into global pneumococcal disease. We provide interactive bioinformatic outputs to explore such topics, aiming to increase dissemination of genomic insights to the wider community, without the need for specialist training. We prepared 12 country-specific phylogenetic snapshots, and international phylogenetic snapshots of 73 common Global Pneumococcal Sequence Clusters (GPSCs) previously defined using PopPUNK, and present them in Microreact. Gene presence and absence defined using Roary, and recombination profiles derived from Gubbins are presented in Phandango for each GPSC. Temporal phylogenetic signal was assessed for each GPSC using BactDating. We provide examples of how such resources can be used. In our example use of a country-specific phylogenetic snapshot we determined that serotype 14 was observed in nine unrelated genetic backgrounds in South Africa. The international phylogenetic snapshot of GPSC9, in which most serotype 14 isolates from South Africa were observed, highlights that there were three independent sub-clusters represented by South African serotype 14 isolates. We estimated from the GPSC9-dated tree that the sub-clusters were each established in South Africa during the 1980s. We show how recombination plots allowed the identification of a 20 kb recombination spanning the capsular polysaccharide locus within GPSC97. This was consistent with a switch from serotype 6A to 19A estimated to have occured in the 1990s from the GPSC97-dated tree. Plots of gene presence/absence of resistance genes (<i>tet</i>, <i>erm</i>, <i>cat</i>) across the GPSC23 phylogeny were consistent with acquisition of a composite transposon. We estimated from the GPSC23-dated tree that the acquisition occurred between 1953 and 1975. Finally, we demonstrate the assignment of GPSC31 to 17 externally generated pneumococcal serotype 1 assemblies from Utah via Pathogenwatch. Most of the Utah isolates clustered within GPSC31 in a USA-specific clade with the most recent common ancestor estimated between 1958 and 1981. The resources we have provided can be used to explore to data, test hypothesis and generate new hypotheses. The accessible assignment of GPSCs allows others to contextualize their own collections beyond the data presented here.

    Microbial genomics 2020

  • Development and validation of a universal blood donor genotyping platform: a multinational prospective study.

    Gleadall NS, Veldhuisen B, Gollub J, Butterworth AS, Ord J, Penkett CJ, Timmer TC, Sauer CM, van der Bolt N, Brown C, Brugger K, Dilthey AT, Duarte D, Grimsley S, van den Hurk K, Jongerius JM, Luken J, Megy K, Miflin G, Nelson CS, Prinsze FJ, Sambrook J, Simeoni I, Sweeting M, Thornton N, Trompeter S, Tuna S, Varma R, Walker MR, NIHR BioResource, Danesh J, Roberts DJ, Ouwehand WH, Stirrups KE, Rendon A, Westhoff CM, Di Angelantonio E, van der Schoot CE, Astle WJ, Watkins NA and Lane WJ

    Department of Haematology, University of Cambridge, Cambridge Biomedical Campus, Cambridge, United Kingdom.

    Each year, blood transfusions save millions of lives. However, under current blood-matching practices, sensitization to non-self-antigens is an unavoidable adverse side effect of transfusion. We describe a universal donor typing platform that could be adopted by blood services worldwide to facilitate a universal extended blood-matching policy and reduce sensitization rates. This DNA-based test is capable of simultaneously typing most clinically relevant red blood cell (RBC), human platelet (HPA), and human leukocyte (HLA) antigens. Validation was performed, using samples from 7927 European, 27 South Asian, 21 East Asian, and 9 African blood donors enrolled in 2 national biobanks. We illustrated the usefulness of the platform by analyzing antibody data from patients sensitized with multiple RBC alloantibodies. Genotyping results demonstrated concordance of 99.91%, 99.97%, and 99.03% with RBC, HPA, and HLA clinically validated typing results in 89 371, 3016, and 9289 comparisons, respectively. Genotyping increased the total number of antigen typing results available from 110 980 to >1 200 000. Dense donor typing allowed identification of 2 to 6 times more compatible donors to serve 3146 patients with multiple RBC alloantibodies, providing at least 1 match for 176 individuals for whom previously no blood could be found among the same donors. This genotyping technology is already being used to type thousands of donors taking part in national genotyping studies. Extraction of dense antigen-typing data from these cohorts provides blood supply organizations with the opportunity to implement a policy of genomics-based precision matching of blood.

    Funded by: Wellcome Trust

    Blood advances 2020;4;15;3495-3506

  • High-throughput genotyping of high-homology mutant mouse strains by next-generation sequencing.

    Gleeson D, Sethi D, Platte R, Burvill J, Barrett D, Akhtar S, Bruntraeger M, Bottomley J, Bussell J and Ryder E

    Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.

    Genotyping of knockout alleles in mice is commonly performed by end-point PCR or gene-specific / universal cassette qPCR. Both have advantages and limitations in terms of assay design and interpretation of results. As an alternative method for high-throughput genotyping, we investigated next generation sequencing (NGS) of PCR amplicons, with a focus on CRISPR-mediated exon deletions where antibiotic selection markers are not present. By multiplexing the wild type and mutant-specific PCR reactions, the genotype can be called by the relative sequence counts of each product. The system is highly scalable and can be applied to a variety of different allele types, including those produced by the International Mouse Phenotyping Consortium and associated projects. One potential challenge with any assay design is locating unique areas of the genome, especially when working with gene families or regions of high homology. These can result in misleading or ambiguous genotypes for either qPCR or end-point assays. Here, we show that genotyping by NGS can negate these issues by simple, automated filtering of undesired sequences. Analysis and genotype calls can also be fully automated, using FASTQ or FASTA input files and an in-house Perl script and SQL database.

    Methods (San Diego, Calif.) 2020

  • Genomic profiling of T-cell activation suggests increased sensitivity of memory T cells to CD28 costimulation.

    Glinos DA, Soskic B, Williams C, Kennedy A, Jostins L, Sansom DM and Trynka G

    Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK.

    T-cell activation is a critical driver of immune responses. The CD28 costimulation is an essential regulator of CD4 T-cell responses, however, its relative importance in naive and memory T cells is not fully understood. Using different model systems, we observe that human memory T cells are more sensitive to CD28 costimulation than naive T cells. To deconvolute how the T-cell receptor (TCR) and CD28 orchestrate activation of human T cells, we stimulate cells using varying intensities of TCR and CD28 and profiled gene expression. We show that genes involved in cell cycle progression and division are CD28-driven in memory cells, but under TCR control in naive cells. We further demonstrate that T-helper differentiation and cytokine expression are controlled by CD28. Using chromatin accessibility profiling, we observe that AP1 transcriptional regulation is enriched when both TCR and CD28 are engaged, whereas open chromatin near CD28-sensitive genes is enriched for NF-kB motifs. Lastly, we show that CD28-sensitive genes are enriched in GWAS regions associated with immune diseases, implicating a role for CD28 in disease development. Our study provides important insights into the differential role of costimulation in naive and memory T-cell responses and disease susceptibility.

    Funded by: Arthritis Research UK: 21147; Royal Society: Z/17/Z; Wellcome Trust (Wellcome): WT204798, WT206194, WT208750

    Genes and immunity 2020

  • Comprehensive molecular comparison of BRCA1 hypermethylated and BRCA1 mutated triple negative breast cancers.

    Glodzik D, Bosch A, Hartman J, Aine M, Vallon-Christersson J, Reuterswärd C, Karlsson A, Mitra S, Niméus E, Holm K, Häkkinen J, Hegardt C, Saal LH, Larsson C, Malmberg M, Rydén L, Ehinger A, Loman N, Kvist A, Ehrencrona H, Nik-Zainal S, Borg Å and Staaf J

    Division of Oncology, Department of Clinical Sciences Lund, Lund University, Medicon Village, SE-22381, Lund, Sweden.

    Homologous recombination deficiency (HRD) is a defining characteristic in BRCA-deficient breast tumors caused by genetic or epigenetic alterations in key pathway genes. We investigated the frequency of BRCA1 promoter hypermethylation in 237 triple-negative breast cancers (TNBCs) from a population-based study using reported whole genome and RNA sequencing data, complemented with analyses of genetic, epigenetic, transcriptomic and immune infiltration phenotypes. We demonstrate that BRCA1 promoter hypermethylation is twice as frequent as BRCA1 pathogenic variants in early-stage TNBC and that hypermethylated and mutated cases have similarly improved prognosis after adjuvant chemotherapy. BRCA1 hypermethylation confers an HRD, immune cell type, genome-wide DNA methylation, and transcriptional phenotype similar to TNBC tumors with BRCA1-inactivating variants, and it can be observed in matched peripheral blood of patients with tumor hypermethylation. Hypermethylation may be an early event in tumor development that progress along a common pathway with BRCA1-mutated disease, representing a promising DNA-based biomarker for early-stage TNBC.

    Funded by: Cancerfonden (Swedish Cancer Society): CAN 2018/685; Crafoordska Stiftelsen (Crafoord Foundation): 20180543; Fru Berta Kamprads Stiftelse (Mrs. Berta Kamprad Foundation): FBKS 2017-34-199, FBKS-2018-4-146; Gunnar Nilssons Cancerstiftelse (Gunnar Nilsson Cancer Foundation): GN-2018-5

    Nature communications 2020;11;1;3747

  • Insights into the intracellular localization, protein associations and artemisinin resistance properties of Plasmodium falciparum K13.

    Gnädig NF, Stokes BH, Edwards RL, Kalantarov GF, Heimsch KC, Kuderjavy M, Crane A, Lee MCS, Straimer J, Becker K, Trakht IN, Odom John AR, Mok S and Fidock DA

    Department of Microbiology & Immunology, Columbia University Irving Medical Center, New York, NY, United States of America.

    The emergence of artemisinin (ART) resistance in Plasmodium falciparum intra-erythrocytic parasites has led to increasing treatment failure rates with first-line ART-based combination therapies in Southeast Asia. Decreased parasite susceptibility is caused by K13 mutations, which are associated clinically with delayed parasite clearance in patients and in vitro with an enhanced ability of ring-stage parasites to survive brief exposure to the active ART metabolite dihydroartemisinin. Herein, we describe a panel of K13-specific monoclonal antibodies and gene-edited parasite lines co-expressing epitope-tagged versions of K13 in trans. By applying an analytical quantitative imaging pipeline, we localize K13 to the parasite endoplasmic reticulum, Rab-positive vesicles, and sites adjacent to cytostomes. These latter structures form at the parasite plasma membrane and traffic hemoglobin to the digestive vacuole wherein artemisinin-activating heme moieties are released. We also provide evidence of K13 partially localizing near the parasite mitochondria upon treatment with dihydroartemisinin. Immunoprecipitation data generated with K13-specific monoclonal antibodies identify multiple putative K13-associated proteins, including endoplasmic reticulum-resident molecules, mitochondrial proteins, and Rab GTPases, in both K13 mutant and wild-type isogenic lines. We also find that mutant K13-mediated resistance is reversed upon co-expression of wild-type or mutant K13. These data help define the biological properties of K13 and its role in mediating P. falciparum resistance to ART treatment.

    Funded by: NIAID NIH HHS: R01 AI103280, R01 AI109023, R21 AI123808, R21 AI130584, R21 AI144472, T32 AI106711; Wellcome Trust: 206194

    PLoS pathogens 2020;16;4;e1008482

  • Epstein-Barr virus reactivation in sepsis due to community-acquired pneumonia is associated with increased morbidity and an immunosuppressed host transcriptomic endotype.

    Goh C, Burnham KL, Ansari MA, de Cesare M, Golubchik T, Hutton P, Overend LE, Davenport EE, Hinds CJ, Bowden R and Knight JC

    Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK.

    Epstein-Barr virus (EBV) reactivation is common in sepsis patients but the extent and nature of this remains unresolved. We sought to determine the incidence and correlates of EBV-positivity in a large sepsis cohort. We also hypothesised that EBV reactivation would be increased in patients in whom relative immunosuppression was the major feature of their sepsis response. To identify such patients we aimed to use knowledge of sepsis response subphenotypes based on transcriptomic studies of circulating leukocytes, specifically patients with a Sepsis Response Signature endotype (SRS1) that we have previously shown to be associated with increased mortality and features of immunosuppression. We assayed EBV from the plasma of intensive care unit (ICU) patients with sepsis due to community-acquired pneumonia. In total 730 patients were evaluated by targeted metagenomics (n = 573 patients), digital droplet PCR (n = 565), or both (n = 408). We had previously analysed gene expression in peripheral blood leukocytes for a subset of individuals (n = 390). We observed a 37% incidence of EBV-positivity. EBV reactivation was associated with longer ICU stay (12.9 vs 9.2 days; p = 0.004) and increased organ failure (day 1 SOFA score 6.9 vs 5.9; p = 0.00011). EBV reactivation was associated with the relatively immunosuppressed SRS1 endotype (p = 0.014) and differential expression of a small number of biologically relevant genes. These findings are consistent with the hypothesis that viral reactivation in sepsis is a consequence of immune compromise and is associated with increasing severity of illness although further mechanistic studies are required to definitively illustrate cause and effect.

    Scientific reports 2020;10;1;9838

  • Genomic evolution of Neisseria gonorrhoeae since the preantibiotic era (1928-2013): antimicrobial use/misuse selects for resistance and drives evolution.

    Golparian D, Harris SR, Sánchez-Busó L, Hoffmann S, Shafer WM, Bentley SD, Jensen JS and Unemo M

    WHO Collaborating Centre for Gonorrhoea and other Sexually Transmitted Infections, Department of Laboratory Medicine, Microbiology, Faculty of Medicine and Health, Örebro University, SE-710 85, Örebro, Sweden.

    Background: Multidrug-resistant Neisseria gonorrhoeae strains are prevalent, threatening gonorrhoea treatment globally, and understanding of emergence, evolution, and spread of antimicrobial resistance (AMR) in gonococci remains limited. We describe the genomic evolution of gonococci and their AMR, related to the introduction of antimicrobial therapies, examining isolates from 1928 (preantibiotic era) to 2013 in Denmark. This is, to our knowledge, the oldest gonococcal collection globally.

    Methods: Lyophilised isolates were revived and examined using Etest (18 antimicrobials) and whole-genome sequencing (WGS). Quality-assured genome sequences were obtained for 191 viable and 40 non-viable isolates and analysed with multiple phylogenomic approaches.

    Results: Gonococcal AMR, including an accumulation of multiple AMR determinants, started to emerge particularly in the 1950s-1970s. By the twenty-first century, resistance to most antimicrobials was common. Despite that some AMR determinants affect many physiological functions and fitness, AMR determinants were mainly selected by the use/misuse of gonorrhoea therapeutic antimicrobials. Most AMR developed in strains belonging to one multidrug-resistant (MDR) clade with close to three times higher genomic mutation rate. Modern N. gonorrhoeae was inferred to have emerged in the late-1500s and its genome became increasingly conserved over time.

    Conclusions: WGS of gonococci from 1928 to 2013 showed that no AMR determinants, except penB, were in detectable frequency before the introduction of gonorrhoea therapeutic antimicrobials. The modern gonococcus is substantially younger than previously hypothesized and has been evolving into a more clonal species, driven by the use/misuse of antimicrobials. The MDR gonococcal clade should be further investigated for early detection of strains with predispositions to develop and maintain MDR and for initiation of public health interventions.

    Funded by: BLRD VA: IK6 BX004470; Foundation for Medical Research at Örebro University Hospital: 2012; Wellcome Trust: 098051

    BMC genomics 2020;21;1;116

  • An antiviral response beyond immune cells.

    Gomes T and Teichmann SA

    Nature 2020;583;7815;206-207

  • Drug mechanism-of-action discovery through the integration of pharmacological and CRISPR screens.

    Gonçalves E, Segura-Cabrera A, Pacini C, Picco G, Behan FM, Jaaks P, Coker EA, van der Meer D, Barthorpe A, Lightfoot H, Mironenko T, Beck A, Richardson L, Yang W, Lleshi E, Hall J, Tolley C, Hall C, Mali I, Thomas F, Morris J, Leach AR, Lynch JT, Sidders B, Crafter C, Iorio F, Fawell S and Garnett MJ

    Wellcome Sanger Institute, Hinxton, UK.

    Low success rates during drug development are due, in part, to the difficulty of defining drug mechanism-of-action and molecular markers of therapeutic activity. Here, we integrated 199,219 drug sensitivity measurements for 397 unique anti-cancer drugs with genome-wide CRISPR loss-of-function screens in 484 cell lines to systematically investigate cellular drug mechanism-of-action. We observed an enrichment for positive associations between the profile of drug sensitivity and knockout of a drug's nominal target, and by leveraging protein-protein networks, we identified pathways underpinning drug sensitivity. This revealed an unappreciated positive association between mitochondrial E3 ubiquitin-protein ligase MARCH5 dependency and sensitivity to MCL1 inhibitors in breast cancer cell lines. We also estimated drug on-target and off-target activity, informing on specificity, potency and toxicity. Linking drug and gene dependency together with genomic data sets uncovered contexts in which molecular networks when perturbed mediate cancer cell loss-of-fitness and thereby provide independent and orthogonal evidence of biomarkers for drug development. This study illustrates how integrating cell line drug sensitivity with CRISPR loss-of-function screens can elucidate mechanism-of-action to advance drug development.

    Funded by: AstraZeneca; Wellcome Trust (WT): 206194

    Molecular systems biology 2020;16;7;e9405

  • Reply.

    Goode EC, Hirschfield GM and Rushbrook SM

    Norfolk and Norwich University Hospital, Norwich, United Kingdom.

    Hepatology (Baltimore, Md.) 2020;71;1;399-400

  • Comparative host-coronavirus protein interaction networks reveal pan-viral disease mechanisms.

    Gordon DE, Hiatt J, Bouhaddou M, Rezelj VV, Ulferts S, Braberg H, Jureka AS, Obernier K, Guo JZ, Batra J, Kaake RM, Weckstein AR, Owens TW, Gupta M, Pourmal S, Titus EW, Cakir M, Soucheray M, McGregor M, Cakir Z, Jang G, O'Meara MJ, Tummino TA, Zhang Z, Foussard H, Rojc A, Zhou Y, Kuchenov D, Hüttenhain R, Xu J, Eckhardt M, Swaney DL, Fabius JM, Ummadi M, Tutuncuoglu B, Rathore U, Modak M, Haas P, Haas KM, Naing ZZC, Pulido EH, Shi Y, Barrio-Hernandez I, Memon D, Petsalaki E, Dunham A, Marrero MC, Burke D, Koh C, Vallet T, Silvas JA, Azumaya CM, Billesbølle C, Brilot AF, Campbell MG, Diallo A, Dickinson MS, Diwanji D, Herrera N, Hoppe N, Kratochvil HT, Liu Y, Merz GE, Moritz M, Nguyen HC, Nowotny C, Puchades C, Rizo AN, Schulze-Gahmen U, Smith AM, Sun M, Young ID, Zhao J, Asarnow D, Biel J, Bowen A, Braxton JR, Chen J, Chio CM, Chio US, Deshpande I, Doan L, Faust B, Flores S, Jin M, Kim K, Lam VL, Li F, Li J, Li YL, Li Y, Liu X, Lo M, Lopez KE, Melo AA, Moss FR, Nguyen P, Paulino J, Pawar KI, Peters JK, Pospiech TH, Safari M, Sangwan S, Schaefer K, Thomas PV, Thwin AC, Trenker R, Tse E, Tsui TKM, Wang F, Whitis N, Yu Z, Zhang K, Zhang Y, Zhou F, Saltzberg D, QCRG Structural Biology Consortium, Hodder AJ, Shun-Shion AS, Williams DM, White KM, Rosales R, Kehrer T, Miorin L, Moreno E, Patel AH, Rihn S, Khalid MM, Vallejo-Gracia A, Fozouni P, Simoneau CR, Roth TL, Wu D, Karim MA, Ghoussaini M, Dunham I, Berardi F, Weigang S, Chazal M, Park J, Logue J, McGrath M, Weston S, Haupt R, Hastie CJ, Elliott M, Brown F, Burness KA, Reid E, Dorward M, Johnson C, Wilkinson SG, Geyer A, Giesel DM, Baillie C, Raggett S, Leech H, Toth R, Goodman N, Keough KC, Lind AL, Zoonomia Consortium, Klesh RJ, Hemphill KR, Carlson-Stevermer J, Oki J, Holden K, Maures T, Pollard KS, Sali A, Agard DA, Cheng Y, Fraser JS, Frost A, Jura N, Kortemme T, Manglik A, Southworth DR, Stroud RM, Alessi DR, Davies P, Frieman MB, Ideker T, Abate C, Jouvenet N, Kochs G, Shoichet B, Ott M, Palmarini M, Shokat KM, García-Sastre A, Rassen JA, Grosse R, Rosenberg OS, Verba KA, Basler CF, Vignuzzi M, Peden AA, Beltrao P and Krogan NJ

    QBI COVID-19 Research Group (QCRG), San Francisco, CA 94158, USA.

    The COVID-19 (Coronavirus disease-2019) pandemic, caused by the SARS-CoV-2 coronavirus, is a significant threat to public health and the global economy. SARS-CoV-2 is closely related to the more lethal but less transmissible coronaviruses SARS-CoV-1 and MERS-CoV. Here, we have carried out comparative viral-human protein-protein interaction and viral protein localization analysis for all three viruses. Subsequent functional genetic screening identified host factors that functionally impinge on coronavirus proliferation, including Tom70, a mitochondrial chaperone protein that interacts with both SARS-CoV-1 and SARS-CoV-2 Orf9b, an interaction we structurally characterized using cryo-EM. Combining genetically-validated host factors with both COVID-19 patient genetic data and medical billing records identified important molecular mechanisms and potential drug treatments that merit further molecular and clinical study.

    Science (New York, N.Y.) 2020

  • Quantifying acquisition and transmission of Enterococcus faecium using genomic surveillance.

    Gouliouris T, Coll F, Ludden C, Blane B, Raven KE, Naydenova P, Crawley C, Török ME, Enoch DA, Brown NM, Harrison EM, Parkhill J and Peacock SJ

    University of Cambridge, Cambridge, UK.

    Nosocomial acquisition and transmission of vancomycin-resistant Enterococcus faecium (VREfm) is the driver for E. faecium carriage in hospitalized patients, which, in turn, is a risk factor for invasive infection in immunocompromised patients. In the present study, we provide a comprehensive picture of E. faecium transmission in an entire sampled patient population using a sequence-driven approach. We prospectively identified and followed 149 haematology patients admitted to a hospital in England for 6 months. Patient stools (n = 376) and environmental swabs (n = 922) were taken at intervals and cultured for E. faecium. We sequenced 1,560 isolates (1,001 stool, 559 environment) and focused our genomic analyses on 1,477 isolates (95%) in the hospital-adapted clade A1. Of 101 patients who provided two or more stool samples, 40 (40%) developed E. faecium carriage after admission based on culture, compared with 64 patients (63%) based on genomic analysis (73% VREfm). Half of 922 environmental swabs (447, 48%) were positive for VREfm. Network analysis showed that, of 111 patients positive for the A1 clade, 67 had strong epidemiological and genomic links with at least one other patient and/or their direct environment, supporting nosocomial transmission. Six patients (3.4%) developed an invasive E. faecium infection from their own gut-colonizing strain, which was preceded by nosocomial acquisition of the infecting isolate in half of these. Two informatics approaches (subtype categorization to define phylogenetic clusters and the development of an SNP cut-off for transmission) were central to our analyses, both of which will inform the future translation of E. faecium sequencing into routine outbreak detection and investigation. In conclusion, we showed that carriage and environmental contamination by the hospital-adapted E. faecium lineage were hyperendemic in our study population and that improved infection control measures will be needed to reduce hospital acquisition rates.

    Funded by: Wellcome Trust (Wellcome): 103387/Z/13/Z, 110243/Z/15/Z, 201344/Z/16/Z, WT098600

    Nature microbiology 2020

  • Association between bacterial homoplastic variants and radiological pathology in tuberculosis.

    Grandjean L, Monteserin J, Gilman R, Pauschardt J, Rokadiya S, Bonilla C, Ritacco V, Vidal JR, Parkhill J, Peacock S, Moore DA and Balloux F

    Department of Medicine, Imperial College London, London, UK

    Background: Understanding how pathogen genetic factors contribute to pathology in TB could enable tailored treatments to the most pathogenic and infectious strains. New strategies are needed to control drug-resistant TB, which requires longer and costlier treatment. We hypothesised that the severity of radiological pathology on the chest radiograph in TB disease was associated with variants arising independently, multiple times (homoplasies) in the <i>Mycobacterium tuberculosis</i> genome.

    Methods: We performed whole genome sequencing (Illumina HiSeq2000 platform) on <i>M. tuberculosis</i> isolates from 103 patients with drug-resistant TB in Lima between 2010 and 2013. Variables including age, sex, HIV status, previous TB disease and the percentage of lung involvement on the pretreatment chest radiograph were collected from health posts of the national TB programme. Genomic variants were identified using standard pipelines.

    Results: Two mutations were significantly associated with more widespread radiological pathology in a multivariable regression model controlling for confounding variables (Rv2828c.141, RR 1.3, 95% CI 1.21 to 1.39, p<0.01; rpoC.1040 95% CI 1.77 to 2.16, RR 1.9, p<0.01). The rpoB.450 mutation was associated with less extensive radiological pathology (RR 0.81, 95% CI 0.69 to 0.94, p=0.03), suggestive of a bacterial fitness cost for this mutation in vivo. Patients with a previous episode of TB disease and those between 10 and 30 years of age also had significantly increased radiological pathology.

    Conclusions: This study is the first to compare the <i>M. tuberculosis</i> genome to radiological pathology on the chest radiograph. We identified two variants significantly positively associated with more widespread radiological pathology and one with reduced pathology. Prospective studies are warranted to determine whether mutations associated with increased pathology also predict the spread of drug-resistant TB.

    Thorax 2020

  • Evolution of the Insecticide Target Rdl in African Anopheles Is Driven by Interspecific and Interkaryotypic Introgression.

    Grau-Bové X, Tomlinson S, O'Reilly AO, Harding NJ, Miles A, Kwiatkowski D, Donnelly MJ, Weetman D and Anopheles gambiae 1000 Genomes Consortium

    Department of Vector Biology, Liverpool School of Tropical Medicine, Liverpool, United Kingdom.

    The evolution of insecticide resistance mechanisms in natural populations of Anopheles malaria vectors is a major public health concern across Africa. Using genome sequence data, we study the evolution of resistance mutations in the resistance to dieldrin locus (Rdl), a GABA receptor targeted by several insecticides, but most notably by the long-discontinued cyclodiene, dieldrin. The two Rdl resistance mutations (296G and 296S) spread across West and Central African Anopheles via two independent hard selective sweeps that included likely compensatory nearby mutations, and were followed by a rare combination of introgression across species (from A. gambiae and A. arabiensis to A. coluzzii) and across nonconcordant karyotypes of the 2La chromosomal inversion. Rdl resistance evolved in the 1950s as the first known adaptation to a large-scale insecticide-based intervention, but the evolutionary lessons from this system highlight contemporary and future dangers for management strategies designed to combat development of resistance in malaria vectors.

    Funded by: Medical Research Council: MR/M006212/1, MR/P02520X/1; NIAID NIH HHS: R01 AI116811; Wellcome Trust: 090532/Z/09/Z, 090770/Z/09/Z, 098051

    Molecular biology and evolution 2020;37;10;2900-2917

  • Personalized and graph genomes reveal missing signal in epigenomic data.

    Groza C, Kwan T, Soranzo N, Pastinen T and Bourque G

    Human Genetics, McGill University, Montreal, QC, Canada.

    Background: Epigenomic studies that use next generation sequencing experiments typically rely on the alignment of reads to a reference sequence. However, because of genetic diversity and the diploid nature of the human genome, we hypothesize that using a generic reference could lead to incorrectly mapped reads and bias downstream results.

    Results: We show that accounting for genetic variation using a modified reference genome or a de novo assembled genome can alter histone H3K4me1 and H3K27ac ChIP-seq peak calls either by creating new personal peaks or by the loss of reference peaks. Using permissive cutoffs, modified reference genomes are found to alter approximately 1% of peak calls while de novo assembled genomes alter up to 5% of peaks. We also show statistically significant differences in the amount of reads observed in regions associated with the new, altered, and unchanged peaks. We report that short insertions and deletions (indels), followed by single nucleotide variants (SNVs), have the highest probability of modifying peak calls. We show that using a graph personalized genome represents a reasonable compromise between modified reference genomes and de novo assembled genomes. We demonstrate that altered peaks have a genomic distribution typical of other peaks.

    Conclusions: Analyzing epigenomic datasets with personalized and graph genomes allows the recovery of new peaks enriched for indels and SNVs. These altered peaks are more likely to differ between individuals and, as such, could be relevant in the study of various human phenotypes.

    Funded by: CIHR: CEE-151618, EP1-120608, EP2-120609

    Genome biology 2020;21;1;124

  • Identifying and removing haplotypic duplication in primary genome assemblies.

    Guan D, McCarthy SA, Wood J, Howe K, Wang Y and Durbin R

    Department of Computer Science and Technology, Center for Bioinformatics, Harbin Institute of Technology, Harbin 150001, China.

    Motivation: Rapid development in long-read sequencing and scaffolding technologies is accelerating the production of reference-quality assemblies for large eukaryotic genomes. However, haplotype divergence in regions of high heterozygosity often results in assemblers creating two copies rather than one copy of a region, leading to breaks in contiguity and compromising downstream steps such as gene annotation. Several tools have been developed to resolve this problem. However, they either focus only on removing contained duplicate regions, also known as haplotigs, or fail to use all the relevant information and hence make errors.

    Results: Here we present a novel tool, purge_dups, that uses sequence similarity and read depth to automatically identify and remove both haplotigs and heterozygous overlaps. In comparison with current tools, we demonstrate that purge_dups can reduce heterozygous duplication and increase assembly continuity while maintaining completeness of the primary assembly. Moreover, purge_dups is fully automatic and can easily be integrated into assembly pipelines.

    Availability and implementation: The source code is written in C and is available at

    Supplementary information: Supplementary data are available at Bioinformatics online.

    Funded by: Wellcome Trust: 207492/Z/17/Z

    Bioinformatics (Oxford, England) 2020;36;9;2896-2898

  • Diverse Routes toward Early Somites in the Mouse Embryo.

    Guibentif C, Griffiths JA, Imaz-Rosshandler I, Ghazanfar S, Nichols J, Wilson V, Göttgens B and Marioni JC

    Department of Haematology, University of Cambridge, CB2 0AW Cambridge, UK; Wellcome-Medical Research Council Cambridge Stem Cell Institute, University of Cambridge, CB2 0AW Cambridge, UK; Sahlgrenska Center for Cancer Research, Department of Microbiology and Immunology, University of Gothenburg, 413 90 Gothenburg, Sweden.

    Somite formation is foundational to creating the vertebrate segmental body plan. Here, we describe three transcriptional trajectories toward somite formation in the early mouse embryo. Precursors of the anterior-most somites ingress through the primitive streak before E7 and migrate anteriorly by E7.5, while a second wave of more posterior somites develops in the vicinity of the streak. Finally, neuromesodermal progenitors (NMPs) are set aside for subsequent trunk somitogenesis. Single-cell profiling of T<sup>-/-</sup> chimeric embryos shows that the anterior somites develop in the absence of T and suggests a cell-autonomous function of T as a gatekeeper between paraxial mesoderm production and the building of the NMP pool. Moreover, we identify putative regulators of early T-independent somites and challenge the T-Sox2 cross-antagonism model in early NMPs. Our study highlights the concept of molecular flexibility during early cell-type specification, with broad relevance for pluripotent stem cell differentiation and disease modeling.

    Developmental cell 2020

  • Environmental shaping of the bacterial and fungal community in infant bed dust and correlations with the airway microbiota.

    Gupta S, Hjelmsø MH, Lehtimäki J, Li X, Mortensen MS, Russel J, Trivedi U, Rasmussen MA, Stokholm J, Bisgaard H and Sørensen SJ

    Section of Microbiology, Department of Biology, University of Copenhagen, Universitetsparken 15, bldg. 1, DK2100, Copenhagen, Denmark.

    Background: From early life, children are exposed to a multitude of environmental exposures, which may be of crucial importance for healthy development. Here, the environmental microbiota may be of particular interest as it represents the interface between environmental factors and the child. As infants in modern societies spend a considerable amount of time indoors, we hypothesize that the indoor bed dust microbiota might be an important factor for the child and for the early colonization of the airway microbiome. To explore this hypothesis, we analyzed the influence of environmental exposures on 577 dust samples from the beds of infants together with 542 airway samples from the Copenhagen Prospective Studies on Asthma in Childhood<sub>2010</sub> cohort.

    Results: Both bacterial and fungal community was profiled from the bed dust. Bacterial and fungal diversity in the bed dust was positively correlated with each other. Bacterial bed dust microbiota was influenced by multiple environmental factors, such as type of home (house or apartment), living environment (rural or urban), sex of siblings, and presence of pets (cat and/or dog), whereas fungal bed dust microbiota was majorly influenced by the type of home (house or apartment) and sampling season. We further observed minor correlation between bed dust and airway microbiota compositions among infants. We also analyzed the transfer of microbiota from bed dust to the airway, but we did not find evidence of transfer of individual taxa.

    Conclusions: Current study explores the influence of environmental factors on bed dust microbiota (both bacterial and fungal) and its correlation with airway microbiota (bacterial) in early life using high-throughput sequencing. Our findings demonstrate that bed dust microbiota is influenced by multiple environmental exposures and could represent an interface between environment and child. Video Abstract.

    Funded by: Lundbeckfonden: R16-A1694; Ministeriet Sundhed Forebyggelse: 903516; Strategiske Forskningsråd: 0603-00280B

    Microbiome 2020;8;1;115

  • Mutagenicity of acrylamide and glycidamide in human TP53 knock-in (Hupki) mouse embryo fibroblasts.

    Hölzl-Armstrong L, Kucab JE, Moody S, Zwart EP, Loutkotová L, Duffy V, Luijten M, Gamboa da Costa G, Stratton MR, Phillips DH and Arlt VM

    Department of Analytical, Environmental and Forensic Sciences, MRC-PHE Centre for Environment and Health, King's College London, London, SE1 9NH, UK.

    Acrylamide is a suspected human carcinogen formed during high-temperature cooking of starch-rich foods. It is metabolised by cytochrome P450 2E1 to its reactive metabolite glycidamide, which forms pre-mutagenic DNA adducts. Using the human TP53 knock-in (Hupki) mouse embryo fibroblasts (HUFs) immortalisation assay (HIMA), acrylamide- and glycidamide-induced mutagenesis was studied in the tumour suppressor gene TP53. Selected immortalised HUF clones were also subjected to next-generation sequencing to determine mutations across the whole genome. The TP53-mutant frequency after glycidamide exposure (1.1 mM for 24 h, n = 198) was 9% compared with 0% in cultures treated with acrylamide [1.5 (n = 24) or 3 mM (n = 6) for 48 h] and untreated vehicle (water) controls (n = 36). Most glycidamide-induced mutations occurred at adenines with A > T/T > A and A > G/T > C mutations being the most common types. Mutations induced by glycidamide occurred at specific TP53 codons that have also been found to be mutated in human tumours (i.e., breast, ovary, colorectal, and lung) previously associated with acrylamide exposure. The spectrum of TP53 mutations was further reflected by the mutations detected by whole-genome sequencing (WGS) and a distinct WGS mutational signature was found in HUF clones treated with glycidamide that was again characterised by A > G/T > C and A > T/T > A mutations. The WGS mutational signature showed similarities with COSMIC mutational signatures SBS3 and 25 previously found in human tumours (e.g., breast and ovary), while the adenine component was similar to COSMIC SBS4 found mostly in smokers' lung cancer. In contrast, in acrylamide-treated HUF clones, only culture-related background WGS mutational signatures were observed. In summary, the results of the present study suggest that glycidamide may be involved in the development of breast, ovarian, and lung cancer.

    Funded by: Cancer Research UK Grand Challenge Award: C98/A24032; MRC Centre for Environment and Health: PhD Studentship Lisa Hölzl-Armstrong

    Archives of toxicology 2020

  • Mutagenicity of 2-hydroxyamino-1-methyl-6-phenylimidazo[4,5-b]pyridine (N-OH-PhIP) in human TP53 knock-in (Hupki) mouse embryo fibroblasts.

    Hölzl-Armstrong L, Moody S, Kucab JE, Zwart EP, Bellamri M, Luijten M, Turesky RJ, Stratton MR, Arlt VM and Phillips DH

    Department of Analytical, Environmental and Forensic Sciences, MRC-PHE Centre for Environment and Health, King's College London, London, SE1 9NH, UK.

    2-Amino-1-methyl-6-phenylimidazo[4,5-b]pyridine (PhIP) is a possible human carcinogen formed in cooked fish and meat. PhIP is bioactivated by cytochrome P450 enzymes to form 2-hydroxyamino-1-methyl-6-phenylimidazo[4,5-b]pyridine (N-OH-PhIP), a genotoxic metabolite that reacts with DNA leading to the mutation-prone DNA adduct N-(deoxyguanosin-8-yl)-PhIP (dG-C8-PhIP). Here, we studied N-OH-PhIP-induced whole genome mutagenesis in human TP53 knock-in (Hupki) mouse embryo fibroblasts (HUFs) immortalised and subjected to whole genome sequencing (WGS). In addition, mutagenicity of N-OH-PhIP in the TP53 and the lacZ reporter genes was assessed. TP53 mutant frequency in HUF cultures treated with N-OH-PhIP (2.5 μM for 24 h, n = 90) was 10% while no TP53 mutations were found in untreated controls (DMSO for 24 h, n = 6). All N-OH-PhIP-induced TP53 mutations occurred at G:C base pairs with G > T/C > A transversions accounting for 58% of them. TP53 mutations characteristic of those induced by N-OH-PhIP have been found in human tumours including breast and colorectal, which are cancer types that have been associated with PhIP exposure. LacZ mutant frequency increased 25-fold at 5 μM N-OH-PHIP and up to ∼350 dG-C8-PhIP adducts/10<sup>8</sup> nucleosides were detected by ultra-performance liquid chromatography-electrospray ionisation multistage scan mass spectrometry (UPLC-ESI-MS<sup>3</sup>) at this concentration. In addition, a WGS mutational signature defined by G > T/C > A transversions was present in N-OH-PhIP-treated immortalised clones, which showed similarity to COSMIC SBS4, 18 and 29 signatures found in human tumours.

    Food and chemical toxicology : an international journal published for the British Industrial Biological Research Association 2020;111855

  • A Genetic History of the Near East from an aDNA Time Course Sampling Eight Points in the Past 4,000 Years.

    Haber M, Nassar J, Almarri MA, Saupe T, Saag L, Griffith SJ, Doumet-Serhal C, Chanteau J, Saghieh-Beydoun M, Xue Y, Scheib CL and Tyler-Smith C

    Institute of Cancer and Genomic Sciences, University of Birmingham, Birmingham B15 2TT, UK; Centre for Computational Biology, University of Birmingham, Birmingham B15 2TT, UK; Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1SA, UK. Electronic address:

    The Iron and Classical Ages in the Near East were marked by population expansions carrying cultural transformations that shaped human history, but the genetic impact of these events on the people who lived through them is little-known. Here, we sequenced the whole genomes of 19 individuals who each lived during one of four time periods between 800 BCE and 200 CE in Beirut on the Eastern Mediterranean coast at the center of the ancient world's great civilizations. We combined these data with published data to traverse eight archaeological periods and observed any genetic changes as they arose. During the Iron Age (∼1000 BCE), people with Anatolian and South-East European ancestry admixed with people in the Near East. The region was then conquered by the Persians (539 BCE), who facilitated movement exemplified in Beirut by an ancient family with Egyptian-Lebanese admixed members. But the genetic impact at a population level does not appear until the time of Alexander the Great (beginning 330 BCE), when a fusion of Asian and Near Easterner ancestry can be seen, paralleling the cultural fusion that appears in the archaeological records from this period. The Romans then conquered the region (31 BCE) but had little genetic impact over their 600 years of rule. Finally, during the Ottoman rule (beginning 1516 CE), Caucasus-related ancestry penetrated the Near East. Thus, in the past 4,000 years, three limited admixture events detectably impacted the population, complementing the historical records of this culturally complex region dominated by the elite with genetic insights from the general population.

    Funded by: Wellcome Trust

    American journal of human genetics 2020;107;1;149-157

  • A Southeast Asian origin for present-day non-African human Y chromosomes.

    Hallast P, Agdzhoyan A, Balanovsky O, Xue Y and Tyler-Smith C

    Institute of Biomedicine and Translational Medicine, University of Tartu, 50411, Tartu, Estonia.

    The genomes of present-day humans outside Africa originated almost entirely from a single out-migration ~ 50,000-70,000 years ago, followed by mixture with Neanderthals contributing ~ 2% to all non-Africans. However, the details of this initial migration remain poorly understood because no ancient DNA analyses are available from this key time period, and interpretation of present-day autosomal data is complicated due to subsequent population movements/reshaping. One locus, however, does retain male-specific information from this early period: the Y chromosome, where a detailed calibrated phylogeny has been constructed. Three present-day Y lineages were carried by the initial migration: the rare haplogroup D, the moderately rare C, and the very common FT lineage which now dominates most non-African populations. Here, we show that phylogenetic analyses of haplogroup C, D and FT sequences, including very rare deep-rooting lineages, together with phylogeographic analyses of ancient and present-day non-African Y chromosomes, all point to East/Southeast Asia as the origin 50,000-55,000 years ago of all known surviving non-African male lineages (apart from recent migrants). This observation contrasts with the expectation of a West Eurasian origin predicted by a simple model of expansion from a source near Africa, and can be interpreted as resulting from extensive genetic drift in the initial population or replacement of early western Y lineages from the east, thus informing and constraining models of the initial expansion.

    Funded by: Eesti Teadusagentuur: PUT1036; Wellcome Trust: 098051

    Human genetics 2020

  • Genetic diversity and neutral selection in Plasmodium vivax erythrocyte binding protein correlates with patient antigenicity.

    Han JH, Cho JS, Ong JJY, Park JH, Nyunt MH, Sutanto E, Trimarsanto H, Petros B, Aseffa A, Getachew S, Sriprawat K, Anstey NM, Grigg MJ, Barber BE, William T, Qi G, Liu Y, Pearson RD, Auburn S, Price RN, Nosten F, Rénia L, Russell B and Han ET

    Department of Microbiology and Immunology, University of Otago, Dunedin, New Zealand.

    Plasmodium vivax is the most widespread and difficult to treat cause of human malaria. The development of vaccines against the blood stages of P. vivax remains a key objective for the control and elimination of vivax malaria. Erythrocyte binding-like (EBL) protein family members such as Duffy binding protein (PvDBP) are of critical importance to erythrocyte invasion and have been the major target for vivax malaria vaccine development. In this study, we focus on another member of EBL protein family, P. vivax erythrocyte binding protein (PvEBP). PvEBP was first identified in Cambodian (C127) field isolates and has subsequently been showed its preferences for binding reticulocytes which is directly inhibited by antibodies. We analysed PvEBP sequence from 316 vivax clinical isolates from eight countries including China (n = 4), Ethiopia (n = 24), Malaysia (n = 53), Myanmar (n = 10), Papua New Guinea (n = 16), Republic of Korea (n = 10), Thailand (n = 174), and Vietnam (n = 25). PvEBP gene exhibited four different phenotypic clusters based on the insertion/deletion (indels) variation. PvEBP-RII (179-479 aa.) showed highest polymorphism similar to other EBL family proteins in various Plasmodium species. Whereas even though PvEBP-RIII-V (480-690 aa.) was the most conserved domain, that showed strong neutral selection pressure for gene purifying with significant population expansion. Antigenicity of both of PvEBP-RII (16.1%) and PvEBP-RIII-V (21.5%) domains were comparatively lower than other P. vivax antigen which expected antigens associated with merozoite invasion. Total IgG recognition level of PvEBP-RII was stronger than PvEBP-RIII-V domain, whereas total IgG inducing level was stronger in PvEBP-RIII-V domain. These results suggest that PvEBP-RII is mainly recognized by natural IgG for innate protection, whereas PvEBP-RIII-V stimulates IgG production activity by B-cell for acquired immunity. Overall, the low antigenicity of both regions in patients with vivax malaria likely reflects genetic polymorphism for strong positive selection in PvEBP-RII and purifying selection in PvEBP-RIII-V domain. These observations pose challenging questions to the selection of EBP and point out the importance of immune pressure and polymorphism required for inclusion of PvEBP as a vaccine candidate.

    PLoS neglected tropical diseases 2020;14;7;e0008202

  • Muzlifah Haniffa-a new era for collaborative and supportive medical research.

    Haniffa M

    Biosciences Institute, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne, UK.

    Nature medicine 2020;26;2;155

  • The architecture and stabilisation of flagellotropic tailed bacteriophages.

    Hardy JM, Dunstan RA, Grinter R, Belousoff MJ, Wang J, Pickard D, Venugopal H, Dougan G, Lithgow T and Coulibaly F

    Infection & Immunity Program, Biomedicine Discovery Institute & Department of Biochemistry and Molecular Biology, Monash University, Clayton, VIC, Australia.

    Flagellotropic bacteriophages engage flagella to reach the bacterial surface as an effective means to increase the capture radius for predation. Structural details of these viruses are of great interest given the substantial drag forces and torques they face when moving down the spinning flagellum. We show that the main capsid and auxiliary proteins form two nested chainmails that ensure the integrity of the bacteriophage head. Core stabilising structures are conserved in herpesviruses suggesting their ancestral origin. The structure of the tail also reveals a robust yet pliable assembly. Hexameric rings of the tail-tube protein are braced by the N-terminus and a β-hairpin loop, and interconnected along the tail by the splayed β-hairpins. By contrast, we show that the β-hairpin has an inhibitory role in the tail-tube precursor, preventing uncontrolled self-assembly. Dyads of acidic residues inside the tail-tube present regularly-spaced motifs well suited to DNA translocation into bacteria through the tail.

    Funded by: Department of Health | National Health and Medical Research Council (NHMRC): 1092262

    Nature communications 2020;11;1;3748

  • WormBase: a modern Model Organism Information Resource.

    Harris TW, Arnaboldi V, Cain S, Chan J, Chen WJ, Cho J, Davis P, Gao S, Grove CA, Kishore R, Lee RYN, Muller HM, Nakamura C, Nuin P, Paulini M, Raciti D, Rodgers FH, Russell M, Schindelman G, Auken KV, Wang Q, Williams G, Wright AJ, Yook K, Howe KL, Schedl T, Stein L and Sternberg PW

    Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada.

    WormBase ( is a mature Model Organism Information Resource supporting researchers using the nematode Caenorhabditis elegans as a model system for studies across a broad range of basic biological processes. Toward this mission, WormBase efforts are arranged in three primary facets: curation, user interface and architecture. In this update, we describe progress in each of these three areas. In particular, we discuss the status of literature curation and recently added data, detail new features of the web interface and options for users wishing to conduct data mining workflows, and discuss our efforts to build a robust and scalable architecture by leveraging commercial cloud offerings. We conclude with a description of WormBase's role as a founding member of the nascent Alliance of Genome Resources.

    Funded by: Medical Research Council: MR/S000453/1

    Nucleic acids research 2020;48;D1;D762-D767

  • Structural basis for RIFIN-mediated activation of LILRB1 in malaria.

    Harrison TE, Mørch AM, Felce JH, Sakoguchi A, Reid AJ, Arase H, Dustin ML and Higgins MK

    Department of Biochemistry, University of Oxford, Oxford, UK.

    The Plasmodium species that cause malaria are obligate intracellular parasites, and disease symptoms occur as they replicate within human blood. Despite risking immune detection, the parasite delivers proteins that bind host receptors to infected erythrocyte surfaces. In the causative agent of the most deadly human malaria, Plasmodium falciparum, RIFINs form the largest erythrocyte surface protein family<sup>1</sup>. Some RIFINs can bind inhibitory immune receptors, acting as targets for unusual antibodies containing a LAIR1 ectodomain<sup>2-4</sup>, or as ligands for LILRB1<sup>5</sup>. RIFINs stimulate LILRB1 activation and signalling<sup>5</sup>, thereby potentially dampening human immune responses. To understand this process, we determined a structure of a RIFIN bound to LILRB1. We show that the RIFIN mimics the natural activating ligand of LILRB1, MHC class I, in its LILRB1-binding mode. A single RIFIN mutation disrupts the complex, blocks LILRB1 binding by all tested RIFINs and abolishes signalling in a reporter assay. In a supported lipid bilayer system, which mimics NK cell activation by antibody-dependent cell-mediated cytotoxicity, both RIFIN and MHC are recruited to the NK cell immunological synapse and reduce cell activation, as measured by perforin mobilisation. Therefore, LILRB1-binding RIFINs mimic the binding mode of the natural ligand of LILRB1 and suppress NK cell function.

    Nature 2020

  • Structure of the Plasmodium-interspersed repeat proteins of the malaria parasite.

    Harrison TE, Reid AJ, Cunningham D, Langhorne J and Higgins MK

    Department of Biochemistry, University of Oxford, Oxford OX1 3QU, United Kingdom.

    The deadly symptoms of malaria occur as <i>Plasmodium</i> parasites replicate within blood cells. Members of several variant surface protein families are expressed on infected blood cell surfaces. Of these, the largest and most ubiquitous are the <i>Plasmodium</i>-interspersed repeat (PIR) proteins, with more than 1,000 variants in some genomes. Their functions are mysterious, but differential <i>pir</i> gene expression associates with acute or chronic infection in a mouse malaria model. The membership of the PIR superfamily, and whether the family includes <i>Plasmodium falciparum</i> variant surface proteins, such as RIFINs and STEVORs, is controversial. Here we reveal the structure of the extracellular domain of a PIR from <i>Plasmodium chabaudi</i> We use structure-guided sequence analysis and molecular modeling to show that this fold is found across PIR proteins from mouse- and human-infective malaria parasites. Moreover, we show that RIFINs and STEVORs are not PIRs. This study provides a structure-guided definition of the PIRs and a molecular framework to understand their evolution.

    Proceedings of the National Academy of Sciences of the United States of America 2020

  • The role of haematological traits in risk of ischaemic stroke and its subtypes.

    Harshfield EL, Sims MC, Traylor M, Ouwehand WH and Markus HS

    Stroke Research Group, Department of Clinical Neurosciences, University of Cambridge, Cambridge, UK.

    Thrombosis and platelet activation play a central role in stroke pathogenesis, and antiplatelet and anticoagulant therapies are central to stroke prevention. However, whether haematological traits contribute equally to all ischaemic stroke subtypes is uncertain. Furthermore, identification of associations with new traits may offer novel treatment opportunities. The aim of this research was to ascertain causal relationships between a wide range of haematological traits and ischaemic stroke and its subtypes. We obtained summary statistics from 27 published genome-wide association studies of haematological traits involving over 375 000 individuals, and genetic associations with stroke from the MEGASTROKE Consortium (n = 67 000 stroke cases). Using two-sample Mendelian randomization we analysed the association of genetically elevated levels of 36 blood cell traits (platelets, mature/immature red cells, and myeloid/lymphoid/compound white cells) and 49 haemostasis traits (including clotting cascade factors and markers of platelet function) with risk of developing ischaemic (AIS), cardioembolic (CES), large artery (LAS), and small vessel stroke (SVS). Several factors on the intrinsic clotting pathway were significantly associated (P < 3.85 × 10-4) with CES and LAS, but not with SVS (e.g. reduced factor VIII activity with AIS/CES/LAS; raised factor VIII antigen with AIS/CES; and increased factor XI activity with AIS/CES). On the common pathway, increased gamma (γ') fibrinogen was significantly associated with AIS/CES. Furthermore, elevated plateletcrit was significantly associated with AIS/CES, eosinophil percentage of white cells with LAS, and thrombin-activatable fibrinolysis inhibitor activation peptide antigen with AIS. We also conducted a follow-up analysis in UK Biobank, which showed that amongst individuals with atrial fibrillation, those with genetically lower levels of factor XI are at reduced risk of AIS compared to those with normal levels of factor XI. These results implicate components of the intrinsic and common pathways of the clotting cascade, as well as several other haematological traits, in the pathogenesis of CES and possibly LAS, but not SVS. The lack of associations with SVS suggests thrombosis may be less important for this stroke subtype. Plateletcrit and factor XI are potentially tractable new targets for secondary prevention of ischaemic stroke, while factor VIII and γ' fibrinogen require further population-based studies to ascertain their possible aetiological roles.

    Funded by: Medical Research Council: MC_PC_17228, MC_QA137853

    Brain : a journal of neurology 2020;143;1;210-221

  • Souporcell: robust clustering of single-cell RNA-seq data by genotype without reference genotypes.

    Heaton H, Talman AM, Knights A, Imaz M, Gaffney DJ, Durbin R, Hemberg M and Lawniczak MKN

    Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK.

    Methods to deconvolve single-cell RNA-sequencing (scRNA-seq) data are necessary for samples containing a mixture of genotypes, whether they are natural or experimentally combined. Multiplexing across donors is a popular experimental design that can avoid batch effects, reduce costs and improve doublet detection. By using variants detected in scRNA-seq reads, it is possible to assign cells to their donor of origin and identify cross-genotype doublets that may have highly similar transcriptional profiles, precluding detection by transcriptional profile. More subtle cross-genotype variant contamination can be used to estimate the amount of ambient RNA. Ambient RNA is caused by cell lysis before droplet partitioning and is an important confounder of scRNA-seq analysis. Here we develop souporcell, a method to cluster cells using the genetic variants detected within the scRNA-seq reads. We show that it achieves high accuracy on genotype clustering, doublet detection and ambient RNA estimation, as demonstrated across a range of challenging scenarios.

    Funded by: British Heart Foundation: RG/13/13/30194, RG/18/13/33946; Department of Health; Medical Research Council: G1100339, MR/L003120/1; Wellcome Trust: 206194/Z/17/Z, WT098503, WT207492

    Nature methods 2020;17;6;615-620

  • A microsporidian impairs Plasmodium falciparum transmission in Anopheles arabiensis mosquitoes.

    Herren JK, Mbaisi L, Mararo E, Makhulu EE, Mobegi VA, Butungi H, Mancini MV, Oundo JW, Teal ET, Pinaud S, Lawniczak MKN, Jabara J, Nattoh G and Sinkins SP

    International Centre of Insect Physiology and Ecology (ICIPE), Kasarani, Nairobi, Kenya.

    A possible malaria control approach involves the dissemination in mosquitoes of inherited symbiotic microbes to block Plasmodium transmission. However, in the Anopheles gambiae complex, the primary African vectors of malaria, there are limited reports of inherited symbionts that impair transmission. We show that a vertically transmitted microsporidian symbiont (Microsporidia MB) in the An. gambiae complex can impair Plasmodium transmission. Microsporidia MB is present at moderate prevalence in geographically dispersed populations of An. arabiensis in Kenya, localized to the mosquito midgut and ovaries, and is not associated with significant reductions in adult host fecundity or survival. Field-collected Microsporidia MB infected An. arabiensis tested negative for P. falciparum gametocytes and, on experimental infection with P. falciparum, sporozoites aren't detected in Microsporidia MB infected mosquitoes. As a microbe that impairs Plasmodium transmission that is non-virulent and vertically transmitted, Microsporidia MB could be investigated as a strategy to limit malaria transmission.

    Funded by: RCUK | Biotechnology and Biological Sciences Research Council (BBSRC): BB/R005338/1, sub-grant AV/PP015/1; Wellcome Trust (Wellcome): 107372, 200274

    Nature communications 2020;11;1;2187

  • Malat1 Suppresses Immunity to Infection through Promoting Expression of Maf and IL-10 in Th Cells.

    Hewitson JP, West KA, James KR, Rani GF, Dey N, Romano A, Brown N, Teichmann SA, Kaye PM and Lagos D

    York Biomedical Research Institute, University of York, York, YO10 5DD Yorkshire, United Kingdom.

    Despite extensive mapping of long noncoding RNAs in immune cells, their function in vivo remains poorly understood. In this study, we identify over 100 long noncoding RNAs that are differentially expressed within 24 h of Th1 cell activation. Among those, we show that suppression of <i>Malat1</i> is a hallmark of CD4<sup>+</sup> T cell activation, but its complete deletion results in more potent immune responses to infection. This is because <i>Malat1<sup>-/-</sup></i> Th1 and Th2 cells express lower levels of the immunosuppressive cytokine IL-10. In vivo, the reduced CD4<sup>+</sup> T cell IL-10 expression in <i>Malat1<sup>-/-</sup></i> mice underpins enhanced immunity and pathogen clearance in experimental visceral leishmaniasis (<i>Leishmania donovani</i>) but more severe disease in a model of malaria (<i>Plasmodium chabaudi chabaudi</i> AS). Mechanistically, <i>Malat1</i> regulates IL-10 through enhancing expression of Maf, a key transcriptional regulator of <i>IL-10</i> Maf expression correlates with <i>Malat1</i> in single Ag-specific Th cells from <i>P. chabaudi chabaudi</i> AS-infected mice and is downregulated in <i>Malat1<sup>-/-</sup></i> Th1 and Th2 cells. The <i>Malat1</i> RNA is responsible for these effects, as antisense oligonucleotide-mediated inhibition of <i>Malat1</i> also suppresses Maf and IL-10 levels. Our results reveal that through promoting expression of the Maf/IL-10 axis in effector Th cells, <i>Malat1</i> is a nonredundant regulator of mammalian immunity.

    Journal of immunology (Baltimore, Md. : 1950) 2020

  • Innate Immune Mechanisms to Protect Against Infection at the Human Decidual-Placental Interface.

    Hoo R, Nakimuli A and Vento-Tormo R

    Wellcome Sanger Institute, Cambridge, United Kingdom.

    During pregnancy, the placenta forms the anatomical barrier between the mother and developing fetus. Infectious agents can potentially breach the placental barrier resulting in pathogenic transmission from mother to fetus. Innate immune responses, orchestrated by maternal and fetal cells at the decidual-placental interface, are the first line of defense to avoid vertical transmission. Here, we outline the anatomy of the human placenta and uterine lining, the decidua, and discuss the potential capacity of pathogen pattern recognition and other host defense strategies present in the innate immune cells at the placental-decidual interface. We consider major congenital infections that access the placenta from hematogenous or decidual route. Finally, we highlight the challenges in studying human placental responses to pathogens and vertical transmission using current experimental models and identify gaps in knowledge that need to be addressed. We further propose novel experimental strategies to address such limitations.

    Frontiers in immunology 2020;11;2070

  • Type II and type IV toxin-antitoxin systems show different evolutionary patterns in the global Klebsiella pneumoniae population.

    Horesh G, Fino C, Harms A, Dorman MJ, Parts L, Gerdes K, Heinz E and Thomson NR

    Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1RQ, UK.

    The Klebsiella pneumoniae species complex includes important opportunistic pathogens which have become public health priorities linked to major hospital outbreaks and the recent emergence of multidrug-resistant hypervirulent strains. Bacterial virulence and the spread of multidrug resistance have previously been linked to toxin-antitoxin (TA) systems. TA systems encode a toxin that disrupts essential cellular processes, and a cognate antitoxin which counteracts this activity. Whilst associated with the maintenance of plasmids, they also act in bacterial immunity and antibiotic tolerance. However, the evolutionary dynamics and distribution of TA systems in clinical pathogens are not well understood. Here, we present a comprehensive survey and description of the diversity of TA systems in 259 clinically relevant genomes of K. pneumoniae. We show that TA systems are highly prevalent with a median of 20 loci per strain. Importantly, these toxins differ substantially in their distribution patterns and in their range of cognate antitoxins. Classification along these properties suggests different roles of TA systems and highlights the association and co-evolution of toxins and antitoxins.

    Funded by: Wellcome Trust: 206194

    Nucleic acids research 2020;48;8;4357-4370

  • Reply to Jensen and Kowalik: Consideration of mixed infections is central to understanding HCMV intrahost diversity.

    Houldcroft CJ, Cudini J, Goldstein RA and Breuer J

    Department of Medicine, Addenbrookes Hospital, Cambridge University, Cambridge CB2 0QQ, United Kingdom.

    Proceedings of the National Academy of Sciences of the United States of America 2020;117;2;818-819

  • Differential regulation of the immune system in a brain-liver-fats organ network during short-term fasting.

    Huang SSY, Makhlouf M, AbouMoussa EH, Ruiz Tejada Segura ML, Mathew LS, Wang K, Leung MC, Chaussabel D, Logan DW, Scialdone A, Garand M and Saraiva LR

    Sidra Medicine, PO Box 26999, Doha, Qatar. Electronic address:

    Objective: Fasting regimens can promote health, mitigate chronic immunological disorders, and improve age-related pathophysiological parameters in animals and humans. Several ongoing clinical trials are using fasting as a potential therapy for various conditions. Fasting alters metabolism by acting as a reset for energy homeostasis, but the molecular mechanisms underlying the beneficial effects of short-term fasting (STF) are not well understood, particularly at the systems or multiorgan level.

    Methods: We performed RNA-sequencing in nine organs from mice fed ad libitum (0 h) or subjected to fasting five times (2-22 h). We applied a combination of multivariate analysis, differential expression analysis, gene ontology, and network analysis for an in-depth understanding of the multiorgan transcriptome. We used literature mining solutions, LitLab™ and Gene Retriever™, to identify the biological and biochemical terms significantly associated with our experimental gene set, which provided additional support and meaning to the experimentally derived gene and inferred protein data.

    Results: We cataloged the transcriptional dynamics within and between organs during STF and discovered differential temporal effects of STF among organs. Using gene ontology enrichment analysis, we identified an organ network sharing 37 common biological pathways perturbed by STF. This network incorporates the brain, liver, interscapular brown adipose tissue, and posterior-subcutaneous white adipose tissue; hence, we named it the brain-liver-fats organ network. Using Reactome pathways analysis, we identified the immune system, dominated by T cell regulation processes, as a central and prominent target of systemic modulations during STF in this organ network. The changes we identified in specific immune components point to the priming of adaptive immunity and parallel the fine-tuning of innate immune signaling.

    Conclusions: Our study provides a comprehensive multiorgan transcriptomic profiling of mice subjected to multiple periods of STF and provides new insights into the molecular modulators involved in the systemic immunotranscriptomic changes that occur during short-term energy loss.

    Molecular metabolism 2020;40;101038

  • Computational strategies to combat COVID-19: useful tools to accelerate SARS-CoV-2 and coronavirus research.

    Hufsky F, Lamkiewicz K, Almeida A, Aouacheria A, Arighi C, Bateman A, Baumbach J, Beerenwinkel N, Brandt C, Cacciabue M, Chuguransky S, Drechsel O, Finn RD, Fritz A, Fuchs S, Hattab G, Hauschild AC, Heider D, Hoffmann M, Hölzer M, Hoops S, Kaderali L, Kalvari I, von Kleist M, Kmiecinski R, Kühnert D, Lasso G, Libin P, List M, Löchel HF, Martin MJ, Martin R, Matschinske J, McHardy AC, Mendes P, Mistry J, Navratil V, Nawrocki EP, O'Toole ÁN, Ontiveros-Palacios N, Petrov AI, Rangel-Pineros G, Redaschi N, Reimering S, Reinert K, Reyes A, Richardson L, Robertson DL, Sadegh S, Singer JB, Theys K, Upton C, Welzel M, Williams L and Marz M

    Friedrich-Schiller-University Jena, Germany.

    SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2) is a novel virus of the family Coronaviridae. The virus causes the infectious disease COVID-19. The biology of coronaviruses has been studied for many years. However, bioinformatics tools designed explicitly for SARS-CoV-2 have only recently been developed as a rapid reaction to the need for fast detection, understanding and treatment of COVID-19. To control the ongoing COVID-19 pandemic, it is of utmost importance to get insight into the evolution and pathogenesis of the virus. In this review, we cover bioinformatics workflows and tools for the routine detection of SARS-CoV-2 infection, the reliable analysis of sequencing data, the tracking of the COVID-19 pandemic and evaluation of containment measures, the study of coronavirus evolution, the discovery of potential drug targets and development of therapeutic strategies. For each tool, we briefly describe its use case and how it advances research specifically for SARS-CoV-2. All tools are free to use and available online, either through web applications or public code repositories. Contact:

    Briefings in bioinformatics 2020

  • Pan-cancer analysis of whole genomes.

    ICGC/TCGA Pan-Cancer Analysis of Whole Genomes Consortium

    Cancer is driven by genetic change, and the advent of massively parallel sequencing has enabled systematic documentation of this variation at the whole-genome scale<sup>1-3</sup>. Here we report the integrative analysis of 2,658 whole-cancer genomes and their matching normal tissues across 38 tumour types from the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA). We describe the generation of the PCAWG resource, facilitated by international data sharing using compute clouds. On average, cancer genomes contained 4-5 driver mutations when combining coding and non-coding genomic elements; however, in around 5% of cases no drivers were identified, suggesting that cancer driver discovery is not yet complete. Chromothripsis, in which many clustered structural variants arise in a single catastrophic event, is frequently an early event in tumour evolution; in acral melanoma, for example, these events precede most somatic point mutations and affect several cancer-associated genes simultaneously. Cancers with abnormal telomere maintenance often originate from tissues with low replicative activity and show several mechanisms of preventing telomere attrition to critical levels. Common and rare germline variants affect patterns of somatic mutation, including point mutations, structural variants and somatic retrotransposition. A collection of papers from the PCAWG Consortium describes non-coding mutations that drive cancer beyond those in the TERT promoter<sup>4</sup>; identifies new signatures of mutational processes that cause base substitutions, small insertions and deletions and structural variation<sup>5,6</sup>; analyses timings and patterns of tumour evolution<sup>7</sup>; describes the diverse transcriptional consequences of somatic mutation on splicing, expression levels, fusion genes and promoter activity<sup>8,9</sup>; and evaluates a range of more-specialized features of cancer genomes<sup>8,10-18</sup>.

    Funded by: Medical Research Council: MC_UU_00007/16, MC_UU_12022/2; NCI NIH HHS: U24 CA211000; NHGRI NIH HHS: R01 HG007069; Wellcome Trust: 088177

    Nature 2020;578;7793;82-93

  • Evaluation of whole genome amplification and bioinformatic methods for the characterization of Leishmania genomes at a single cell level.

    Imamura H, Monsieurs P, Jara M, Sanders M, Maes I, Vanaerschot M, Berriman M, Cotton JA, Dujardin JC and Domagalska MA

    Institute of Tropical Medicine Antwerp, Molecular Parasitology Unit, Antwerp, Belgium.

    Here, we report a pilot study paving the way for further single cell genomics studies in Leishmania. First, the performances of two commercially available kits for Whole Genome Amplification (WGA), PicoPLEX and RepliG were compared on small amounts of Leishmania donovani DNA, testing their ability to preserve specific genetic variations, including aneuploidy levels and SNPs. We show here that the choice of WGA method should be determined by the planned downstream genetic analysis, PicoPLEX and RepliG performing better for aneuploidy and SNP calling, respectively. This comparison allowed us to evaluate and optimize corresponding bio-informatic methods. As PicoPLEX was shown to be the preferred method for studying single cell aneuploidy, this method was applied in a second step, on single cells of L. braziliensis, which were sorted by fluorescence activated cell sorting (FACS). Even sequencing depth was achieved in 28 single cells, allowing accurate somy estimation. A dominant karyotype with three aneuploid chromosomes was observed in 25 cells, while two different minor karyotypes were observed in the other cells. Our method thus allowed the detection of aneuploidy mosaicism, and provides a solid basis which can be further refined to concur with higher-throughput single cell genomic methods.

    Funded by: Wellcome Trust: 206194

    Scientific reports 2020;10;1;15043

  • Molecular epidemiology of resistance to antimalarial drugs in the Greater Mekong subregion: an observational study.

    Imwong M, Dhorda M, Myo Tun K, Thu AM, Phyo AP, Proux S, Suwannasin K, Kunasol C, Srisutham S, Duanguppama J, Vongpromek R, Promnarate C, Saejeng A, Khantikul N, Sugaram R, Thanapongpichat S, Sawangjaroen N, Sutawong K, Han KT, Htut Y, Linn K, Win AA, Hlaing TM, van der Pluijm RW, Mayxay M, Pongvongsa T, Phommasone K, Tripura R, Peto TJ, von Seidlein L, Nguon C, Lek D, Chan XHS, Rekol H, Leang R, Huch C, Kwiatkowski DP, Miotto O, Ashley EA, Kyaw MP, Pukrittayakamee S, Day NPJ, Dondorp AM, Smithuis FM, Nosten FH and White NJ

    Department of Molecular Tropical Medicine and Genetics, Faculty of Tropical Medicine, Mahidol University, Bangkok, Thailand; Mahidol-Oxford Tropical Medicine Research Unit, Faculty of Tropical Medicine, Mahidol University, Bangkok, Thailand. Electronic address:

    Background: The Greater Mekong subregion is a recurrent source of antimalarial drug resistance in Plasmodium falciparum malaria. This study aimed to characterise the extent and spread of resistance across this entire region between 2007 and 2018.

    Methods: P falciparum isolates from Myanmar, Thailand, Laos, and Cambodia were obtained from clinical trials and epidemiological studies done between Jan 1, 2007, and Dec 31, 2018, and were genotyped for molecular markers (pfkelch, pfcrt, pfplasmepsin2, and pfmdr1) of antimalarial drug resistance. Genetic relatedness was assessed using microsatellite and single nucleotide polymorphism typing of flanking sequences around target genes.

    Findings: 10 632 isolates were genotyped. A single long pfkelch Cys580Tyr haplotype (from -50 kb to +31·5 kb) conferring artemisinin resistance (PfPailin) now dominates across the eastern Greater Mekong subregion. Piperaquine resistance associated with pfplasmepsin2 gene amplification and mutations in pfcrt downstream of the Lys76Thr chloroquine resistance locus has also developed. On the Thailand-Myanmar border a different pfkelch Cys580Tyr lineage rose to high frequencies before it was eliminated. Elsewhere in Myanmar the Cys580Tyr allele remains widespread at low allele frequencies. Meanwhile a single artemisinin-resistant pfkelch Phe446Ile haplotype has spread across Myanmar. Despite intense use of dihydroartemisinin-piperaquine in Kayin state, eastern Myanmar, both in treatment and mass drug administrations, no selection of piperaquine resistance markers was observed. pfmdr1 amplification, a marker of resistance to mefloquine, remains at low prevalence across the entire region.

    Interpretation: Artemisinin resistance in P falciparum is now prevalent across the Greater Mekong subregion. In the eastern Greater Mekong subregion a multidrug resistant P falciparum lineage (PfPailin) dominates. In Myanmar a long pfkelch Phe446Ile haplotype has spread widely but, by contrast with the eastern Greater Mekong subregion, there is no indication of artemisinin combination therapy (ACT) partner drug resistance from genotyping known markers, and no evidence of spread of ACT resistant P falciparum from the east to the west. There is still a window of opportunity to prevent global spread of ACT resistance.

    Funding: Thailand Science Research and Innovation, Initiative 5%, Expertise France, Wellcome Trust.

    The Lancet. Infectious diseases 2020

  • B Lymphocyte-Derived CCL7 Augments Neutrophil and Monocyte Recruitment, Exacerbating Acute Kidney Injury.

    Inaba A, Tuong ZK, Riding AM, Mathews RJ, Martin JL, Saeb-Parsy K and Clatworthy MR

    Molecular Immunity Unit, Department of Medicine, University of Cambridge, Cambridge CB2 0QH, United Kingdom;

    Acute kidney injury (AKI) is a serious condition affecting one fifth of hospital inpatients. B lymphocytes have immunological functions beyond Ab production and may produce cytokines and chemokines that modulate inflammation. In this study, we investigated leukocyte responses in a mouse model of AKI and observed an increase in circulating and kidney B cells, particularly a B220<sup>low</sup> subset, following AKI. We found that B cells produce the chemokine CCL7, with the potential to facilitate neutrophil and monocyte recruitment to the injured kidney. Siglec-G-deficient mice, which have increased numbers of B220<sup>low</sup> innate B cells and a lower B cell activation threshold, had increased <i>Ccl7</i> transcripts, increased neutrophil and monocyte numbers in the kidney, and more severe AKI. CCL7 blockade in mice reduced myeloid cell infiltration into the kidney and ameliorated AKI. In two independent cohorts of human patients with AKI, we observed significantly higher <i>CCL7</i> transcripts compared with controls, and in a third cohort, we observed an increase in urinary CCL7 levels in AKI, supporting the clinical importance of this pathway. Together, our data suggest that B cells contribute to early sterile inflammation in AKI via the production of leukocyte-recruiting chemokines.

    Journal of immunology (Baltimore, Md. : 1950) 2020

  • Functional analysis of candidate genes from genome-wide association studies of hearing.

    Ingham NJ, Rook V, Di Domenico F, James E, Lewis MA, Girotto G, Buniello A and Steel KP

    Wolfson Centre for Age-Related Diseases, King's College London, London, SE1 1UL, UK; Wellcome Trust Sanger Institute, Hinxton, CB10 1SA, UK. Electronic address:

    The underlying causes of age-related hearing loss (ARHL) are not well understood, but it is clear from heritability estimates that genetics plays a role in addition to environmental factors. Genome-wide association studies (GWAS) in human populations can point to candidate genes that may be involved in ARHL, but follow-up analysis is needed to assess the role of these genes in the disease process. Some genetic variants may contribute a small amount to a disease, while other variants may have a large effect size, but the genetic architecture of ARHL is not yet well-defined. In this study, we asked if a set of 17 candidate genes highlighted by early GWAS reports of ARHL have detectable effects on hearing by knocking down expression levels of each gene in the mouse and analysing auditory function. We found two of the genes have an impact on hearing. Mutation of Dclk1 led to late-onset progressive increase in ABR thresholds and the A430005L14Rik (C1orf174) mutants showed worse recovery from noise-induced damage than controls. We did not detect any abnormal responses in the remaining 15 mutant lines either in thresholds or from our battery of suprathreshold ABR tests, and we discuss the possible reasons for this.

    Funded by: Action on Hearing Loss: G73, G86; Medical Research Council: MC_QA137918, MR/N012119/1; Wellcome Trust: 098051, 100669

    Hearing research 2020;387;107879

  • Effects of maternal high fat/high sucrose dieton hepatic lipid metabolism in rat offspring.

    Ingvorsen C, Lelliott CJ, Brix S and Hellgren LI

    Department of Systems Biology, Technical University of Denmark, Lyngby, Denmark.

    Maternal obesity and/or high fat diet during pregnancypredispose the offspring to metabolic disease. It is however unclear how pre-natal and post-natal exposure respectively affect the risk of hepatic steatosis and the trajectory towards non-alcoholic steatohepatitis in the offspring. We investigate hepatic lipid metabolism and how these factors are related to metabolic outcome in new born and young rats.Rat dams were exposed to a high fat/high sucrose (HFHS) diet for 17 weeks prior to mating and during pregnancy. After birth, female offspring where euthanised and male offspring were cross-fostered, creating four groups; Control born pups lactated by control (CC) or HFHS dams (CH) and HFHS born pups lactated by control (HC) or HFHS dams (HH). At 4 weeks of age, pups were euthanized and metabolic markers in plasma were assayed, together with hepatic lipid composition and expression of relevant genes.Female HFHS neonates had smaller livers at birth (p<0.05), a reduced hepatic lipid content (p<0.05) and altered lipid composition. The post-natal environment dominated the metabolic profile in the male offspring at 4 weeks of age. Offspring exposed to a HFHS environment post-natallyThe HFHS post-natal offspringhad increased adiposity (p<0.0001), increased hepatic TAG accumulation (p<0.0001), and an altered lipid profile with elevated n-6 PUFA levels (p<0.0001) and a reduction in ceramide (p<0.001) and MUFA (p<0.0001).In summary, maternal HFHS diet during gestation affects the hepatic lipid profile in neonates. The pre-natal exposure becomes less pronounced in young male offspring at 4 weeks of age, where the post-natal diet has the largest impact.

    Clinical and experimental pharmacology & physiology 2020

  • Aberrant cell migration contributes to defective airway epithelial repair in childhood wheeze.

    Iosifidis T, Sutanto EN, Buckley AG, Coleman L, Gill EE, Lee AH, Ling KM, Hillas J, Looi K, Garratt LW, Martinovich KM, Shaw NC, Montgomery ST, Kicic-Starcevich E, Karpievitch YV, Le Souëf P, Laing IA, Vijayasekaran S, Lannigan FJ, Rigby PJ, Hancock RE, Knight DA, Stick SM, Kicic A, Western Australian Epithelial Research Program (WAERP) and Australian Respiratory Epithelium Consortium (AusREC)

    Division of Pediatrics and.

    Abnormal wound repair has been observed in the airway epithelium of patients with chronic respiratory diseases, including asthma. Therapies focusing on repairing vulnerable airways, particularly in early life, present a potentially novel treatment strategy. We report defective lower airway epithelial cell repair to strongly associate with common pre-school-aged and school-aged wheezing phenotypes, characterized by aberrant migration patterns and reduced integrin α5β1 expression. Next generation sequencing identified the PI3K/Akt pathway as the top upstream transcriptional regulator of integrin α5β1, where Akt activation enhanced repair and integrin α5β1 expression in primary cultures from children with wheeze. Conversely, inhibition of PI3K/Akt signaling in primary cultures from children without wheeze reduced α5β1 expression and attenuated repair. Importantly, the FDA-approved drug celecoxib - and its non-COX2-inhibiting analogue, dimethyl-celecoxib - stimulated the PI3K/Akt-integrin α5β1 axis and restored airway epithelial repair in cells from children with wheeze. When compared with published clinical data sets, the identified transcriptomic signature was also associated with viral-induced wheeze exacerbations highlighting the clinical potential of such therapy. Collectively, these results identify airway epithelial restitution via targeting the PI3K-integrin α5β1 axis as a potentially novel therapeutic avenue for childhood wheeze and asthma. We propose that the next step in the therapeutic development process should be a proof-of-concept clinical trial, since relevant animal models to test the crucial underlying premise are unavailable.

    JCI insight 2020;5;7

  • Systems genetics analysis identifies calcium-signaling defects as novel cause of congenital heart disease.

    Izarzugaza JMG, Ellesøe SG, Doganli C, Ehlers NS, Dalgaard MD, Audain E, Dombrowsky G, Banasik K, Sifrim A, Wilsdon A, Thienpont B, Breckpot J, Gewillig M, Competence Network for Congenital Heart Defects, Germany, Brook JD, Hitz MP, Larsen LA and Brunak S

    Department of Health Technology, Technical University of Denmark, Kemitorvet, DK-2800, Kgs. Lyngby, Denmark.

    Background: Congenital heart disease (CHD) occurs in almost 1% of newborn children and is considered a multifactorial disorder. CHD may segregate in families due to significant contribution of genetic factors in the disease etiology. The aim of the study was to identify pathophysiological mechanisms in families segregating CHD.

    Methods: We used whole exome sequencing to identify rare genetic variants in ninety consenting participants from 32 Danish families with recurrent CHD. We applied a systems biology approach to identify developmental mechanisms influenced by accumulation of rare variants. We used an independent cohort of 714 CHD cases and 4922 controls for replication and performed functional investigations using zebrafish as in vivo model.

    Results: We identified 1785 genes, in which rare alleles were shared between affected individuals within a family. These genes were enriched for known cardiac developmental genes, and 218 of these genes were mutated in more than one family. Our analysis revealed a functional cluster, enriched for proteins with a known participation in calcium signaling. Replication in an independent cohort confirmed increased mutation burden of calcium-signaling genes in CHD patients. Functional investigation of zebrafish orthologues of ITPR1, PLCB2, and ADCY2 verified a role in cardiac development and suggests a combinatorial effect of inactivation of these genes.

    Conclusions: The study identifies abnormal calcium signaling as a novel pathophysiological mechanism in human CHD and confirms the complex genetic architecture underlying CHD.

    Funded by: FWO: Postdoctoral Fellow number 12W7318N; Lundbeckfonden: R209-2015-2604; Novo Nordisk Fonden: NNF12OC0001790, NNF14CC0001; The Danish National Advanced Technology Foundation: 019-2011-2

    Genome medicine 2020;12;1;76

  • Germs and germlines: how "public" B-cell clones evolve in the gut.

    James KR and King HW

    Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK.

    Chen et al. describe how B-cell clones observed in the gut of many different individuals (recurrent or "public" clonotypes) are shaped by the combined influences of common microbial antigens and underlying genomic recombination biases.

    Funded by: European Research Council: 646794; Sir Henry Wellcome Postdoctoral Fellowship: 213555/Z/18/Z

    Immunology and cell biology 2020

  • Increasing incidence of group B streptococcus neonatal infections in the Netherlands is associated with clonal expansion of CC17 and CC23.

    Jamrozy D, Bijlsma MW, de Goffau MC, van de Beek D, Kuijpers TW, Parkhill J, van der Ende A and Bentley SD

    Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK.

    Group B streptococcus (GBS) is the leading cause of neonatal invasive disease worldwide. In the Netherlands incidence of the disease increased despite implementation of preventive guidelines. We describe a genomic analysis of 1345 GBS isolates from neonatal (age 0-89 days) invasive infections in the Netherlands reported between 1987 and 2016. Most isolates clustered into one of five major lineages: CC17 (39%), CC19 (25%), CC23 (18%), CC10 (9%) and CC1 (7%). There was a significant rise in the number of infections due to isolates from CC17 and CC23. Phylogenetic clustering analysis revealed that this was caused by expansion of specific sub-lineages, designated CC17-A1, CC17-A2 and CC23-A1. Dating of phylogenetic trees estimated that these clones diverged in the 1960s/1970s, representing historical rather than recently emerged clones. For CC17-A1 the expansion correlated with acquisition of a new phage, carrying gene encoding a putative cell-surface protein. Representatives of CC17-A1, CC17-A2 and CC23-A1 clones were identified in datasets from other countries demonstrating their global distribution.

    Funded by: Wellcome Trust (Wellcome): 098051; ZonMw (Netherlands Organisation for Health Research and Development): 016.116.358

    Scientific reports 2020;10;1;9539

  • Reconstructing human DC, monocyte and macrophage development in utero using single cell technologies.

    Jardine L and Haniffa M

    Biosciences Institute, Newcastle University, Faculty of Medical Sciences, Newcastle upon Tyne, NE2 4HH, UK; Department of Haematology, Newcastle Hospitals NHS Foundation Trust, Newcastle upon Tyne NE2 4LP, UK. Electronic address:

    The repertoire of dendritic cells (DCs), monocytes and macrophages in adult humans is diverse and we are appreciating this to a greater extent as high throughput methods, such a single-cell RNA sequencing, become widely adopted and scalable. This powerful lens of analysis is also beginning to shed light on prenatal immunology, allowing us to chart the emergence, tissue distribution and developmental regulation of DCs, monocytes and macrophages during early human life. In this review, we will integrate recent insights from studies of the developing immune system into our understanding of adult DC, monocyte and macrophage organization, illustrating where insights from early life both affirm and challenge current understanding.

    Molecular immunology 2020;123;1-6

  • Gene signatures from scRNA-seq accurately quantify mast cells in biopsies in asthma.

    Jiang J, Faiz A, Berg M, Carpaij OA, Vermeulen CJ, Brouwer S, Hesse L, Teichmann SA, Ten Hacken N, Timens W, van den Berge M and Nawijn MC

    Groningen Research Institute for Asthma and COPD (GRIAC), University of Groningen, Groningen, The Netherlands.

    Respiratory disease, characterized by changes in the cells of the lung, can affect molecular phenotype of cells and the intercellular interactions, resulting in a disbalance in the relative proportions of individual cell types. Understanding these changes is essential to understand the pathophysiology of lung disease. Conventional 'bulk' RNA-sequencing (RNA-seq), analyzing the entire transcriptome of the tissue sample, provides information about average expression levels of each gene in the mixed cell population; whereas it does not consider the cellular heterogeneity in samples composed of more than one cell type <sup>1</sup> . Single-cell RNA-seq (scRNA-seq) assesses the transcriptome of a complex biological sample with single-cell resolution, allowing identification of the relative frequency of discrete cell-types and analysis of their transcriptomes <sup>1</sup> . Nevertheless, analyzing the transcriptomic signature in large numbers of patients by scRNA-Seq is currently limited by its high costs. Mast cells are key regulatory cells driving the inflammatory process in asthma<sup>2</sup> . Since they can be quantified by immunohistochemical staining for validation purposes, we used mast cells as an example of a rare cell population to assess the validity of our deconvolution approach. Recently, a number of bulk RNA-seq deconvolution methods have become available <sup>3</sup> , for instance of two deconvolution methods, namely support vector regression (SVR) <sup>4</sup> , the machine-learning method implemented in CYBERSORT, and Non-Negative Least Square (NNLS) <sup>5</sup> , using a matrix of cell-type selective genes identified with AutoGeneSc <sup>6</sup> . Both approaches are designed to estimate relative proportion of the main, common cell types present in the sample. When we used these methods to estimate the number of mast cells, we found a poor correlation with the number of mast cells stained by immunohistochemistry in the biopsies, suggesting the CIBERSORT and NNLS are less reliable in the case of rare cell types. We explored the possibility to use scRNA-Seq data from small numbers of subjects to specifically interrogate the relative cell type frequency of a rare cell population in a bulk RNA-Seq dataset obtained from a large asthma cohort.

    Clinical and experimental allergy : journal of the British Society for Allergy and Clinical Immunology 2020

  • Hypertension and renin-angiotensin system blockers are not associated with expression of angiotensin-converting enzyme 2 (ACE2) in the kidney.

    Jiang X, Eales JM, Scannali D, Nazgiewicz A, Prestes P, Maier M, Denniff M, Xu X, Saluja S, Cano-Gamez E, Wystrychowski W, Szulinska M, Antczak A, Byars S, Skrypnik D, Glyda M, Król R, Zywiec J, Zukowska-Szczechowska E, Burrell LM, Woolf AS, Greenstein A, Bogdanski P, Keavney B, Morris AP, Heagerty A, Williams B, Harrap SB, Trynka G, Samani NJ, Guzik TJ, Charchar FJ and Tomaszewski M

    Division of Cardiovascular Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Manchester, UK.

    Aims: Angiotensin-converting enzyme 2 (ACE2) is the cellular entry point for severe acute respiratory syndrome coronavirus (SARS-CoV-2)-the cause of coronavirus disease 2019 (COVID-19). However, the effect of renin-angiotensin system (RAS)-inhibition on ACE2 expression in human tissues of key relevance to blood pressure regulation and COVID-19 infection has not previously been reported.

    Methods and results: We examined how hypertension, its major metabolic co-phenotypes, and antihypertensive medications relate to ACE2 renal expression using information from up to 436 patients whose kidney transcriptomes were characterized by RNA-sequencing. We further validated some of the key observations in other human tissues and/or a controlled experimental model. Our data reveal increasing expression of ACE2 with age in both human lungs and the kidney. We show no association between renal expression of ACE2 and either hypertension or common types of RAS inhibiting drugs. We demonstrate that renal abundance of ACE2 is positively associated with a biochemical index of kidney function and show a strong enrichment for genes responsible for kidney health and disease in ACE2 co-expression analysis.

    Conclusion: Our results indicate that neither hypertension nor antihypertensive treatment is likely to alter the expression of the key entry receptor for SARS-CoV-2 in the human kidney. Our data further suggest that in the absence of SARS-CoV-2 infection, kidney ACE2 is most likely nephro-protective but the age-related increase in its expression within lungs and kidneys may be relevant to the risk of SARS-CoV-2 infection.

    European heart journal 2020

  • A deep learning system accurately classifies primary and metastatic cancers using passenger mutation patterns.

    Jiao W, Atwal G, Polak P, Karlic R, Cuppen E, PCAWG Tumor Subtypes and Clinical Translation Working Group, Danyi A, de Ridder J, van Herpen C, Lolkema MP, Steeghs N, Getz G, Morris Q, Stein LD and PCAWG Consortium

    Ontario Institute for Cancer Research, Toronto, ON, M5G0A3, Canada.

    In cancer, the primary tumour's organ of origin and histopathology are the strongest determinants of its clinical behaviour, but in 3% of cases a patient presents with a metastatic tumour and no obvious primary. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, we train a deep learning classifier to predict cancer type based on patterns of somatic passenger mutations detected in whole genome sequencing (WGS) of 2606 tumours representing 24 common cancer types produced by the PCAWG Consortium. Our classifier achieves an accuracy of 91% on held-out tumor samples and 88% and 83% respectively on independent primary and metastatic samples, roughly double the accuracy of trained pathologists when presented with a metastatic tumour without knowledge of the primary. Surprisingly, adding information on driver mutations reduced accuracy. Our results have clinical applicability, underscore how patterns of somatic passenger mutations encode the state of the cell of origin, and can inform future strategies to detect the source of circulating tumour DNA.

    Funded by: NCI NIH HHS: P01 CA240239; NIEHS NIH HHS: P30 ES010126

    Nature communications 2020;11;1;728

  • Macrophage metabolic reprogramming presents a therapeutic target in lupus nephritis.

    Jing C, Castro-Dopico T, Richoz N, Tuong ZK, Ferdinand JR, Lok LSC, Loudon KW, Banham GD, Mathews RJ, Cader Z, Fitzpatrick S, Bashant KR, Kaplan MJ, Kaser A, Johnson RS, Murphy MP, Siegel RM and Clatworthy MR

    Molecular Immunity Unit, Department of Medicine, Medical Research Council Laboratory of Molecular Biology, University of Cambridge, Cambridge CB2 0QH, United Kingdom.

    IgG antibodies cause inflammation and organ damage in autoimmune diseases such as systemic lupus erythematosus (SLE). We investigated the metabolic profile of macrophages isolated from inflamed tissues in immune complex (IC)-associated diseases, including SLE and rheumatoid arthritis, and following IgG Fcγ receptor cross-linking. We found that human and mouse macrophages undergo a switch to glycolysis in response to IgG IC stimulation, mirroring macrophage metabolic changes in inflamed tissue in vivo. This metabolic reprogramming was required to generate a number of proinflammatory mediators, including IL-1β, and was dependent on mTOR and hypoxia-inducible factor (HIF)1α. Inhibition of glycolysis, or genetic depletion of HIF1α, attenuated IgG IC-induced activation of macrophages in vitro, including primary human kidney macrophages. In vivo, glycolysis inhibition led to a reduction in kidney macrophage IL-1β and reduced neutrophil recruitment in a murine model of antibody-mediated nephritis. Together, our data reveal the molecular mechanisms underpinning FcγR-mediated metabolic reprogramming in macrophages and suggest a therapeutic strategy for autoantibody-induced inflammation, including lupus nephritis.

    Proceedings of the National Academy of Sciences of the United States of America 2020

  • PRL3-DDX21 Transcriptional Control of Endolysosomal Genes Restricts Melanocyte Stem Cell Differentiation.

    Johansson JA, Marie KL, Lu Y, Brombin A, Santoriello C, Zeng Z, Zich J, Gautier P, von Kriegsheim A, Brunsdon H, Wheeler AP, Dreger M, Houston DR, Dooley CM, Sims AH, Busch-Nentwich EM, Zon LI, Illingworth RS and Patton EE

    MRC Human Genetics Unit, MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Crewe Road South, Edinburgh EH4 2XU, UK; Cancer Research UK Edinburgh Centre, MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh EH4 2XU, UK.

    Melanocytes, replenished throughout life by melanocyte stem cells (MSCs), play a critical role in pigmentation and melanoma. Here, we reveal a function for the metastasis-associated phosphatase of regenerating liver 3 (PRL3) in MSC regeneration. We show that PRL3 binds to the RNA helicase DDX21, thereby restricting productive transcription by RNAPII at master transcription factor (MITF)-regulated endolysosomal vesicle genes. In zebrafish, this mechanism controls premature melanoblast expansion and differentiation from MSCs. In melanoma patients, restricted transcription of this endolysosomal vesicle pathway is a hallmark of PRL3-high melanomas. Our work presents the conceptual advance that PRL3-mediated control of transcriptional elongation is a differentiation checkpoint mechanism for activated MSCs and has clinical relevance for the activity of PRL3 in regenerating tissue and cancer.

    Developmental cell 2020

  • The Nature and Extent of Plasmid Variation in Chlamydia trachomatis.

    Jones CA, Hadfield J, Thomson NR, Cleary DW, Marsh P, Clarke IN and O'Neill CE

    Clinical and Experimental Sciences, Faculty of Medicine, University of Southampton, Southampton General Hospital, Southampton SO166YD, UK.

    <i>Chlamydia trachomatis</i> is an obligate intracellular pathogen of humans, causing both the sexually transmitted infection, chlamydia, and the most common cause of infectious blindness, trachoma. The majority of sequenced <i>C. trachomatis</i> clinical isolates carry a 7.5-Kb plasmid, and it is becoming increasingly evident that this is a key determinant of pathogenicity. The discovery of the Swedish New Variant and the more recent Finnish variant highlight the importance of understanding the natural extent of variation in the plasmid. In this study we analysed 524 plasmid sequences from publicly available whole-genome sequence data. Single nucleotide polymorphisms (SNP) in each of the eight coding sequences (CDS) were identified and analysed. There were 224 base positions out of a total 7550 bp that carried a SNP, which equates to a SNP rate of 2.97%, nearly three times what was previously calculated. After normalising for CDS size, CDS8 had the highest SNP rate at 3.97% (i.e., number of SNPs per total number of nucleotides), whilst CDS6 had the lowest at 1.94%. CDS5 had the highest total number of SNPs across the 524 sequences analysed (2267 SNPs), whereas CDS6 had the least SNPs with only 85 SNPs. Calculation of the genetic distances identified CDS6 as the least variable gene at the nucleotide level (d = 0.001), and CDS5 as the most variable (d = 0.007); however, at the amino acid level CDS2 was the least variable (d = 0.001), whilst CDS5 remained the most variable (d = 0.013). This study describes the largest in-depth analysis of the <i>C. trachomatis</i> plasmid to date, through the analysis of plasmid sequence data mined from whole genome sequences spanning 50 years and from a worldwide distribution, providing insights into the nature and extent of existing variation within the plasmid as well as guidance for the design of future diagnostic assays. This is crucial at a time when single-target diagnostic assays are failing to detect natural mutants, putting those infected at risk of a serious long-term and life-changing illness.

    Microorganisms 2020;8;3

  • Effective control of SARS-CoV-2 transmission between healthcare workers during a period of diminished community prevalence of COVID-19.

    Jones NK, Rivett L, Sparkes D, Forrest S, Sridhar S, Young J, Pereira-Dias J, Cormie C, Gill H, Reynolds N, Wantoch M, Routledge M, Warne B, Levy J, Córdova Jiménez WD, Samad FNB, McNicholas C, Ferris M, Gray J, Gill M, CITIID-NIHR COVID-19 BioResource Collaboration, Baker S, Bradley J, Dougan G, Goodfellow I, Gupta R, Lehner PJ, Lyons PA, Matheson NJ, Smith KG, Torok ME, Toshner M, Curran MD, Fuller S, Chaudhry A, Shaw A, Bradley JR, Hannon GJ, Goodfellow IG, Dougan G, Smith KG, Lehner PJ, Wright G, Matheson NJ, Baker S and Weekes MP

    Department of Infectious Diseases, Cambridge University NHS Hospitals Foundation Trust, Cambridge, United Kingdom.

    Previously, we showed that 3% (31/1032)of asymptomatic healthcare workers (HCWs) from a large teaching hospital in Cambridge, UK, tested positive for SARS-CoV-2 in April 2020. About 15% (26/169) HCWs with symptoms of coronavirus disease 2019 (COVID-19) also tested positive for SARS-CoV-2 (Rivett et al., 2020). Here, we show that the proportion of both asymptomatic and symptomatic HCWs testing positive for SARS-CoV-2 rapidly declined to near-zero between 25th April and 24th May 2020, corresponding to a decline in patient admissions with COVID-19 during the ongoing UK 'lockdown'. These data demonstrate how infection prevention and control measures including staff testing may help prevent hospitals from becoming independent 'hubs' of SARS-CoV-2 transmission, and illustrate how, with appropriate precautions, organizations in other sectors may be able to resume on-site work safely.

    Funded by: Cancer Research UK: C38317/A24043; Medical Research Council: MR/P008801/1; NHS Blood and Transplant: WPA15-02; Wellcome: 108070/Z/15/Z, 200871/Z/16/Z, 206298/B/17/Z, 207498?Z/17/Z, 210688/Z/18/Z, 215515/Z/19/Z; Wellcome Trust

    eLife 2020;9

  • A cell atlas of human thymic development defines T cell repertoire formation

    Jong-Eun Park, Rachel A. Botting, Cecilia Domínguez Conde, Dorin-Mirel Popescu, Marieke Lavaert, Daniel J. Kunz, Issac Goh, Emily Stephenson, Roberta Ragazzini, Elizabeth Tuck, Anna Wilbrey-Clark, Kenny Roberts, Veronika R. Kedlian, John R. Ferdinand, Xiaoling He, Simone Webb, Daniel Maunder, Niels Vandamme, Krishnaa T. Mahbubani, Krzysztof Polanski, Lira Mamanova, Liam Bolt, David Crossland, Fabrizio de Rita, Andrew Fuller, Andrew Filby, Gary Reynolds, David Dixon, Kourosh Saeb-Parsy, Steven Lisgo, Deborah Henderson, Roser Vento-Tormo, Omer A. Bayraktar, Roger A. Barker, Kerstin B. Meyer, Yvan Saeys, Paola Bonfanti, Sam Behjati, Menna R. Clatworthy, Tom Taghon, Muzlifah Haniffa and Sarah A. Teichmann

    The human thymus is the organ responsible for the maturation of many types of T cells, which are immune cells that protect us from infection. However, it is not well known how these cells develop with a full immune complement that contains the necessary variation to protect us from a variety of pathogens. By performing single-cell RNA sequencing on more than 250,000 cells, Park et al. examined the changes that occur in the thymus over the course of a human life. They found that development occurs in a coordinated manner among immune cells and with their developmental microenvironment. These data allowed for the creation of models of how T cells with different specific immune functions develop in humans. Science , this issue p. [eaay3224][1] ### INTRODUCTION The thymus is the critical organ for T cell development and T cell receptor (TCR) repertoire formation, which shapes the landscape of adaptive immunity. T cell development in the thymus is spatially coordinated, and this process is orchestrated by diverse cell types constituting the thymic microenvironment. Although the thymus has been extensively studied using diverse animal models, human immunity cannot be understood without a detailed atlas of the human thymus. ### RATIONALE To provide a comprehensive atlas of thymic cells across human life, we performed single-cell RNA sequencing (scRNA-seq) using dissociated cells from human thymus during development, childhood, and adult life. We sampled 15 embryonic and fetal thymi spanning thymic developmental stages between 7 and 17 post-conception weeks, as well as nine postnatal thymi from pediatric and adult individuals. Diverse sorting schemes were applied to increase the coverage on underrepresented cell populations. Using the marker genes obtained from single-cell transcriptomes, we spatially localized cell states by single-molecule fluorescence in situ hybridization (smFISH). To provide a systematic comparison between human and mouse, we also generated single-cell data on postnatal mouse thymi and combined this with preexisting mouse datasets. Finally, to investigate the bias in the recombination and selection of human TCR repertoires, we enriched the TCR sequences for single-cell library generation. ### RESULTS We identified more than 50 different cell states in the human thymus. Human thymus cell states dynamically change in abundance and gene expression profiles across development and during pediatric and adult life. We identified novel subpopulations of human thymic fibroblasts and epithelial cells and located them in situ. We computationally predicted the trajectory of human T cell development from early progenitors in the hematopoietic fetal liver into diverse mature T cell types. Using this trajectory, we constructed a framework of putative transcription factors driving T cell fate determination. Among thymic unconventional T cells, we noted a distinct subset of CD8αα+ T cells, which is marked by GNG4 expression and located in the perimedullary region of the thymus. This subset expressed high levels of XCL1 and colocalized with XCR1+ dendritic cells. Comparison of human and mouse thymic cells revealed divergent gene expression profiles of these unconventional T cell types. Finally, we identified a strong bias in human VDJ usage shaped by recombination and multiple rounds of selection, including a TCRα V-J bias for CD8+ T cells. ### CONCLUSION Our single-cell transcriptome profile of the thymus across the human lifetime and across species provides a high-resolution census of T cell development within the native tissue microenvironment. Systematic comparison between the human and mouse thymus highlights human-specific cell states and gene expression signatures. Our detailed cellular network of the thymic niche for T cell development will aid the establishment of in vitro organoid culture models that faithfully recapitulate human in vivo thymic tissue. ![Figure][2]&lt;/img&gt; Constructing the human thymus cell atlas. We analyzed human thymic cells across development and postnatal life using scRNA-seq and spatial methods to delineate the diversity of thymic-derived T cells and the localization of cells constituting the thymus microenvironment. With T cell development trajectory reconstituted at single-cell resolution combined with TCR sequence, we investigated the bias in the VDJ recombination and selection of human TCR repertoires. Finally, we provide a systematic comparison between human and mouse thymic cell atlases. The thymus provides a nurturing environment for the differentiation and selection of T cells, a process orchestrated by their interaction with multiple thymic cell types. We used single-cell RNA sequencing to create a cell census of the human thymus across the life span and to reconstruct T cell differentiation trajectories and T cell receptor (TCR) recombination kinetics. Using this approach, we identified and located in situ CD8αα+ T cell populations, thymic fibroblast subtypes, and activated dendritic cell states. In addition, we reveal a bias in TCR recombination and selection, which is attributed to genomic position and the kinetics of lineage commitment. Taken together, our data provide a comprehensive atlas of the human thymus across the life span with new insights into human T cell development. [1]: /lookup/doi/10.1126/science.aay3224 [2]: pending:yes

    Science 2020;367;6480

  • Coevolving Plasmids Drive Gene Flow and Genome Plasticity in Host-Associated Intracellular Bacteria.

    Köstlbacher S, Collingro A, Halter T, Domman D and Horn M

    University of Vienna, Centre for Microbiology and Environmental Systems Science, Division of Microbial Ecology, Althanstrasse 14, Vienna 1090, Austria.

    Plasmids are important in microbial evolution and adaptation to new environments. Yet, carrying a plasmid can be costly, and long-term association of plasmids with their hosts is poorly understood. Here, we provide evidence that the Chlamydiae, a phylum of strictly host-associated intracellular bacteria, have coevolved with their plasmids since their last common ancestor. Current chlamydial plasmids are amalgamations of at least one ancestral plasmid and a bacteriophage. We show that the majority of plasmid genes are also found on chromosomes of extant chlamydiae. The most conserved plasmid gene families are predominantly vertically inherited, while accessory plasmid gene families show significantly increased mobility. We reconstructed the evolutionary history of plasmid gene content of an entire bacterial phylum over a period of around one billion years. Frequent horizontal gene transfer and chromosomal integration events illustrate the pronounced impact of coevolution with these extrachromosomal elements on bacterial genome dynamics in host-dependent microbes.

    Current biology : CB 2020

  • Expanding the genotype-phenotype correlation of de novo heterozygous missense variants in YWHAG as a cause of developmental and epileptic encephalopathy.

    Kanani F, Titheradge H, Cooper N, Elmslie F, Lees MM, Juusola J, Pisani L, McKenna C, Mignot C, Valence S, Keren B, Lachlan K, DDD Study and Balasubramanian M

    Sheffield Clinical Genetics Service, Sheffield Children's NHS Foundation Trust, Sheffield, UK.

    Developmental and Epileptic encephalopathies (DEE) describe heterogeneous epilepsy syndromes, characterized by early-onset, refractory seizures and developmental delay (DD). Several DEE associated genes have been reported. With increased access to whole exome sequencing (WES), new candidate genes are being identified although there are fewer large cohort papers describing the clinical phenotype in such patients. We describe 6 unreported individuals and provide updated information on an additional previously reported individual with heterozygous de novo missense variants in YWHAG. We describe a syndromal phenotype, report 5 novel, and a recurrent p.Arg132Cys YWHAG variant and compare developmental trajectory and treatment strategies in this cohort. We provide further evidence of causality in YWHAG variants. WES was performed in five patients via Deciphering Developmental Disorders Study and the remaining two were identified via Genematcher and AnnEX databases. De novo variants identified from exome data were validated using Sanger sequencing. Seven out of seven patients in the cohort have de novo, heterozygous missense variants in YWHAG including 2/7 patients with a recurrent c.394C > T, p.Arg132Cys variant; 1/7 has a second, pathogenic variant in STAG1. Characteristic features included: early-onset seizures, predominantly generalized tonic-clonic and absence type (7/7) with good response to standard anti-epileptic medications; moderate DD; Intellectual Disability (ID) (5/7) and Autism Spectrum Disorder (3/7). De novo YWHAG missense variants cause EE, characterized by early-onset epilepsy, ID and DD, supporting the hypothesis that YWHAG loss-of-function causes a neurological phenotype. Although the exact mechanism of disease resulting from alterations in YWHAG is not fully known, it is possible that haploinsufficiency of YWHAG in developing cerebral cortex may lead to abnormal neuronal migration resulting in DEE.

    American journal of medical genetics. Part A 2020;182;4;713-720

  • Evidence for 28 genetic disorders discovered by combining healthcare and research data.

    Kaplanis J, Samocha KE, Wiel L, Zhang Z, Arvai KJ, Eberhardt RY, Gallone G, Lelieveld SH, Martin HC, McRae JF, Short PJ, Torene RI, de Boer E, Danecek P, Gardner EJ, Huang N, Lord J, Martincorena I, Pfundt R, Reijnders MRF, Yeung A, Yntema HG, Deciphering Developmental Disorders Study, Vissers LELM, Juusola J, Wright CF, Brunner HG, Firth HV, FitzPatrick DR, Barrett JC, Hurles ME, Gilissen C and Retterer K

    Human Genetics Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK.

    De novo mutations in protein-coding genes are a well-established cause of developmental disorders<sup>1</sup>. However, genes known to be associated with developmental disorders account for only a minority of the observed excess of such de novo mutations<sup>1,2</sup>. Here, to identify previously undescribed genes associated with developmental disorders, we integrate healthcare and research exome-sequence data from 31,058 parent-offspring trios of individuals with developmental disorders, and develop a simulation-based statistical test to identify gene-specific enrichment of de novo mutations. We identified 285 genes that were significantly associated with developmental disorders, including 28 that had not previously been robustly associated with developmental disorders. Although we detected more genes associated with developmental disorders, much of the excess of de novo mutations in protein-coding genes remains unaccounted for. Modelling suggests that more than 1,000 genes associated with developmental disorders have not yet been described, many of which are likely to be less penetrant than the currently known genes. Research access to clinical diagnostic datasets will be critical for completing the map of genes associated with developmental disorders.

    Funded by: Wellcome Trust

    Nature 2020;586;7831;757-762

  • ChemBioServer 2.0: an advanced web server for filtering, clustering and networking of chemical compounds facilitating both drug discovery and repurposing.

    Karatzas E, Zamora JE, Athanasiadis E, Dellis D, Cournia Z and Spyrou GM

    Department of Informatics and Telecommunications, National and Kapodistrian University of Athens, Ilisia, 15784 Athens, Greece.

    Summary: ChemBioServer 2.0 is the advanced sequel of a web server for filtering, clustering and networking of chemical compound libraries facilitating both drug discovery and repurposing. It provides researchers the ability to (i) browse and visualize compounds along with their physicochemical and toxicity properties, (ii) perform property-based filtering of compounds, (iii) explore compound libraries for lead optimization based on perfect match substructure search, (iv) re-rank virtual screening results to achieve selectivity for a protein of interest against different protein members of the same family, selecting only those compounds that score high for the protein of interest, (v) perform clustering among the compounds based on their physicochemical properties providing representative compounds for each cluster, (vi) construct and visualize a structural similarity network of compounds providing a set of network analysis metrics, (vii) combine a given set of compounds with a reference set of compounds into a single structural similarity network providing the opportunity to infer drug repurposing due to transitivity, (viii) remove compounds from a network based on their similarity with unwanted substances (e.g. failed drugs) and (ix) build custom compound mining pipelines.

    Availability and implementation:

    Bioinformatics (Oxford, England) 2020;36;8;2602-2604

  • The mutational constraint spectrum quantified from variation in 141,456 humans.

    Karczewski KJ, Francioli LC, Tiao G, Cummings BB, Alföldi J, Wang Q, Collins RL, Laricchia KM, Ganna A, Birnbaum DP, Gauthier LD, Brand H, Solomonson M, Watts NA, Rhodes D, Singer-Berk M, England EM, Seaby EG, Kosmicki JA, Walters RK, Tashman K, Farjoun Y, Banks E, Poterba T, Wang A, Seed C, Whiffin N, Chong JX, Samocha KE, Pierce-Hoffman E, Zappala Z, O'Donnell-Luria AH, Minikel EV, Weisburd B, Lek M, Ware JS, Vittal C, Armean IM, Bergelson L, Cibulskis K, Connolly KM, Covarrubias M, Donnelly S, Ferriera S, Gabriel S, Gentry J, Gupta N, Jeandet T, Kaplan D, Llanwarne C, Munshi R, Novod S, Petrillo N, Roazen D, Ruano-Rubio V, Saltzman A, Schleicher M, Soto J, Tibbetts K, Tolonen C, Wade G, Talkowski ME, Genome Aggregation Database Consortium, Neale BM, Daly MJ and MacArthur DG

    Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA.

    Genetic variants that inactivate protein-coding genes are a powerful source of information about the phenotypic consequences of gene disruption: genes that are crucial for the function of an organism will be depleted of such variants in natural populations, whereas non-essential genes will tolerate their accumulation. However, predicted loss-of-function variants are enriched for annotation errors, and tend to be found at extremely low frequencies, so their analysis requires careful variant annotation and very large sample sizes<sup>1</sup>. Here we describe the aggregation of 125,748 exomes and 15,708 genomes from human sequencing studies into the Genome Aggregation Database (gnomAD). We identify 443,769 high-confidence predicted loss-of-function variants in this cohort after filtering for artefacts caused by sequencing and annotation errors. Using an improved model of human mutation rates, we classify human protein-coding genes along a spectrum that represents tolerance to inactivation, validate this classification using data from model organisms and engineered human cells, and show that it can be used to improve the power of gene discovery for both common and rare diseases.

    Nature 2020;581;7809;434-443

  • Mining a GWAS of Severe Covid-19.

    Karim M, Ghoussaini M and Dunham I

    Wellcome Sanger Institute, Hinxton, United Kingdom

    The New England journal of medicine 2020;383;26

  • High relatedness of invasive multi-drug resistant non-typhoidal Salmonella genotypes among patients and asymptomatic carriers in endemic informal settlements in Kenya.

    Kariuki S, Mbae C, Van Puyvelde S, Onsare R, Kavai S, Wairimu C, Ngetich R, Clemens J and Dougan G

    Centre for Microbiology Research, Kenya Medical Research Institute, Nairobi, Kenya.

    Invasive Non-typhoidal Salmonella (iNTS) disease is a major public health challenge, especially in Sub-Saharan Africa (SSA). In Kenya, mortality rates are high (20-25%) unless prompt treatment is instituted. The most common serotypes are Salmonella enterica serotype Typhimurium (S. Typhimurium) and Salmonella enterica serotype Enteritidis (S. Enteritidis). In a 5 year case-control study in children residing in the Mukuru informal settlement in Nairobi, Kenya, a total of 4201 blood cultures from suspected iNTS cases and 6326 fecal samples from age-matched controls were studied. From the laboratory cultures we obtained a total of 133 S. Typhimurium isolates of which 83(62.4%) came from cases (53 blood and 30 fecal) and 50(37.6%) from controls (fecal). A total of 120 S. Enteritidis consisted of 70(58.3%) from cases (43 blood and 27 fecal) and 50(41.7%) from controls (fecal). The S. Typhimurium population fell into two distinct ST19 lineages constituting 36.1%, as well as ST313 lineage I (27.8%) and ST313 lineage II (36.1%) isolates. The S. Enteritidis isolates fell into the global epidemic lineage (46.6%), the Central/Eastern African lineage (30.5%), a novel Kenyan-specific lineage (12.2%) and a phylogenetically outlier lineage (10.7%). Detailed phylogenetic analysis revealed a high level of relatedness between NTS from blood and stool originating from cases and controls, indicating a common source pool. Multidrug resistance was common throughout, with 8.5% of such isolates resistant to extended spectrum beta lactams. The high rate of asymptomatic carriage in the population is a concern for transmission to vulnerable individuals and this group could be targeted for vaccination if an iNTS vaccine becomes available.

    PLoS neglected tropical diseases 2020;14;8;e0008440

  • Red blood cell tension protects against severe malaria in the Dantu blood group.

    Kariuki SN, Marin-Menendez A, Introini V, Ravenhill BJ, Lin YC, Macharia A, Makale J, Tendwa M, Nyamu W, Kotar J, Carrasquilla M, Rowe JA, Rockett K, Kwiatkowski D, Weekes MP, Cicuta P, Williams TN and Rayner JC

    Department of Epidemiology and Demography, KEMRI-Wellcome Trust Research Programme, Kilifi, Kenya.

    Malaria has had a major effect on the human genome, with many protective polymorphisms-such as the sickle-cell trait-having been selected to high frequencies in malaria-endemic regions<sup>1,2</sup>. The blood group variant Dantu provides 74% protection against all forms of severe malaria in homozygous individuals<sup>3-5</sup>, a similar degree of protection to that afforded by the sickle-cell trait and considerably greater than that offered by the best malaria vaccine. Until now, however, the protective mechanism has been unknown. Here we demonstrate the effect of Dantu on the ability of the merozoite form of the malaria parasite Plasmodium falciparum to invade red blood cells (RBCs). We find that Dantu is associated with extensive changes to the repertoire of proteins found on the RBC surface, but, unexpectedly, inhibition of invasion does not correlate with specific RBC-parasite receptor-ligand interactions. By following invasion using video microscopy, we find a strong link between RBC tension and merozoite invasion, and identify a tension threshold above which invasion rarely occurs, even in non-Dantu RBCs. Dantu RBCs have higher average tension than non-Dantu RBCs, meaning that a greater proportion resist invasion. These findings provide both an explanation for the protective effect of Dantu, and fresh insight into why the efficiency of P. falciparum invasion might vary across the heterogenous populations of RBCs found both within and between individuals.

    Nature 2020

  • The gene-rich genome of the scallop Pecten maximus.

    Kenny NJ, McCarthy SA, Dudchenko O, James K, Betteridge E, Corton C, Dolucan J, Mead D, Oliver K, Omer AD, Pelan S, Ryan Y, Sims Y, Skelton J, Smith M, Torrance J, Weisz D, Wipat A, Aiden EL, Howe K and Williams ST

    Natural History Museum, Department of Life Sciences,Cromwell Road, London SW7 5BD, UK.

    Background: The king scallop, Pecten maximus, is distributed in shallow waters along the Atlantic coast of Europe. It forms the basis of a valuable commercial fishery and plays a key role in coastal ecosystems and food webs. Like other filter feeding bivalves it can accumulate potent phytotoxins, to which it has evolved some immunity. The molecular origins of this immunity are of interest to evolutionary biologists, pharmaceutical companies, and fisheries management.

    Findings: Here we report the genome assembly of this species, conducted as part of the Wellcome Sanger 25 Genomes Project. This genome was assembled from PacBio reads and scaffolded with 10X Chromium and Hi-C data. Its 3,983 scaffolds have an N50 of 44.8 Mb (longest scaffold 60.1 Mb), with 92% of the assembly sequence contained in 19 scaffolds, corresponding to the 19 chromosomes found in this species. The total assembly spans 918.3 Mb and is the best-scaffolded marine bivalve genome published to date, exhibiting 95.5% recovery of the metazoan BUSCO set. Gene annotation resulted in 67,741 gene models. Analysis of gene content revealed large numbers of gene duplicates, as previously seen in bivalves, with little gene loss, in comparison with the sequenced genomes of other marine bivalve species.

    Conclusions: The genome assembly of P. maximus and its annotated gene set provide a high-quality platform for studies on such disparate topics as shell biomineralization, pigmentation, vision, and resistance to algal toxins. As a result of our findings we highlight the sodium channel gene Nav1, known to confer resistance to saxitoxin and tetrodotoxin, as a candidate for further studies investigating immunity to domoic acid.

    GigaScience 2020;9;5

  • Clustered CTCF binding is an evolutionary mechanism to maintain topologically associating domains.

    Kentepozidou E, Aitken SJ, Feig C, Stefflova K, Ibarra-Soria X, Odom DT, Roller M and Flicek P

    European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Cambridge, CB10 1SD, UK.

    Background: CTCF binding contributes to the establishment of a higher-order genome structure by demarcating the boundaries of large-scale topologically associating domains (TADs). However, despite the importance and conservation of TADs, the role of CTCF binding in their evolution and stability remains elusive.

    Results: We carry out an experimental and computational study that exploits the natural genetic variation across five closely related species to assess how CTCF binding patterns stably fixed by evolution in each species contribute to the establishment and evolutionary dynamics of TAD boundaries. We perform CTCF ChIP-seq in multiple mouse species to create genome-wide binding profiles and associate them with TAD boundaries. Our analyses reveal that CTCF binding is maintained at TAD boundaries by a balance of selective constraints and dynamic evolutionary processes. Regardless of their conservation across species, CTCF binding sites at TAD boundaries are subject to stronger sequence and functional constraints compared to other CTCF sites. TAD boundaries frequently harbor dynamically evolving clusters containing both evolutionarily old and young CTCF sites as a result of the repeated acquisition of new species-specific sites close to conserved ones. The overwhelming majority of clustered CTCF sites colocalize with cohesin and are significantly closer to gene transcription start sites than nonclustered CTCF sites, suggesting that CTCF clusters particularly contribute to cohesin stabilization and transcriptional regulation.

    Conclusions: Dynamic conservation of CTCF site clusters is an apparently important feature of CTCF binding evolution that is critical to the functional stability of a higher-order chromatin structure.

    Funded by: Cancer Research UK: 20412; European Research Council: 615584; Wellcome Trust: WT106563/Z/14, WT108749/Z/15/Z, WT202878/B/16/Z, WT202878/Z/16/Z

    Genome biology 2020;21;1;5

  • Pitfalls of Applying Mouse Markers to Human Adrenal Medullary Cells.

    Kildisiute G, Young MD and Behjati S

    Wellcome Sanger Institute, Hinxton CB10 1SA, UK.

    Cancer cell 2020

  • Genome-wide methylation patterns predict clinical benefit of immunotherapy in lung cancer.

    Kim JY, Choi JK and Jung H

    Department of Bio and Brain Engineering, KAIST, Daejeon, 34141, Republic of Korea.

    Background: It is crucial to unravel molecular determinants of responses to immune checkpoint blockade (ICB) therapy because only a small subset of advanced non-small cell lung cancer (NSCLC) patients responds to ICB therapy. Previous studies were concentrated on genomic and transcriptomic markers (e.g., mutation burden and immune gene expression). However, these markers are not sufficient to accurately predict a response to ICB therapy.

    Results: Here, we analyzed DNA methylomes of 141 advanced NSCLC samples subjected to ICB therapy (i.e., anti-programmed death-1) from two independent cohorts (60 and 81 patients from our and IDIBELL cohorts). Integrative analysis of patients with matched transcriptome data in our cohort (n = 28) at pathway level revealed significant overlaps between promoter hypermethylation and transcriptional repression in nonresponders relative to responders. Fifteen immune-related pathways, including interferon signaling, were identified to be enriched for both hypermethylation and repression. We built a reliable prognostic risk model based on eight genes using LASSO model and successfully validated the model in independent cohorts. Furthermore, we found 30 survival-associated molecular interaction networks, in which two or three hypermethylated genes showed significant mutual exclusion across nonresponders.

    Conclusions: Our study demonstrates that methylation patterns can provide insight into molecular determinants underlying the clinical benefit of ICB therapy.

    Clinical epigenetics 2020;12;1;119

  • Spatio-temporal dynamics of Plasmodium falciparum transmission within a spatial unit on the Colombian Pacific Coast.

    Knudson A, González-Casabianca F, Feged-Rivadeneira A, Pedreros MF, Aponte S, Olaya A, Castillo CF, Mancilla E, Piamba-Dorado A, Sanchez-Pedraza R, Salazar-Terreros MJ, Lucchi N, Udhayakumar V, Jacob C, Pance A, Carrasquilla M, Apráez G, Angel JA, Rayner JC and Corredor V

    Departamento de Microbiología, Facultad de Medicina, Universidad Nacional de Colombia, Bogotá, Colombia.

    As malaria control programmes concentrate their efforts towards malaria elimination a better understanding of malaria transmission patterns at fine spatial resolution units becomes necessary. Defining spatial units that consider transmission heterogeneity, human movement and migration will help to set up achievable malaria elimination milestones and guide the creation of efficient operational administrative control units. Using a combination of genetic and epidemiological data we defined a malaria transmission unit as the area contributing 95% of malaria cases diagnosed at the catchment facility located in the town of Guapi in the South Pacific Coast of Colombia. We provide data showing that P. falciparum malaria transmission is heterogeneous in time and space and analysed, using topological data analysis, the spatial connectivity, at the micro epidemiological level, between parasite populations circulating within the unit. To illustrate the necessity to evaluate the efficacy of malaria control measures within the transmission unit in order to increase the efficiency of the malaria control effort, we provide information on the size of the asymptomatic reservoir, the nature of parasite genotypes associated with drug resistance as well as the frequency of the Pfhrp2/3 deletion associated with false negatives when using Rapid Diagnostic Tests.

    Funded by: U.S. Department of Health &amp; Human Services | Centers for Disease Control and Prevention (CDC): 2017-503; Wellcome Trust (Wellcome): 206194/Z/17/Z

    Scientific reports 2020;10;1;3756

  • Exome Sequencing for Prenatal Detection of Genetic Abnormalities in Fetal Ultrasound Anomalies: An Economic Evaluation.

    Kodabuckus SS, Quinlan-Jones E, McMullan DJ, Maher ER, Hurles ME, Barton PM and Kilby MD

    Health Economics Unit, Institute of Applied Health Research, College of Medical and Dental Sciences, University of Birmingham, Birmingham, United Kingdom.

    Introduction: In light of the prospective Prenatal Assessment of Genomes and Exomes (PAGE) study, this paper aimed to determine the additional costs of using exome sequencing (ES) alongside or in place of chromosomal microarray (CMA) in a fetus with an identified congenital anomaly.

    Methods: A decision tree was populated using data from a prospective cohort of women undergoing invasive diagnostic testing. Four testing strategies were evaluated: CMA, ES, CMA followed by ES ("stepwise"); CMA and ES combined.

    Results: When ES is priced at GBP 2,100 (EUR 2,407/USD 2,694), performing ES alone prenatally would cost a further GBP 31,410 (EUR 36,001/USD 40,289) per additional genetic diagnosis, whereas the stepwise would cost a further GBP 24,657 (EUR 28,261/USD 31,627) per additional genetic diagnosis. When ES is priced at GBP 966 (EUR 1,107/USD 1,239), performing ES alone prenatally would cost a further GBP 11,532 (EUR 13,217/USD 14,792) per additional genetic diagnosis, whereas the stepwise would cost a further additional GBP 11,639 (EUR 13,340/USD 14,929) per additional genetic diagnosis. The sub-group analysis suggests that performing stepwise on cases indicative of multiple anomalies at ultrasound scan (USS) compared to cases indicative of a single anomaly, is more cost-effective compared to using ES alone.

    Discussion/conclusion: Performing ES alongside CMA is more cost-effective than ES alone, which can potentially lead to improvements in pregnancy management. The direct effects of test results on pregnancy outcomes were not examined; therefore, further research is recommended to examine changes on the projected incremental cost-effectiveness ratios.

    Fetal diagnosis and therapy 2020;47;7;554-564

  • Mutational signatures: experimental design and analytical framework.

    Koh G, Zou X and Nik-Zainal S

    Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK.

    Mutational signatures provide a powerful alternative for understanding the pathophysiology of cancer. Currently, experimental efforts aimed at validating and understanding the etiologies of cancer-derived mutational signatures are underway. In this review, we highlight key aspects of mutational signature experimental design and describe the analytical framework. We suggest guidelines and quality control measures for handling whole-genome sequencing data for mutational signature analyses and discuss pitfalls in interpretation. We envision that improved next-generation sequencing technologies and molecular cell biology approaches will usher in the next generation of studies into the etiologies and mechanisms of mutational patterns uncovered in cancers.

    Funded by: Cancer Research UK: C60100/A23916; Medical Research Council: Grant-in-Aid MRC Cancer Unit; Wellcome Trust: 4-year PhD Studentship (Sanger Institute)

    Genome biology 2020;21;1;37

  • Challenges and opportunities for the adoption of molecular diagnostics for anthelmintic resistance.

    Kotze AC, Gilleard JS, Doyle SR and Prichard RK

    CSIRO Agriculture and Food, St. Lucia, Brisbane, 4072, QLD, Australia. Electronic address:

    Anthelmintic resistance is a significant threat to livestock production systems worldwide and is emerging as an important issue in companion animal parasite management. It is also an emerging concern for the control of human soil-transmitted helminths and filaria. An important aspect of managing anthelmintic resistance is the ability to utilise diagnostic tests to detect its emergence at an early stage. In host-parasite systems where resistance is already widespread, diagnostics have a potentially important role in determining those drugs that remain the most effective. The development of molecular diagnostics for anthelmintic resistance is one focus of the Consortium for Anthelmintic Resistance and Susceptibility (CARS) group. The present paper reflects discussions of this issue that occurred at the most recent meeting of the group in Wisconsin, USA, in July 2019. We compare molecular resistance diagnostics with in vivo and in vitro phenotypic methods, and highlight the advantages and disadvantages of each. We assess whether our knowledge on the identity of molecular markers for resistance towards the different drug classes is sufficient to provide some expectation that molecular tests for field use may be available in the short-to-medium term. We describe some practical aspects of such tests and how our current capabilities compare to the requirements of an 'ideal' test. Finally, we describe examples of drug class/parasite species interactions that provide the best opportunity for commercial use of molecular tests in the near future. We argue that while such prototype tests may not satisfy the requirements of an 'ideal' test, their potential to provide significant advances over currently-used phenotypic methods warrants their development as field diagnostics.

    International journal for parasitology. Drugs and drug resistance 2020;14;264-273

  • halSynteny: a fast, easy-to-use conserved synteny block construction method for multiple whole-genome alignments.

    Krasheninnikova K, Diekhans M, Armstrong J, Dievskii A, Paten B and O'Brien S

    Computer Technologies Laboratory, School of Translational Information Technologies, ITMO University, 49 Kronverkskiy Pr., St. Petersburg 197101, St. Petersburg, Russian Federation.

    Background: Large-scale sequencing projects provide high-quality full-genome data that can be used for reconstruction of chromosomal exchanges and rearrangements that disrupt conserved syntenic blocks. The highest resolution of cross-species homology can be obtained on the basis of whole-genome, reference-free alignments. Very large multiple alignments of full-genome sequence stored in a binary format demand an accurate and efficient computational approach for synteny block production.

    Findings: halSynteny performs efficient processing of pairwise alignment blocks for any pair of genomes in the alignment. The tool is part of the HAL comparative genomics suite and is targeted to build synteny blocks for multi-hundred-way, reference-free vertebrate alignments built with the Cactus system.

    Conclusions: halSynteny enables an accurate and rapid identification of synteny in multiple full-genome alignments. The method is implemented in C++11 as a component of the halTools software and released under MIT license. The package is available at

    GigaScience 2020;9;6

  • Diversification in immunogenicity genes caused by selective pressures in invasive meningococci.

    Kremer PHC, Lees JA, Ferwerda B, Bijlsma MW, MacAlasdair N, van der Ende A, Brouwer MC, Bentley SD and van de Beek D

    Amsterdam UMC, University of Amsterdam, Department of Neurology, Amsterdam Neuroscienc, Amsterdam, The Netherlands.

    We studied population genomics of 486 <i>Neisseria meningitidis</i> isolates causing meningitis in the Netherlands during the period 1979-2003 and 2006-2013 using whole-genome sequencing to evaluate the impact of a hyperendemic period of serogroup B invasive disease. The majority of serogroup B isolates belonged to ST-41/44 (41 %) and ST-32 complex (16 %). Comparing the time periods, before and after the decline of serogroup B invasive disease, there was a decrease of ST-41/44 complex sequences (<i>P</i>=0.002). We observed the expansion of a sub-lineage within ST-41/44 complex sequences being associated with isolation from the 1979-2003 time period (<i>P</i>=0.014). Isolates belonging to this sub-lineage expansion within ST-41/44 complex were marked by four antigen allele variants. Presence of these allele variants was associated with isolation from the 1979-2003 time period after correction for multiple testing (Wald test, <i>P</i>=0.0043 for FetA 1-5; <i>P</i>=0.0035 for FHbp 14; <i>P</i>=0.012 for PorA 7-2.4 and <i>P</i>=0.0031 for NHBA two peptide allele). These sequences were associated with 4CMenB vaccine coverage (Fisher's exact test, <i>P</i><0.001). Outside of the sub-lineage expansion, isolates with markedly lower levels of predicted vaccine coverage clustered in phylogenetic groups showing a trend towards isolation in the 2006-2013 time period (<i>P</i>=0.08). In conclusion, we show the emergence and decline of a sub-lineage expansion within ST-41/44 complex isolates concurrent with a hyperendemic period in meningococcal meningitis. The expansion was marked by specific antigen peptide allele combinations. We observed preliminary evidence for decreasing 4CMenB vaccine coverage in the post-hyperendemic period.

    Microbial genomics 2020

  • Genetic Variation in Neisseria meningitidis Does Not Influence Disease Severity in Meningococcal Meningitis.

    Kremer PHC, Lees JA, Ferwerda B, van de Ende A, Brouwer MC, Bentley SD and van de Beek D

    Department of Neurology, Amsterdam Neuroscience, Amsterdam University Medical Center, University of Amsterdam, Amsterdam, Netherlands.

    <i>Neisseria meningitidis</i> causes sepsis and meningitis in humans. It has been suggested that pathogen genetic variation determines variance in disease severity. Here we report results of a genome-wide association study of 486 <i>N. meningitidis</i> genomes from meningococcal meningitis patients and their association with disease severity. Of 369 meningococcal meningitis patients for whom clinical data was available, 44 (12%) had unfavorable outcome and 24 (7%) died. To increase power, thrombocyte count was used as proxy marker for disease severity. Bacterial genetic variants were called as k-mers, SNPs, insertions and deletions and clusters of orthologous genes (COGs). Population-level meningococcal genetic variation did not explain variance in disease severity (unfavorable outcome or thrombocyte count) in this cohort (h<sup>2</sup> = 0.0%; 95% confidence interval: 0.0-0.9). Genetic variants in the bacterial <i>uppS</i> gene represented the top signal associated with thrombocyte count (<i>p</i>-value = 9.96e-07) but this did not reach statistical significance. We did not find an association between previously published variants in <i>lpxL1, fHbp</i>, and <i>tps</i> genes and unfavorable outcome or thrombocyte count. A power analysis based on simulated phenotypes based on real genetic data from 880 <i>N. meningitidis</i> genomes showed that we would be able to detect a continuous phenotype with h<sup>2</sup> > = 0.5 with the population size available in this study. This rules out a major contribution of pathogen genetic variation to disease severity in meningococcal meningitis, and shows that much larger sample sizes are required to find specific low-effect genetic variants modulating disease outcome in meningococcal meningitis.

    Frontiers in medicine 2020;7;594769

  • The prevalence and implications of single nucleotide polymorphisms in genes encoding the RNA polymerase of clinical isolates of Staphylococcus aureus.

    Krishna A, Liu B, Peacock SJ and Wigneshweraraj S

    MRC Centre for Molecular Bacteriology and Infection, Imperial College London, London, UK.

    Central to the regulation of bacterial gene expression is the multisubunit enzyme RNA polymerase (RNAP), which is responsible for catalyzing transcription. As all adaptive processes are underpinned by changes in gene expression, the RNAP can be considered the major mediator of any adaptive response in the bacterial cell. In bacterial pathogens, theoretically, single nucleotide polymorphisms (SNPs) in genes that encode subunits of the RNAP and associated factors could mediate adaptation and confer a selective advantage to cope with biotic and abiotic stresses. We investigated this possibility by undertaking a systematic survey of SNPs in genes encoding the RNAP and associated factors in a collection of 1,429 methicillin-resistant Staphylococcus aureus (MRSA) clinical isolates. We present evidence for the existence of several, hitherto unreported, nonsynonymous SNPs in genes encoding the RNAP and associated factors of MRSA ST22 clinical isolates and propose that the acquisition of amino acid substitutions in the RNAP could represent an adaptive strategy that contributes to the pathogenic success of MRSA.

    Funded by: Medical Research Council: G1000803; Wellcome Trust: WT100958MA

    MicrobiologyOpen 2020;9;7;e1058

  • Nasal microbiome research in ANCA-associated vasculitis: Strengths, limitations, and future directions.

    Kronbichler A, Harrison EM and Wagner J

    Department of Internal Medicine IV (Nephrology and Hypertension), Medical University Innsbruck, Innsbruck, Austria.

    The human nasal microbiome is characterized by biodiversity and undergoes changes during the span of life. In granulomatosis with polyangiitis (GPA), the persistent nasal colonization by <i>Staphylococcus aureus</i> (<i>S. aureus</i>) assessed by culture-based detection methods has been associated with increased relapse frequency. Different research groups have characterized the nasal microbiome in patients with GPA and found that patients have a distinct nasal microbiome compared to controls, but the reported results between studies differed. In order to increase comparability, there is a need to standardize patient selection, sample preparation, and analytical methodology; particularly as low biomass samples like those obtained by nasal swabbing are impacted by reagent contamination. Optimization in obtaining a sample and processing with the inclusion of critical controls is needed for consistent comparative studies. Ongoing studies will analyze the nasal microbiome in GPA in a longitudinal way and the results will inform whether or not targeted antimicrobial management in a clinical trial should be pursued or not. This review focuses on the proposed role of <i>S. aureus</i> in GPA, the (healthy) nasal microbiome, findings in the first pilot studies in GPA, and will discuss future strategies.

    Computational and structural biotechnology journal 2020;19;415-423

  • Evolution and lineage dynamics of a transmissible cancer in Tasmanian devils.

    Kwon YM, Gori K, Park N, Potts N, Swift K, Wang J, Stammnitz MR, Cannell N, Baez-Ortega A, Comte S, Fox S, Harmsen C, Huxtable S, Jones M, Kreiss A, Lawrence C, Lazenby B, Peck S, Pye R, Woods G, Zimmermann M, Wedge DC, Pemberton D, Stratton MR, Hamede R and Murchison EP

    Transmissible Cancer Group, Department of Veterinary Medicine, University of Cambridge, Cambridge, United Kingdom.

    Devil facial tumour 1 (DFT1) is a transmissible cancer clone endangering the Tasmanian devil. The expansion of DFT1 across Tasmania has been documented, but little is known of its evolutionary history. We analysed genomes of 648 DFT1 tumours collected throughout the disease range between 2003 and 2018. DFT1 diverged early into five clades, three spreading widely and two failing to persist. One clade has replaced others at several sites, and rates of DFT1 coinfection are high. DFT1 gradually accumulates copy number variants (CNVs), and its telomere lengths are short but constant. Recurrent CNVs reveal genes under positive selection, sites of genome instability, and repeated loss of a small derived chromosome. Cultured DFT1 cell lines have increased CNV frequency and undergo highly reproducible convergent evolution. Overall, DFT1 is a remarkably stable lineage whose genome illustrates how cancer cells adapt to diverse environments and persist in a parasitic niche.

    PLoS biology 2020;18;11;e3000926

  • Distinct microbial and immune niches of the human colon

    Kylie R. James, Tomas Gomes, Rasa Elmentaite, Nitin Kumar, Emily L. Gulliver, Hamish W. King, Mark D. Stares, Bethany R. Bareham, John R. Ferdinand, Velislava N. Petrova, Krzysztof Pola&#324;ski, Samuel C. Forster, Lorna B. Jarvis, Ondrej Suchanek, Sarah Howlett, Louisa K. James, Joanne L. Jones, Kerstin B. Meyer, Menna R. Clatworthy, Kourosh Saeb-Parsy, Trevor D. Lawley and Sarah A. Teichmann

    Gastrointestinal microbiota and immune cells interact closely and display regional specificity; however, little is known about how these communities differ with location. Here, we simultaneously assess microbiota and single immune cells across the healthy, adult human colon, with paired characterization of immune cells in the mesenteric lymph nodes, to delineate colonic immune niches at steady state. We describe distinct helper T cell activation and migration profiles along the colon and characterize the transcriptional adaptation trajectory of regulatory T cells between lymphoid tissue and colon. Finally, we show increasing B cell accumulation, clonal expansion and mutational frequency from the cecum to the sigmoid colon and link this to the increasing number of reactive bacterial species. The gut microbiota and their proximate immune cells engage in a dialog of reciprocal regulation. James and colleagues describe how immune cell and microbiotal populations vary along the length of the human colon.

    Nature Immunology 2020;21;3;343

  • Eleven grand challenges in single-cell data science.

    Lähnemann D, Köster J, Szczurek E, McCarthy DJ, Hicks SC, Robinson MD, Vallejos CA, Campbell KR, Beerenwinkel N, Mahfouz A, Pinello L, Skums P, Stamatakis A, Attolini CS, Aparicio S, Baaijens J, Balvert M, Barbanson B, Cappuccio A, Corleone G, Dutilh BE, Florescu M, Guryev V, Holmer R, Jahn K, Lobo TJ, Keizer EM, Khatri I, Kielbasa SM, Korbel JO, Kozlov AM, Kuo TH, Lelieveldt BPF, Mandoiu II, Marioni JC, Marschall T, Mölder F, Niknejad A, Raczkowski L, Reinders M, Ridder J, Saliba AE, Somarakis A, Stegle O, Theis FJ, Yang H, Zelikovsky A, McHardy AC, Raphael BJ, Shah SP and Schönhuth A

    Algorithms for Reproducible Bioinformatics, Genome Informatics, Institute of Human Genetics, University Hospital Essen, University of Duisburg-Essen, Essen, Germany.

    The recent boom in microfluidics and combinatorial indexing strategies, combined with low sequencing costs, has empowered single-cell sequencing technology. Thousands-or even millions-of cells analyzed in a single experiment amount to a data revolution in single-cell biology and pose unique data science problems. Here, we outline eleven challenges that will be central to bringing this emerging field of single-cell data science forward. For each challenge, we highlight motivating research questions, review prior work, and formulate open problems. This compendium is for established researchers, newcomers, and students alike, highlighting interesting and rewarding problems for the coming years.

    Funded by: NHGRI NIH HHS: R01 HG007069

    Genome biology 2020;21;1;31

  • Targeted sequencing in DLBCL, molecular subtypes, and outcomes: a Haematological Malignancy Research Network report.

    Lacy SE, Barrans SL, Beer PA, Painter D, Smith AG, Roman E, Cooke SL, Ruiz C, Glover P, Van Hoppe SJL, Webster N, Campbell PJ, Tooze RM, Patmore R, Burton C, Crouch S and Hodson DJ

    Epidemiology and Cancer Statistics Group, Department of Health Sciences, University of York, York, United Kingdom.

    Based on the profile of genetic alterations occurring in tumor samples from selected diffuse large B-cell lymphoma (DLBCL) patients, 2 recent whole-exome sequencing studies proposed partially overlapping classification systems. Using clustering techniques applied to targeted sequencing data derived from a large unselected population-based patient cohort with full clinical follow-up (n = 928), we investigated whether molecular subtypes can be robustly identified using methods potentially applicable in routine clinical practice. DNA extracted from DLBCL tumors diagnosed in patients residing in a catchment population of ∼4 million (14 centers) were sequenced with a targeted 293-gene hematological-malignancy panel. Bernoulli mixture-model clustering was applied and the resulting subtypes analyzed in relation to their clinical characteristics and outcomes. Five molecular subtypes were resolved, termed MYD88, BCL2, SOCS1/SGK1, TET2/SGK1, and NOTCH2, along with an unclassified group. The subtypes characterized by genetic alterations of BCL2, NOTCH2, and MYD88 recapitulated recent studies showing good, intermediate, and poor prognosis, respectively. The SOCS1/SGK1 subtype showed biological overlap with primary mediastinal B-cell lymphoma and conferred excellent prognosis. Although not identified as a distinct cluster, NOTCH1 mutation was associated with poor prognosis. The impact of TP53 mutation varied with genomic subtypes, conferring no effect in the NOTCH2 subtype and poor prognosis in the MYD88 subtype. Our findings confirm the existence of molecular subtypes of DLBCL, providing evidence that genomic tests have prognostic significance in non-selected DLBCL patients. The identification of both good and poor risk subtypes in patients treated with R-CHOP (rituximab, cyclophosphamide, doxorubicin, vincristine, and prednisone) clearly show the clinical value of the approach, confirming the need for a consensus classification.

    Funded by: Wellcome Trust

    Blood 2020;135;20;1759-1771

  • Nucleotide diversity of functionally different groups of immune response genes in Old World camels based on newly annotated and reference-guided assemblies.

    Lado S, Elbers JP, Rogers MF, Melo-Ferreira J, Yadamsuren A, Corander J, Horin P and Burger PA

    Department of Interdisciplinary Life Sciences, Research Institute of Wildlife Ecology, Vetmeduni Vienna, Vienna, Austria.

    Background: Immune-response (IR) genes have an important role in the defense against highly variable pathogens, and therefore, diversity in these genomic regions is essential for species' survival and adaptation. Although current genome assemblies from Old World camelids are very useful for investigating genome-wide diversity, demography and population structure, they have inconsistencies and gaps that limit analyses at local genomic scales. Improved and more accurate genome assemblies and annotations are needed to study complex genomic regions like adaptive and innate IR genes.

    Results: In this work, we improved the genome assemblies of the three Old World camel species - domestic dromedary and Bactrian camel, and the two-humped wild camel - via different computational methods. The newly annotated dromedary genome assembly CamDro3 served as reference to scaffold the NCBI RefSeq genomes of domestic Bactrian and wild camels. These upgraded assemblies were then used to assess nucleotide diversity of IR genes within and between species, and to compare the diversity found in immune genes and the rest of the genes in the genome. We detected differences in the nucleotide diversity among the three Old World camelid species and between IR gene groups, i.e., innate versus adaptive. Among the three species, domestic Bactrian camels showed the highest mean nucleotide diversity. Among the functionally different IR gene groups, the highest mean nucleotide diversity was observed in the major histocompatibility complex.

    Conclusions: The new camel genome assemblies were greatly improved in terms of contiguity and increased size with fewer scaffolds, which is of general value for the scientific community. This allowed us to perform in-depth studies on genetic diversity in immunity-related regions of the genome. Our results suggest that differences of diversity across classes of genes appear compatible with a combined role of population history and differential exposures to pathogens, and consequent different selective pressures.

    Funded by: Austrian Science Fund: P29623-B25; Fundação para a Ciência e a Tecnologia: CEECIND/00372/2018

    BMC genomics 2020;21;1;606

  • Schistosoma mansoni Vector Snails in Antigua and Montserrat, with Snail-Related Considerations Pertinent to a Declaration of Elimination of Human Schistosomiasis.

    Laidemitt MR, Buddenborg SK, Lewis LL, Michael LE, Sanchez MJ, Hewitt R and Loker ES

    Center for Evolutionary and Theoretical Immunology, Parasite Division, Museum of Southwestern Biology, Department of Biology, University of New Mexico, Albuquerque, New Mexico.

    Investigations leading to a WHO-validated declaration of elimination of schistosomiasis transmission are contemplated for several countries, including Caribbean island nations. With assistance from the Pan American Health Organization, we undertook freshwater snail surveys in two such nations, Antigua and Barbuda, and Montserrat in September and October 2017. Historically, the transmission of <i>Schistosoma mansoni</i> supported by the Neotropical vector snail <i>Biomphalaria glabrata</i> occurred in both countries. Transmission on the islands is thought to have been interrupted by the treatment of infected people, improved sanitation, introduction of competitor snails, and on Montserrat with the eruption of the Soufrière volcano which decimated known <i>B. glabrata</i> habitats. Guided by the available literature and local expertise, we found <i>Biomphalaria</i> snails in seven of 15 and one of 14 localities on Antigua and Montserrat, respectively, most of which were identified anatomically and molecularly as <i>Biomphalaria kuhniana</i>. Two localities on Antigua harbored <i>B. glabrata</i>, but no schistosome infections in snails were found. For snail-related aspects of validation of elimination, there are needs to undertake basic local training in medical malacology, be guided by historical literature and recent human schistosomiasis surveys, improve and validate sampling protocols for aquatic habitats, enlist local expertise to efficiently find potential transmission sites, use both anatomical and molecular identifications of schistosomes or putative vector snail species found, if possible determine the susceptibility of recovered <i>Biomphalaria</i> spp. to <i>S. mansoni</i>, publish survey results, and provide museum vouchers of collected snails and parasites as part of the historical record.

    The American journal of tropical medicine and hygiene 2020

  • Gene family information facilitates variant interpretation and identification of disease-associated genes in neurodevelopmental disorders.

    Lal D, May P, Perez-Palma E, Samocha KE, Kosmicki JA, Robinson EB, Møller RS, Krause R, Nürnberg P, Weckhuysen S, De Jonghe P, Guerrini R, Niestroj LM, Du J, Marini C, EuroEPINOMICS-RES Consortium, Ware JS, Kurki M, Gormley P, Tang S, Wu S, Biskup S, Poduri A, Neubauer BA, Koeleman BPC, Helbig KL, Weber YG, Helbig I, Majithia AR, Palotie A and Daly MJ

    Epilepsy Center, Neurological Institute, Cleveland Clinic, Cleveland, OH, USA.

    Background: Classifying pathogenicity of missense variants represents a major challenge in clinical practice during the diagnoses of rare and genetic heterogeneous neurodevelopmental disorders (NDDs). While orthologous gene conservation is commonly employed in variant annotation, approximately 80% of known disease-associated genes belong to gene families. The use of gene family information for disease gene discovery and variant interpretation has not yet been investigated on a genome-wide scale. We empirically evaluate whether paralog-conserved or non-conserved sites in human gene families are important in NDDs.

    Methods: Gene family information was collected from Ensembl. Paralog-conserved sites were defined based on paralog sequence alignments; 10,068 NDD patients and 2078 controls were statistically evaluated for de novo variant burden in gene families.

    Results: We demonstrate that disease-associated missense variants are enriched at paralog-conserved sites across all disease groups and inheritance models tested. We developed a gene family de novo enrichment framework that identified 43 exome-wide enriched gene families including 98 de novo variant carrying genes in NDD patients of which 28 represent novel candidate genes for NDD which are brain expressed and under evolutionary constraint.

    Conclusion: This study represents the first method to incorporate gene family information into a statistical framework to interpret variant data for NDDs and to discover new NDD-associated genes.

    Genome medicine 2020;12;1;28

  • Subdividing Y-chromosome haplogroup R1a1 reveals Norse Viking dispersal lineages in Britain.

    Lall GM, Larmuseau MHD, Wetton JH, Batini C, Hallast P, Huszar TI, Zadik D, Aase S, Baker T, Balaresque P, Bodmer W, Børglum AD, de Knijff P, Dunn H, Harding SE, Løvvik H, Dupuy BM, Pamjav H, Tillmar AO, Tomaszewski M, Tyler-Smith C, Verdugo MP, Winney B, Vohra P, Story J, King TE and Jobling MA

    Department of Genetics & Genome Biology, University of Leicester, Leicester, UK.

    The influence of Viking-Age migrants to the British Isles is obvious in archaeological and place-names evidence, but their demographic impact has been unclear. Autosomal genetic analyses support Norse Viking contributions to parts of Britain, but show no signal corresponding to the Danelaw, the region under Scandinavian administrative control from the ninth to eleventh centuries. Y-chromosome haplogroup R1a1 has been considered as a possible marker for Viking migrations because of its high frequency in peninsular Scandinavia (Norway and Sweden). Here we select ten Y-SNPs to discriminate informatively among hg R1a1 sub-haplogroups in Europe, analyse these in 619 hg R1a1 Y chromosomes including 163 from the British Isles, and also type 23 short-tandem repeats (Y-STRs) to assess internal diversity. We find three specifically Western-European sub-haplogroups, two of which predominate in Norway and Sweden, and are also found in Britain; star-like features in the STR networks of these lineages indicate histories of expansion. We ask whether geographical distributions of hg R1a1 overall, and of the two sub-lineages in particular, correlate with regions of Scandinavian influence within Britain. Neither shows any frequency difference between regions that have higher (≥10%) or lower autosomal contributions from Norway and Sweden, but both are significantly overrepresented in the region corresponding to the Danelaw. These differences between autosomal and Y-chromosomal histories suggest either male-specific contribution, or the influence of patrilocality. Comparison of modern DNA with recently available ancient DNA data supports the interpretation that two sub-lineages of hg R1a1 spread with the Vikings from peninsular Scandinavia.

    Funded by: British Heart Foundation (BHF): PG/16/49/32176; Leverhulme Trust: F/00 212/AM; Wellcome Trust (Wellcome): 057559, 072974, 084060, 087576, 088262, 098051

    European journal of human genetics : EJHG 2020

  • Atypical, milder presentation in a child with CC2D2A and KIDINS220 variants.

    Lam Z, Albaba S, Study D and Balasubramanian M

    Yorkshire Regional Genetics Service, Leeds Teaching Hospitals NHS Trust, Leeds.

    With the increasing availability and clinical use of exome and whole-genome sequencing, reverse phenotyping is now becoming common practice in clinical genetics. Here, we report a patient identified through the Wellcome Trust Deciphering Developmental Disorders study who has homozygous pathogenic variants in CC2D2A and a de-novo heterozygous pathogenic variant in KIDINS220. He presents with developmental delay, intellectual disability, and oculomotor apraxia. Reverse phenotyping has demonstrated that he likely has a composite phenotype with contributions from both variants. The patient is much more mildly affected than those with Joubert Syndrome or Spastic paraplegia, intellectual disability, nystagmus, and obesity, the conditions associated with CC2D2A and KIDINS220 respectively, and therefore, contributes to the phenotypic variability associated with the two conditions.

    Clinical dysmorphology 2020;29;1;10-16

  • TMEM95 is a sperm membrane protein essential for mammalian fertilization.

    Lamas-Toranzo I, Hamze JG, Bianchi E, Fernández-Fuertes B, Pérez-Cerezales S, Laguna-Barraza R, Fernández-González R, Lonergan P, Gutiérrez-Adán A, Wright GJ, Jiménez-Movilla M and Bermejo-Álvarez P

    Animal Reproduction Department, INIA, Madrid, Spain.

    The fusion of gamete membranes during fertilization is an essential process for sexual reproduction. Despite its importance, only three proteins are known to be indispensable for sperm-egg membrane fusion: the sperm proteins IZUMO1 and SPACA6, and the egg protein JUNO. Here we demonstrate that another sperm protein, TMEM95, is necessary for sperm-egg interaction. TMEM95 ablation in mice caused complete male-specific infertility. Sperm lacking this protein were morphologically normal exhibited normal motility, and could penetrate the zona pellucida and bind to the oolemma. However, once bound to the oolemma, TMEM95-deficient sperm were unable to fuse with the egg membrane or penetrate into the ooplasm, and fertilization could only be achieved by mechanical injection of one sperm into the ooplasm, thereby bypassing membrane fusion. These data demonstrate that TMEM95 is essential for mammalian fertilization.

    Funded by: Department of Agriculture, Food and the Marine: 11/S/104; Fundaci&amp;amp;#x00F3;n S&amp;amp;#x00E9;neca-Agencia de Ciencia y Tecnolog&amp;amp;#x00ED;a de Murcia: 20887/PI/18; Fundación Séneca: 20887/PI/18; H2020 European Research Council: StG 757886-ELONGAN; Medical Research Council: MR/M012468/1; Ministerio de Economía y Competitividad: AGL2014-58739-R, AGL2015-70159-P, AGL2016-71890-REDT, AGL2017-84908-R, RTI2018-093548-B-I00, RYC-2012-10193

    eLife 2020;9

  • Pan-active imidazolopiperazine antimalarials target the Plasmodium falciparum intracellular secretory pathway.

    LaMonte GM, Rocamora F, Marapana DS, Gnädig NF, Ottilie S, Luth MR, Worgall TS, Goldgof GM, Mohunlal R, Santha Kumar TR, Thompson JK, Vigil E, Yang J, Hutson D, Johnson T, Huang J, Williams RM, Zou BY, Cheung AL, Kumar P, Egan TJ, Lee MCS, Siegel D, Cowman AF, Fidock DA and Winzeler EA

    Department of Pediatrics, School of Medicine, University of California, San Diego, La Jolla, CA, 92093, USA.

    A promising new compound class for treating human malaria is the imidazolopiperazines (IZP) class. IZP compounds KAF156 (Ganaplacide) and GNF179 are effective against Plasmodium symptomatic asexual blood-stage infections, and are able to prevent transmission and block infection in animal models. But despite the identification of resistance mechanisms in P. falciparum, the mode of action of IZPs remains unknown. To investigate, we here combine in vitro evolution and genome analysis in Saccharomyces cerevisiae with molecular, metabolomic, and chemogenomic methods in P. falciparum. Our findings reveal that IZP-resistant S. cerevisiae clones carry mutations in genes involved in Endoplasmic Reticulum (ER)-based lipid homeostasis and autophagy. In Plasmodium, IZPs inhibit protein trafficking, block the establishment of new permeation pathways, and cause ER expansion. Our data highlight a mechanism for blocking parasite development that is distinct from those of standard compounds used to treat malaria, and demonstrate the potential of IZPs for studying ER-dependent protein processing.

    Funded by: Bill and Melinda Gates Foundation (Bill &amp; Melinda Gates Foundation): OPP1054480, OPP1171497

    Nature communications 2020;11;1;1780

  • Multiple GYPB gene deletions associated with the U- phenotype in those of African ancestry.

    Lane WJ, Gleadall NS, Aeschlimann J, Vege S, Sanchis-Juan A, Stephens J, Sullivan JC, Mah HH, Aguad M, Smeland-Wagman R, Lebo MS, Vijay Kumar PK, Kaufman RM, Green RC, Ouwehand WH and Westhoff CM

    Department of Pathology, Brigham and Women's Hospital, Boston, Massachusetts.

    Background: The MNS blood group system is defined by three homologous genes: GYPA, GYPB, and GYPE. GYPB encodes for glycophorin B (GPB) carrying S/s and the "universal" antigen U. RBCs of approximately 1% of individuals of African ancestry are U- due to absence of GPB. The U- phenotype has long been attributed to a deletion encompassing GYPB exons 2 to 5 and GYPE exon 1 (GYPB*01N).

    Study design and methods: Samples from two U-individuals underwent Illumina short read whole genome sequencing (WGS) and Nanopore long read WGS. In addition, two existing WGS datasets, MedSeq (n = 110) and 1000 Genomes (1000G, n = 2535), were analyzed for GYPB deletions. Deletions were confirmed by Sanger sequencing. Twenty known U- donor samples were tested by a PCR assay to determine the specific deletion alleles present in African Americans.

    Results: Two large GYPB deletions in U- samples of African ancestry were identified: a 110 kb deletion extending left of GYPB (DEL_B_LEFT) and a 103 kb deletion extending right (DEL_B_RIGHT). DEL_B_LEFT and DEL_B_RIGHT were the most common GYPB deletions in the 1000 Genomes Project 669 African genomes (allele frequencies 0.04 and 0.02). Seven additional deletions involving GYPB were seen in African, Admixed American, and South Asian samples. No samples analyzed had GYPB*01N.

    Conclusions: The U- phenotype in those of African ancestry is primarily associated with two different complete deletions of GYPB (with intact GYPE). Seven additional less common GYPB deletion backgrounds were found. GYPB*01N, long assumed to be the allele commonly encoding U- phenotypes, appears to be rare.

    Funded by: Department of Defense; Doris Duke Charitable Foundation; NHGRI NIH HHS: U01-HG006500; NHS Blood and Transplant; NIH HHS; National Institute for Health Research

    Transfusion 2020

  • Analysis pipelines for cancer genome sequencing in mice.

    Lange S, Engleitner T, Mueller S, Maresch R, Zwiebel M, González-Silva L, Schneider G, Banerjee R, Yang F, Vassiliou GS, Friedrich MJ, Saur D, Varela I and Rad R

    Institute of Molecular Oncology and Functional Genomics, School of Medicine, Technische Universität München, Munich, Germany.

    Mouse models of human cancer have transformed our ability to link genetics, molecular mechanisms and phenotypes. Both reverse and forward genetics in mice are currently gaining momentum through advances in next-generation sequencing (NGS). Methodologies to analyze sequencing data were, however, developed for humans and hence do not account for species-specific differences in genome structures and experimental setups. Here, we describe standardized computational pipelines specifically tailored to the analysis of mouse genomic data. We present novel tools and workflows for the detection of different alteration types, including single-nucleotide variants (SNVs), small insertions and deletions (indels), copy-number variations (CNVs), loss of heterozygosity (LOH) and complex rearrangements, such as in chromothripsis. Workflows have been extensively validated and cross-compared using multiple methodologies. We also give step-by-step guidance on the execution of individual analysis types, provide advice on data interpretation and make the complete code available online. The protocol takes 2-7 d, depending on the desired analyses.

    Funded by: Deutsche Forschungsgemeinschaft (German Research Foundation): RA1629/2-1, SFB1243, SFB1321, SFB1335; Deutsche Krebshilfe (German Cancer Aid): 70112480; EC | EU Framework Programme for Research and Innovation H2020 | H2020 Priority Excellent Science | H2020 Marie Skłodowska-Curie Actions (H2020 Excellent Science - Marie Skłodowska-Curie Actions): PRECODE

    Nature protocols 2020;15;2;266-315

  • VarSite: Disease variants and protein structure.

    Laskowski RA, Stephenson JD, Sillitoe I, Orengo CA and Thornton JM

    European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, UK.

    VarSite is a web server mapping known disease-associated variants from UniProt and ClinVar, together with natural variants from gnomAD, onto protein 3D structures in the Protein Data Bank. The analyses are primarily image-based and provide both an overview for each human protein, as well as a report for any specific variant of interest. The information can be useful in assessing whether a given variant might be pathogenic or benign. The structural annotations for each position in the protein include protein secondary structure, interactions with ligand, metal, DNA/RNA, or other protein, and various measures of a given variant's possible impact on the protein's function. The 3D locations of the disease-associated variants can be viewed interactively via the 3dmol.js JavaScript viewer, as well as in RasMol and PyMOL. Users can search for specific variants, or sets of variants, by providing the DNA coordinates of the base change(s) of interest. Additionally, various agglomerative analyses are given, such as the mapping of disease and natural variants onto specific Pfam or CATH domains. The server is freely accessible to all at:

    Protein science : a publication of the Protein Society 2020;29;1;111-119

  • Whole genome sequencing of Herpes Simplex Virus 1 directly from human cerebrospinal fluid reveals selective constraints in neurotropic viruses.

    Lassalle F, Beale MA, Bharucha T, Williams CA, Williams RJ, Cudini J, Goldstein R, Haque T, Depledge DP and Breuer J

    Department of Infectious Disease Epidemiology, Imperial College London, St-Mary's Hospital campus, Praed Street, London W2 1NY, UK.

    Herpes Simplex Virus type 1 (HSV-1) chronically infects over 70 per cent of the global population. Clinical manifestations are largely restricted to recurrent epidermal vesicles. However, HSV-1 also leads to encephalitis, the infection of the brain parenchyma, with high associated rates of mortality and morbidity. In this study, we performed target enrichment followed by direct sequencing of HSV-1 genomes, using target enrichment methods on the cerebrospinal fluid (CSF) of clinical encephalitis patients and from skin swabs of epidermal vesicles on non-encephalopathic patients. Phylogenetic analysis revealed high inter-host diversity and little population structure. In contrast, samples from different lesions in the same patient clustered with similar patterns of allelic variants. Comparison of consensus genome sequences shows HSV-1 has been freely recombining, except for distinct islands of linkage disequilibrium (LD). This suggests functional constraints prevent recombination between certain genes, notably those encoding pairs of interacting proteins. Distinct LD patterns characterised subsets of viruses recovered from CSF and skin lesions, which may reflect different evolutionary constraints in different body compartments. Functions of genes under differential constraint related to immunity or tropism and provide new hypotheses on tissue-specific mechanisms of viral infection and latency.

    Virus evolution 2020;6;1;veaa012

  • Detecting extra-ocular Chlamydia trachomatis in a trachoma-endemic community in Ethiopia: Identifying potential routes of transmission.

    Last A, Versteeg B, Shafi Abdurahman O, Robinson A, Dumessa G, Abraham Aga M, Shumi Bejiga G, Negussu N, Greenland K, Czerniewska A, Thomson N, Cairncross S, Sarah V, Macleod D, Solomon AW, Logan J and Burton MJ

    Clinical Research Department, London School of Hygiene & Tropical Medicine, London, United Kingdom.

    Background: Trachoma elimination efforts are hampered by limited understanding of Chlamydia trachomatis (Ct) transmission routes. Here we aimed to detect Ct DNA at non-ocular sites and on eye-seeking flies.

    Methods: A population-based household survey was conducted in Oromia Region, Ethiopia. Ocular and non-ocular (faces, hands, clothing, water containers and sleeping surfaces) swabs were collected from all individuals. Flies were caught from faces of children. Flies, ocular swabs and non-ocular swabs were tested for Ct by quantitative PCR.

    Results: In total, 1220 individuals in 247 households were assessed. Active trachoma (trachomatous inflammation-follicular) and ocular Ct were detected in 10% and 2% of all-ages, and 21% and 3% of 1-9-year-olds, respectively. Ct was detected in 12% (95% CI:8-15%) of tested non-ocular swabs from ocular-positive households, but in none of the non-ocular swabs from ocular-negative households. Ct was detected on 24% (95% CI:18-32%) of flies from ocular-positive households and 3% (95% CI:1-6%) of flies from ocular-negative households.

    Conclusion: Ct DNA was detected on hands, faces and clothing of individuals living in ocular-positive households suggesting that this might be a route of transmission within Ct infected households. In addition, we detected Ct on flies from ocular-positive households and occasionally in ocular-negative households suggesting that flies might be a vector for transmission within and between Ct infected and uninfected households. These potential transmission routes may need to be simultaneously addressed to suppress transmission.

    PLoS neglected tropical diseases 2020;14;3;e0008120

  • Integrated scRNA-Seq Identifies Human Postnatal Thymus Seeding Progenitors and Regulatory Dynamics of Differentiating Immature Thymocytes.

    Lavaert M, Liang KL, Vandamme N, Park JE, Roels J, Kowalczyk MS, Li B, Ashenberg O, Tabaka M, Dionne D, Tickle TL, Slyper M, Rozenblatt-Rosen O, Vandekerckhove B, Leclercq G, Regev A, Van Vlierberghe P, Guilliams M, Teichmann SA, Saeys Y and Taghon T

    Faculty of Medicine and Health Sciences, Department of Diagnostic Sciences, Ghent University, C. Heymanslaan 10, MRB2, Entrance 38, 9000 Ghent, Belgium.

    During postnatal life, thymopoiesis depends on the continuous colonization of the thymus by bone-marrow-derived hematopoietic progenitors that migrate through the bloodstream. The current understanding of the nature of thymic immigrants is largely based on data from pre-clinical models. Here, we employed single-cell RNA sequencing (scRNA-seq) to examine the immature postnatal thymocyte population in humans. Integration of bone marrow and peripheral blood precursor datasets identified two putative thymus seeding progenitors that varied in expression of CD7; CD10; and the homing receptors CCR7, CCR9, and ITGB7. Whereas both precursors supported T cell development, only one contributed to intrathymic dendritic cell (DC) differentiation, predominantly of plasmacytoid dendritic cells. Trajectory inference delineated the transcriptional dynamics underlying early human T lineage development, enabling prediction of transcription factor (TF) modules that drive stage-specific steps of human T cell development. This comprehensive dataset defines the expression signature of immature human thymocytes and provides a resource for the further study of human thymopoiesis.

    Immunity 2020;52;6;1088-1104.e6

  • Extensive heterogeneity in somatic mutation and selection in the human bladder.

    Lawson ARJ, Abascal F, Coorens THH, Hooks Y, O'Neill L, Latimer C, Raine K, Sanders MA, Warren AY, Mahbubani KTA, Bareham B, Butler TM, Harvey LMR, Cagan A, Menzies A, Moore L, Colquhoun AJ, Turner W, Thomas B, Gnanapragasam V, Williams N, Rassl DM, Vöhringer H, Zumalave S, Nangalia J, Tubío JMC, Gerstung M, Saeb-Parsy K, Stratton MR, Campbell PJ, Mitchell TJ and Martincorena I

    Cancer, Ageing and Somatic Mutation Programme, Wellcome Sanger Institute, Hinxton CB10 1SA, UK.

    The extent of somatic mutation and clonal selection in the human bladder remains unknown. We sequenced 2097 bladder microbiopsies from 20 individuals using targeted (<i>n</i> = 1914 microbiopsies), whole-exome (<i>n</i> = 655), and whole-genome (<i>n</i> = 88) sequencing. We found widespread positive selection in 17 genes. Chromatin remodeling genes were frequently mutated, whereas mutations were absent in several major bladder cancer genes. There was extensive interindividual variation in selection, with different driver genes dominating the clonal landscape across individuals. Mutational signatures were heterogeneous across clones and individuals, which suggests differential exposure to mutagens in the urine. Evidence of APOBEC mutagenesis was found in 22% of the microbiopsies. Sequencing multiple microbiopsies from five patients with bladder cancer enabled comparisons with cancer-free individuals and across histological features. This study reveals a rich landscape of mutational processes and selection in normal urothelium with large heterogeneity across clones and individuals.

    Funded by: Cancer Research UK; Wellcome Trust

    Science (New York, N.Y.) 2020;370;6512;75-82

  • Genomic heterogeneity in myeloproliferative neoplasms and applications to clinical practice.

    Lee J, Godfrey AL and Nangalia J

    Wellcome Sanger Institute, Hinxton, Cambridgeshire, UK; Cambridge Stem Cell Institute, Jeffrey Cheah Biomedical Centre, Cambridge Biomedical Campus, Puddicombe Way, Cambridge, UK; Department of Haematology, University of Cambridge, Cambridge, UK.

    The myeloproliferative neoplasms (MPN) polycythaemia vera, essential thrombocythaemia and primary myelofibrosis are chronic myeloid disorders associated most often with mutations in JAK2, MPL and CALR, and in some patients with additional acquired genomic lesions. Whilst the molecular mechanisms downstream of these mutations are now clearer, it is apparent that clinical phenotype in MPN is a product of complex interactions, acting between individual mutations, between disease subclones, and between the tumour and background host factors. In this review we first discuss MPN phenotypic driver mutations and the factors that interact with them to influence phenotype. We consider the importance of ongoing studies of clonal haematopoiesis, which may inform a better understanding of why MPN develop in specific individuals. We then consider how best to deploy genomic testing in a clinical environment and the challenges as well as opportunities that may arise from more routine, comprehensive genomic analysis of patients with MPN.

    Blood reviews 2020;100708

  • Mutations in FAM50A suggest that Armfield XLID syndrome is a spliceosomopathy.

    Lee YR, Khan K, Armfield-Uhas K, Srikanth S, Thompson NA, Pardo M, Yu L, Norris JW, Peng Y, Gripp KW, Aleck KA, Li C, Spence E, Choi TI, Kwon SJ, Park HM, Yu D, Do Heo W, Mooney MR, Baig SM, Wentzensen IM, Telegrafi A, McWalter K, Moreland T, Roadhouse C, Ramsey K, Lyons MJ, Skinner C, Alexov E, Katsanis N, Stevenson RE, Choudhary JS, Adams DJ, Kim CH, Davis EE and Schwartz CE

    Department of Biology, Chungnam National University, Daejeon, Korea.

    Intellectual disability (ID) is a heterogeneous clinical entity and includes an excess of males who harbor variants on the X-chromosome (XLID). We report rare FAM50A missense variants in the original Armfield XLID syndrome family localized in Xq28 and four additional unrelated males with overlapping features. Our fam50a knockout (KO) zebrafish model exhibits abnormal neurogenesis and craniofacial patterning, and in vivo complementation assays indicate that the patient-derived variants are hypomorphic. RNA sequencing analysis from fam50a KO zebrafish show dysregulation of the transcriptome, with augmented spliceosome mRNAs and depletion of transcripts involved in neurodevelopment. Zebrafish RNA-seq datasets show a preponderance of 3' alternative splicing events in fam50a KO, suggesting a role in the spliceosome C complex. These data are supported with transcriptomic signatures from cell lines derived from affected individuals and FAM50A protein-protein interaction data. In sum, Armfield XLID syndrome is a spliceosomopathy associated with aberrant mRNA processing during development.

    Funded by: U.S. Department of Health &amp; Human Services | NIH | National Institute of Neurological Disorders and Stroke (NINDS): R01NS073854

    Nature communications 2020;11;1;3698

  • Tracking hematopoietic stem cells and their progeny using whole-genome sequencing.

    Lee-Six H and Kent DG

    Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, United Kingdom.

    Despite decades of progress in our understanding of hematopoiesis through the study of animal models and transplantation in humans, investigating physiological human hematopoiesis directly has remained challenging. Questions on the clonal structure of the human hematopoietic stem cell (HSC) pool, such as "how many HSCs are there?" and "do all HSC clones actively produce all blood cell types in equal proportions?" remain open. These questions have inherent value for understanding normal human physiology, but also directly inform our comprehension of the process by which the system is subverted to drive diseases of the blood, in particular blood cancers and bone marrow failure syndromes. The critical link between normal and abnormal hematopoiesis is perhaps best illustrated by the recent discovery of clonal hematopoiesis in healthy people with no abnormal blood parameters. In such individuals, large clones derived from single cells are present and are dominant relative to their normal counterparts, but their presence does not necessitate abnormal blood cell production. Intriguingly, however, these individuals are also at a significantly greater risk of developing leukemias and of cardiovascular events, underscoring the importance of understanding how blood stem cell clones compete against each other.

    Experimental hematology 2020

  • Improved Prediction of Bacterial Genotype-Phenotype Associations Using Interpretable Pangenome-Spanning Regressions.

    Lees JA, Mai TT, Galardini M, Wheeler NE, Horsfield ST, Parkhill J and Corander J

    MRC Centre for Global Infectious Disease Analysis, Department of Infectious Disease Epidemiology, Imperial College London, London, United Kingdom

    Discovery of genetic variants underlying bacterial phenotypes and the prediction of phenotypes such as antibiotic resistance are fundamental tasks in bacterial genomics. Genome-wide association study (GWAS) methods have been applied to study these relations, but the plastic nature of bacterial genomes and the clonal structure of bacterial populations creates challenges. We introduce an alignment-free method which finds sets of loci associated with bacterial phenotypes, quantifies the total effect of genetics on the phenotype, and allows accurate phenotype prediction, all within a single computationally scalable joint modeling framework. Genetic variants covering the entire pangenome are compactly represented by extended DNA sequence words known as unitigs, and model fitting is achieved using elastic net penalization, an extension of standard multiple regression. Using an extensive set of state-of-the-art bacterial population genomic data sets, we demonstrate that our approach performs accurate phenotype prediction, comparable to popular machine learning methods, while retaining both interpretability and computational efficiency. Compared to those of previous approaches, which test each genotype-phenotype association separately for each variant and apply a significance threshold, the variants selected by our joint modeling approach overlap substantially.<b>IMPORTANCE</b> Being able to identify the genetic variants responsible for specific bacterial phenotypes has been the goal of bacterial genetics since its inception and is fundamental to our current level of understanding of bacteria. This identification has been based primarily on painstaking experimentation, but the availability of large data sets of whole genomes with associated phenotype metadata promises to revolutionize this approach, not least for important clinical phenotypes that are not amenable to laboratory analysis. These models of phenotype-genotype association can in the future be used for rapid prediction of clinically important phenotypes such as antibiotic resistance and virulence by rapid-turnaround or point-of-care tests. However, despite much effort being put into adapting genome-wide association study (GWAS) approaches to cope with bacterium-specific problems, such as strong population structure and horizontal gene exchange, current approaches are not yet optimal. We describe a method that advances methodology for both association and generation of portable prediction models.

    mBio 2020;11;4

  • Urbanized microbiota in infants, immune constitution and later risk of atopic diseases.

    Lehtimäki J, Thorsen J, Rasmussen MA, Hjelmsø M, Shah S, Mortensen MS, Trivedi U, Vestergaard G, Bønnelykke K, Chawes BL, Brix S, Sørensen SJ, Bisgaard H and Stokholm J

    COPSAC, Copenhagen Prospective Studies on Asthma in Childhood, Herlev and Gentofte Hospital, University of Copenhagen, Ledreborg Alle 34, 2820, Gentofte, Denmark.

    Background: Urbanization is linked with an increased burden of asthma and atopic traits. A putative mechanism is insufficient exposure to beneficial microbes early in life leading to immune dysregulation as previously shown for indoor microbial exposures.

    Objective: To investigate whether urbanization is associated with the microbiota composition in the infants' body and early immune function, and whether these contribute to the later risk of asthma and atopic traits.

    Methods: We studied the prospective COPSAC<sub>2010</sub> mother-child cohort of 700 children growing up in areas with different degrees of urbanization. During their first year of life, airway and gut microbiota as well as immune marker concentrations were defined. At six years of age, asthma and atopic traits were diagnosed by pediatricians.

    Results: In adjusted analyses, the risk of asthma and aeroallergen sensitization were increased in urban infants. The composition of especially airway, but also gut microbiota differed between urban and rural infants. The living environment related structure of the airway microbiota associated with immune mediator concentrations already at one month of age. An urbanized structure of airway and gut microbiota associated with an increased risk of asthma coherently during multiple time points, and also with the risks of eczema and sensitization.

    Conclusion: Our findings suggest that urbanization related changes in the infant microbiota may elevate the risk of asthma and atopic traits, probably via crosstalk with the developing immune system. The airways may facilitate this effect as they are open for colonization by environmental, airborne microbes and serve as immune interface.

    The Journal of allergy and clinical immunology 2020

  • Horizontal gene transfer rate is not the primary determinant of observed antibiotic resistance frequencies in Streptococcus pneumoniae.

    Lehtinen S, Chewapreecha C, Lees J, Hanage WP, Lipsitch M, Croucher NJ, Bentley SD, Turner P, Fraser C and Mostowy RJ

    Big Data Institute, University of Oxford, Oxford, UK.

    The extent to which evolution is constrained by the rate at which horizontal gene transfer (HGT) allows DNA to move between genetic lineages is an open question, which we address in the context of antibiotic resistance in <i>Streptococcus pneumoniae</i>. We analyze microbiological, genomic, and epidemiological data from the largest-to-date sequenced pneumococcal carriage study in 955 infants from a refugee camp on the Thailand-Myanmar border. Using a unified framework, we simultaneously test prior hypotheses on rates of HGT and a key evolutionary covariate (duration of carriage) as determinants of resistance frequencies. We conclude that in this setting, there is little evidence of HGT playing a major role in determining resistance frequencies. Instead, observed resistance frequencies are best explained as the outcome of selection acting on a pool of variants, irrespective of the rate at which resistance determinants move between genetic lineages.

    Science advances 2020;6;21;eaaz6137

  • Hearing impairment due to Mir183/96/182 mutations suggests both loss and gain of function effects.

    Lewis MA, Di Domenico F, Ingham NJ, Prosser HM and Steel KP

    Wolfson Centre for Age-Related Diseases, King's College London, London, SE1 1UL, UK

    The microRNA miR-96 is important for hearing, as point mutations in humans and mice result in dominant progressive hearing loss. <i>Mir96</i> is expressed in sensory cells along with <i>Mir182</i> and <i>Mir183</i>, but the roles of these closely-linked microRNAs are as yet unknown. Here we analyse mice carrying null alleles of <i>Mir182</i>, and of <i>Mir183</i> and <i>Mir96</i> together to investigate their roles in hearing. We found that <i>Mir183</i>/<i>96</i> heterozygous mice had normal hearing and homozygotes were completely deaf with abnormal hair cell stereocilia bundles and reduced numbers of inner hair cell synapses at four weeks old. <i>Mir182</i> knockout mice developed normal hearing then exhibited progressive hearing loss. Our transcriptional analyses revealed significant changes in a range of other genes, but surprisingly there were fewer genes with altered expression in the organ of Corti of <i>Mir183/96</i> null mice compared with our previous findings in <i>Mir96</i><sup> <i>Dmdo</i> </sup> mutants, which have a point mutation in the miR-96 seed region. This suggests the more severe phenotype of <i>Mir96</i><sup> <i>Dmdo</i> </sup> mutants compared with <i>Mir183</i>/<i>96</i> mutants, including progressive hearing loss in <i>Mir96</i><sup> <i>Dmdo</i> </sup> heterozygotes, is likely to be mediated by the gain of novel target genes in addition to the loss of its normal targets. We propose three mechanisms of action of mutant miRNAs; loss of targets that are normally completely repressed, loss of targets whose transcription is normally buffered by the miRNA, and gain of novel targets. Any of these mechanisms could lead to a partial loss of a robust cellular identity and consequent dysfunction.

    Disease models & mechanisms 2020

  • Draft Genome Sequence of a New Delhi Metallo-β-Lactamase (NDM-1)-Producing Providencia stuartii Strain Isolated in Lima, Peru.

    Lezameta L, Cuicapuza D, Dávila-Barclay A, Torres S, Salvatierra G, Tsukayama P and Tamariz J

    Laboratorio de Resistencia Antimicrobiana e Inmunopatología, Universidad Peruana Cayetano Heredia, Lima, Peru.

    <i>Providencia stuartii</i> is an opportunistic pathogen of the <i>Enterobacteriales</i> order. Here, we report the 4,594,658-bp draft genome sequence of a New Delhi metallo-β-lactamase (NDM-1)-producing <i>Providencia stuartii</i> strain that was isolated from an emergency patient in a private clinic in Lima, Peru.

    Microbiology resource announcements 2020;9;39

  • Genome-wide Association Analysis in Humans Links Nucleotide Metabolism to Leukocyte Telomere Length.

    Li C, Stoma S, Lotta LA, Warner S, Albrecht E, Allione A, Arp PP, Broer L, Buxton JL, Da Silva Couto Alves A, Deelen J, Fedko IO, Gordon SD, Jiang T, Karlsson R, Kerrison N, Loe TK, Mangino M, Milaneschi Y, Miraglio B, Pervjakova N, Russo A, Surakka I, van der Spek A, Verhoeven JE, Amin N, Beekman M, Blakemore AI, Canzian F, Hamby SE, Hottenga JJ, Jones PD, Jousilahti P, Mägi R, Medland SE, Montgomery GW, Nyholt DR, Perola M, Pietiläinen KH, Salomaa V, Sillanpää E, Suchiman HE, van Heemst D, Willemsen G, Agudo A, Boeing H, Boomsma DI, Chirlaque MD, Fagherazzi G, Ferrari P, Franks P, Gieger C, Eriksson JG, Gunter M, Hägg S, Hovatta I, Imaz L, Kaprio J, Kaaks R, Key T, Krogh V, Martin NG, Melander O, Metspalu A, Moreno C, Onland-Moret NC, Nilsson P, Ong KK, Overvad K, Palli D, Panico S, Pedersen NL, Penninx BWJH, Quirós JR, Jarvelin MR, Rodríguez-Barranco M, Scott RA, Severi G, Slagboom PE, Spector TD, Tjonneland A, Trichopoulou A, Tumino R, Uitterlinden AG, van der Schouw YT, van Duijn CM, Weiderpass E, Denchi EL, Matullo G, Butterworth AS, Danesh J, Samani NJ, Wareham NJ, Nelson CP, Langenberg C and Codd V

    MRC Epidemiology Unit, University of Cambridge, CB2 0SL, United Kingdom; NIHR Leicester Biomedical Research Centre, Glenfield Hospital, Leicester, LE3 9QP, United Kingdom.

    Leukocyte telomere length (LTL) is a heritable biomarker of genomic aging. In this study, we perform a genome-wide meta-analysis of LTL by pooling densely genotyped and imputed association results across large-scale European-descent studies including up to 78,592 individuals. We identify 49 genomic regions at a false dicovery rate (FDR) < 0.05 threshold and prioritize genes at 31, with five highlighting nucleotide metabolism as an important regulator of LTL. We report six genome-wide significant loci in or near SENP7, MOB1B, CARMIL1, PRRC2A, TERF2, and RFWD3, and our results support recently identified PARP1, POT1, ATM, and MPHOSPH6 loci. Phenome-wide analyses in >350,000 UK Biobank participants suggest that genetically shorter telomere length increases the risk of hypothyroidism and decreases the risk of thyroid cancer, lymphoma, and a range of proliferative conditions. Our results replicate previously reported associations with increased risk of coronary artery disease and lower risk for multiple cancer types. Our findings substantially expand current knowledge on genes that regulate LTL and their impact on human health and disease.

    American journal of human genetics 2020;106;3;389-404

  • Patterns of somatic structural variation in human cancer genomes.

    Li Y, Roberts ND, Wala JA, Shapira O, Schumacher SE, Kumar K, Khurana E, Waszak S, Korbel JO, Haber JE, Imielinski M, PCAWG Structural Variation Working Group, Weischenfeldt J, Beroukhim R, Campbell PJ and PCAWG Consortium

    Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, UK.

    A key mutational process in cancer is structural variation, in which rearrangements delete, amplify or reorder genomic segments that range in size from kilobases to whole chromosomes<sup>1-7</sup>. Here we develop methods to group, classify and describe somatic structural variants, using data from the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA), which aggregated whole-genome sequencing data from 2,658 cancers across 38 tumour types<sup>8</sup>. Sixteen signatures of structural variation emerged. Deletions have a multimodal size distribution, assort unevenly across tumour types and patients, are enriched in late-replicating regions and correlate with inversions. Tandem duplications also have a multimodal size distribution, but are enriched in early-replicating regions-as are unbalanced translocations. Replication-based mechanisms of rearrangement generate varied chromosomal structures with low-level copy-number gains and frequent inverted rearrangements. One prominent structure consists of 2-7 templates copied from distinct regions of the genome strung together within one locus. Such cycles of templated insertions correlate with tandem duplications, and-in liver cancer-frequently activate the telomerase gene TERT. A wide variety of rearrangement processes are active in cancer, which generate complex configurations of the genome upon which selection can act.

    Funded by: NCI NIH HHS: R01 CA095175, R01 CA217991, R01 CA218668; NIGMS NIH HHS: R35 GM127029; Wellcome Trust: 088340, 206194, WT088340MA

    Nature 2020;578;7793;112-121

  • Trappc9 deficiency causes parent-of-origin dependent microcephaly and obesity.

    Liang ZS, Cimino I, Yalcin B, Raghupathy N, Vancollie VE, Ibarra-Soria X, Firth HV, Rimmington D, Farooqi IS, Lelliott CJ, Munger SC, O'Rahilly S, Ferguson-Smith AC, Coll AP and Logan DW

    Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, United Kingdom.

    Some imprinted genes exhibit parental origin specific expression bias rather than being transcribed exclusively from one copy. The physiological relevance of this remains poorly understood. In an analysis of brain-specific allele-biased expression, we identified that Trappc9, a cellular trafficking factor, was expressed predominantly (~70%) from the maternally inherited allele. Loss-of-function mutations in human TRAPPC9 cause a rare neurodevelopmental syndrome characterized by microcephaly and obesity. By studying Trappc9 null mice we discovered that homozygous mutant mice showed a reduction in brain size, exploratory activity and social memory, as well as a marked increase in body weight. A role for Trappc9 in energy balance was further supported by increased ad libitum food intake in a child with TRAPPC9 deficiency. Strikingly, heterozygous mice lacking the maternal allele (70% reduced expression) had pathology similar to homozygous mutants, whereas mice lacking the paternal allele (30% reduction) were phenotypically normal. Taken together, we conclude that Trappc9 deficient mice recapitulate key pathological features of TRAPPC9 mutations in humans and identify a role for Trappc9 and its imprinting in controlling brain development and metabolism.

    Funded by: Department of Health; Medical Research Council: MC_UU_00014/1, MC_UU_00014/5, MC_UU_12012/1, MR/J001597/1 ; Wellcome Trust: 207462/Z/17/Z, WT095606, WT098051 , WT206194

    PLoS genetics 2020;16;9;e1008916

  • Functional studies of GWAS variants are gaining momentum.

    Lichou F and Trynka G

    Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK.

    Funded by: Wellcome Trust: WT206194

    Nature communications 2020;11;1;6283

  • The Deep Genome Project.

    Lloyd KCK, Adams DJ, Baynam G, Beaudet AL, Bosch F, Boycott KM, Braun RE, Caulfield M, Cohn R, Dickinson ME, Dobbie MS, Flenniken AM, Flicek P, Galande S, Gao X, Grobler A, Heaney JD, Herault Y, de Angelis MH, Lupski JR, Lyonnet S, Mallon AM, Mammano F, MacRae CA, McInnes R, McKerlie C, Meehan TF, Murray SA, Nutter LMJ, Obata Y, Parkinson H, Pepper MS, Sedlacek R, Seong JK, Shiroishi T, Smedley D, Tocchini-Valentini G, Valle D, Wang CL, Wells S, White J, Wurst W, Xu Y and Brown SDM

    Department of Surgery, School of Medicine, and Mouse Biology Program, University of California, Davis, CA, 95618, USA.

    Funded by: British Heart Foundation: FS/12/82/29736; Medical Research Council: G9521010, MC_EX_MR/M009203/1, MC_PC_14089, MC_U142684172, MR/M009203/1; NHGRI NIH HHS: UM1 HG006348

    Genome biology 2020;21;1;18

  • Genomics and epidemiological surveillance.

    Lo SW and Jamrozy D

    Parasites and Microbes, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK.

    Nature reviews. Microbiology 2020

  • A mosaic tetracycline resistance gene tet(S/M) detected in an MDR pneumococcal CC230 lineage that underwent capsular switching in South Africa.

    Lo SW, Gladstone RA, van Tonder AJ, Du Plessis M, Cornick JE, Hawkins PA, Madhi SA, Nzenze SA, Kandasamy R, Ravikumar KL, Elmdaghri N, Kwambana-Adams B, Almeida SCG, Skoczynska A, Egorova E, Titov L, Saha SK, Paragi M, Everett DB, Antonio M, Klugman KP, Li Y, Metcalf BJ, Beall B, McGee L, Breiman RF, Bentley SD, von Gottberg A and Global Pneumococcal Sequencing Consortium

    Parasites and Microbes Programme, The Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK.

    Objectives: We reported tet(S/M) in Streptococcus pneumoniae and investigated its temporal spread in relation to nationwide clinical interventions.

    Methods: We whole-genome sequenced 12 254 pneumococcal isolates from 29 countries on an Illumina HiSeq sequencer. Serotype, multilocus ST and antibiotic resistance were inferred from genomes. An SNP tree was built using Gubbins. Temporal spread was reconstructed using a birth-death model.

    Results: We identified tet(S/M) in 131 pneumococcal isolates and none carried other known tet genes. Tetracycline susceptibility testing results were available for 121 tet(S/M)-positive isolates and all were resistant. A majority (74%) of tet(S/M)-positive isolates were from South Africa and caused invasive diseases among young children (59% HIV positive, where HIV status was available). All but two tet(S/M)-positive isolates belonged to clonal complex (CC) 230. A global phylogeny of CC230 (n=389) revealed that tet(S/M)-positive isolates formed a sublineage predicted to exhibit resistance to penicillin, co-trimoxazole, erythromycin and tetracycline. The birth-death model detected an unrecognized outbreak of this sublineage in South Africa between 2000 and 2004 with expected secondary infections (effective reproductive number, R) of ∼2.5. R declined to ∼1.0 in 2005 and <1.0 in 2012. The declining epidemic could be related to improved access to ART in 2004 and introduction of pneumococcal conjugate vaccine (PCV) in 2009. Capsular switching from vaccine serotype 14 to non-vaccine serotype 23A was observed within the sublineage.

    Conclusions: The prevalence of tet(S/M) in pneumococci was low and its dissemination was due to an unrecognized outbreak of CC230 in South Africa. Capsular switching in this MDR sublineage highlighted its potential to continue to cause disease in the post-PCV13 era.

    Funded by: Wellcome Trust

    The Journal of antimicrobial chemotherapy 2020;75;3;512-520

  • Genomic and Phenotypic Analyses of Acinetobacter baumannii Isolates From Three Tertiary Care Hospitals in Thailand.

    Loraine J, Heinz E, Soontarach R, Blackwell GA, Stabler RA, Voravuthikunchai SP, Srimanote P, Kiratisin P, Thomson NR and Taylor PW

    School of Pharmacy, University College London, London, United Kingdom.

    Antibiotic resistant strains of <i>Acinetobacter baumannii</i> are responsible for a large and increasing burden of nosocomial infections in Thailand and other countries of Southeast Asia. New approaches to their control and treatment are urgently needed and an attractive strategy is to remove the bacterial polysaccharide capsule, and thus the protection from the host's immune system. To examine phylogenetic relationships, distribution of capsule chemotypes, acquired antibiotic resistance determinants, susceptibility to complement and other traits associated with systemic infection, we sequenced 191 isolates from three tertiary referral hospitals in Thailand and used phenotypic assays to characterize key aspects of infectivity. Several distinct lineages were circulating in three hospitals and the majority belonged to global clonal group 2 (GC2). Very high levels of resistance to carbapenems and other front-line antibiotics were found, as were a number of widespread plasmid replicons. A high diversity of capsule genotypes was encountered, with only three of these (KL6, KL10, and KL47) showing more than 10% frequency. Almost 90% of GC2 isolates belonged to the most common capsule genotypes and were fully resistant to the bactericidal action of human serum complement, most likely protected by their polysaccharide capsule, which represents a key determinant of virulence for systemic infection. Our study further highlights the importance to develop therapeutic strategies to remove the polysaccharide capsule from extensively drug-resistant <i>A. baumanii</i> during the course of systemic infection.

    Frontiers in microbiology 2020;11;548

  • Influence of past climate change on phylogeography and demographic history of narwhals, Monodon monoceros.

    Louis M, Skovrind M, Samaniego Castruita JA, Garilao C, Kaschner K, Gopalakrishnan S, Haile JS, Lydersen C, Kovacs KM, Garde E, Heide-Jørgensen MP, Postma L, Ferguson SH, Willerslev E and Lorenzen ED

    Globe Institute, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark.

    The Arctic is warming at an unprecedented rate, with unknown consequences for endemic fauna. However, Earth has experienced severe climatic oscillations in the past, and understanding how species responded to them might provide insight into their resilience to near-future climatic predictions. Little is known about the responses of Arctic marine mammals to past climatic shifts, but narwhals (<i>Monodon monoceros</i>) are considered one of the endemic Arctic species most vulnerable to environmental change. Here, we analyse 121 complete mitochondrial genomes from narwhals sampled across their range and use them in combination with species distribution models to elucidate the influence of past and ongoing climatic shifts on their population structure and demographic history. We find low levels of genetic diversity and limited geographic structuring of genetic clades. We show that narwhals experienced a long-term low effective population size, which increased after the Last Glacial Maximum, when the amount of suitable habitat expanded. Similar post-glacial habitat release has been a key driver of population size expansion of other polar marine predators. Our analyses indicate that habitat availability has been critical to the success of narwhals, raising concerns for their fate in an increasingly warming Arctic.

    Proceedings. Biological sciences 2020;287;1925;20192964

  • Structural variation of the malaria-associated human glycophorin A-B-E region.

    Louzada S, Algady W, Weyell E, Zuccherato LW, Brajer P, Almalki F, Scliar MO, Naslavsky MS, Yamamoto GL, Duarte YAO, Passos-Bueno MR, Zatz M, Yang F and Hollox EJ

    Wellcome Sanger Institute, Hinxton, Cambridge, UK.

    Background: Approximately 5% of the human genome shows common structural variation, which is enriched for genes involved in the immune response and cell-cell interactions. A well-established region of extensive structural variation is the glycophorin gene cluster, comprising three tandemly-repeated regions about 120 kb in length and carrying the highly homologous genes GYPA, GYPB and GYPE. Glycophorin A (encoded by GYPA) and glycophorin B (encoded by GYPB) are glycoproteins present at high levels on the surface of erythrocytes, and they have been suggested to act as decoy receptors for viral pathogens. They are receptors for the invasion of the protist parasite Plasmodium falciparum, a causative agent of malaria. A particular complex structural variant, called DUP4, creates a GYPB-GYPA fusion gene known to confer resistance to malaria. Many other structural variants exist across the glycophorin gene cluster, and they remain poorly characterised.

    Results: Here, we analyse sequences from 3234 diploid genomes from across the world for structural variation at the glycophorin locus, confirming 15 variants in the 1000 Genomes project cohort, discovering 9 new variants, and characterising a selection of these variants using fibre-FISH and breakpoint mapping at the sequence level. We identify variants predicted to create novel fusion genes and a common inversion duplication variant at appreciable frequencies in West Africans. We show that almost all variants can be explained by non-allelic homologous recombination and by comparing the structural variant breakpoints with recombination hotspot maps, confirm the importance of a particular meiotic recombination hotspot on structural variant formation in this region.

    Conclusions: We identify and validate large structural variants in the human glycophorin A-B-E gene cluster which may be associated with different clinical aspects of malaria.

    Funded by: Saudi Arabia Cultural Bureau in London: n/a; Wellcome Trust: WT098051

    BMC genomics 2020;21;1;446

  • Genome-wide discovery, and computational and transcriptional characterization of an AIG gene family in the freshwater snail Biomphalaria glabrata, a vector for Schistosoma mansoni.

    Lu L, Loker ES, Zhang SM, Buddenborg SK and Bu L

    Center for Evolutionary and Theoretical Immunology, Department of Biology, University of New Mexico, Albuquerque, NM, 87131, USA.

    Background: The AIG (avrRpt2-induced gene) family of GTPases, characterized by the presence of a distinctive AIG1 domain, is mysterious in having a peculiar phylogenetic distribution, a predilection for undergoing expansion and loss, and an uncertain functional role, especially in invertebrates. AIGs are frequently represented as GIMAPs (GTPase of the immunity associated protein family), characterized by presence of the AIG1 domain along with coiled-coil domains. Here we provide an overview of the remarkably expanded AIG repertoire of the freshwater gastropod Biomphalaria glabrata, compare it with AIGs in other organisms, and detail patterns of expression in B. glabrata susceptible or resistant to infection with Schistosoma mansoni, responsible for the neglected tropical disease of intestinal schistosomiasis.

    Results: We define the 7 conserved motifs that comprise the AIG1 domain in B. glabrata and detail its association with at least 7 other domains, indicative of functional versatility of B. glabrata AIGs. AIG genes were usually found in tandem arrays in the B. glabrata genome, suggestive of an origin by segmental gene duplication. We found 91 genes with complete AIG1 domains, including 64 GIMAPs and 27 AIG genes without coiled-coils, more than known for any other organism except Danio (with > 100). We defined expression patterns of AIG genes in 12 different B. glabrata organs and characterized whole-body AIG responses to microbial PAMPs, and of schistosome-resistant or -susceptible strains of B. glabrata to S. mansoni exposure. Biomphalaria glabrata AIG genes clustered with expansions of AIG genes from other heterobranch gastropods yet showed unique lineage-specific subclusters. Other gastropods and bivalves had separate but also diverse expansions of AIG genes, whereas cephalopods seem to lack AIG genes.

    Conclusions: The AIG genes of B. glabrata exhibit expansion in both numbers and potential functions, differ markedly in expression between strains varying in susceptibility to schistosomes, and are responsive to immune challenge. These features provide strong impetus to further explore the functional role of AIG genes in the defense responses of B. glabrata, including to suppress or support the development of medically relevant S. mansoni parasites.

    Funded by: NIAID NIH HHS: R37 AI101438; NIGMS NIH HHS: P30 GM110907

    BMC genomics 2020;21;1;190

  • Tumor necrosis factor receptor family costimulation increases regulatory T-cell activation and function via NF-κB.

    Lubrano di Ricco M, Ronin E, Collares D, Divoux J, Grégoire S, Wajant H, Gomes T, Grinberg-Bleyer Y, Baud V, Marodon G and Salomon BL

    Sorbonne Université, INSERM, CNRS, Centre d'Immunologie et des Maladies Infectieuses (CIMI-Paris), Paris, France.

    Several drugs targeting members of the TNF superfamily or TNF receptor superfamily (TNFRSF) are widely used in medicine or are currently being tested in therapeutic trials. However, their mechanism of action remains poorly understood. Here, we explored the effects of TNFRSF co-stimulation on murine Foxp3<sup>+</sup> regulatory T cell (Treg) biology, as they are pivotal modulators of immune responses. We show that engagement of TNFR2, 4-1BB, GITR, and DR3, but not OX40, increases Treg proliferation and survival. Triggering these TNFRSF in Tregs induces similar changes in gene expression patterns, suggesting that they engage common signal transduction pathways. Among them, we identified a major role of canonical NF-κB. Importantly, TNFRSF co-stimulation improves the ability of Tregs to suppress colitis. Our data demonstrate that stimulation of discrete TNFRSF members enhances Treg activation and function through a shared mechanism. Consequently, therapeutic effects of drugs targeting TNFRSF or their ligands may be mediated by their effect on Tregs.

    Funded by: Agence Nationale de la Recherche: ANR-15-CE15-0015-01, ANR-17-CE15-0030-01; Deutsche Forschungsgemeinschaft: 324392634; ENLIGHT-TEN: 675395; ENLIGHT-TEN program; European Commission: 675395; European Union's: H2020; Fondation pour la Recherche Médicale

    European journal of immunology 2020;50;7;972-985

  • Genomic surveillance of Escherichia coli ST131 identifies local expansion and serial replacement of subclones.

    Ludden C, Decano AG, Jamrozy D, Pickard D, Morris D, Parkhill J, Peacock SJ, Cormican M and Downing T

    London School of Hygiene & Tropical Medicine, Keppel Street, London WC1E 7HT, UK.

    <i>Escherichia coli</i> sequence type 131 (ST131) is a pandemic clone that is evolving rapidly with increasing levels of antimicrobial resistance. Here, we investigated an outbreak of <i>E. coli</i> ST131 producing extended spectrum β-lactamases (ESBLs) in a long-term care facility (LTCF) in Ireland by combining data from this LTCF (<i>n</i>=69) with other Irish (<i>n</i>=35) and global (<i>n</i>=690) ST131 genomes to reconstruct the evolutionary history and understand changes in population structure and genome architecture over time. This required a combination of short- and long-read genome sequencing, <i>de novo</i> assembly, read mapping, ESBL gene screening, plasmid alignment and temporal phylogenetics. We found that Clade C was the most prevalent (686 out of 794 isolates, 86 %) of the three major ST131 clades circulating worldwide (A with <i>fimH41</i>, B with <i>fimH22</i>, C with <i>fimH30</i>), and was associated with the presence of different ESBL alleles, diverse plasmids and transposable elements. Clade C was estimated to have emerged in <i>c</i>. 1985 and subsequently acquired different ESBL gene variants (<i>bla</i><sub>CTX-M-14</sub> vs <i>bla</i><sub>CTX-M-15</sub>). An ISEcp<i>1-</i>mediated transposition of the <i>bla</i><sub>CTX-M-15</sub> gene further increased the diversity within Clade C. We discovered a local clonal expansion of a rare C2 lineage (C2_8) with a chromosomal insertion of <i>bla</i><sub>CTX-M-15</sub> at the <i>mppA</i> gene. This was acquired from an IncFIA plasmid. The C2_8 lineage clonally expanded in the Irish LTCF from 2006, displacing the existing C1 strain (C1_10), highlighting the potential for novel ESBL-producing ST131 with a distinct genetic profile to cause outbreaks strongly associated with specific healthcare environments.

    Funded by: Department of Health; Wellcome Trust: 110243/Z/15/Z, WT098600

    Microbial genomics 2020;6;4

  • A One Health Study of the Genetic Relatedness of Klebsiella pneumoniae and Their Mobile Elements in the East of England.

    Ludden C, Moradigaravand D, Jamrozy D, Gouliouris T, Blane B, Naydenova P, Hernandez-Garcia J, Wood P, Hadjirin N, Radakovic M, Crawley C, Brown NM, Holmes M, Parkhill J and Peacock SJ

    Department of Pathogen Molecular Biology, London School of Hygiene & Tropical Medicine, Hinxton.

    Background: Klebsiella pneumoniae is a human, animal, and environmental commensal and a leading cause of nosocomial infections, which are often caused by multiresistant strains. We evaluate putative sources of K. pneumoniae that are carried by and infect hospital patients.

    Methods: We conducted a 6-month survey on 2 hematology wards at Addenbrooke's Hospital, Cambridge, United Kingdom, in 2015 to isolate K. pneumoniae from stool, blood, and the environment. We conducted cross-sectional surveys of K. pneumoniae from 29 livestock farms, 97 meat products, the hospital sewer, and 20 municipal wastewater treatment plants in the East of England between 2014 and 2015. Isolates were sequenced and their genomes compared.

    Results: Klebsiella pneumoniae was isolated from stool of 17/149 (11%) patients and 18/922 swabs of their environment, together with 1 bloodstream infection during the study and 4 others over a 24-month period. Each patient carried 1 or more lineages that was unique to them, but 2 broad environmental contamination events and patient-environment transmission were identified. Klebsiella pneumoniae was isolated from cattle, poultry, hospital sewage, and 12/20 wastewater treatment plants. There was low genetic relatedness between isolates from patients/their hospital environment vs isolates from elsewhere. Identical genes encoding cephalosporin resistance were carried by isolates from humans/environment and elsewhere but were carried on different plasmids.

    Conclusion: We identified no patient-to-patient transmission and no evidence for livestock as a source of K. pneumoniae infecting humans. However, our findings reaffirm the importance of the hospital environment as a source of K. pneumoniae associated with serious human infection.

    Clinical infectious diseases : an official publication of the Infectious Diseases Society of America 2020;70;2;219-226

  • Diverse variola virus (smallpox) strains were widespread in northern Europe in the Viking Age.

    Mühlemann B, Vinner L, Margaryan A, Wilhelmson H, de la Fuente Castro C, Allentoft ME, de Barros Damgaard P, Hansen AJ, Holtsmark Nielsen S, Strand LM, Bill J, Buzhilova A, Pushkina T, Falys C, Khartanovich V, Moiseyev V, Jørkov MLS, Østergaard Sørensen P, Magnusson Y, Gustin I, Schroeder H, Sutter G, Smith GL, Drosten C, Fouchier RAM, Smith DJ, Willerslev E, Jones TC and Sikora M

    Centre for Pathogen Evolution, Department of Zoology, University of Cambridge, Cambridge CB2 3EJ, UK.

    Smallpox, one of the most devastating human diseases, killed between 300 million and 500 million people in the 20th century alone. We recovered viral sequences from 13 northern European individuals, including 11 dated to ~600-1050 CE, overlapping the Viking Age, and reconstructed near-complete variola virus genomes for four of them. The samples predate the earliest confirmed smallpox cases by ~1000 years, and the sequences reveal a now-extinct sister clade of the modern variola viruses that were in circulation before the eradication of smallpox. We date the most recent common ancestor of variola virus to ~1700 years ago. Distinct patterns of gene inactivation in the four near-complete sequences show that different evolutionary paths of genotypic host adaptation resulted in variola viruses that circulated widely among humans.

    Science (New York, N.Y.) 2020;369;6502

  • A panel of recombinant proteins from human-infective Plasmodium species for serological surveillance.

    Müller-Sienerth N, Shilts J, Kadir KA, Yman V, Homann MV, Asghar M, Ngasala B, Singh B, Färnert A and Wright GJ

    Cell Surface Signalling Laboratory, Wellcome Sanger Institute, Cambridge, UK.

    Background: Malaria remains a global health problem and accurate surveillance of Plasmodium parasites that are responsible for this disease is required to guide the most effective distribution of control measures. Serological surveillance will be particularly important in areas of low or periodic transmission because patient antibody responses can provide a measure of historical exposure. While methods for detecting host antibody responses to Plasmodium falciparum and Plasmodium vivax are well established, development of serological assays for Plasmodium knowlesi, Plasmodium ovale and Plasmodium malariae