Sanger Institute - Publications 2018

Number of papers published in 2018: 568

  • Loose ends: almost one in five human genes still have unresolved coding status.

    Abascal F, Juan D, Jungreis I, Martinez L, Rigau M, Rodriguez JM, Vazquez J and Tress ML

    Wellcome Trust Sanger Institute, Hinxton CB10 1SA, Cambridgeshire, UK.

    Seventeen years after the sequencing of the human genome, the human proteome is still under revision. One in eight of the 22 210 coding genes listed by the Ensembl/GENCODE, RefSeq and UniProtKB reference databases are annotated differently across the three sets. We have carried out an in-depth investigation on the 2764 genes classified as coding by one or more sets of manual curators and not coding by others. Data from large-scale genetic variation analyses suggests that most are not under protein-like purifying selection and so are unlikely to code for functional proteins. A further 1470 genes annotated as coding in all three reference sets have characteristics that are typical of non-coding genes or pseudogenes. These potential non-coding genes also appear to be undergoing neutral evolution and have considerably less supporting transcript and protein evidence than other coding genes. We believe that the three reference databases currently overestimate the number of human coding genes by at least 2000, complicating and adding noise to large-scale biomedical experiments. Determining which potential non-coding genes do not code for proteins is a difficult but vitally important task since the human reference proteome is a fundamental pillar of most basic research and supports almost all large-scale biomedical projects.

    Nucleic acids research 2018;46;14;7070-7084

  • Prediction of acute myeloid leukaemia risk in healthy individuals.

    Abelson S, Collord G, Ng SWK, Weissbrod O, Mendelson Cohen N, Niemeyer E, Barda N, Zuzarte PC, Heisler L, Sundaravadanam Y, Luben R, Hayat S, Wang TT, Zhao Z, Cirlan I, Pugh TJ, Soave D, Ng K, Latimer C, Hardy C, Raine K, Jones D, Hoult D, Britten A, McPherson JD, Johansson M, Mbabaali F, Eagles J, Miller JK, Pasternack D, Timms L, Krzyzanowski P, Awadalla P, Costa R, Segal E, Bratman SV, Beer P, Behjati S, Martincorena I, Wang JCY, Bowles KM, Quirós JR, Karakatsani A, La Vecchia C, Trichopoulou A, Salamanca-Fernández E, Huerta JM, Barricarte A, Travis RC, Tumino R, Masala G, Boeing H, Panico S, Kaaks R, Krämer A, Sieri S, Riboli E, Vineis P, Foll M, McKay J, Polidoro S, Sala N, Khaw KT, Vermeulen R, Campbell PJ, Papaemmanuil E, Minden MD, Tanay A, Balicer RD, Wareham NJ, Gerstung M, Dick JE, Brennan P, Vassiliou GS and Shlush LI

    Princess Margaret Cancer Centre, University Health Network (UHN), Toronto, Ontario, Canada.

    The incidence of acute myeloid leukaemia (AML) increases with age and mortality exceeds 90% when diagnosed after age 65. Most cases arise without any detectable early symptoms and patients usually present with the acute complications of bone marrow failure<sup>1</sup>. The onset of such de novo AML cases is typically preceded by the accumulation of somatic mutations in preleukaemic haematopoietic stem and progenitor cells (HSPCs) that undergo clonal expansion<sup>2,3</sup>. However, recurrent AML mutations also accumulate in HSPCs during ageing of healthy individuals who do not develop AML, a phenomenon referred to as age-related clonal haematopoiesis (ARCH)<sup>4-8</sup>. Here we use deep sequencing to analyse genes that are recurrently mutated in AML to distinguish between individuals who have a high risk of developing AML and those with benign ARCH. We analysed peripheral blood cells from 95 individuals that were obtained on average 6.3 years before AML diagnosis (pre-AML group), together with 414 unselected age- and gender-matched individuals (control group). Pre-AML cases were distinct from controls and had more mutations per sample, higher variant allele frequencies, indicating greater clonal expansion, and showed enrichment of mutations in specific genes. Genetic parameters were used to derive a model that accurately predicted AML-free survival; this model was validated in an independent cohort of 29 pre-AML cases and 262 controls. Because AML is rare, we also developed an AML predictive model using a large electronic health record database that identified individuals at greater risk. Collectively our findings provide proof-of-concept that it is possible to discriminate ARCH from pre-AML many years before malignant transformation. This could in future enable earlier detection and monitoring, and may help to inform intervention.

    Funded by: Cancer Research UK: 14136; Medical Research Council: G0401527, G1000143, MC_PC_12009, MC_UU_12015/1; Wellcome Trust

    Nature 2018;559;7714;400-404

  • Recommendations for interpreting the loss of function PVS1 ACMG/AMP variant criterion.

    Abou Tayoun AN, Pesaran T, DiStefano MT, Oza A, Rehm HL, Biesecker LG, Harrison SM and ClinGen Sequence Variant Interpretation Working Group (ClinGen SVI)

    The Children's Hospital of Philadelphia, Philadelphia, Pennsylvania.

    The 2015 ACMG/AMP sequence variant interpretation guideline provided a framework for classifying variants based on several benign and pathogenic evidence criteria, including a pathogenic criterion (PVS1) for predicted loss of function variants. However, the guideline did not elaborate on specific considerations for the different types of loss of function variants, nor did it provide decision-making pathways assimilating information about variant type, its location, or any additional evidence for the likelihood of a true null effect. Furthermore, this guideline did not take into account the relative strengths for each evidence type and the final outcome of their combinations with respect to PVS1 strength. Finally, criteria specifying the genes for which PVS1 can be applied are still missing. Here, as part of the ClinGen Sequence Variant Interpretation (SVI) Workgroup's goal of refining ACMG/AMP criteria, we provide recommendations for applying the PVS1 criterion using detailed guidance addressing the above-mentioned gaps. Evaluation of the refined criterion by seven disease-specific groups using heterogeneous types of loss of function variants (n = 56) showed 89% agreement with the new recommendation, while discrepancies in six variants (11%) were appropriately due to disease-specific refinements. Our recommendations will facilitate consistent and accurate interpretation of predicted loss of function variants.

    Funded by: NHGRI NIH HHS: U41 HG006834; National Human Genome Research Institute: U41HG006834

    Human mutation 2018

  • Whole-body single-cell sequencing reveals transcriptional domains in the annelid larval body.

    Achim K, Eling N, Vergara HM, Bertucci PY, Musser J, Vopalensky P, Brunet T, Collier P, Benes V, Marioni JC and Arendt D

    Developmental Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany.

    Animal bodies comprise diverse arrays of cells. To characterise cellular identities across an entire body, we have compared the transcriptomes of single cells randomly picked from dissociated whole larvae of the marine annelid Platynereis dumerilii. We identify five transcriptionally distinct groups of differentiated cells, each expressing a unique set of transcription factors and effector genes that implement cellular phenotypes. Spatial mapping of cells into a cellular expression atlas, and wholemount in situ hybridisation of group-specific genes reveals spatially coherent transcriptional domains in the larval body, comprising e.g. apical sensory-neurosecretory cells vs. neural/epidermal surface cells. These domains represent new, basic subdivisions of the annelid body based entirely on differential gene expression, and are composed of multiple, transcriptionally similar cell types. They do not represent clonal domains, as revealed by developmental lineage analysis. We propose that the transcriptional domains that subdivide the annelid larval body represent families of related cell types that have arisen by evolutionary diversification. Their possible evolutionary conservation makes them a promising tool for evo-devo research. (167/250).

    Molecular biology and evolution 2018

  • Development and evaluation of a novel LAMP assay for the diagnosis of Cutaneous and Visceral Leishmaniasis.

    Adams ER, Schoone G, Versteeg I, Gomez MA, Diro E, Mori Y, Perlee D, Downing T, Saravia N, Assaye A, Hailu A, Albertini A, Ndung'u JM and Schallig H

    Research Centre for Drugs and Diagnostics, Parasitology Department, Liverpool School of Tropical Medicine, Parasitology department, Pembroke Place, Liverpool, L3 5QA, UK

    Introduction: A novel <i>Pan-Leishmania</i> LAMP assay was developed for diagnosis of Cutaneous and Visceral Leishmaniasis (CL & VL) which can be used in near-patient settings.

    Methods: Primers were designed on the 18S rDNA and the conserved region of minicircle kDNA selected on the basis of high copy number. LAMP assays were evaluated for CL in a prospective cohort trial of 105 patients in South-West Colombia. Lesion swab samples from CL suspects were collected and tested using LAMP and compared to a composite reference of microscopy AND/OR culture to calculate diagnostic accuracy. LAMP assays were tested on 50 VL suspected patients from Ethiopia, including whole blood, peripheral blood mononuclear cells, and buffy coat. Diagnostic accuracy was calculated against a reference standard of microscopy of splenic or bone marrow aspirates. To calculate analytical specificity 100 clinical samples and isolates with fever causing pathogens including malaria, arboviruses and bacterial infections were tested.

    Results &amp; conclusions: The LAMP assay had a sensitivity of 95% (95% CI: 87.2% - 98.5 %) and a specificity of 86% (95% CI: 67.3% -95.9 %) for the diagnosis of CL. On VL suspects the sensitivity was 92% (95% CI: 74.9 - 99.1%) and specificity of 100% (95% CI: 85.8-100%) in whole blood. For CL, LAMP is a sensitive tool for diagnosis and requires less equipment, time and expertise than alternative CL diagnostics. For VL, LAMP is sensitive using a minimally invasive sample as compared to the gold standard. The analytical specificity was 100%.

    Journal of clinical microbiology 2018

  • CTCF maintains regulatory homeostasis of cancer pathways.

    Aitken SJ, Ibarra-Soria X, Kentepozidou E, Flicek P, Feig C, Marioni JC and Odom DT

    Cancer Research UK Cambridge Institute, University of Cambridge, Li Ka Shing Centre, Robinson Way, Cambridge, CB2 0RE, UK.

    Background: CTCF binding to DNA helps partition the mammalian genome into discrete structural and regulatory domains. Complete removal of CTCF from mammalian cells causes catastrophic genome dysregulation, likely due to widespread collapse of 3D chromatin looping and alterations to inter- and intra-TAD interactions within the nucleus. In contrast, Ctcf hemizygous mice with lifelong reduction of CTCF expression are viable, albeit with increased cancer incidence. Here, we exploit chronic Ctcf hemizygosity to reveal its homeostatic roles in maintaining genome function and integrity.

    Results: We find that Ctcf hemizygous cells show modest but robust changes in almost a thousand sites of genomic CTCF occupancy; these are enriched for lower affinity binding events with weaker evolutionary conservation across the mouse lineage. Furthermore, we observe dysregulation of the expression of several hundred genes, which are concentrated in cancer-related pathways, and are caused by changes in transcriptional regulation. Chromatin structure is preserved but some loop interactions are destabilized; these are often found around differentially expressed genes and their enhancers. Importantly, the transcriptional alterations identified in vitro are recapitulated in mouse tumors and also in human cancers.

    Conclusions: This multi-dimensional genomic and epigenomic profiling of a Ctcf hemizygous mouse model system shows that chronic depletion of CTCF dysregulates steady-state gene expression by subtly altering transcriptional regulation, changes which can also be observed in primary tumors.

    Funded by: Cancer Research UK: 20412; Cancer Research UK (GB): 20412; European Research Council (): 615584; Pathological Society of Great Britain and Ireland (GB): SGS 2015/04/04; Wellcome Trust (GB): 106563/Z/14, 108438/Z/15, 108749/Z/15/Z, 202878/A/16/Z, 202878/B/16/Z

    Genome biology 2018;19;1;106

  • Shared genetic effects on chromatin and gene expression indicate a role for enhancer priming in immune response.

    Alasoo K, Rodrigues J, Mukhopadhyay S, Knights AJ, Mann AL, Kundu K, HIPSCI Consortium, Hale C, Dougan G and Gaffney DJ

    Wellcome Trust Sanger Institute, Hinxton, UK.

    Regulatory variants are often context specific, modulating gene expression in a subset of possible cellular states. Although these genetic effects can play important roles in disease, the molecular mechanisms underlying context specificity are poorly understood. Here, we identified shared quantitative trait loci (QTLs) for chromatin accessibility and gene expression in human macrophages exposed to IFNγ, Salmonella and IFNγ plus Salmonella. We observed that ~60% of stimulus-specific expression QTLs with a detectable effect on chromatin altered the chromatin accessibility in naive cells, thus suggesting that they perturb enhancer priming. Such variants probably influence binding of cell-type-specific transcription factors, such as PU.1, which can then indirectly alter the binding of stimulus-specific transcription factors, such as NF-κB or STAT2. Thus, although chromatin accessibility assays are powerful for fine-mapping causal regulatory variants, detecting their downstream effects on gene expression will be challenging, requiring profiling of large numbers of stimulated cellular states and time points.

    Funded by: Medical Research Council: MC_PC_12026; Wellcome Trust: 098051, 098503

    Nature genetics 2018;50;3;424-431

  • The Malaria-Protective Human Glycophorin Structural Variant DUP4 Shows Somatic Mosaicism and Association with Hemoglobin Levels.

    Algady W, Louzada S, Carpenter D, Brajer P, Färnert A, Rooth I, Ngasala B, Yang F, Shaw MA and Hollox EJ

    Department of Genetics and Genome Biology, University of Leicester, Leicester LE1 7RH, UK.

    Glycophorin A and glycophorin B are red blood cell surface proteins and are both receptors for the parasite Plasmodium falciparum, which is the principal cause of malaria in sub-Saharan Africa. DUP4 is a complex structural genomic variant that carries extra copies of a glycophorin A-glycophorin B fusion gene and has a dramatic effect on malaria risk by reducing the risk of severe malaria by up to 40%. Using fiber-FISH and Illumina sequencing, we validate the structural arrangement of the glycophorin locus in the DUP4 variant and reveal somatic variation in copy number of the glycophorin B-glycophorin A fusion gene. By developing a simple, specific, PCR-based assay for DUP4, we show that the DUP4 variant reaches a frequency of 13% in the population of a malaria-endemic village in south-eastern Tanzania. We genotype a substantial proportion of that village and demonstrate an association of DUP4 genotype with hemoglobin levels, a phenotype related to malaria, using a family-based association test. Taken together, we show that DUP4 is a complex structural variant that may be susceptible to somatic variation and show that DUP4 is associated with a malarial-related phenotype in a longitudinally followed population.

    American journal of human genetics 2018;103;5;769-776

  • A cross-sectional analysis of ITN and IRS coverage in Namibia in 2013.

    Allcock SH, Young EH and Sandhu MS

    Department of Medicine, University of Cambridge, Cambridge, Cambridgeshire, UK.

    Background: Achieving vector control targets is a key step towards malaria elimination. Because of variations in reporting of progress towards vector control targets in 2013, the coverage of these vector control interventions in Namibia was assessed.

    Methods: Data on 9846 households, representing 41,314 people, collected in the 2013 nationally-representative Namibia Demographic and Health Survey were used to explore the coverage of two vector control methods: indoor residual spraying (IRS) and insecticide-treated nets (ITNs). Regional data on Plasmodium falciparum parasite rate in those aged 2-10 years (PfPR<sub>2-10</sub>), obtained from the Malaria Atlas Project, were used to provide information on malaria transmission intensity. Poisson regression analyses were carried out exploring the relationship between household interventions and PfPR<sub>2-10</sub>, with fully adjusted models adjusting for wealth and residence type and accounting for regional and enumeration area clustering. Additionally, the coverage as a function of government intervention zones was explored and models were compared using log-likelihood ratio tests.

    Results: Intervention coverage was greatest in the highest transmission areas (PfPR<sub>2-10</sub> ≥ 5%), but was still below target levels of 95% coverage in these regions, with 27.6% of households covered by IRS, 32.3% with an ITN and 49.0% with at least one intervention (ITN and/or IRS). In fully adjusted models, PfPR<sub>2-10</sub> ≥ 5% was strongly associated with IRS (RR 14.54; 95% CI 5.56-38.02; p < 0.001), ITN ownership (RR 5.70; 95% CI 2.84-11.45; p < 0.001) and ITN and/or IRS coverage (RR 5.32; 95% CI 3.09-9.16; p < 0.001).

    Conclusions: The prevalence of IRS and ITN interventions in 2013 did not reflect the Namibian government intervention targets. As such, there is a need to include quantitative monitoring of such interventions to reliably inform intervention strategies for malaria elimination in Namibia.

    Funded by: African Partnership for Chronic Disease Research (Medical Research Council UK partnership grant): MR/K013491/1; Wellcome Trust Sanger Institute: WT098051

    Malaria journal 2018;17;1;264

  • Predicting the mutations generated by repair of Cas9-induced double-strand breaks.

    Allen F, Crepaldi L, Alsinet C, Strong AJ, Kleshchevnikov V, De Angeli P, Páleníková P, Khodak A, Kiselev V, Kosicki M, Bassett AR, Harding H, Galanty Y, Muñoz-Martínez F, Metzakopian E, Jackson SP and Parts L

    Wellcome Sanger Institute, Hinxton, UK.

    The DNA mutation produced by cellular repair of a CRISPR-Cas9-generated double-strand break determines its phenotypic effect. It is known that the mutational outcomes are not random, but depend on DNA sequence at the targeted location. Here we systematically study the influence of flanking DNA sequence on repair outcome by measuring the edits generated by >40,000 guide RNAs (gRNAs) in synthetic constructs. We performed the experiments in a range of genetic backgrounds and using alternative CRISPR-Cas9 reagents. In total, we gathered data for >10<sup>9</sup> mutational outcomes. The majority of reproducible mutations are insertions of a single base, short deletions or longer microhomology-mediated deletions. Each gRNA has an individual cell-line-dependent bias toward particular outcomes. We uncover sequence determinants of the mutations produced and use these to derive a predictor of Cas9 editing outcomes. Improved understanding of sequence repair will allow better design of gene editing experiments.

    Nature biotechnology 2018

  • Genome watch: Keeping tally in the microbiome.

    Almeida A and Shao Y

    Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK.

    This month's Genome Watch highlights how the development of new approaches for quantifying the human microbiome may pave the way for a better understanding of microbial shifts in the context of human health and disease.

    Nature reviews. Microbiology 2018

  • Benchmarking taxonomic assignments based on 16S rRNA gene profiling of the microbiota from commonly sampled environments.

    Almeida A, Mitchell AL, Tarkowska A and Finn RD

    EMBL-EBI European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

    Background: Taxonomic profiling of ribosomal RNA (rRNA) sequences has been the accepted norm for inferring the composition of complex microbial ecosystems. Quantitative Insights Into Microbial Ecology (QIIME) and mothur have been the most widely used taxonomic analysis tools for this purpose, with MAPseq and QIIME 2 being two recently released alternatives. However, no independent and direct comparison between these four main tools has been performed. Here, we compared the default classifiers of MAPseq, mothur, QIIME, and QIIME 2 using synthetic simulated datasets comprised of some of the most abundant genera found in the human gut, ocean, and soil environments. We evaluate their accuracy when paired with both different reference databases and variable sub-regions of the 16S rRNA gene.

    Findings: We show that QIIME 2 provided the best recall and F-scores at genus and family levels, together with the lowest distance estimates between the observed and simulated samples. However, MAPseq showed the highest precision, with miscall rates consistently <2%. Notably, QIIME 2 was the most computationally expensive tool, with CPU time and memory usage almost 2 and 30 times higher than MAPseq, respectively. Using the SILVA database generally yielded a higher recall than using Greengenes, while assignment results of different 16S rRNA variable sub-regions varied up to 40% between samples analysed with the same pipeline.

    Conclusions: Our results support the use of either QIIME 2 or MAPseq for optimal 16S rRNA gene profiling, and we suggest that the choice between the two should be based on the level of recall, precision, and/or computational performance required.

    GigaScience 2018;7;5

  • Consistent signatures of selection from genomic analysis of pairs of temporal and spatial Plasmodium falciparum populations from The Gambia.

    Amambua-Ngwa A, Jeffries D, Amato R, Worwui A, Karim M, Ceesay S, Nyang H, Nwakanma D, Okebe J, Kwiatkowski D, Conway DJ and D'Alessandro U

    Medical Research Council Unit The Gambia at LSHTM, Banjul, The Gambia.

    Genome sequences of 247 Plasmodium falciparum isolates collected in The Gambia in 2008 and 2014 were analysed to identify changes possibly related to the scale-up of antimalarial interventions that occurred during this period. Overall, there were 15 regions across the genomes with signatures of positive selection. Five of these were sweeps around known drug resistance and antigenic loci. Signatures at antigenic loci such as thrombospodin related adhesive protein (Pftrap) were most frequent in eastern Gambia, where parasite prevalence and transmission remain high. There was a strong temporal differentiation at a non-synonymous SNP in a cysteine desulfarase (Pfnfs) involved in iron-sulphur complex biogenesis. During the 7-year period, the frequency of the lysine variant at codon 65 (Pfnfs-Q65K) increased by 22% (10% to 32%) in the Greater Banjul area. Between 2014 and 2015, the frequency of this variant increased by 6% (20% to 26%) in eastern Gambia. IC<sub>50</sub> for lumefantrine was significantly higher in Pfnfs-65K isolates. This is probably the first evidence of directional selection on Pfnfs or linked loci by lumefantrine. Given the declining malaria transmission, the consequent loss of population immunity, and sustained drug pressure, it is important to monitor Gambian P. falciparum populations for further signs of adaptation.

    Funded by: Medical Research Council (MRC): MC_EX_MR/K02440X/1

    Scientific reports 2018;8;1;9687

  • Genomic positional conservation identifies topological anchor point RNAs linked to developmental loci.

    Amaral PP, Leonardi T, Han N, Viré E, Gascoigne DK, Arias-Carrasco R, Büscher M, Pandolfini L, Zhang A, Pluchino S, Maracaja-Coutinho V, Nakaya HI, Hemberg M, Shiekhattar R, Enright AJ and Kouzarides T

    The Gurdon Institute, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QN, UK.

    Background: The mammalian genome is transcribed into large numbers of long noncoding RNAs (lncRNAs), but the definition of functional lncRNA groups has proven difficult, partly due to their low sequence conservation and lack of identified shared properties. Here we consider promoter conservation and positional conservation as indicators of functional commonality.

    Results: We identify 665 conserved lncRNA promoters in mouse and human that are preserved in genomic position relative to orthologous coding genes. These positionally conserved lncRNA genes are primarily associated with developmental transcription factor loci with which they are coexpressed in a tissue-specific manner. Over half of positionally conserved RNAs in this set are linked to chromatin organization structures, overlapping binding sites for the CTCF chromatin organiser and located at chromatin loop anchor points and borders of topologically associating domains (TADs). We define these RNAs as topological anchor point RNAs (tapRNAs). Characterization of these noncoding RNAs and their associated coding genes shows that they are functionally connected: they regulate each other's expression and influence the metastatic phenotype of cancer cells in vitro in a similar fashion. Furthermore, we find that tapRNAs contain conserved sequence domains that are enriched in motifs for zinc finger domain-containing RNA-binding proteins and transcription factors, whose binding sites are found mutated in cancers.

    Conclusions: This work leverages positional conservation to identify lncRNAs with potential importance in genome organization, development and disease. The evidence that many developmental transcription factors are physically and functionally connected to lncRNAs represents an exciting stepping-stone to further our understanding of genome regulation.

    Funded by: Cancer Research UK: C6/A18796, C6946/A14492; European Research Council: 268569; Wellcome Trust: 092096

    Genome biology 2018;19;1;32

  • Origins of the current outbreak of multidrug-resistant malaria in southeast Asia: a retrospective genetic study.

    Amato R, Pearson RD, Almagro-Garcia J, Amaratunga C, Lim P, Suon S, Sreng S, Drury E, Stalker J, Miotto O, Fairhurst RM and Kwiatkowski DP

    Wellcome Sanger Institute, Hinxton, UK; MRC Centre for Genomics and Global Health, Big Data Institute, Oxford University, Oxford, UK. Electronic address:

    Background: Antimalarial resistance is rapidly spreading across parts of southeast Asia where dihydroartemisinin-piperaquine is used as first-line treatment for Plasmodium falciparum malaria. The first published reports about resistance to antimalarial drugs came from western Cambodia in 2013. Here, we analyse genetic changes in the P falciparum population of western Cambodia in the 6 years before those reports.

    Methods: We analysed genome sequence data on 1492 P falciparum samples from 11 locations across southeast Asia, including 464 samples collected in western Cambodia between 2007 and 2013. Different epidemiological origins of resistance were identified by haplotypic analysis of the kelch13 artemisinin resistance locus and the plasmepsin 2-3 piperaquine resistance locus.

    Findings: We identified more than 30 independent origins of artemisinin resistance, of which the KEL1 lineage accounted for 140 (91%) of 154 parasites resistant to dihydroartemisinin-piperaquine. In 2008, KEL1 combined with PLA1, the major lineage associated with piperaquine resistance. By 2013, the KEL1/PLA1 co-lineage had reached a frequency of 63% (24/38) in western Cambodia and had spread to northern Cambodia.

    Interpretation: The KEL1/PLA1 co-lineage emerged in the same year that dihydroartemisinin-piperaquine became the first-line antimalarial drug in western Cambodia and spread rapidly thereafter, displacing other artemisinin-resistant parasite lineages. These findings have important implications for management of the global health risk associated with the current outbreak of multidrug-resistant malaria in southeast Asia.

    Funding: Wellcome Trust, Bill & Melinda Gates Foundation, Medical Research Council, UK Department for International Development, and the Intramural Research Program of the National Institute of Allergy and Infectious Diseases.

    Funded by: Medical Research Council: G0600718, MR/M006212/1; Wellcome Trust: 090770, 098051, 204911, 206194

    The Lancet. Infectious diseases 2018;18;3;337-345

  • Rearrangement bursts generate canonical gene fusions in bone and soft tissue tumors.

    Anderson ND, de Borja R, Young MD, Fuligni F, Rosic A, Roberts ND, Hajjar S, Layeghifard M, Novokmet A, Kowalski PE, Anaka M, Davidson S, Zarrei M, Id Said B, Schreiner LC, Marchand R, Sitter J, Gokgoz N, Brunga L, Graham GT, Fullam A, Pillay N, Toretsky JA, Yoshida A, Shibata T, Metzler M, Somers GR, Scherer SW, Flanagan AM, Campbell PJ, Schiffman JD, Shago M, Alexandrov LB, Wunder JS, Andrulis IL, Malkin D, Behjati S and Shlien A

    Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario, Canada.

    Sarcomas are cancers of the bone and soft tissue often defined by gene fusions. Ewing sarcoma involves fusions between <i>EWSR1</i>, a gene encoding an RNA binding protein, and E26 transformation-specific (ETS) transcription factors. We explored how and when <i>EWSR1-ETS</i> fusions arise by studying the whole genomes of Ewing sarcomas. In 52 of 124 (42%) of tumors, the fusion gene arises by a sudden burst of complex, loop-like rearrangements, a process called chromoplexy, rather than by simple reciprocal translocations. These loops always contained the disease-defining fusion at the center, but they disrupted multiple additional genes. The loops occurred preferentially in early replicating and transcriptionally active genomic regions. Similar loops forming canonical fusions were found in three other sarcoma types. Chromoplexy-generated fusions appear to be associated with an aggressive form of Ewing sarcoma. These loops arise early, giving rise to both primary and relapse Ewing sarcoma tumors, which can continue to evolve in parallel.

    Funded by: Wellcome Trust: 110104

    Science (New York, N.Y.) 2018;361;6405

  • False signals induced by single-cell imputation.

    Andrews TS and Hemberg M

    Wellcome Trust Sanger Institute, Hinxton, Cambridgeshire, CB10 1SA, UK.

    <b>Background:</b> Single-cell RNA-seq is a powerful tool for measuring gene expression at the resolution of individual cells.  A challenge in the analysis of this data is the large amount of zero values, representing either missing data or no expression. Several imputation approaches have been proposed to address this issue, but they generally rely on structure inherent to the dataset under consideration they may not provide any additional information, hence, are limited by the information contained therein and the validity of their assumptions. <b>Methods:</b> We evaluated the risk of generating false positive or irreproducible differential expression when imputing data with six different methods. We applied each method to a variety of simulated datasets as well as to permuted real single-cell RNA-seq datasets and consider the number of false positive gene-gene correlations and differentially expressed genes. Using matched 10X and Smart-seq2 data we examined whether cell-type specific markers were reproducible across datasets derived from the same tissue before and after imputation. <b>Results:</b> The extent of false-positives introduced by imputation varied considerably by method. Data smoothing based methods, MAGIC, knn-smooth and dca, generated many false-positives in both real and simulated data. Model-based imputation methods typically generated fewer false-positives but this varied greatly depending on the diversity of cell-types in the sample. All imputation methods decreased the reproducibility of cell-type specific markers, although this could be mitigated by selecting markers with large effect size and significance. <b>Conclusions:</b> Imputation of single-cell RNA-seq data introduces circularity that can generate false-positive results. Thus, statistical tests applied to imputed data should be treated with care. Additional filtering by effect size can reduce but not fully eliminate these effects. Of the methods we considered, SAVER was the least likely to generate false or irreproducible results, thus should be favoured over alternatives if imputation is necessary.

    F1000Research 2018;7;1740

  • M3Drop: Dropout-based feature selection for scRNASeq.

    Andrews TS and Hemberg M

    Wellcome Trust Sanger Institute, Hinxton, Cambridgshire, UK.

    Motivation: Most genomes contain thousands of genes, but for most functional responses, only a subset of those genes are relevant. To facilitate many single-cell RNASeq (scRNASeq) analyses the set of genes is often reduced through feature selection, i.e. by removing genes only subject to technical noise.

    Results: We present M3Drop, an R package that implements popular existing feature selection methods and two novel methods which take advantage of the prevalence of zeros (dropouts) in scRNASeq data to identify features. We show these new methods outperform existing methods on simulated and real datasets.

    Availability: M3Drop is freely available on github as an R package and is compatible with other popular scRNASeq tools:

    Supplementary information: Supplementary data are available at Bioinformatics online.

    Bioinformatics (Oxford, England) 2018

  • Demographic history and genetic adaptation in the Himalayan region inferred from genome-wide SNP genotypes of 49 populations.

    Arciero E, Kraaijenbrink T, Asan, Haber M, Mezzavilla M, Ayub Q, Wang W, Pingcuo Z, Yang H, Wang J, Jobling MA, van Driem G, Xue Y, de Knijff P and Tyler-Smith C

    The Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK.

    We genotyped 738 individuals belonging to 49 populations from Nepal, Bhutan, North India or Tibet at over 500,000 SNPs, and analysed the genotypes in the context of available worldwide population data in order to investigate the demographic history of the region and the genetic adaptations to the harsh environment. The Himalayan populations resembled other South and East Asians, but in addition displayed their own specific ancestral component and showed strong population structure and genetic drift. We also found evidence for multiple admixture events involving Himalayan populations and South/East Asians between 200 and 2,000 years ago. In comparisons with available ancient genomes, the Himalayans, like other East and South Asian populations, showed similar genetic affinity to Eurasian hunter-gatherers (a 24,000-year-old Upper Palaeolithic Siberian), and the related Bronze Age Yamnaya. The high-altitude Himalayan populations all shared a specific ancestral component, suggesting that genetic adaptation to life at high altitude originated only once in this region and subsequently spread. Combining four approaches to identifying specific positively-selected loci, we confirmed that the strongest signals of high-altitude adaptation were located near the Endothelial PAS domain-containing protein 1 (EPAS1) and Egl-9 Family Hypoxia Inducible Factor 1 (EGLN1) loci, and discovered eight additional robust signals of high-altitude adaptation, five of which have strong biological functional links to such adaptation. In conclusion, the demographic history of Himalayan populations is complex, with strong local differentiation, reflecting both genetic and cultural factors; these populations also display evidence of multiple genetic adaptations to high-altitude environments.

    Molecular biology and evolution 2018

  • Multi-Omics Factor Analysis-a framework for unsupervised integration of multi-omics data sets.

    Argelaguet R, Velten B, Arnol D, Dietrich S, Zenz T, Marioni JC, Buettner F, Huber W and Stegle O

    European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge, UK.

    Multi-omics studies promise the improved characterization of biological processes across molecular layers. However, methods for the unsupervised integration of the resulting heterogeneous data sets are lacking. We present Multi-Omics Factor Analysis (MOFA), a computational method for discovering the principal sources of variation in multi-omics data sets. MOFA infers a set of (hidden) factors that capture biological and technical sources of variability. It disentangles axes of heterogeneity that are shared across multiple modalities and those specific to individual data modalities. The learnt factors enable a variety of downstream analyses, including identification of sample subgroups, data imputation and the detection of outlier samples. We applied MOFA to a cohort of 200 patient samples of chronic lymphocytic leukaemia, profiled for somatic mutations, RNA expression, DNA methylation and <i>ex vivo</i> drug responses. MOFA identified major dimensions of disease heterogeneity, including immunoglobulin heavy-chain variable region status, trisomy of chromosome 12 and previously underappreciated drivers, such as response to oxidative stress. In a second application, we used MOFA to analyse single-cell multi-omics data, identifying coordinated transcriptional and epigenetic changes along cell differentiation.

    Molecular systems biology 2018;14;6;e8124

  • Genome-wide interaction study of a proxy for stress-sensitivity and its prediction of major depressive disorder.

    Arnau-Soler A, Adams MJ, Generation Scotland, Major Depressive Disorder Working Group of the Psychiatric Genomics Consortium, Hayward C and Thomson PA

    Medical Genetics Section, Centre for Genomic and Experimental Medicine, MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh, United Kingdom.

    Individual response to stress is correlated with neuroticism and is an important predictor of both neuroticism and the onset of major depressive disorder (MDD). Identification of the genetics underpinning individual differences in response to negative events (stress-sensitivity) may improve our understanding of the molecular pathways involved, and its association with stress-related illnesses. We sought to generate a proxy for stress-sensitivity through modelling the interaction between SNP allele and MDD status on neuroticism score in order to identify genetic variants that contribute to the higher neuroticism seen in individuals with a lifetime diagnosis of depression compared to unaffected individuals. Meta-analysis of genome-wide interaction studies (GWIS) in UK Biobank (N = 23,092) and Generation Scotland: Scottish Family Health Study (N = 7,155) identified no genome-wide significance SNP interactions. However, gene-based tests identified a genome-wide significant gene, ZNF366, a negative regulator of glucocorticoid receptor function implicated in alcohol dependence (p = 1.48x10-7; Bonferroni-corrected significance threshold p < 2.79x10-6). Using summary statistics from the stress-sensitivity term of the GWIS, SNP heritability for stress-sensitivity was estimated at 5.0%. In models fitting polygenic risk scores of both MDD and neuroticism derived from independent GWAS, we show that polygenic risk scores derived from the UK Biobank stress-sensitivity GWIS significantly improved the prediction of MDD in Generation Scotland. This study may improve interpretation of larger genome-wide association studies of MDD and other stress-related illnesses, and the understanding of the etiological mechanisms underpinning stress-sensitivity.

    Funded by: Wellcome Trust

    PloS one 2018;13;12;e0209160

  • mlplasmids: a user-friendly tool to predict plasmid- and chromosome-derived sequences for single species.

    Arredondo-Alonso S, Rogers MRC, Braat JC, Verschuuren TD, Top J, Corander J, Willems RJL and Schürch AC

    1​Department of Medical Microbiology, University Medical Center Utrecht, Utrecht, The Netherlands.

    Assembly of bacterial short-read whole-genome sequencing data frequently results in hundreds of contigs for which the origin, plasmid or chromosome, is unclear. Complete genomes resolved by long-read sequencing can be used to generate and label short-read contigs. These were used to train several popular machine learning methods to classify the origin of contigs from Enterococcus faecium, Klebsiella pneumoniae and Escherichia coli using pentamer frequencies. We selected support-vector machine (SVM) models as the best classifier for all three bacterial species (F1-score E. faecium=0.92, F1-score K. pneumoniae=0.90, F1-score E. coli=0.76), which outperformed other existing plasmid prediction tools using a benchmarking set of isolates. We demonstrated the scalability of our models by accurately predicting the plasmidome of a large collection of 1644 E. faecium isolates and illustrate its applicability by predicting the location of antibiotic-resistance genes in all three species. The SVM classifiers are publicly available as an R package and graphical-user interface called 'mlplasmids'. We anticipate that this tool may significantly facilitate research on the dissemination of plasmids encoding antibiotic resistance and/or contributing to host adaptation.

    Microbial genomics 2018

  • Generation of gene-corrected human induced pluripotent stem cell lines derived from retinitis pigmentosa patient with Ser331Cysfs*5 mutation in MERTK.

    Artero Castro A, Long K, Bassett A, Machuca C, León M, Ávila-Fernandez A, Cortón M, Vidal-Puig T, Ayuso C, Lukovic D and Erceg S

    Stem Cells Therapies in Neurodegenerative Diseases Lab, Centro de Investigacion Principe Felipe (CIPF), Valencia, Spain.

    The human induced pluripotent stem cell (hiPSC) line RP1-FiPS4F1 generated from the patient with autosomal recessive retinitis pigmentosa (arRP) caused by homozygous Ser331Cysfs*5 mutation in Mer tyrosine kinase receptor (MERTK) was genetically corrected using CRISPR/Cas9 system. Two isogenic hiPSCs lines, with heterozygous and homozygous correction of c.992_993delCA mutation in the MERTK gene were generated. These cell lines demonstrate normal karyotype, maintain a pluripotent state, and can differentiate toward three germ layers in vitro. These genetically corrected hiPSCs represent accurate controls to study the contribution of the specific genetic change to the disease, and potentially therapeutic material for cell-replacement therapy.

    Stem cell research 2018;34;101341

  • Streptococcus suis contains multiple phase-variable methyltransferases that show a discrete lineage distribution.

    Atack JM, Weinert LA, Tucker AW, Husna AU, Wileman TM, F Hadjirin N, Hoa NT, Parkhill J, Maskell DJ, Blackall PJ and Jennings MP

    Institute for Glycomics, Griffith University, Gold Coast, Queensland 4222, Australia.

    Streptococcus suis is a major pathogen of swine, responsible for a number of chronic and acute infections, and is also emerging as a major zoonotic pathogen, particularly in South-East Asia. Our study of a diverse population of S. suis shows that this organism contains both Type I and Type III phase-variable methyltransferases. In all previous examples, phase-variation of methyltransferases results in genome wide methylation differences, and results in differential regulation of multiple genes, a system known as the phasevarion (phase-variable regulon). We hypothesized that each variant in the Type I and Type III systems encoded a methyltransferase with a unique specificity, and could therefore control a distinct phasevarion, either by recombination-driven shuffling between different specificities (Type I) or by biphasic on-off switching via simple sequence repeats (Type III). Here, we present the identification of the target specificities for each Type III allelic variant from S. suis using single-molecule, real-time methylome analysis. We demonstrate phase-variation is occurring in both Type I and Type III methyltransferases, and show a distinct association between methyltransferase type and presence, and population clades. In addition, we show that the phase-variable Type I methyltransferase was likely acquired at the origin of a highly virulent zoonotic sub-population.

    Nucleic acids research 2018

  • Genomic analysis of a pre-elimination Malaysian Plasmodium vivax population reveals selective pressures and changing transmission dynamics.

    Auburn S, Benavente ED, Miotto O, Pearson RD, Amato R, Grigg MJ, Barber BE, William T, Handayuni I, Marfurt J, Trimarsanto H, Noviyanti R, Sriprawat K, Nosten F, Campino S, Clark TG, Anstey NM, Kwiatkowski DP and Price RN

    Global and Tropical Health Division, Menzies School of Health Research and Charles Darwin University, Darwin, NT, 0811, Australia.

    The incidence of Plasmodium vivax infection has declined markedly in Malaysia over the past decade despite evidence of high-grade chloroquine resistance. Here we investigate the genetic changes in a P. vivax population approaching elimination in 51 isolates from Sabah, Malaysia and compare these with data from 104 isolates from Thailand and 104 isolates from Indonesia. Sabah displays extensive population structure, mirroring that previously seen with the emergence of artemisinin-resistant P. falciparum founder populations in Cambodia. Fifty-four percent of the Sabah isolates have identical genomes, consistent with a rapid clonal expansion. Across Sabah, there is a high prevalence of loci known to be associated with antimalarial drug resistance. Measures of differentiation between the three countries reveal several gene regions under putative selection in Sabah. Our findings highlight important factors pertinent to parasite resurgence and molecular cues that can be used to monitor low-endemic populations at the end stages of P. vivax elimination.

    Funded by: Bill and Melinda Gates Foundation: OPP1164105; Department of Health | National Health and Medical Research Council (NHMRC): 1037304, 1042072, 1045156, 1074795, 1088738, 1131932, 1135820; Medical Research Council (MRC): M006212, MC_PC_15103, MR/K000551/1, MR/M01360X/1, MR/N010469/1; Wellcome Trust: 200909, 204911, 206194

    Nature communications 2018;9;1;2585

  • The impact of serotype-specific vaccination on phylodynamic parameters of Streptococcus pneumoniae and the pneumococcal pan-genome.

    Azarian T, Grant LR, Arnold BJ, Hammitt LL, Reid R, Santosham M, Weatherholtz R, Goklish N, Thompson CM, Bentley SD, O'Brien KL, Hanage WP and Lipsitch M

    Center for Communicable Disease Dynamics, Department of Epidemiology, T.H. Chan School of Public Health, Harvard University; Cambridge, Massachusetts, United States of America.

    In the United States, the introduction of the heptavalent pneumococcal conjugate vaccine (PCV) largely eliminated vaccine serotypes (VT); non-vaccine serotypes (NVT) subsequently increased in carriage and disease. Vaccination also disrupts the composition of the pneumococcal pangenome, which includes mobile genetic elements and polymorphic non-capsular antigens important for virulence, transmission, and pneumococcal ecology. Antigenic proteins are of interest for future vaccines; yet, little is known about how the they are affected by PCV use. To investigate the evolutionary impact of vaccination, we assessed recombination, evolution, and pathogen demographic history of 937 pneumococci collected from 1998-2012 among Navajo and White Mountain Apache Native American communities. We analyzed changes in the pneumococcal pangenome, focusing on metabolic loci and 19 polymorphic protein antigens. We found the impact of PCV on the pneumococcal population could be observed in reduced diversity, a smaller pangenome, and changing frequencies of accessory clusters of orthologous groups (COGs). Post-PCV7, diversity rebounded through clonal expansion of NVT lineages and inferred in-migration of two previously unobserved lineages. Accessory COGs frequencies trended toward pre-PCV7 values with increasing time since vaccine introduction. Contemporary frequencies of protein antigen variants are better predicted by pre-PCV7 values (1998-2000) than the preceding period (2006-2008), suggesting balancing selection may have acted in maintaining variant frequencies in this population. Overall, we present the largest genomic analysis of pneumococcal carriage in the United States to date, which includes a snapshot of a true vaccine-naïve community prior to the introduction of PCV7. These data improve our understanding of pneumococcal evolution and emphasize the need to consider pangenome composition when inferring the impact of vaccination and developing future protein-based pneumococcal vaccines.

    PLoS pathogens 2018;14;4;e1006966

  • Global emergence and population dynamics of divergent serotype 3 CC180 pneumococci.

    Azarian T, Mitchell PK, Georgieva M, Thompson CM, Ghouila A, Pollard AJ, von Gottberg A, du Plessis M, Antonio M, Kwambana-Adams BA, Clarke SC, Everett D, Cornick J, Sadowy E, Hryniewicz W, Skoczynska A, Moïsi JC, McGee L, Beall B, Metcalf BJ, Breiman RF, Ho PL, Reid R, O'Brien KL, Gladstone RA, Bentley SD and Hanage WP

    Center for Communicable Disease Dynamics, Department of Epidemiology, T.H. Chan School of Public Health, Harvard University, Boston, Massachusetts, United States of America.

    Streptococcus pneumoniae serotype 3 remains a significant cause of morbidity and mortality worldwide, despite inclusion in the 13-valent pneumococcal conjugate vaccine (PCV13). Serotype 3 increased in carriage since the implementation of PCV13 in the USA, while invasive disease rates remain unchanged. We investigated the persistence of serotype 3 in carriage and disease, through genomic analyses of a global sample of 301 serotype 3 isolates of the Netherlands3-31 (PMEN31) clone CC180, combined with associated patient data and PCV utilization among countries of isolate collection. We assessed phenotypic variation between dominant clades in capsule charge (zeta potential), capsular polysaccharide shedding, and susceptibility to opsonophagocytic killing, which have previously been associated with carriage duration, invasiveness, and vaccine escape. We identified a recent shift in the CC180 population attributed to a lineage termed Clade II, which was estimated by Bayesian coalescent analysis to have first appeared in 1968 [95% HPD: 1939-1989] and increased in prevalence and effective population size thereafter. Clade II isolates are divergent from the pre-PCV13 serotype 3 population in non-capsular antigenic composition, competence, and antibiotic susceptibility, the last of which resulting from the acquisition of a Tn916-like conjugative transposon. Differences in recombination rates among clades correlated with variations in the ATP-binding subunit of Clp protease, as well as amino acid substitutions in the comCDE operon. Opsonophagocytic killing assays elucidated the low observed efficacy of PCV13 against serotype 3. Variation in PCV13 use among sampled countries was not independently correlated with the CC180 population shift; therefore, genotypic and phenotypic differences in protein antigens and, in particular, antibiotic resistance may have contributed to the increase of Clade II. Our analysis emphasizes the need for routine, representative sampling of isolates from disperse geographic regions, including historically under-sampled areas. We also highlight the value of genomics in resolving antigenic and epidemiological variations within a serotype, which may have implications for future vaccine development.

    PLoS pathogens 2018;14;11;e1007438

  • Complete avian malaria parasite genomes reveal features associated with lineage-specific evolution in birds and mammals.

    Böhme U, Otto TD, Cotton JA, Steinbiss S, Sanders M, Oyola SO, Nicot A, Gandon S, Patra KP, Herd C, Bushell E, Modrzynska KK, Billker O, Vinetz JM, Rivero A, Newbold CI and Berriman M

    Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SA, United Kingdom.

    Avian malaria parasites are prevalent around the world and infect a wide diversity of bird species. Here, we report the sequencing and analysis of high-quality draft genome sequences for two avian malaria species, <i>Plasmodium relictum</i> and <i>Plasmodium gallinaceum</i> We identify 50 genes that are specific to avian malaria, located in an otherwise conserved core of the genome that shares gene synteny with all other sequenced malaria genomes. Phylogenetic analysis suggests that the avian malaria species form an outgroup to the mammalian <i>Plasmodium</i> species, and using amino acid divergence between species, we estimate the avian- and mammalian-infective lineages diverged in the order of 10 million years ago. Consistent with their phylogenetic position, we identify orthologs of genes that had previously appeared to be restricted to the clades of parasites containing <i>Plasmodium falciparum</i> and <i>Plasmodium vivax</i>, the species with the greatest impact on human health. From these orthologs, we explore differential diversifying selection across the genus and show that the avian lineage is remarkable in the extent to which invasion-related genes are evolving. The subtelomeres of the <i>P. relictum</i> and <i>P. gallinaceum</i> genomes contain several novel gene families, including an expanded <i>surf</i> multigene family. We also identify an expansion of reticulocyte binding protein homologs in <i>P. relictum</i>, and within these proteins, we detect distinct regions that are specific to nonhuman primate, humans, rodent, and avian hosts. For the first time in the <i>Plasmodium</i> lineage, we find evidence of transposable elements, including several hundred fragments of LTR-retrotransposons in both species and an apparently complete LTR-retrotransposon in the genome of <i>P. gallinaceum</i>.

    Funded by: Wellcome Trust: 206194, 104792/Z/14/Z, WT099198MA

    Genome research 2018;28;4;547-560

  • A synthesis approach of mouse studies to identify genes and proteins in arterial thrombosis and bleeding.

    Baaten CCFMJ, Meacham S, de Witt SM, Feijge MAH, Adams DJ, Akkerman JN, Cosemans JMEM, Grassi L, Jupe S, Kostadima M, Mattheij NJA, Prins MH, Ramirez-Solis R, Soehnlein O, Swieringa F, Weber C, White JK, Ouwehand WH and Heemskerk JWM

    Department of Biochemistry, Cardiovascular Research Institute Maastricht (CARIM), Maastricht University, Maastricht, Netherlands.

    Antithrombotic therapies reduce cardiovascular diseases by preventing arterial thrombosis and thromboembolism, but at expense of increased bleeding risks. Arterial thrombosis studies using genetically modified mice have been invaluable for identification of new molecular targets. Because of low sample sizes and heterogeneity in approaches or methodologies, a formal meta-analysis to compare studies of mice with single gene defects encountered major limitations. To overcome these, we developed a novel synthesis approach to quantitatively scale 1514 published studies of arterial thrombus formation (in vivo and in vitro), thromboembolism and tail bleeding of genetically modified mice. Using a newly defined consistency parameter (CP), indicating the strength of published data, comparisons were made of 431 mouse genes, of which 17 consistently contributed to thrombus formation without affecting hemostasis. Ranking analysis indicated high correlations between collagen-dependent thrombosis models in vivo (FeCl<sub>3</sub> injury or ligation/compression) and in vitro. Integration of scores and CP values resulted in a network of protein interactions in thrombosis and hemostasis (PITH), which was combined with databases of genetically linked human bleeding and thrombotic disorders. The network contained 2,946 nodes linked to modifying genes of thrombus formation, mostly with expression in megakaryocytes. Reactome pathway analysis and network characteristics revealed multiple novel genes with potential contribution to thrombosis/hemostasis. Studies with additional knockout mice revealed that 4/8 (<i>Apoe</i>, <i>Fpr2</i>, <i>Ifnar1</i>, <i>Vps13a</i>) new genes were modifying in thrombus formation. The PITH network further: <i>(i)</i> revealed a high similarity of murine and human hemostatic and thrombotic processes, and <i>(ii)</i> identified multiple new candidate proteins regulating these processes.

    Blood 2018

  • Shared activity patterns arising at genetic susceptibility loci reveal underlying genomic and cellular architecture of human disease.

    Baillie JK, Bretherick A, Haley CS, Clohisey S, Gray A, Neyton LPA, Barrett J, Stahl EA, Tenesa A, Andersson R, Brown JB, Faulkner GJ, Lizio M, Schaefer U, Daub C, Itoh M, Kondo N, Lassmann T, Kawai J, IIBDGC Consortium, Mole D, Bajic VB, Heutink P, Rehli M, Kawaji H, Sandelin A, Suzuki H, Satsangi J, Wells CA, Hacohen N, Freeman TC, Hayashizaki Y, Carninci P, Forrest ARR and Hume DA

    Division of Genetics and Genomics, The Roslin Institute, University of Edinburgh, Edinburgh, United Kingdom.

    Genetic variants underlying complex traits, including disease susceptibility, are enriched within the transcriptional regulatory elements, promoters and enhancers. There is emerging evidence that regulatory elements associated with particular traits or diseases share similar patterns of transcriptional activity. Accordingly, shared transcriptional activity (coexpression) may help prioritise loci associated with a given trait, and help to identify underlying biological processes. Using cap analysis of gene expression (CAGE) profiles of promoter- and enhancer-derived RNAs across 1824 human samples, we have analysed coexpression of RNAs originating from trait-associated regulatory regions using a novel quantitative method (network density analysis; NDA). For most traits studied, phenotype-associated variants in regulatory regions were linked to tightly-coexpressed networks that are likely to share important functional characteristics. Coexpression provides a new signal, independent of phenotype association, to enable fine mapping of causative variants. The NDA coexpression approach identifies new genetic variants associated with specific traits, including an association between the regulation of the OCT1 cation transporter and genetic variants underlying circulating cholesterol levels. NDA strongly implicates particular cell types and tissues in disease pathogenesis. For example, distinct groupings of disease-associated regulatory regions implicate two distinct biological processes in the pathogenesis of ulcerative colitis; a further two separate processes are implicated in Crohn's disease. Thus, our functional analysis of genetic predisposition to disease defines new distinct disease endotypes. We predict that patients with a preponderance of susceptibility variants in each group are likely to respond differently to pharmacological therapy. Together, these findings enable a deeper biological understanding of the causal basis of complex traits.

    PLoS computational biology 2018;14;3;e1005934

  • Genomic epidemiology of Shigella in the United Kingdom shows transmission of pathogen sublineages and determinants of antimicrobial resistance.

    Baker KS, Dallman TJ, Field N, Childs T, Mitchell H, Day M, Weill FX, Lefèvre S, Tourdjman M, Hughes G, Jenkins C and Thomson N

    Institute for Integrative Biology, University of Liverpool, Liverpool, L69 7ZB, United Kingdom.

    Shigella are globally important diarrhoeal pathogens that are endemic in low-to-middle income nations and also occur in high income nations, typically in travellers or community-based risk-groups. Shigella phylogenetics reveals population structures that are more reliable than those built with traditional typing methods, and has identified sublineages associated with specific geographical regions or patient groups. Genomic analyses reveal temporal increases in Shigella antimicrobial resistance (AMR) gene content, which is frequently encoded on mobile genetic elements. Here, we whole genome sequenced representative subsamples of S. flexneri 2a and S. sonnei (n = 366) from the United Kingdom from 2008 to 2014, and analysed these alongside publicly available data to make qualitative insights on the genomic epidemiology of shigellosis and its AMR within the broader global context. Combined phylogenetic, epidemiological and genomic anlayses revealed the presence of domestically-circulating sublineages in patient risk-groups and the importation of travel-related sublineages from both Africa and Asia, including ciprofloxacin-resistant sublineages of both species from Asia. Genomic analyses revealed common AMR determinants among travel-related and domestically-acquired isolates, and the evolution of mutations associated with reduced quinolone susceptibility in domestically-circulating sublineages. Collectively, this study provides unprecedented insights on the contribution and mobility of endemic and travel-imported sublineages and AMR determinants responsible for disease in a high-income nation.

    Funded by: Wellcome Trust

    Scientific reports 2018;8;1;7389

  • Horizontal antimicrobial resistance transfer drives epidemics of multiple Shigella species.

    Baker KS, Dallman TJ, Field N, Childs T, Mitchell H, Day M, Weill FX, Lefèvre S, Tourdjman M, Hughes G, Jenkins C and Thomson N

    Institute for Integrative Biology, University of Liverpool, Liverpool, L69 7ZB, UK.

    Horizontal gene transfer has played a role in developing the global public health crisis of antimicrobial resistance (AMR). However, the dynamics of AMR transfer through bacterial populations and its direct impact on human disease is poorly elucidated. Here, we study parallel epidemic emergences of multiple Shigella species, a priority AMR organism, in men who have sex with men to gain insight into AMR emergence and spread. Using genomic epidemiology, we show that repeated horizontal transfer of a single AMR plasmid among Shigella enhanced existing and facilitated new epidemics. These epidemic patterns contrasted with slighter, slower increases in disease caused by organisms with vertically inherited (chromosomally encoded) AMR. This demonstrates that horizontal transfer of AMR directly affects epidemiological outcomes of globally important AMR pathogens and highlights the need for integration of genomic analyses into all areas of AMR research, surveillance and management.

    Nature communications 2018;9;1;1462

  • An outbreak of a rare Shiga-toxin-producing Escherichia coli serotype (O117:H7) among men who have sex with men.

    Baker KS, Dallman TJ, Thomson NR and Jenkins C

    1​Institute for Integrative Biology, University of Liverpool, Liverpool, UK.

    Sexually transmissible enteric infections (STEIs) are commonly associated with transmission among men who have sex with men (MSM). In the past decade, the UK has experienced multiple parallel STEI emergences in MSM caused by a range of bacterial species of the genus Shigella, and an outbreak of an uncommon serotype (O117 : H7) of Shiga-toxin-producing Escherichia coli (STEC). Here, we used microbial genomics on 6 outbreak and 30 sporadic STEC O117 : H7 isolates to explore the origins and pathogenic drivers of the STEC O117 : H7 emergence in MSM. Using genomic epidemiology, we found that the STEC O117 : H7 outbreak lineage was potentially imported from Latin America and likely continues to circulate both in the UK MSM population and in Latin America. We found genomic relationships consistent with existing symptomatic evidence for chronic infection with this STEC serotype. Comparative genomic analysis indicated the existence of a novel Shiga toxin 1-encoding prophage in the outbreak isolates, and evidence of horizontal gene exchange among the STEC O117 : H7 outbreak lineage and other enteric pathogens. There was no evidence of increased virulence in the outbreak strains relative to contextual isolates, but the outbreak lineage was associated with azithromycin resistance. Comparing these findings with similar genomic investigations of emerging MSM-associated Shigella in the UK highlighted many parallels, the most striking of which was the importance of the azithromycin phenotype for STEI emergence in this patient group.

    Microbial genomics 2018

  • Genomic insights into the emergence and spread of antimicrobial-resistant bacterial pathogens.

    Baker S, Thomson N, Weill FX and Holt KE

    Oxford University Clinical Research Unit, Ho Chi Minh City, Vietnam.

    Whole-genome sequencing (WGS) has been vital for revealing the rapid temporal and spatial evolution of antimicrobial resistance (AMR) in bacterial pathogens. Some antimicrobial-resistant pathogens have outpaced us, with untreatable infections appearing in hospitals and the community. However, WGS has additionally provided us with enough knowledge to initiate countermeasures. Although we cannot stop bacterial adaptation, the predictability of many evolutionary processes in AMR bacteria offers us an opportunity to channel them using new control strategies. Furthermore, by using WGS for coordinating surveillance and to create a more fundamental understanding of the outcome of antimicrobial treatment and AMR mechanisms, we can use current and future antimicrobials more effectively and aim to extend their longevity.

    Funded by: Wellcome Trust

    Science (New York, N.Y.) 2018;360;6390;733-738

  • Targeting of NAT10 enhances healthspan in a mouse model of human accelerated aging syndrome.

    Balmus G, Larrieu D, Barros AC, Collins C, Abrudan M, Demir M, Geisler NJ, Lelliott CJ, White JK, Karp NA, Atkinson J, Kirton A, Jacobsen M, Clift D, Rodriguez R, Sanger Mouse Genetics Project, Adams DJ and Jackson SP

    The Wellcome Trust/Cancer Research UK Gurdon Institute and Department of Biochemistry, University of Cambridge, Cambridge, CB2 1QN, UK.

    Hutchinson-Gilford Progeria Syndrome (HGPS) is a rare, but devastating genetic disease characterized by segmental premature aging, with cardiovascular disease being the main cause of death. Cells from HGPS patients accumulate progerin, a permanently farnesylated, toxic form of Lamin A, disrupting the nuclear shape and chromatin organization, leading to DNA-damage accumulation and senescence. Therapeutic approaches targeting farnesylation or aiming to reduce progerin levels have provided only partial health improvements. Recently, we identified Remodelin, a small-molecule agent that leads to amelioration of HGPS cellular defects through inhibition of the enzyme N-acetyltransferase 10 (NAT10). Here, we show the preclinical data demonstrating that targeting NAT10 in vivo, either via chemical inhibition or genetic depletion, significantly enhances the healthspan in a Lmna <sup>G609G</sup> HGPS mouse model. Collectively, the data provided here highlights NAT10 as a potential therapeutic target for HGPS.

    Funded by: Medical Research Council: MC_U105181010, MR/L019116/1; Wellcome Trust

    Nature communications 2018;9;1;1700

  • Sphingolipid dysregulation due to lack of functional KDSR impairs proplatelet formation causing thrombocytopenia.

    Bariana TK, Labarque V, Heremans J, Thys C, De Reys M, Greene D, Jenkins B, Grassi L, Seyres D, Burden F, Whitehorn D, Shamardina O, Papadia S, Gomez K, NIHR BioResource, Van Geet C, Koulman A, Ouwehand WH, Ghevaert C, Frontini M, Turro E and Freson K

    University College London.

    Sphingolipids are fundamental to membrane trafficking, apoptosis and cell differentiation and proliferation. KDSR or 3-keto-dihydrosphingosine reductase is an essential enzyme for de novo sphingolipid synthesis, and pathogenic mutations in KDSR result in the severe skin disorder erythrokeratodermia variabilis et progressiva-4. Four of the eight reported cases also had thrombocytopenia but the underlying mechanism has remained unexplored. Here we expand upon the phenotypic spectrum of KDSR deficiency with studies in two siblings with novel compound heterozygous variants associated with thrombocytopenia, anemia and minimal skin involvement. We report a novel phenotype of progressive juvenile myelofibrosis in the propositus, with spontaneous recovery of anemia and thrombocytopenia in the first decade of life. Examination of bone marrow biopsies showed megakaryocyte hyperproliferation and dysplasia. Megakaryocytes obtained by culture of CD34+ stem cells confirmed hyperproliferation and showed reduced proplatelet formation. The effect of KDSR insufficiency on the sphingolipid profile was unknown, and was explored in vivo and in vitro by a broad metabolomics screen that indicated activation of an in vivo compensatory pathway that leads to normalisation of downstream metabolites such as ceramide. Differentiation of propositus-derived induced pluripotent stem cells to megakaryocytes followed by expression of functional KDSR showed correction of the aberrant cellular and biochemical phenotypes, corroborating the critical role of KDSR in proplatelet formation. Finally, Kdsr depletion in zebrafish recapitulated the thrombocytopenia and showed biochemical changes similar to those observed in the affected siblings. These studies support an important role for sphingolipids as regulators of cytoskeletal organisation during megakaryopoiesis and proplatelet formation.

    Haematologica 2018

  • Objective measurement of physical activity: improving the evidence base to address non-communicable diseases in Africa.

    Barr AL, Young EH and Sandhu MS

    Department of Medicine, University of Cambridge, Cambridge, UK.

    Funded by: Wellcome Trust

    BMJ global health 2018;3;5;e001044

  • Delineating the HMGB1 and HMGB2 interactome in prostate and ovary epithelial cells and its relationship with cancer.

    Barreiro-Alonso A, Lamas-Maceiras M, García-Díaz R, Rodríguez-Belmonte E, Yu L, Pardo M, Choudhary JS and Cerdán ME

    EXPRELA Group, Centro de Investigacións Científicas Avanzadas, Departamento de Biología, Facultade de Ciencias, INIBIC-Universidade da Coruña, Campus de A Coruña, A Coruña, 15071, Spain.

    High Mobility Group B (HMGB) proteins are involved in cancer progression and in cellular responses to platinum compounds used in the chemotherapy of prostate and ovary cancer. Here we use affinity purification coupled to mass spectrometry (MS) and yeast two-hybrid (Y2H) screening to carry out an exhaustive study of HMGB1 and HMGB2 protein interactions in the context of prostate and ovary epithelia. We present a proteomic study of HMGB1 partners based on immunoprecipitation of HMGB1 from a non-cancerous prostate epithelial cell line. In addition, HMGB1 and HMGB2 were used as baits in yeast two-hybrid screening of libraries from prostate and ovary epithelial cell lines as well as from healthy ovary tissue. HMGB1 interacts with many nuclear proteins that control gene expression, but also with proteins that form part of the cytoskeleton, cell-adhesion structures and others involved in intracellular protein translocation, cellular migration, secretion, apoptosis and cell survival. HMGB2 interacts with proteins involved in apoptosis, cell motility and cellular proliferation. High confidence interactors, based on repeated identification in different cell types or in both MS and Y2H approaches, are discussed in relation to cancer. This study represents a useful resource for detailed investigation of the role of HMGB1 in cancer of epithelial origins, as well as potential alternative avenues of therapeutic intervention.

    Funded by: Wellcome Trust

    Oncotarget 2018;9;27;19050-19064

  • Microevolution and Patterns of Transmission of Shigella sonnei within Cyclic Outbreaks Shigellosis, Israel.

    Behar A, Baker KS, Bassal R, Ezernitchi A, Valinsky L, Thomson NR and Cohen D

    Whole-genome sequencing unveiled host and environment-related insights to Shigella sonnei transmission within cyclic epidemics during 2000-2012 in Israel. The Israeli reservoir contains isolates belonging to S. sonnei lineage III but of different origin, shows loss of tetracycline resistance genes, and little genetic variation within the O antigen: highly relevant for Shigella vaccine development.

    Emerging infectious diseases 2018;24;7;1335-1339

  • Mapping human development at single-cell resolution.

    Behjati S, Lindsay S, Teichmann SA and Haniffa M

    Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK

    Human development is regulated by spatiotemporally restricted molecular programmes and is pertinent to many areas of basic biology and human medicine, such as stem cell biology, reproductive medicine and childhood cancer. Mapping human development has presented significant technological, logistical and ethical challenges. The availability of established human developmental biorepositories and the advent of cutting-edge single-cell technologies provide new opportunities to study human development. Here, we present a working framework for the establishment of a human developmental cell atlas exploiting single-cell genomics and spatial analysis. We discuss how the development atlas will benefit the scientific and clinical communities to advance our understanding of basic biology, health and disease.

    Funded by: Medical Research Council: G0700089; Wellcome Trust: 110104/Z/15/Z

    Development (Cambridge, England) 2018;145;3

  • Single-cell transcriptomics reveals a new dynamical function of transcription factors during embryonic hematopoiesis.

    Bergiers I, Andrews T, Vargel Bölükbaşı Ö, Buness A, Janosz E, Lopez-Anguita N, Ganter K, Kosim K, Celen C, Itır Perçin G, Collier P, Baying B, Benes V, Hemberg M and Lancrin C

    European Molecular Biology Laboratory, EMBL Rome, Monterotondo, Italy.

    Recent advances in single-cell transcriptomics techniques have opened the door to the study of gene regulatory networks (GRNs) at the single-cell level. Here, we studied the GRNs controlling the emergence of hematopoietic stem and progenitor cells from mouse embryonic endothelium using a combination of single-cell transcriptome assays. We found that a heptad of transcription factors (Runx1, Gata2, Tal1, Fli1, Lyl1, Erg and Lmo2) is specifically co-expressed in an intermediate population expressing both endothelial and hematopoietic markers. Within the heptad, we identified two sets of factors of opposing functions: one (Erg/Fli1) promoting the endothelial cell fate, the other (Runx1/Gata2) promoting the hematopoietic fate. Surprisingly, our data suggest that even though Fli1 initially supports the endothelial cell fate, it acquires a pro-hematopoietic role when co-expressed with Runx1. This work demonstrates the power of single-cell RNA-sequencing for characterizing complex transcription factor dynamics.

    Funded by: EMBL Interdisciplinary Postdocs (EIPOD) Initiative: Post-doc fellowship; Wellcome Trust

    eLife 2018;7

  • Human Genetics: Busy Subway Networks in Remote Oceania?

    Bergström A and Tyler-Smith C

    The Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK. Electronic address:

    Ancient human DNA from the Oceanian islands of Vanuatu reveals a surprisingly complex history of human settlement, featuring almost complete replacement shortly after initial colonisation, followed by mixing and a puzzling disconnect between genetic ancestry and language.

    Current biology : CB 2018;28;9;R549-R551

  • The SMAD2/3 interactome reveals that TGFβ controls m<sup>6</sup>A mRNA methylation in pluripotency.

    Bertero A, Brown S, Madrigal P, Osnato A, Ortmann D, Yiangou L, Kadiwala J, Hubner NC, de Los Mozos IR, Sadée C, Lenaerts AS, Nakanoh S, Grandy R, Farnell E, Ule J, Stunnenberg HG, Mendjan S and Vallier L

    Wellcome Trust-MRC Cambridge Stem Cell Institute, Anne McLaren Laboratory and Department of Surgery, University of Cambridge, Cambridge CB2 0SZ, UK.

    The TGFβ pathway has essential roles in embryonic development, organ homeostasis, tissue repair and disease. These diverse effects are mediated through the intracellular effectors SMAD2 and SMAD3 (hereafter SMAD2/3), whose canonical function is to control the activity of target genes by interacting with transcriptional regulators. Therefore, a complete description of the factors that interact with SMAD2/3 in a given cell type would have broad implications for many areas of cell biology. Here we describe the interactome of SMAD2/3 in human pluripotent stem cells. This analysis reveals that SMAD2/3 is involved in multiple molecular processes in addition to its role in transcription. In particular, we identify a functional interaction with the METTL3-METTL14-WTAP complex, which mediates the conversion of adenosine to N<sup>6</sup>-methyladenosine (m<sup>6</sup>A) on RNA. We show that SMAD2/3 promotes binding of the m<sup>6</sup>A methyltransferase complex to a subset of transcripts involved in early cell fate decisions. This mechanism destabilizes specific SMAD2/3 transcriptional targets, including the pluripotency factor gene NANOG, priming them for rapid downregulation upon differentiation to enable timely exit from pluripotency. Collectively, these findings reveal the mechanism by which extracellular signalling can induce rapid cellular responses through regulation of the epitranscriptome. These aspects of TGFβ signalling could have far-reaching implications in many other cell types and in diseases such as cancer.

    Nature 2018;555;7695;256-259

  • Conditional Manipulation of Gene Function in Human Cells with Optimized Inducible shRNA.

    Bertero A, Yiangou L, Brown S, Ortmann D, Pawlowski M and Vallier L

    Wellcome Trust-MRC Stem Cell Institute, Anne McLaren Laboratory, University of Cambridge, Cambridge, United Kingdom.

    The difficulties involved in conditionally perturbing complex gene expression networks represent major challenges toward defining the mechanisms controlling human development, physiology, and disease. We developed an OPTimized inducible KnockDown (OPTiKD) platform that addresses the limitations of previous approaches by allowing streamlined, tightly-controlled, and potent loss-of-function experiments for both single and multiple genes. The method relies on single-step genetic engineering of the AAVS1 genomic safe harbor with an optimized tetracycline-responsive cassette driving one or more inducible short hairpin RNAs (shRNAs). OPTiKD provides homogeneous, dose-responsive, and reversible gene knockdown. When implemented in human pluripotent stem cells (hPSCs), the approach can be then applied to a broad range of hPSC-derived mature cell lineages that include neurons, cardiomyocytes, and hepatocytes. Generation of OPTiKD hPSCs in commonly used culture conditions is simple (plasmid based), rapid (two weeks), and highly efficient (>95%). Overall, this method facilitates the functional annotation of the human genome in health and disease. © 2018 by John Wiley & Sons, Inc.

    Current protocols in stem cell biology 2018;44;5C.4.1-5C.4.48

  • Chemical Synergy between Ionophore PBT2 and Zinc Reverses Antibiotic Resistance.

    Bohlmann L, De Oliveira DMP, El-Deeb IM, Brazel EB, Harbison-Price N, Ong CY, Rivera-Hernandez T, Ferguson SA, Cork AJ, Phan MD, Soderholm AT, Davies MR, Nimmo GR, Dougan G, Schembri MA, Cook GM, McEwan AG, von Itzstein M, McDevitt CA and Walker MJ

    School of Chemistry and Molecular Biosciences and Australian Infectious Diseases Research Centre, The University of Queensland, Brisbane, QLD, Australia.

    The World Health Organization reports that antibiotic-resistant pathogens represent an imminent global health disaster for the 21st century. Gram-positive superbugs threaten to breach last-line antibiotic treatment, and the pharmaceutical industry antibiotic development pipeline is waning. Here we report the synergy between ionophore-induced physiological stress in Gram-positive bacteria and antibiotic treatment. PBT2 is a safe-for-human-use zinc ionophore that has progressed to phase 2 clinical trials for Alzheimer's and Huntington's disease treatment. In combination with zinc, PBT2 exhibits antibacterial activity and disrupts cellular homeostasis in erythromycin-resistant group A <i>Streptococcus</i> (GAS), methicillin-resistant <i>Staphylococcus aureus</i> (MRSA), and vancomycin-resistant <i>Enterococcus</i> (VRE). We were unable to select for mutants resistant to PBT2-zinc treatment. While ineffective alone against resistant bacteria, several clinically relevant antibiotics act synergistically with PBT2-zinc to enhance killing of these Gram-positive pathogens. These data represent a new paradigm whereby disruption of bacterial metal homeostasis reverses antibiotic-resistant phenotypes in a number of priority human bacterial pathogens.<b>IMPORTANCE</b> The rise of bacterial antibiotic resistance coupled with a reduction in new antibiotic development has placed significant burdens on global health care. Resistant bacterial pathogens such as methicillin-resistant <i>Staphylococcus aureus</i> and vancomycin-resistant <i>Enterococcus</i> are leading causes of community- and hospital-acquired infection and present a significant clinical challenge. These pathogens have acquired resistance to broad classes of antimicrobials. Furthermore, <i>Streptococcus pyogenes</i>, a significant disease agent among Indigenous Australians, has now acquired resistance to several antibiotic classes. With a rise in antibiotic resistance and reduction in new antibiotic discovery, it is imperative to investigate alternative therapeutic regimens that complement the use of current antibiotic treatment strategies. As stated by the WHO Director-General, "On current trends, common diseases may become untreatable. Doctors facing patients will have to say, Sorry, there is nothing I can do for you."

    mBio 2018;9;6

  • Using WormBase ParaSite: An Integrated Platform for Exploring Helminth Genomic Data.

    Bolt BJ, Rodgers FH, Shafie M, Kersey PJ, Berriman M and Howe KL

    European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, UK.

    WormBase ParaSite ( ) is a comprehensive resource for the genomes of parasitic nematodes and flatworms (helminths). It currently includes genomic data for over 100 helminth species, adding value by way of consistent functional annotation, gene comparative analysis and gene expression analysis. We provide several ways of exploring the data including a choice of genome browsers, genome and gene summary pages, text and sequence searching, a query wizard, bulk downloads, and programmatic interfaces. WormBase ParaSite is released three to six times per year, and is developed in collaboration with WormBase ( ) and Ensembl Genomes ( ).

    Methods in molecular biology (Clifton, N.J.) 2018;1757;471-491

  • Crumble: reference free lossy compression of sequence quality values.

    Bonfield JK, McCarthy SA and Durbin R

    Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK.

    Motivation: The bulk of space taken up by NGS sequencing CRAM files consists of per-base quality values. Most of these are unnecessary for variant calling, offering an opportunity for space saving.

    Results: On the CHM1+CHM13 test set, a 17 fold reduction in the quality storage portion of a CRAM file can be achieved while maintaining variant calling accuracy. The size reduction of an entire CRAM file varied from 2.2 to 7.4 fold, depending on the non-quality content of the original file. See Supplementary Data section 6 for details.

    Availability: Crumble is OpenSource and can be obtained from

    Supplementary information: Supplementary data are available at Bioinformatics.

    Bioinformatics (Oxford, England) 2018

  • A single nucleotide polymorphism in the Plasmodium falciparum atg18 gene associates with artemisinin resistance and confers enhanced parasite survival under nutrient deprivation.

    Breglio KF, Amato R, Eastman R, Lim P, Sa JM, Guha R, Ganesan S, Dorward DW, Klumpp-Thomas C, McKnight C, Fairhurst RM, Roberts D, Thomas C and Simon AK

    National Center for Advancing Translational Sciences, National Institutes of Health, Bethesda, MD, USA.

    Background: Artemisinin-resistant Plasmodium falciparum has been reported throughout the Greater Mekong subregion and threatens to disrupt current malaria control efforts worldwide. Polymorphisms in kelch13 have been associated with clinical and in vitro resistance phenotypes; however, several studies suggest that the genetic determinants of resistance may involve multiple genes. Current proposed mechanisms of resistance conferred by polymorphisms in kelch13 hint at a connection to an autophagy-like pathway in P. falciparum.

    Results: A SNP in autophagy-related gene 18 (atg18) was associated with long parasite clearance half-life in patients following artemisinin-based combination therapy. This gene encodes PfAtg18, which is shown to be similar to the mammalian/yeast homologue WIPI/Atg18 in terms of structure, binding abilities, and ability to form puncta in response to stress. To investigate the contribution of this polymorphism, the atg18 gene was edited using CRISPR/Cas9 to introduce a T38I mutation into a k13-edited Dd2 parasite. The presence of this SNP confers a fitness advantage by enabling parasites to grow faster in nutrient-limited settings. The mutant and parent parasites were screened against drug libraries of 6349 unique compounds. While the SNP did not modulate the parasite's susceptibility to any of the anti-malarial compounds using a 72-h drug pulse, it did alter the parasite's susceptibility to 227 other compounds.

    Conclusions: These results suggest that the atg18 T38I polymorphism may provide additional resistance against artemisinin derivatives, but not partner drugs, even in the absence of kelch13 mutations, and may also be important in parasite survival during nutrient deprivation.

    Malaria journal 2018;17;1;391

  • Generating CRISPR/Cas9-Derived Mutant Mice by Zygote Cytoplasmic Injection Using an Automatic Microinjector

    Brendan Doe, Ellen Brown and Katharina Boroviak

    Methods and Protocols 2018;1;1;5

  • Rapid HIV disease progression following superinfection in an HLA-B*27:05/B*57:01-positive transmission recipient.

    Brener J, Gall A, Hurst J, Batorsky R, Lavandier N, Chen F, Edwards A, Bolton C, Dsouza R, Allen T, Pybus OG, Kellam P, Matthews PC and Goulder PJR

    Department of Paediatrics, University of Oxford, Oxford, UK.

    Background: The factors determining differential HIV disease outcome among individuals expressing protective HLA alleles such as HLA-B*27:05 and HLA-B*57:01 remain unknown. We here analyse two HIV-infected subjects expressing both HLA-B*27:05 and HLA-B*57:01. One subject maintained low-to-undetectable viral loads for more than a decade of follow up. The other progressed to AIDS in < 3 years.

    Results: The rapid progressor was the recipient within a known transmission pair, enabling virus sequences to be tracked from transmission. Progression was associated with a 12% Gag sequence change and 26% Nef sequence change at the amino acid level within 2 years. Although next generation sequencing from early timepoints indicated that multiple CD8+ cytotoxic T lymphocyte (CTL) escape mutants were being selected prior to superinfection, < 4% of the amino acid changes arising from superinfection could be ascribed to CTL escape. Analysis of an HLA-B*27:05/B*57:01 non-progressor, in contrast, demonstrated minimal virus sequence diversification (1.1% Gag amino acid sequence change over 10 years), and dominant HIV-specific CTL responses previously shown to be effective in control of viraemia were maintained. Clonal sequencing demonstrated that escape variants were generated within the non-progressor, but in many cases were not selected. In the rapid progressor, progression occurred despite substantial reductions in viral replicative capacity (VRC), and non-progression in the elite controller despite relatively high VRC.

    Conclusions: These data are consistent with previous studies demonstrating rapid progression in association with superinfection and that rapid disease progression can occur despite the relatively the low VRC that is typically observed in the setting of multiple CTL escape mutants.

    Funded by: National Institutes of Health: RO1AI46995; Wellcome Trust: WT104748MA

    Retrovirology 2018;15;1;7

  • Laboratory and molecular surveillance of paediatric typhoidal Salmonella in Nepal: Antimicrobial resistance and implications for vaccine policy.

    Britto CD, Dyson ZA, Duchene S, Carter MJ, Gurung M, Kelly DF, Murdoch DR, Ansari I, Thorson S, Shrestha S, Adhikari N, Dougan G, Holt KE and Pollard AJ

    Oxford Vaccine Group, Department of Paediatrics, University of Oxford and the NIHR Oxford Biomedical Research Centre, Oxford, United Kingdom.

    Background: Children are substantially affected by enteric fever in most settings with a high burden of the disease, including Nepal. However pathogen population structure and transmission dynamics are poorly delineated in young children, the proposed target group for immunization programs. Here we present whole genome sequencing and antimicrobial susceptibility data on 198 S. Typhi and 66 S. Paratyphi A isolated from children aged 2 months to 15 years of age during blood culture surveillance at Patan Hospital, Nepal, 2008-2016.

    Principal findings: S. Typhi was the dominant agent and comprised several distinct genotypes, dominated by 4.3.1 (H58). The heterogeneity of genotypes in children under five was reduced compared to data from 2005-2006, attributable to ongoing clonal expansion of H58. Most isolates (86%) were non-susceptible to fluoroquinolones, associated mainly with S. Typhi H58 Lineage II and S. Paratyphi A harbouring mutations in the quinolone resistance-determining region (QRDR); non-susceptible strains from these groups accounted for 50% and 25% of all isolates. Multi-drug resistance (MDR) was rare (3.5% of S. Typhi, 0 S. Paratyphi A) and restricted to chromosomal insertions of resistance genes in H58 lineage I strains. Temporal analyses revealed a shift in dominance from H58 Lineage I to H58 Lineage II, with the latter being significantly more common after 2010. Comparison to global data sets showed the local S. Typhi and S. Paratyphi A strains had close genetic relatives in other South Asian countries, indicating regional strain circulation. Multiple imports from India of ciprofloxacin-resistant H58 Lineage II strains were identified, but these were rare and showed no evidence of clonal replacement of local S. Typhi.

    Significance: These data indicate that enteric fever in Nepal continues to be a major public health issue with ongoing inter- and intra-country transmission, and highlights the need for regional coordination of intervention strategies. The absence of a S. Paratyphi A vaccine is cause for concern, given its prevalence as a fluoroquinolone resistant enteric fever agent in this setting.

    PLoS neglected tropical diseases 2018;12;4;e0006408

  • A systematic review of antimicrobial resistance in Salmonella enterica serovar Typhi, the etiological agent of typhoid.

    Britto CD, Wong VK, Dougan G and Pollard AJ

    Oxford Vaccine Group, Department of Paediatrics, University of Oxford and the NIHR Oxford Biomedical Research Centre, Oxford, United Kingdom.

    Background: The temporal and spatial change in trends of antimicrobial resistance (AMR) in typhoid have not been systematically studied, and such information will be critical for defining intervention, as well as planning sustainable prevention strategies.

    Methodology and findings: To identify the phenotypic trends in AMR, 13,833 individual S. Typhi isolates, reported from 1973 to 2018 in 62 publications, were analysed to determine the AMR preponderance over time. Separate analyses of molecular resistance determinants present in over 4,000 isolates reported in 61 publications were also conducted. Multi-drug resistant (MDR) typhoid is in decline in Asia in a setting of high fluoroquinolone resistance while it is on the increase in Africa. Mutations in QRDRs in gyrA (S83F, D87N) and parC (S80I) are the most common mechanisms responsible for fluoroquinolone resistance. Cephalosporin resistant S. Typhi, dubbed extensively drug-resistant (XDR) is a real threat and underscores the urgency in deploying the Vi-conjugate vaccines.

    Conclusion: From these observations, it appears that AMR in S. Typhi will continue to emerge leading to treatment failure, changes in antimicrobial policy and further resistance developing in S. Typhi isolates and other Gram-negative bacteria in endemic regions. The deployment of typhoid conjugate vaccines to control the disease in endemic regions may be the best defence.

    PLoS neglected tropical diseases 2018;12;10;e0006779

  • Whole genome sequencing and microsatellite analysis of the Plasmodium falciparum E5 NF54 strain show that the var, rifin and stevor gene families follow Mendelian inheritance.

    Bruske E, Otto TD and Frank M

    Institute of Tropical Medicine, University of Tuebingen, Wilhelmstr. 27, 72074, Tuebingen, Germany.

    Background: Plasmodium falciparum exhibits a high degree of inter-isolate genetic diversity in its variant surface antigen (VSA) families: P. falciparum erythrocyte membrane protein 1, repetitive interspersed family (RIFIN) and subtelomeric variable open reading frame (STEVOR). The role of recombination for the generation of this diversity is a subject of ongoing research. Here the genome of E5, a sibling of the 3D7 genome strain is presented. Short and long read whole genome sequencing (WGS) techniques (Ilumina, Pacific Bioscience) and a set of 84 microsatellites (MS) were employed to characterize the 3D7 and non-3D7 parts of the E5 genome. This is the first time that VSA genes in sibling parasites were analysed with long read sequencing technology.

    Results: Of the 5733 E5 genes only 278 genes, mostly var and rifin/stevor genes, had no orthologues in the 3D7 genome. WGS and MS analysis revealed that chromosomal crossovers occurred at a rate of 0-3 per chromosome. var, stevor and rifin genes were inherited within the respective non-3D7 or 3D7 chromosomal context. 54 of the 84 MS PCR fragments correctly identified the respective MS as 3D7- or non-3D7 and this correlated with var and rifin/stevor gene inheritance in the adjacent chromosomal regions. E5 had 61 var and 189 rifin/stevor genes. One large non-chromosomal recombination event resulted in a new var gene on chromosome 14. The remainder of the E5 3D7-type subtelomeric and central regions were identical to 3D7.

    Conclusions: The data show that the rifin/stevor and var gene families represent the most diverse compartments of the P. falciparum genome but that the majority of var genes are inherited without alterations within their respective parental chromosomal context. Furthermore, MS genotyping with 54 MS can successfully distinguish between two sibling progeny of a natural P. falciparum cross and thus can be used to investigate identity by descent in field isolates.

    Funded by: Bundesministerium für Bildung und Forschung: BMBF-grant 01KA110; Wellcome Trust: 098051

    Malaria journal 2018;17;1;376

  • Itraconazole targets cell cycle heterogeneity in colorectal cancer.

    Buczacki SJA, Popova S, Biggs E, Koukorava C, Buzzelli J, Vermeulen L, Hazelwood L, Francies H, Garnett MJ and Winton DJ

    Cancer Research UK (CRUK) Cambridge Institute, Li Ka Shing Centre, Robinson Way, Cambridge, England, UK

    Cellular dormancy and heterogeneity in cell cycle length provide important explanations for treatment failure after adjuvant therapy with S-phase cytotoxics in colorectal cancer (CRC), yet the molecular control of the dormant versus cycling state remains unknown. We sought to understand the molecular features of dormant CRC cells to facilitate rationale identification of compounds to target both dormant and cycling tumor cells. Unexpectedly, we demonstrate that dormant CRC cells are differentiated, yet retain clonogenic capacity. Mouse organoid drug screening identifies that itraconazole generates spheroid collapse and loss of dormancy. Human CRC cell dormancy and tumor growth can also be perturbed by itraconazole, which is found to inhibit Wnt signaling through noncanonical hedgehog signaling. Preclinical validation shows itraconazole to be effective in multiple assays through Wnt inhibition, causing both cycling and dormant cells to switch to global senescence. These data provide preclinical evidence to support an early phase trial of itraconazole in CRC.

    The Journal of experimental medicine 2018

  • The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019.

    Buniello A, MacArthur JAL, Cerezo M, Harris LW, Hayhurst J, Malangone C, McMahon A, Morales J, Mountjoy E, Sollis E, Suveges D, Vrousgou O, Whetzel PL, Amode R, Guillen JA, Riat HS, Trevanion SJ, Hall P, Junkins H, Flicek P, Burdett T, Hindorff LA, Cunningham F and Parkinson H

    European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

    The GWAS Catalog delivers a high-quality curated collection of all published genome-wide association studies enabling investigations to identify causal variants, understand disease mechanisms, and establish targets for novel therapies. The scope of the Catalog has also expanded to targeted and exome arrays with 1000 new associations added for these technologies. As of September 2018, the Catalog contains 5687 GWAS comprising 71673 variant-trait associations from 3567 publications. New content includes 284 full P-value summary statistics datasets for genome-wide and new targeted array studies, representing 6 × 109 individual variant-trait statistics. In the last 12 months, the Catalog's user interface was accessed by ∼90000 unique users who viewed >1 million pages. We have improved data access with the release of a new RESTful API to support high-throughput programmatic access, an improved web interface and a new summary statistics database. Summary statistics provision is supported by a new format proposed as a community standard for summary statistics data representation. This format was derived from our experience in standardizing heterogeneous submissions, mapping formats and in harmonizing content. Availability:

    Nucleic acids research 2018

  • Association of LPA Variants With Risk of Coronary Disease and the Implications for Lipoprotein(a)-Lowering Therapies: A Mendelian Randomization Analysis.

    Burgess S, Ference BA, Staley JR, Freitag DF, Mason AM, Nielsen SF, Willeit P, Young R, Surendran P, Karthikeyan S, Bolton TR, Peters JE, Kamstrup PR, Tybjærg-Hansen A, Benn M, Langsted A, Schnohr P, Vedel-Krogh S, Kobylecki CJ, Ford I, Packard C, Trompet S, Jukema JW, Sattar N, Di Angelantonio E, Saleheen D, Howson JMM, Nordestgaard BG, Butterworth AS, Danesh J and European Prospective Investigation Into Cancer and Nutrition–Cardiovascular Disease (EPIC-CVD) Consortium

    Medical Research Council Biostatistics Unit, University of Cambridge, Cambridge, United Kingdom.

    Importance: Human genetic studies have indicated that plasma lipoprotein(a) (Lp[a]) is causally associated with the risk of coronary heart disease (CHD), but randomized trials of several therapies that reduce Lp(a) levels by 25% to 35% have not provided any evidence that lowering Lp(a) level reduces CHD risk.

    Objective: To estimate the magnitude of the change in plasma Lp(a) levels needed to have the same evidence of an association with CHD risk as a 38.67-mg/dL (ie, 1-mmol/L) change in low-density lipoprotein cholesterol (LDL-C) level, a change that has been shown to produce a clinically meaningful reduction in the risk of CHD.

    Design, setting, and participants: A mendelian randomization analysis was conducted using individual participant data from 5 studies and with external validation using summarized data from 48 studies. Population-based prospective cohort and case-control studies featured 20 793 individuals with CHD and 27 540 controls with individual participant data, whereas summarized data included 62 240 patients with CHD and 127 299 controls. Data were analyzed from November 2016 to March 2018.

    Exposures: Genetic LPA score and plasma Lp(a) mass concentration.

    Main outcomes and measures: Coronary heart disease.

    Results: Of the included study participants, 53% were men, all were of white European ancestry, and the mean age was 57.5 years. The association of genetically predicted Lp(a) with CHD risk was linearly proportional to the absolute change in Lp(a) concentration. A 10-mg/dL lower genetically predicted Lp(a) concentration was associated with a 5.8% lower CHD risk (odds ratio [OR], 0.942; 95% CI, 0.933-0.951; P = 3 × 10-37), whereas a 10-mg/dL lower genetically predicted LDL-C level estimated using an LDL-C genetic score was associated with a 14.5% lower CHD risk (OR, 0.855; 95% CI, 0.818-0.893; P = 2 × 10-12). Thus, a 101.5-mg/dL change (95% CI, 71.0-137.0) in Lp(a) concentration had the same association with CHD risk as a 38.67-mg/dL change in LDL-C level. The association of genetically predicted Lp(a) concentration with CHD risk appeared to be independent of changes in LDL-C level owing to genetic variants that mimic the relationship of statins, PCSK9 inhibitors, and ezetimibe with CHD risk.

    Conclusions and relevance: The clinical benefit of lowering Lp(a) is likely to be proportional to the absolute reduction in Lp(a) concentration. Large absolute reductions in Lp(a) of approximately 100 mg/dL may be required to produce a clinically meaningful reduction in the risk of CHD similar in magnitude to what can be achieved by lowering LDL-C level by 38.67 mg/dL (ie, 1 mmol/L).

    JAMA cardiology 2018

  • Insular Celtic population structure and genomic footprints of migration.

    Byrne RP, Martiniano R, Cassidy LM, Carrigan M, Hellenthal G, Hardiman O, Bradley DG and McLaughlin RL

    Complex Trait Genomics Laboratory, Smurfit Institute of Genetics, School of Genetics and Microbiology, Trinity College Dublin, College Green, Dublin, Republic of Ireland.

    Previous studies of the genetic landscape of Ireland have suggested homogeneity, with population substructure undetectable using single-marker methods. Here we have harnessed the haplotype-based method fineSTRUCTURE in an Irish genome-wide SNP dataset, identifying 23 discrete genetic clusters which segregate with geographical provenance. Cluster diversity is pronounced in the west of Ireland but reduced in the east where older structure has been eroded by historical migrations. Accordingly, when populations from the neighbouring island of Britain are included, a west-east cline of Celtic-British ancestry is revealed along with a particularly striking correlation between haplotypes and geography across both islands. A strong relationship is revealed between subsets of Northern Irish and Scottish populations, where discordant genetic and geographic affinities reflect major migrations in recent centuries. Additionally, Irish genetic proximity of all Scottish samples likely reflects older strata of communication across the narrowest inter-island crossing. Using GLOBETROTTER we detected Irish admixture signals from Britain and Europe and estimated dates for events consistent with the historical migrations of the Norse-Vikings, the Anglo-Normans and the British Plantations. The influence of the former is greater than previously estimated from Y chromosome haplotypes. In all, we paint a new picture of the genetic landscape of Ireland, revealing structure which should be considered in the design of studies examining rare genetic variation and its association with traits.

    PLoS genetics 2018;14;1;e1007152

  • Targeting MEK in vemurafenib-resistant hairy cell leukemia.

    Caeser R, Collord G, Yao WQ, Chen Z, Vassiliou GS, Beer PA, Du MQ, Scott MA, Follows GA and Hodson DJ

    Department of Haematology, University of Cambridge, Cambridge, UK.

    Funded by: Medical Research Council (MRC): MR/M008584/1; Wellcome Trust: WT098051

    Leukemia 2018

  • Morphological, genomic and transcriptomic responses of Klebsiella pneumoniae to the last-line antibiotic colistin.

    Cain AK, Boinett CJ, Barquist L, Dordel J, Fookes M, Mayho M, Ellington MJ, Goulding D, Pickard D, Wick RR, Holt KE, Parkhill J and Thomson NR

    Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.

    Colistin remains one of the few antibiotics effective against multi-drug resistant (MDR) hospital pathogens, such as Klebsiella pneumoniae. Yet resistance to this last-line drug is rapidly increasing. Characterized mechanisms of col<sup>R</sup> in K. pneumoniae are largely due to chromosomal mutations in two-component regulators, although a plasmid-mediated col<sup>R</sup> mechanism has recently been uncovered. However, the effects of intrinsic colistin resistance are yet to be characterized on a whole-genome level. Here, we used a genomics-based approach to understand the mechanisms of adaptive col<sup>R</sup> acquisition in K. pneumoniae. In controlled directed-evolution experiments we observed two distinct paths to colistin resistance acquisition. Whole genome sequencing identified mutations in two colistin resistance genes: in the known col<sup>R</sup> regulator phoQ which became fixed in the population and resulted in a single amino acid change, and unstable minority variants in the recently described two-component sensor crrB. Through RNAseq and microscopy, we reveal the broad range of effects that colistin exposure has on the cell. This study is the first to use genomics to identify a population of minority variants with mutations in a col<sup>R</sup> gene in K. pneumoniae.

    Scientific reports 2018;8;1;9868

  • Increasing nursing capacity in genomics: Overview of existing global genomics resources.

    Calzone KA, Kirk M, Tonkin E, Badzek L, Benjamin C and Middleton A

    National Institutes of Health, National Cancer Institute, Center for Cancer Research, Genetics Branch, 37 Convent Drive, Building 37, RM 6002C, MSC 4256, Bethesda, MD 20892, USA. Electronic address:

    Background: Global genomic literacy of all health professions, including nurses, remains low despite an inundation of genomic information with established clinical and analytic validity and clinical utility. Genomic literacy and competency deficits contribute to lost opportunities to take advantage of the benefits that genomic information provides to improve health outcomes, reduce healthcare costs, and increase patient quality and safety. Nurses are essential to the integration of genomics into healthcare. The greatest challenges to realizing their potential in successful integration include education and awareness. Identification of resources, their focus, whether they targeted at nursing, and how to access them, form the foundation for a global genomic resource initiative led by the Global Genomics Nursing Alliance.

    Objectives: The aim was to identify existing global genomic resources and competencies, identifying the source, type and accessibility.

    Design: Cross sectional online descriptive survey to ascertain existing genomic resources.

    Settings: Limited to eighteen countries and seven organizations represented by delegates attending the inaugural meeting in 2017 of the Global Genomics Nursing Alliance.

    Participants: A purposive sample of global nursing leaders and representatives of national and international nursing organizations.

    Methods: The primary method was by online survey administered following an orientation webinar. Given the small numbers of nurse leaders in genomics within our sample (and indeed within the world), results were analyzed and presented descriptively. Those identifying resources provided further detailed resource information. Additional data were collected during a face-to-face meeting using an electronic audience-response system.

    Results: Of the twenty-three global delegates responding, 9 identified existing genomic resources that could be used for academic or continuing genomics education. Three countries have competence frameworks to guide learning and 5 countries have national organizations for genetics nurses.

    Conclusions: The genomic resources that already exist are not readily accessible or discoverable to the international nursing community and as such are underutilized.

    Nurse education today 2018;69;53-59

  • A forward genetic screen reveals a primary role for Plasmodium falciparum Reticulocyte Binding Protein Homologue 2a and 2b in determining alternative erythrocyte invasion pathways.

    Campino S, Marin-Menendez A, Kemp A, Cross N, Drought L, Otto TD, Benavente ED, Ravenhall M, Schwach F, Girling G, Manske M, Theron M, Gould K, Drury E, Clark TG, Kwiatkowski DP, Pance A and Rayner JC

    Malaria Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom.

    Invasion of human erythrocytes is essential for Plasmodium falciparum parasite survival and pathogenesis, and is also a complex phenotype. While some later steps in invasion appear to be invariant and essential, the earlier steps of recognition are controlled by a series of redundant, and only partially understood, receptor-ligand interactions. Reverse genetic analysis of laboratory adapted strains has identified multiple genes that when deleted can alter invasion, but how the relative contributions of each gene translate to the phenotypes of clinical isolates is far from clear. We used a forward genetic approach to identify genes responsible for variable erythrocyte invasion by phenotyping the parents and progeny of previously generated experimental genetic crosses. Linkage analysis using whole genome sequencing data revealed a single major locus was responsible for the majority of phenotypic variation in two invasion pathways. This locus contained the PfRh2a and PfRh2b genes, members of one of the major invasion ligand gene families, but not widely thought to play such a prominent role in specifying invasion phenotypes. Variation in invasion pathways was linked to significant differences in PfRh2a and PfRh2b expression between parasite lines, and their role in specifying alternative invasion was confirmed by CRISPR-Cas9-mediated genome editing. Expansion of the analysis to a large set of clinical P. falciparum isolates revealed common deletions, suggesting that variation at this locus is a major cause of invasion phenotypic variation in the endemic setting. This work has implications for blood-stage vaccine development and will help inform the design and location of future large-scale studies of invasion in clinical isolates.

    Funded by: Medical Research Council: MR/M01360X/1, MR/M006212/1, MR/M01360X/1; Wellcome Trust: 090851

    PLoS pathogens 2018;14;11;e1007436

  • Homozygous loss-of-function mutations in SLC26A7 cause goitrous congenital hypothyroidism.

    Cangul H, Liao XH, Schoenmakers E, Kero J, Barone S, Srichomkwun P, Iwayama H, Serra EG, Saglam H, Eren E, Tarim O, Nicholas AK, Zvetkova I, Anderson CA, Frankl FEK, Boelaert K, Ojaniemi M, Jääskeläinen J, Patyra K, Löf C, Williams ED, UK10K Consortium, Soleimani M, Barrett T, Maher ER, Chatterjee VK, Refetoff S and Schoenmakers N

    Department of Medical Genetics, Istanbul Medipol University, International School of Medicine, Istanbul, Turkey.

    Defects in genes mediating thyroid hormone biosynthesis result in dyshormonogenic congenital hypothyroidism (CH). Here, we report homozygous truncating mutations in SLC26A7 in 6 unrelated families with goitrous CH and show that goitrous hypothyroidism also occurs in Slc26a7-null mice. In both species, the gene is expressed predominantly in the thyroid gland, and loss of function is associated with impaired availability of iodine for thyroid hormone synthesis, partially corrected in mice by iodine supplementation. SLC26A7 is a member of the same transporter family as SLC26A4 (pendrin), an anion exchanger with affinity for iodide and chloride (among others), whose gene mutations cause congenital deafness and dyshormonogenic goiter. However, in contrast to pendrin, SLC26A7 does not mediate cellular iodide efflux and hearing in affected individuals is normal. We delineate a hitherto unrecognized role for SLC26A7 in thyroid hormone biosynthesis, for which the mechanism remains unclear.

    JCI insight 2018;3;20

  • Virus discovery reveals frequent infection by diverse novel members of the Flaviviridae in wild lemurs.

    Canuti M, Williams CV, Sagan SM, Oude Munnink BB, Gadi S, Verhoeven JTP, Kellam P, Cotten M, Lang AS, Junge RE, Cullen JM and van der Hoek L

    Department of Biology, Memorial University of Newfoundland, 232 Elizabeth Ave., St. John's, NL, A1B 3X9, Canada.

    Lemurs are highly endangered mammals inhabiting the forests of Madagascar. In this study, we performed virus discovery on serum samples collected from 84 wild lemurs and identified viral sequence fragments from 4 novel viruses within the family Flaviviridae, including members of the genera Hepacivirus and Pegivirus. The sifaka hepacivirus (SifHV, two genotypes) and pegivirus (SifPgV, two genotypes) were discovered in the diademed sifaka (Propithecus diadema), while other pegiviral fragments were detected in samples from the indri (Indri indri, IndPgV) and the weasel sportive lemur (Lepilemur mustelinus, LepPgV). Although data are preliminary, each viral species appeared host species-specific and frequent infection was detected (18 of 84 individuals were positive for at least one virus). The complete coding sequence and partial 5' and 3' untranslated regions (UTRs) were obtained for SifHV and its genomic organization was consistent with that of other hepaciviruses, with one unique polyprotein and highly structured UTRs. Phylogenetic analyses showed the SifHV belonged to a clade that includes several viral species identified in rodents from Asia and North America, while SifPgV and IndPgV were more closely related to pegiviral species A and C, that include viruses found in humans as well as New- and Old-World monkeys. Our results support the current proposed model of virus-host co-divergence with frequent occurrence of cross-species transmission for these genera and highlight how the discovery of more members of the Flaviviridae can help clarify the ecology and evolutionary history of these viruses. Furthermore, this knowledge is important for conservation and captive management of lemurs.

    Funded by: European Community: EC grant 223498

    Archives of virology 2018

  • Evaluation of Protein-Ligand Docking by Cyscore.

    Cao Y, Dai W and Miao Z

    Center of Growth, Metabolism and Aging, Key Lab of Bio-Resources and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu, People's Republic of China.

    Protein-ligand docking is a powerful method in drug discovery. The reliability of docking can be quantified by RMSD between a docking structure and an experimentally determined one. However, most experimentally determined structures are not available in practice. Evaluation by scoring functions is an alternative for assessing protein-ligand docking results. This chapter first provides a brief introduction to scoring methods used in docking. Then details are provided on how to use Cyscore programs. Finally it describes a case study for evaluation of protein-ligand docking.

    Methods in molecular biology (Clifton, N.J.) 2018;1762;233-243

  • 'Basic and Applied Thermogenesis Research' Bridging the Gap.

    Carobbio S, Guénantin AC and Vidal-Puig A

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, UK; Metabolic Research Laboratories, Addenbrooke's Treatment Centre, Institute of Metabolic Science, Addenbrooke's Hospital, University of Cambridge, Cambridge, UK. Electronic address:

    Obesity is a major health problem without satisfactory pharmacological treatment. A promising strategy is to promote energy dissipation by activating brown/beige adipose tissue. However, for this strategy to succeed it requires improving the transferability amongst cellular, murine, and human systems and bridging the gap between basic and clinical research.

    Funded by: British Heart Foundation: PG/12/53/29714; Medical Research Council: MC_UU_12012/2

    Trends in endocrinology and metabolism: TEM 2018;29;1;5-7

  • In silico guided reconstruction and analysis of ICAM-1-binding var genes from Plasmodium falciparum.

    Carrington E, Otto TD, Szestak T, Lennartz F, Higgins MK, Newbold CI and Craig AG

    Liverpool School of Tropical Medicine, Pembroke Place, Liverpool, L3 5QA, UK.

    The Plasmodium falciparum variant surface antigen PfEMP1 expressed on the surface of infected erythrocytes is thought to play a major role in the pathology of severe malaria. As the sequence pool of the var genes encoding PfEMP1 expands there are opportunities, despite the high degree of sequence diversity demonstrated by this gene family, to reconstruct full-length var genes from small sequence tags generated from patient isolates. To test whether this is possible we have used a set of recently laboratory adapted ICAM-1-binding parasite isolates to generate sequence tags and, from these, to identify the full-length PfEMP1 being expressed by them. In a subset of the strains available we were able to produce validated, full-length var gene sequences and use these to conduct biophysical analyses of the ICAM-1 binding regions.

    Scientific reports 2018;8;1;3282

  • Open Targets Platform: new developments and updates two years on.

    Carvalho-Silva D, Pierleoni A, Pignatelli M, Ong C, Fumis L, Karamanis N, Carmona M, Faulconbridge A, Hercules A, McAuley E, Miranda A, Peat G, Spitzer M, Barrett J, Hulcoop DG, Papa E, Koscielny G and Dunham I

    European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK.

    The Open Targets Platform integrates evidence from genetics, genomics, transcriptomics, drugs, animal models and scientific literature to score and rank target-disease associations for drug target identification. The associations are displayed in an intuitive user interface (, and are available through a REST-API ( and a bulk download ( In addition to target-disease associations, we also aggregate and display data at the target and disease levels to aid target prioritisation. Since our first publication two years ago, we have made eight releases, added new data sources for target-disease associations, started including causal genetic variants from non genome-wide targeted arrays, added new target and disease annotations, launched new visualisations and improved existing ones and released a new web tool for batch search of up to 200 targets. We have a new URL for the Open Targets Platform REST-API, new REST endpoints and also removed the need for authorisation for API fair use. Here, we present the latest developments of the Open Targets Platform, expanding the evidence and target-disease associations with new and improved data sources, refining data quality, enhancing website usability, and increasing our user base with our training workshops, user support, social media and bioinformatics forum engagement.

    Nucleic acids research 2018

  • A novel variant in <i>GLIS3</i> is associated with osteoarthritis.

    Casalone E, Tachmazidou I, Zengini E, Hatzikotoulas K, Hackinger S, Suveges D, Steinberg J, Rayner NW, arcOGEN Consortium, Wilkinson JM, Panoutsopoulou K and Zeggini E

    Department of Medical Sciences, University of Turin, Turin, Italy.

    Objectives: Osteoarthritis (OA) is a complex disease, but its genetic aetiology remains poorly characterised. To identify novel susceptibility loci for OA, we carried out a genome-wide association study (GWAS) in individuals from the largest UK-based OA collections to date.

    Methods: We carried out a discovery GWAS in 5414 OA individuals with knee and/or hip total joint replacement (TJR) and 9939 population-based controls. We followed-up prioritised variants in OA subjects from the interim release of the UK Biobank resource (up to 12 658 cases and 50 898 controls) and our lead finding in operated OA subjects from the full release of UK Biobank (17 894 cases and 89 470 controls). We investigated its functional implications in methylation, gene expression and proteomics data in primary chondrocytes from 12 pairs of intact and degraded cartilage samples from patients undergoing TJR.

    Results: We detect a genome-wide significant association at rs10116772 with TJR (P=3.7×10<sup>-8</sup>; for allele A: OR (95% CI) 0.97 (0.96 to 0.98)), an intronic variant in <i>GLIS3</i>, which is expressed in cartilage. Variants in strong correlation with rs10116772 have been associated with elevated plasma glucose levels and diabetes.

    Conclusions: We identify a novel susceptibility locus for OA that has been previously implicated in diabetes and glycaemic traits.

    Funded by: Medical Research Council: MC_QA137853; Wellcome Trust

    Annals of the rheumatic diseases 2018;77;4;620-623

  • Multiplexed ChIP-Seq Using Direct Nucleosome Barcoding: A Tool for High-Throughput Chromatin Analysis.

    Chabbert CD, Adjalley SH, Steinmetz LM and Pelechano V

    Genome Biology Unit, European Molecular Biology Laboratory (EMBL), Heidelberg, Germany.

    Chromatin immunoprecipitation followed by sequencing (ChIP-Seq) or microarray hybridization (ChIP-on-chip) are standard methods for the study of transcription factor binding sites and histone chemical modifications. However, these approaches only allow profiling of a single factor or protein modification at a time.In this chapter, we present Bar-ChIP, a higher throughput version of ChIP-Seq that relies on the direct ligation of molecular barcodes to chromatin fragments. Bar-ChIP enables the concurrent profiling of multiple DNA-protein interactions and is therefore amenable to experimental scale-up, without the need for any robotic instrumentation.

    Methods in molecular biology (Clifton, N.J.) 2018;1689;177-194

  • Latin Americans show wide-spread Converso ancestry and imprint of local Native ancestry on physical appearance.

    Chacón-Duque JC, Adhikari K, Fuentes-Guajardo M, Mendoza-Revilla J, Acuña-Alonzo V, Barquera R, Quinto-Sánchez M, Gómez-Valdés J, Everardo Martínez P, Villamil-Ramírez H, Hünemeier T, Ramallo V, Silva de Cerqueira CC, Hurtado M, Villegas V, Granja V, Villena M, Vásquez R, Llop E, Sandoval JR, Salazar-Granara AA, Parolin ML, Sandoval K, Peñaloza-Espinosa RI, Rangel-Villalobos H, Winkler CA, Klitz W, Bravi C, Molina J, Corach D, Barrantes R, Gomes V, Resende C, Gusmão L, Amorim A, Xue Y, Dugoujon JM, Moral P, González-José R, Schuler-Faccini L, Salzano FM, Bortolini MC, Canizales-Quinteros S, Poletti G, Gallo C, Bedoya G, Rothhammer F, Balding D, Hellenthal G and Ruiz-Linares A

    Department of Genetics, Evolution and Environment and UCL Genetics Institute, University College London, London, WC1E 6BT, UK.

    Historical records and genetic analyses indicate that Latin Americans trace their ancestry mainly to the intermixing (admixture) of Native Americans, Europeans and Sub-Saharan Africans. Using novel haplotype-based methods, here we infer sub-continental ancestry in over 6,500 Latin Americans and evaluate the impact of regional ancestry variation on physical appearance. We find that Native American ancestry components in Latin Americans correspond geographically to the present-day genetic structure of Native groups, and that sources of non-Native ancestry, and admixture timings, match documented migratory flows. We also detect South/East Mediterranean ancestry across Latin America, probably stemming mostly from the clandestine colonial migration of Christian converts of non-European origin (Conversos). Furthermore, we find that ancestry related to highland (Central Andean) versus lowland (Mapuche) Natives is associated with variation in facial features, particularly nose morphology, and detect significant differences in allele frequencies between these groups at loci previously associated with nose morphology in this sample.

    Nature communications 2018;9;1;5388

  • Single-Cell (Multi)omics Technologies.

    Chappell L, Russell AJC and Voet T

    Wellcome Sanger Institute, Cambridge CB10 1SA, United Kingdom; email: , ,

    Single-cell multiomics technologies typically measure multiple types of molecule from the same individual cell, enabling more profound biological insight than can be inferred by analyzing each molecular layer from separate cells. These single-cell multiomics technologies can reveal cellular heterogeneity at multiple molecular layers within a population of cells and reveal how this variation is coupled or uncoupled between the captured omic layers. The data sets generated by these techniques have the potential to enable a deeper understanding of the key biological processes and mechanisms driving cellular heterogeneity and how they are linked with normal development and aging as well as disease etiology. This review details both established and novel single-cell mono- and multiomics technologies and considers their limitations, applications, and likely future developments.

    Funded by: Wellcome Trust: 105031/E/14/Z, 105045/Z/14/Z

    Annual review of genomics and human genetics 2018;19;15-41

  • Gimpute: An efficient genetic data imputation pipeline.

    Chen J, Lippold D, Frank J, Rayner W, Meyer-Lindenberg A and Schwarz E

    Department of Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, University of Heidelberg, Mannheim, Germany.

    Motivation: Genotype imputation is essential for genome-wide association studies (GWAS) to retrieve information of untyped variants and facilitate comparability across studies. However, there is a lack of automated pipelines that perform all required processing steps prior to and following imputation.

    Results: Based on widely used and freely available tools, we have developed Gimpute, an automated processing and imputation pipeline for genome-wide association data. Gimpute includes processing steps for genotype liftOver, quality control, population outlier detection, haplotype pre-phasing, imputation, post imputation, data management and the extension to other existing pipeline.

    Availability: The Gimpute package is an open source R package and is freely available at

    Supplementary information: Supplementary data are available at Bioinformatics online.

    Bioinformatics (Oxford, England) 2018

  • A rapid and robust method for single cell chromatin accessibility profiling.

    Chen X, Miragaia RJ, Natarajan KN and Teichmann SA

    Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.

    The assay for transposase-accessible chromatin using sequencing (ATAC-seq) is widely used to identify regulatory regions throughout the genome. However, very few studies have been performed at the single cell level (scATAC-seq) due to technical challenges. Here we developed a simple and robust plate-based scATAC-seq method, combining upfront bulk Tn5 tagging with single-nuclei sorting. We demonstrate that our method works robustly across various systems, including fresh and cryopreserved cells from primary tissues. By profiling over 3000 splenocytes, we identify distinct immune cell types and reveal cell type-specific regulatory regions and related transcription factors.

    Funded by: Wellcome Trust

    Nature communications 2018;9;1;5345

  • Impact of carbohydrate substrate complexity on the diversity of the human colonic microbiota.

    Chung WSF, Walker AW, Vermeiren J, Sheridan PO, Bosscher D, Garcia-Campayo V, Parkhill J, Flint HJ and Duncan SH

    Gut Health Group, Rowett Institute, University of Aberdeen, Foresterhill, Aberdeen, Scotland, AB25 2ZD, UK.

    The diversity of the colonic microbial community has been linked with health in adults and diet composition is one possible determinant of diversity. We used carefully controlled conditions in vitro to determine how the complexity and multiplicity of growth substrates influence species diversity of the human colonic microbiota. In each experiment, five parallel anaerobic fermentors that received identical faecal inocula were supplied continuously with single carbohydrates (either arabinoxylan-oligosaccharides (AXOS), pectin or inulin) or with a '3-mix' of all three carbohydrates, or with a '6-mix' that additionally contained resistant starch, β-glucan and galactomannan as energy sources. Inulin supported less microbial diversity over the first six days than the other two single substrates or the 3- and 6-mixes, showing that substrate complexity is key to influencing microbiota diversity. The communities enriched in these fermentors did not differ greatly at the phylum and family level, but were markedly different at the species level. Certain species were promoted by single substrates, whilst others (such as Bacteroides ovatus, LEfSe p = 0.001) showed significantly greater success with the mixed substrate. The complex polysaccharides such as pectin and arabinoxylan-oligosaccharides promoted greater diversity than simple homopolymers, such as inulin. These findings suggest that dietary strategies intended to achieve health benefits by increasing gut microbiota diversity should employ complex non-digestible substrates and substrate mixtures.

    FEMS microbiology ecology 2018

  • RecQ helicases in the malaria parasite Plasmodium falciparum affect genome stability, gene expression patterns and DNA replication dynamics.

    Claessens A, Harris LM, Stanojcic S, Chappell L, Stanton A, Kuk N, Veneziano-Broccia P, Sterkers Y, Rayner JC and Merrick CJ

    London School of Hygiene and Tropical Medicine, London, United Kingdom.

    The malaria parasite Plasmodium falciparum has evolved an unusual genome structure. The majority of the genome is relatively stable, with mutation rates similar to most eukaryotic species. However, some regions are very unstable with high recombination rates, driving the generation of new immune evasion-associated var genes. The molecular factors controlling the inconsistent stability of this genome are not known. Here we studied the roles of the two putative RecQ helicases in P. falciparum, PfBLM and PfWRN. When PfWRN was knocked down, recombination rates increased four-fold, generating chromosomal abnormalities, a high rate of chimeric var genes and many microindels, particularly in known 'fragile sites'. This is the first identification of a gene involved in suppressing recombination and maintaining genome stability in Plasmodium. By contrast, no change in mutation rate appeared when the second RecQ helicase, PfBLM, was mutated. At the transcriptional level, however, both helicases evidently modulate the transcription of large cohorts of genes, with several hundred genes-including a large proportion of vars-showing deregulated expression in each RecQ mutant. Aberrant processing of stalled replication forks is a possible mechanism underlying elevated mutation rates and this was assessed by measuring DNA replication dynamics in the RecQ mutant lines. Replication forks moved slowly and stalled at elevated rates in both mutants, confirming that RecQ helicases are required for efficient DNA replication. Overall, this work identifies the Plasmodium RecQ helicases as major players in DNA replication, antigenic diversification and genome stability in the most lethal human malaria parasite, with important implications for genome evolution in this pathogen.

    PLoS genetics 2018;14;7;e1007490

  • scNMT-seq enables joint profiling of chromatin accessibility DNA methylation and transcription in single cells.

    Clark SJ, Argelaguet R, Kapourani CA, Stubbs TM, Lee HJ, Alda-Catalinas C, Krueger F, Sanguinetti G, Kelsey G, Marioni JC, Stegle O and Reik W

    Epigenetics Programme, Babraham Institute, Cambridge, CB22 3AT, UK.

    Parallel single-cell sequencing protocols represent powerful methods for investigating regulatory relationships, including epigenome-transcriptome interactions. Here, we report a single-cell method for parallel chromatin accessibility, DNA methylation and transcriptome profiling. scNMT-seq (single-cell nucleosome, methylation and transcription sequencing) uses a GpC methyltransferase to label open chromatin followed by bisulfite and RNA sequencing. We validate scNMT-seq by applying it to differentiating mouse embryonic stem cells, finding links between all three molecular layers and revealing dynamic coupling between epigenomic layers during differentiation.

    Funded by: Wellcome Trust

    Nature communications 2018;9;1;781

  • The genomics of insecticide resistance: insights from recent studies in African malaria vectors.

    Clarkson CS, Temple HJ and Miles A

    Wellcome Sanger Institute, Hinxton CB10 1SA, United Kingdom. Electronic address:

    Over 80% of the world's population is at risk from arthropod-vectored diseases, and arthropod crop pests are a significant threat to food security. Insecticides are our front-line response for controlling these disease vectors and pests, and consequently the increasing prevalence of insecticide resistance is of global concern. Here we provide a brief overview of how genomics can be used to implement effective insecticide resistance management (IRM), with a focus on recent advances in the study of Anopheles gambiae, the major vector of malaria in Africa. These advances unlock the potential for a predictive form of IRM, allowing tractable feedback for stakeholders, where the latest field data and well parameterised models can maximise the lifetime and effectiveness of available insecticides.

    Current opinion in insect science 2018;27;111-115

  • Pneumococcal vaccine impacts on the population genomics of non-typeable Haemophilus influenzae.

    Cleary D, Devine V, Morris D, Osman K, Gladstone R, Bentley S, Faust S and Clarke S

    2​NIHR Southampton Biomedical Research Centre, University Hospital Southampton Foundation NHS Trust, Southampton, UK.

    The implementation of pneumococcal conjugate vaccines (PCVs) has led to a decline in vaccine-type disease. However, there is evidence that the epidemiology of non-typeable Haemophilus influenzae (NTHi) carriage and disease can be altered as a consequence of PCV introduction. We explored the epidemiological shifts in NTHi carriage using whole genome sequencing over a 5-year period that included PCV13 replacement of PCV7 in the UK's National Immunization Programme in 2010. Between 2008/09 and 2012/13 (October to March), nasopharyngeal swabs were taken from children <5 years of age. Significantly increased carriage post-PCV13 was observed and lineage-specific associations with Streptococcus pneumoniae were seen before but not after PCV13 introduction. NTHi were characterized into 11 discrete, temporally stable lineages, congruent with current knowledge regarding the clonality of NTHi. The increased carriage could not be linked to the expansion of a particular clone and different co-carriage dynamics were seen before PCV13 implementation when NTHi co-carried with vaccine serotype pneumococci. In summary, PCV13 introduction has been shown to have an indirect effect on NTHi epidemiology and there exists both negative and positive, distinct associations between pneumococci and NTHi. This should be considered when evaluating the impacts of pneumococcal vaccine design and policy.

    Microbial genomics 2018

  • Genome-wide analysis of multi- and extensively drug-resistant Mycobacterium tuberculosis.

    Coll F, Phelan J, Hill-Cawthorne GA, Nair MB, Mallard K, Ali S, Abdallah AM, Alghamdi S, Alsomali M, Ahmed AO, Portelli S, Oppong Y, Alves A, Bessa TB, Campino S, Caws M, Chatterjee A, Crampin AC, Dheda K, Furnham N, Glynn JR, Grandjean L, Minh Ha D, Hasan R, Hasan Z, Hibberd ML, Joloba M, Jones-López EC, Matsumoto T, Miranda A, Moore DJ, Mocillo N, Panaiotov S, Parkhill J, Penha C, Perdigão J, Portugal I, Rchiad Z, Robledo J, Sheen P, Shesha NT, Sirgel FA, Sola C, Oliveira Sousa E, Streicher EM, Helden PV, Viveiros M, Warren RM, McNerney R, Pain A and Clark TG

    Faculty of Infectious and Tropical Diseases, London School of Hygiene and Tropical Medicine, London, UK.

    To characterize the genetic determinants of resistance to antituberculosis drugs, we performed a genome-wide association study (GWAS) of 6,465 Mycobacterium tuberculosis clinical isolates from more than 30 countries. A GWAS approach within a mixed-regression framework was followed by a phylogenetics-based test for independent mutations. In addition to mutations in established and recently described resistance-associated genes, novel mutations were discovered for resistance to cycloserine, ethionamide and para-aminosalicylic acid. The capacity to detect mutations associated with resistance to ethionamide, pyrazinamide, capreomycin, cycloserine and para-aminosalicylic acid was enhanced by inclusion of insertions and deletions. Odds ratios for mutations within candidate genes were found to reflect levels of resistance. New epistatic relationships between candidate drug-resistance-associated genes were identified. Findings also suggest the involvement of efflux pumps (drrA and Rv2688c) in the emergence of resistance. This study will inform the design of new diagnostic tests and expedite the investigation of resistance and compensatory epistatic mechanisms.

    Funded by: Medical Research Council: MR/K020420/1; Wellcome Trust: 098610

    Nature genetics 2018;50;2;307-316

  • Dietary trehalose enhances virulence of epidemic Clostridium difficile.

    Collins J, Robinson C, Danhof H, Knetsch CW, van Leeuwen HC, Lawley TD, Auchtung JM and Britton RA

    Baylor College of Medicine, Department of Molecular Virology and Microbiology, One Baylor Plaza, Houston, Texas 77030, USA.

    Clostridium difficile disease has recently increased to become a dominant nosocomial pathogen in North America and Europe, although little is known about what has driven this emergence. Here we show that two epidemic ribotypes (RT027 and RT078) have acquired unique mechanisms to metabolize low concentrations of the disaccharide trehalose. RT027 strains contain a single point mutation in the trehalose repressor that increases the sensitivity of this ribotype to trehalose by more than 500-fold. Furthermore, dietary trehalose increases the virulence of a RT027 strain in a mouse model of infection. RT078 strains acquired a cluster of four genes involved in trehalose metabolism, including a PTS permease that is both necessary and sufficient for growth on low concentrations of trehalose. We propose that the implementation of trehalose as a food additive into the human diet, shortly before the emergence of these two epidemic lineages, helped select for their emergence and contributed to hypervirulence.

    Nature 2018

  • An integrated genomic analysis of anaplastic meningioma identifies prognostic molecular signatures.

    Collord G, Tarpey P, Kurbatova N, Martincorena I, Moran S, Castro M, Nagy T, Bignell G, Maura F, Young MD, Berna J, Tubio JMC, McMurran CE, Young AMH, Sanders M, Noorani I, Price SJ, Watts C, Leipnitz E, Kirsch M, Schackert G, Pearson D, Devadass A, Ram Z, Collins VP, Allinson K, Jenkinson MD, Zakaria R, Syed K, Hanemann CO, Dunn J, McDermott MW, Kirollos RW, Vassiliou GS, Esteller M, Behjati S, Brazma A, Santarius T and McDermott U

    Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK.

    Anaplastic meningioma is a rare and aggressive brain tumor characterised by intractable recurrences and dismal outcomes. Here, we present an integrated analysis of the whole genome, transcriptome and methylation profiles of primary and recurrent anaplastic meningioma. A key finding was the delineation of distinct molecular subgroups that were associated with diametrically opposed survival outcomes. Relative to lower grade meningiomas, anaplastic tumors harbored frequent driver mutations in SWI/SNF complex genes, which were confined to the poor prognosis subgroup. Aggressive disease was further characterised by transcriptional evidence of increased PRC2 activity, stemness and epithelial-to-mesenchymal transition. Our analyses discern biologically distinct variants of anaplastic meningioma with prognostic and therapeutic significance.

    Funded by: Wellcome Trust: WT098051

    Scientific reports 2018;8;1;13537

  • Computational pan-genomics: status, promises and challenges.

    Computational Pan-Genomics Consortium

    Many disciplines, from human genetics and oncology to plant breeding, microbiology and virology, commonly face the challenge of analyzing rapidly increasing numbers of genomes. In case of Homo sapiens, the number of sequenced genomes will approach hundreds of thousands in the next few years. Simply scaling up established bioinformatics pipelines will not be sufficient for leveraging the full potential of such rich genomic data sets. Instead, novel, qualitatively different computational methods and paradigms are needed. We will witness the rapid extension of computational pan-genomics, a new sub-area of research in computational biology. In this article, we generalize existing definitions and understand a pan-genome as any collection of genomic sequences to be analyzed jointly or to be used as a reference. We examine already available approaches to construct and use pan-genomes, discuss the potential benefits of future technologies and methodologies and review open challenges from the vantage point of the above-mentioned biological disciplines. As a prominent example for a computational paradigm shift, we particularly highlight the transition from the representation of reference genomes as strings to representations as graphs. We outline how this and other challenges from different application domains translate into common computational problems, point out relevant bioinformatics techniques and identify open problems in computer science. With this review, we aim to increase awareness that a joint approach to computational pan-genomics can help address many of the problems currently faced in various domains.

    Funded by: NHGRI NIH HHS: U54 HG007990

    Briefings in bioinformatics 2018;19;1;118-135

  • PDX Finder: A portal for patient-derived tumor xenograft model discovery.

    Conte N, Mason JC, Halmagyi C, Neuhauser S, Mosaku A, Yordanova G, Chatzipli A, Begley DA, Krupke DM, Parkinson H, Meehan TF and Bult CC

    European Molecular Biology Laboratory- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

    Patient-derived tumor xenograft (PDX) mouse models are a versatile oncology research platform for studying tumor biology and for testing chemotherapeutic approaches tailored to genomic characteristics of individual patients' tumors. PDX models are generated and distributed by a diverse group of academic labs, multi-institution consortia and contract research organizations. The distributed nature of PDX repositories and the use of different metadata standards for describing model characteristics presents a significant challenge to identifying PDX models relevant to specific cancer research questions. The Jackson Laboratory and EMBL-EBI are addressing these challenges by co-developing PDX Finder, a comprehensive open global catalog of PDX models and their associated datasets. Within PDX Finder, model attributes are harmonized and integrated using a previously developed community minimal information standard to support consistent searching across the originating resources. Links to repositories are provided from the PDX Finder search results to facilitate model acquisition and/or collaboration. The PDX Finder resource currently contains information for 1985 PDX models of diverse cancers including those from large resources such as the Patient-Derived Models Repository, PDXNet and EurOPDX. Individuals or organizations that generate and distribute PDXs are invited to increase the 'findability' of their models by participating in the PDX Finder initiative at

    Nucleic acids research 2018

  • Formalising recall by genotype as an efficient approach to detailed phenotyping and causal inference.

    Corbin LJ, Tan VY, Hughes DA, Wade KH, Paul DS, Tansey KE, Butcher F, Dudbridge F, Howson JM, Jallow MW, John C, Kingston N, Lindgren CM, O'Donavan M, O'Rahilly S, Owen MJ, Palmer CNA, Pearson ER, Scott RA, van Heel DA, Whittaker J, Frayling T, Tobin MD, Wain LV, Smith GD, Evans DM, Karpe F, McCarthy MI, Danesh J, Franks PW and Timpson NJ

    MRC Integrative Epidemiology Unit at University of Bristol, Bristol, BS8 2BN, UK.

    Detailed phenotyping is required to deepen our understanding of the biological mechanisms behind genetic associations. In addition, the impact of potentially modifiable risk factors on disease requires analytical frameworks that allow causal inference. Here, we discuss the characteristics of Recall-by-Genotype (RbG) as a study design aimed at addressing both these needs. We describe two broad scenarios for the application of RbG: studies using single variants and those using multiple variants. We consider the efficacy and practicality of the RbG approach, provide a catalogue of UK-based resources for such studies and present an online RbG study planner.

    Funded by: Medical Research Council: G0600705, G0902313, MC_PC_15018, MC_UU_12012/1, MC_UU_12012/5, MC_UU_12013/1, MC_UU_12013/3, MC_UU_12013/4, MR/L003120/1, MR/L010305/1, MR/L020149/1; NIDDK NIH HHS: U01 DK105535

    Nature communications 2018;9;1;711

  • Draft Genome Sequences of Two Multidrug-Resistant Salmonella enterica Serovar Typhimurium Clinical Isolates from Uruguay.

    Cordeiro NF, D'Alessandro B, Iriarte A, Pickard D, Yim L, Chabalgoity JA, Betancor L and Vignoli R

    Departamento de Bacteriología y Virología, Instituto de Higiene, Facultad de Medicina, UDELAR, Montevideo, Uruguay.

    Multidrug-resistant Salmonella enterica isolates are an increasing problem worldwide; nevertheless, the mechanisms responsible for such resistance are rarely well defined. Multidrug-resistant S. enterica serovar Typhimurium isolates ST3224 and ST827 were collected from two patients. The characteristics of both genomes and antimicrobial resistance genes were determined using next-generation sequencing.

    Microbiology resource announcements 2018;7;4

  • PPARs and Metabolic Disorders Associated with Challenged Adipose Tissue Plasticity.

    Corrales P, Vidal-Puig A and Medina-Gómez G

    Área de Bioquímica y Biología Molecular, Departamento de Ciencias Básicas de la Salud, Facultad de Ciencias de la Salud, Universidad Rey Juan Carlos, Avda. de Atenas s/n. Alcorcón, 28922 Madrid, Spain.

    Peroxisome proliferator-activated receptors (PPARs) are members of a family of nuclear hormone receptors that exert their transcriptional control on genes harboring PPAR-responsive regulatory elements (PPRE) in partnership with retinoid X receptors (RXR). The activation of PPARs coordinated by specific coactivators/repressors regulate networks of genes controlling diverse homeostatic processes involving inflammation, adipogenesis, lipid metabolism, glucose homeostasis, and insulin resistance. Defects in PPARs have been linked to lipodystrophy, obesity, and insulin resistance as a result of the impairment of adipose tissue expandability and functionality. PPARs can act as lipid sensors, and when optimally activated, can rewire many of the metabolic pathways typically disrupted in obesity leading to an improvement of metabolic homeostasis. PPARs also contribute to the homeostasis of adipose tissue under challenging physiological circumstances, such as pregnancy and aging. Given their potential pathogenic role and their therapeutic potential, the benefits of PPARs activation should not only be considered relevant in the context of energy balance-associated pathologies and insulin resistance but also as potential relevant targets in the context of diabetic pregnancy and changes in body composition and metabolic stress associated with aging. Here, we review the rationale for the optimization of PPAR activation under these conditions.

    International journal of molecular sciences 2018;19;7

  • Sequence variation of Epstein-Barr virus: viral types, geography, codon usage and diseases.

    Correia S, Bridges R, Wegner F, Venturini C, Palser A, Middeldorp JM, Cohen JI, Lorenzetti MA, Bassano I, White RE, Kellam P, Breuer J and Farrell PJ

    Section of Virology, Faculty of Medicine, Norfolk Place, London W2 1PG, UK.

    138 new Epstein-Barr virus (EBV) genome sequences have been determined. 125 of these and 116 from previous reports were combined to produce a multiple sequence alignment of 241 EBV genomes, which we have used to analyze variation within the viral genome. The type 1/type2 classification of EBV remains the major form of variation and is defined mostly by EBNA2 and EBNA3, but the type 2 SNPs at the EBNA3 locus extend into the adjacent gp350 and gp42 genes, whose products mediate infection of B cells by EBV. A small insertion within the BART miRNA region of the genome was present in 21 EBV strains. EBV from saliva of USA patients with chronic active EBV infection aligned with the wild type EBV genome, with no evidence of WZhet rearrangements. The V3 polymorphism in the Zp promoter for BZLF1 was found to be frequent in nasopharyngeal carcinoma cases both from Hong Kong and Indonesia. Codon usage was found to differ between latent and lytic cycle EBV genes and the main forms of variation of the EBNA1 protein have been identified.<b>IMPORTANCE</b> Epstein-Barr virus causes most cases of infectious mononucleosis and post-transplant lymphoproliferative disease. It contributes to several types of cancer including Hodgkin's lymphoma, Burkitt's lymphoma, diffuse large B cell lymphoma, nasopharyngeal carcinoma and gastric carcinoma. EBV genome variation is important because some of the diseases associated with EBV have very different incidences in different populations and geographic regions - differences in the EBV genome might contribute to these diseases. Some specific EBV genome alterations that appear to be significant in EBV associated cancers are already known and current efforts to make an EBV vaccine and antiviral drugs should also take account of sequence differences in the proteins used as targets.

    Journal of virology 2018

  • Eradication genomics-lessons for parasite control.

    Cotton JA, Berriman M, Dalén L and Barnes I

    Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK.

    Science (New York, N.Y.) 2018;361;6398;130-131

  • <i>Leishmania naiffi</i> and <i>Leishmania guyanensis</i> reference genomes highlight genome structure and gene evolution in the <i>Viannia</i> subgenus.

    Coughlan S, Taylor AS, Feane E, Sanders M, Schonian G, Cotton JA and Downing T

    School of Mathematics, Applied Mathematics and Statistics, National University of Ireland, Galway, Republic of Ireland.

    The unicellular protozoan parasite <i>Leishmania</i> causes the neglected tropical disease leishmaniasis, affecting 12 million people in 98 countries. In South America, where the <i>Viannia</i> subgenus predominates, so far only <i>L.</i> (<i>Viannia</i>) <i>braziliensis</i> and <i>L.</i> (<i>V.</i>) <i>panamensis</i> have been sequenced, assembled and annotated as reference genomes. Addressing this deficit in molecular information can inform species typing, epidemiological monitoring and clinical treatment. Here, <i>L.</i> (<i>V.</i>) <i>naiffi</i> and <i>L.</i> (<i>V.</i>) <i>guyanensis</i> genomic DNA was sequenced to assemble these two genomes as draft references from short sequence reads. The methods used were tested using short sequence reads for <i>L. braziliensis</i> M2904 against its published reference as a comparison. This assembly and annotation pipeline identified 70 additional genes not annotated on the original M2904 reference. Phylogenetic and evolutionary comparisons of <i>L. guyanensis</i> and <i>L. naiffi</i> with 10 other <i>Viannia</i> genomes revealed four traits common to all <i>Viannia</i>: aneuploidy, 22 orthologous groups of genes absent in other <i>Leishmania</i> subgenera, elevated TATE transposon copies and a high NADH-dependent fumarate reductase gene copy number. Within the <i>Viannia</i>, there were limited structural changes in genome architecture specific to individual species: a 45 Kb amplification on chromosome 34 was present in all bar <i>L. lainsoni</i>, <i>L. naiffi</i> had a higher copy number of the virulence factor leishmanolysin, and laboratory isolate <i>L. shawi</i> M8408 had a possible minichromosome derived from the 3' end of chromosome 34<i>.</i> This combination of genome assembly, phylogenetics and comparative analysis across an extended panel of diverse <i>Viannia</i> has uncovered new insights into the origin and evolution of this subgenus and can help improve diagnostics for leishmaniasis surveillance.

    Royal Society open science 2018;5;4;172212

  • Mapping the malaria parasite druggable genome by using in vitro evolution and chemogenomics.

    Cowell AN, Istvan ES, Lukens AK, Gomez-Lorenzo MG, Vanaerschot M, Sakata-Kato T, Flannery EL, Magistrado P, Owen E, Abraham M, LaMonte G, Painter HJ, Williams RM, Franco V, Linares M, Arriaga I, Bopp S, Corey VC, Gnädig NF, Coburn-Flynn O, Reimer C, Gupta P, Murithi JM, Moura PA, Fuchs O, Sasaki E, Kim SW, Teng CH, Wang LT, Akidil A, Adjalley S, Willis PA, Siegel D, Tanaseichuk O, Zhong Y, Zhou Y, Llinás M, Ottilie S, Gamo FJ, Lee MCS, Goldberg DE, Fidock DA, Wirth DF and Winzeler EA

    School of Medicine, University of California San Diego (UCSD), 9500 Gilman Drive, La Jolla, CA 92093, USA.

    Chemogenetic characterization through in vitro evolution combined with whole-genome analysis can identify antimalarial drug targets and drug-resistance genes. We performed a genome analysis of 262 <i>Plasmodium falciparum</i> parasites resistant to 37 diverse compounds. We found 159 gene amplifications and 148 nonsynonymous changes in 83 genes associated with drug-resistance acquisition, where gene amplifications contributed to one-third of resistance acquisition events. Beyond confirming previously identified multidrug-resistance mechanisms, we discovered hitherto unrecognized drug target-inhibitor pairs, including thymidylate synthase and a benzoquinazolinone, farnesyltransferase and a pyrimidinedione, and a dipeptidylpeptidase and an arylurea. This exploration of the <i>P. falciparum</i> resistome and druggable genome will likely guide drug discovery and structural biology efforts, while also advancing our understanding of resistance mechanisms available to the malaria parasite.

    Funded by: NIAID NIH HHS: R01 AI050234, R01 AI090141, R01 AI099105, R01 AI103058, R37 AI050234, T32 AI007036; NIGMS NIH HHS: P50 GM085764, T32 GM007198, T32 GM008666

    Science (New York, N.Y.) 2018;359;6372;191-199

  • Transposon Insertion Sequencing Elucidates Novel Gene Involvement in Susceptibility and Resistance to Phages T4 and T7 in <i>Escherichia coli</i> O157.

    Cowley LA, Low AS, Pickard D, Boinett CJ, Dallman TJ, Day M, Perry N, Gally DL, Parkhill J, Jenkins C and Cain AK

    Gastrointestinal Bacterial Reference Unit, Public Health England, London United Kingdom

    Experiments using bacteriophage (phage) to infect bacterial strains have helped define some basic genetic concepts in microbiology, but our understanding of the complexity of bacterium-phage interactions is still limited. As the global threat of antibiotic resistance continues to increase, phage therapy has reemerged as an attractive alternative or supplement to treating antibiotic-resistant bacterial infections. Further, the long-used method of phage typing to classify bacterial strains is being replaced by molecular genetic techniques. Thus, there is a growing need for a complete understanding of the precise molecular mechanisms underpinning phage-bacterium interactions to optimize phage therapy for the clinic as well as for retrospectively interpreting phage typing data on the molecular level. In this study, a genomics-based fitness assay (TraDIS) was used to identify all host genes involved in phage susceptibility and resistance for a T4 phage infecting Shiga-toxigenic <i>Escherichia coli</i> O157. The TraDIS results identified both established and previously unidentified genes involved in phage infection, and a subset were confirmed by site-directed mutagenesis and phenotypic testing of 14 T4 and 2 T7 phages. For the first time, the entire <i>sap</i> operon was implicated in phage susceptibility and, conversely, the stringent starvation protein A gene (<i>sspA</i>) was shown to provide phage resistance. Identifying genes involved in phage infection and replication should facilitate the selection of bespoke phage combinations to target specific bacterial pathogens.<b>IMPORTANCE</b> Antibiotic resistance has diminished treatment options for many common bacterial infections. Phage therapy is an alternative option that was once popularly used across Europe to kill bacteria within humans. Phage therapy acts by using highly specific viruses (called phages) that infect and lyse certain bacterial species to treat the infection. Whole-genome sequencing has allowed modernization of the investigations into phage-bacterium interactions. Here, using <i>E. coli</i> O157 and T4 bacteriophage as a model, we have exploited a genome-wide fitness assay to investigate all genes involved in defining phage resistance or susceptibility. This knowledge of the genetic determinants of phage resistance and susceptibility can be used to design bespoke phage combinations targeted to specific bacterial infections for successful infection eradication.

    mBio 2018;9;4

  • The Contribution of Genetic Variation of Streptococcus Pneumoniae to the Clinical Manifestation of Invasive Pneumococcal Disease.

    Cremers AJH, Mobegi FM, van der Gaast-de Jongh C, van Weert M, van Opzeeland FJ, Vehkala M, Knol MJ, Bootsma HJ, Välimäki N, Croucher NJ, Meis JF, Bentley S, van Hijum SAFT, Corander J, Zomer AL, Ferwerda G and de Jonge MI

    Laboratory of Pediatric Infectious Diseases, Radboudumc, Nijmegen, the Netherlands.

    Background: Different clinical manifestations of invasive pneumococcal disease (IPD) have thus far mainly been explained by patient characteristics. Here we studied the contribution of pneumococcal genetic variation to IPD phenotype.

    Methods: The index cohort consisted of 349 patients admitted to two Dutch hospitals between 2000-2011 with pneumococcal bacteraemia. We performed genome-wide association studies to identify pneumococcal lineages, genes and allelic variants associated with 23 clinical IPD phenotypes. The identified associations were validated in a nationwide (n=482) and a post-pneumococcal vaccination cohort (n=121). The contribution of confirmed pneumococcal genotypes to the clinical IPD phenotype, relative to known clinical predictors, was tested by regression analysis.

    Results: Among IPD patients, the presence of pneumococcal gene slaA was a nationwide confirmed independent predictor of meningitis (OR=10.5, p=0.001, 5% presence), as was sequence cluster 9 (predominant serotype 7F, OR=3.68, p=0.057, 11% presence). A set of 4 pneumococcal genes co-located on a prophage was a confirmed independent predictor of 30-day mortality (OR=3.4, p=0.003, 48% presence). We could detect the pneumococcal variants of concern in these patients' blood samples.

    Conclusions: In this study, knowledge of pneumococcal genotypic variants improved the clinical risk assessment for detrimental manifestations of IPD. This provides us with novel opportunities to target, anticipate or avert the pathogenic effects related to particular pneumococcal variants, and indicates that information on pneumococcal genotype is important for the diagnostic and treatment strategy in IPD. Ongoing surveillance is warranted to monitor the clinical value of information on pneumococcal variants in dynamic microbial and susceptible host populations.

    Clinical infectious diseases : an official publication of the Infectious Diseases Society of America 2018

  • The evolutionary landscape of colorectal tumorigenesis.

    Cross W, Kovac M, Mustonen V, Temko D, Davis H, Baker AM, Biswas S, Arnold R, Chegwidden L, Gatenbee C, Anderson AR, Koelzer VH, Martinez P, Jiang X, Domingo E, Woodcock DJ, Feng Y, Kovacova M, Maughan T, S:CORT Consortium, Jansen M, Rodriguez-Justo M, Ashraf S, Guy R, Cunningham C, East JE, Wedge DC, Wang LM, Palles C, Heinimann K, Sottoriva A, Leedham SJ, Graham TA and Tomlinson IPM

    Evolution and Cancer Laboratory, Barts Cancer Institute, Barts and the London School of Medicine and Dentistry, Queen Mary University of London, London, UK.

    The evolutionary events that cause colorectal adenomas (benign) to progress to carcinomas (malignant) remain largely undetermined. Using multi-region genome and exome sequencing of 24 benign and malignant colorectal tumours, we investigate the evolutionary fitness landscape occupied by these neoplasms. Unlike carcinomas, advanced adenomas frequently harbour sub-clonal driver mutations-considered to be functionally important in the carcinogenic process-that have not swept to fixation, and have relatively high genetic heterogeneity. Carcinomas are distinguished from adenomas by widespread aneusomies that are usually clonal and often accrue in a 'punctuated' fashion. We conclude that adenomas evolve across an undulating fitness landscape, whereas carcinomas occupy a sharper fitness peak, probably owing to stabilizing selection.

    Nature ecology & evolution 2018

  • Pneumococcal Vaccines: Host Interactions, Population Dynamics, and Design Principles.

    Croucher NJ, Løchen A and Bentley SD

    Department of Infectious Disease Epidemiology, Imperial College London, London W2 1PG, United Kingdom.

    Streptococcus pneumoniae (the pneumococcus) is a nasopharyngeal commensal and respiratory pathogen. Most isolates express a capsule, the species-wide diversity of which has been immunologically classified into ∼100 serotypes. Capsule polysaccharides have been combined into multivalent vaccines widely used in adults, but the T cell independence of the antibody response means they are not protective in infants. Polysaccharide conjugate vaccines (PCVs) trigger a T cell-dependent response through attaching a carrier protein to capsular polysaccharides. The immune response stimulated by PCVs in infants inhibits carriage of vaccine serotypes (VTs), resulting in population-wide herd immunity. These were replaced in carriage by non-VTs. Nevertheless, PCVs drove reductions in infant pneumococcal disease, due to the lower mean invasiveness of the postvaccination bacterial population; age-varying serotype invasiveness resulted in a smaller reduction in adult disease. Alternative vaccines being tested in trials are designed to provide species-wide protection through stimulating innate and cellular immune responses, alongside antibodies to conserved antigens.

    Annual review of microbiology 2018;72;521-549

  • Preclinical Development of a Novel, Orally-Administered Anti-Tumour Necrosis Factor Domain Antibody for the Treatment of Inflammatory Bowel Disease.

    Crowe JS, Roberts KJ, Carlton TM, Maggiore L, Cubitt MF, Clare S, Harcourt K, Reckless J, MacDonald TT, Ray KP, Vossenkämper A and West MR

    VHsquared Ltd., 1 Lower Court, Copley Hill, Cambridge Road, Babraham, Cambridge, CB22 3GN, UK.

    TNFα is an important cytokine in inflammatory bowel disease. V565 is a novel anti-TNFα domain antibody developed for oral administration in IBD patients, derived from a llama domain antibody and engineered to enhance intestinal protease resistance. V565 activity was evaluated in TNFα-TNFα receptor-binding ELISAs as well as TNFα responsive cellular assays and demonstrated neutralisation of both soluble and membrane TNFα with potencies similar to those of adalimumab. Although sensitive to pepsin, V565 retained activity after lengthy incubations with trypsin, chymotrypsin, and pancreatin, as well as mouse small intestinal and human ileal and faecal supernatants. In orally dosed naïve and DSS colitis mice, high V565 concentrations were observed in intestinal contents and faeces and immunostaining revealed V565 localisation in mouse colon tissue. V565 was detected by ELISA in post-dose serum of colitis mice, but not naïve mice, demonstrating penetration of disrupted epithelium. In an ex vivo human IBD tissue culture model, V565 inhibition of tissue phosphoprotein levels and production of inflammatory cytokine biomarkers was similar to infliximab, demonstrating efficacy when present at the disease site. Taken together, results of these studies provide confidence that oral V565 dosing will be therapeutic in IBD patients where the mucosal epithelial barrier is compromised.

    Funded by: Wellcome Trust

    Scientific reports 2018;8;1;4941

  • Fluoroquinolone resistance in Salmonella: insights by whole-genome sequencing.

    Cuypers WL, Jacobs J, Wong V, Klemm EJ, Deborggraeve S and Van Puyvelde S

    2​Department of Mathematics and Computer Science, University of Antwerp, Antwerpen, Belgium.

    Fluoroquinolone (FQ)-resistant Salmonella spp. were listed by the WHO in 2017 as priority pathogens for which new antibiotics were urgently needed. The overall global burden of Salmonella infections is high, but differs per region. Whereas typhoid fever is most prevalent in South and South-East Asia, non-typhoidal salmonellosis is prevalent across the globe and associated with a mild gastroenteritis. By contrast, invasive non-typhoidal Salmonella cause bloodstream infections associated with high mortality, particularly in sub-Saharan Africa. Most Salmonella strains from clinical sources are resistant to first-line antibiotics, with FQs now being the antibiotic of choice for treatment of invasive Salmonella infections. However, FQ resistance is increasingly being reported in Salmonella, and multiple molecular mechanisms are already described. Whole-genome sequencing (WGS) is becoming more frequently used to analyse bacterial genomes for antibiotic-resistance markers, and to understand the phylogeny of bacteria in relation to their antibiotic-resistance profiles. This mini-review provides an overview of FQ resistance in Salmonella, guided by WGS studies that demonstrate that WGS is a valuable tool for global surveillance.

    Microbial genomics 2018

  • A novel prophage identified in strains from Salmonella enterica serovar Enteritidis is a phylogenetic signature of the lineage ST-1974.

    D'Alessandro B, Pérez Escanda V, Balestrazzi L, Iriarte A, Pickard D, Yim L, Chabalgoity JA and Betancor L

    1​Instituto de Higiene, Facultad de Medicina, UDELAR, Montevideo, Uruguay.

    Salmonella enterica serovar Enteritidis is a major agent of foodborne diseases worldwide. In Uruguay, this serovar was almost negligible until the mid 1990s but since then it has become the most prevalent. Previously, we characterized a collection of strains isolated from 1988 to 2005 and found that the two oldest strains were the most genetically divergent. In order to further characterize these strains, we sequenced and annotated eight genomes including those of the two oldest isolates. We report on the identification and characterization of a novel 44 kbp Salmonella prophage found exclusively in these two genomes. Sequence analysis reveals that the prophage is a mosaic, with homologous regions in different Salmonella prophages. It contains 60 coding sequences, including two genes, gogB and sseK3, involved in virulence and modulation of host immune response. Analysis of serovar Enteritidis genomes available in public databases confirmed that this prophage is absent in most of them, with the exception of a group of 154 genomes. All 154 strains carrying this prophage belong to the same sequence type (ST-1974), suggesting that its acquisition occurred in a common ancestor. We tested this by phylogenetic analysis of 203 genomes representative of the intraserovar diversity. The ST-1974 forms a distinctive monophyletic lineage, and the newly described prophage is a phylogenetic signature of this lineage that could be used as a molecular marker. The phylogenetic analysis also shows that the major ST (ST-11) is polyphyletic and might have given rise to almost all other STs, including ST-1974.

    Microbial genomics 2018

  • Fine-mapping of prostate cancer susceptibility loci in a large meta-analysis identifies candidate causal variants.

    Dadaev T, Saunders EJ, Newcombe PJ, Anokian E, Leongamornlert DA, Brook MN, Cieza-Borrella C, Mijuskovic M, Wakerell S, Olama AAA, Schumacher FR, Berndt SI, Benlloch S, Ahmed M, Goh C, Sheng X, Zhang Z, Muir K, Govindasami K, Lophatananon A, Stevens VL, Gapstur SM, Carter BD, Tangen CM, Goodman P, Thompson IM, Batra J, Chambers S, Moya L, Clements J, Horvath L, Tilley W, Risbridger G, Gronberg H, Aly M, Nordström T, Pharoah P, Pashayan N, Schleutker J, Tammela TLJ, Sipeky C, Auvinen A, Albanes D, Weinstein S, Wolk A, Hakansson N, West C, Dunning AM, Burnet N, Mucci L, Giovannucci E, Andriole G, Cussenot O, Cancel-Tassin G, Koutros S, Freeman LEB, Sorensen KD, Orntoft TF, Borre M, Maehle L, Grindedal EM, Neal DE, Donovan JL, Hamdy FC, Martin RM, Travis RC, Key TJ, Hamilton RJ, Fleshner NE, Finelli A, Ingles SA, Stern MC, Rosenstein B, Kerns S, Ostrer H, Lu YJ, Zhang HW, Feng N, Mao X, Guo X, Wang G, Sun Z, Giles GG, Southey MC, MacInnis RJ, FitzGerald LM, Kibel AS, Drake BF, Vega A, Gómez-Caamaño A, Fachal L, Szulkin R, Eklund M, Kogevinas M, Llorca J, Castaño-Vinyals G, Penney KL, Stampfer M, Park JY, Sellers TA, Lin HY, Stanford JL, Cybulski C, Wokolorczyk D, Lubinski J, Ostrander EA, Geybels MS, Nordestgaard BG, Nielsen SF, Weisher M, Bisbjerg R, Røder MA, Iversen P, Brenner H, Cuk K, Holleczek B, Maier C, Luedeke M, Schnoeller T, Kim J, Logothetis CJ, John EM, Teixeira MR, Paulo P, Cardoso M, Neuhausen SL, Steele L, Ding YC, De Ruyck K, De Meerleer G, Ost P, Razack A, Lim J, Teo SH, Lin DW, Newcomb LF, Lessel D, Gamulin M, Kulis T, Kaneva R, Usmani N, Slavov C, Mitev V, Parliament M, Singhal S, Claessens F, Joniau S, Van den Broeck T, Larkin S, Townsend PA, Aukim-Hastie C, Gago-Dominguez M, Castelao JE, Martinez ME, Roobol MJ, Jenster G, van Schaik RHN, Menegaux F, Truong T, Koudou YA, Xu J, Khaw KT, Cannon-Albright L, Pandha H, Michael A, Kierzek A, Thibodeau SN, McDonnell SK, Schaid DJ, Lindstrom S, Turman C, Ma J, Hunter DJ, Riboli E, Siddiq A, Canzian F, Kolonel LN, Le Marchand L, Hoover RN, Machiela MJ, Kraft P, PRACTICAL (Prostate Cancer Association Group to Investigate Cancer-Associated Alterations in the Genome) Consortium, Freedman M, Wiklund F, Chanock S, Henderson BE, Easton DF, Haiman CA, Eeles RA, Conti DV and Kote-Jarai Z

    The Institute of Cancer Research, London, SW7 3RP, UK.

    Prostate cancer is a polygenic disease with a large heritable component. A number of common, low-penetrance prostate cancer risk loci have been identified through GWAS. Here we apply the Bayesian multivariate variable selection algorithm JAM to fine-map 84 prostate cancer susceptibility loci, using summary data from a large European ancestry meta-analysis. We observe evidence for multiple independent signals at 12 regions and 99 risk signals overall. Only 15 original GWAS tag SNPs remain among the catalogue of candidate variants identified; the remainder are replaced by more likely candidates. Biological annotation of our credible set of variants indicates significant enrichment within promoter and enhancer elements, and transcription factor-binding sites, including AR, ERG and FOXA1. In 40 regions at least one variant is colocalised with an eQTL in prostate cancer tissue. The refined set of candidate variants substantially increase the proportion of familial relative risk explained by these known susceptibility regions, which highlights the importance of fine-mapping studies and has implications for clinical risk profiling.

    Funded by: NCI NIH HHS: K07 CA187546

    Nature communications 2018;9;1;2256

  • 137 ancient human genomes from across the Eurasian steppes.

    Damgaard PB, Marchi N, Rasmussen S, Peyrot M, Renaud G, Korneliussen T, Moreno-Mayar JV, Pedersen MW, Goldberg A, Usmanova E, Baimukhanov N, Loman V, Hedeager L, Pedersen AG, Nielsen K, Afanasiev G, Akmatov K, Aldashev A, Alpaslan A, Baimbetov G, Bazaliiskii VI, Beisenov A, Boldbaatar B, Boldgiv B, Dorzhu C, Ellingvag S, Erdenebaatar D, Dajani R, Dmitriev E, Evdokimov V, Frei KM, Gromov A, Goryachev A, Hakonarson H, Hegay T, Khachatryan Z, Khaskhanov R, Kitov E, Kolbina A, Kubatbek T, Kukushkin A, Kukushkin I, Lau N, Margaryan A, Merkyte I, Mertz IV, Mertz VK, Mijiddorj E, Moiyesev V, Mukhtarova G, Nurmukhanbetov B, Orozbekova Z, Panyushkina I, Pieta K, Smrčka V, Shevnina I, Logvin A, Sjögren KG, Štolcová T, Taravella AM, Tashbaeva K, Tkachev A, Tulegenov T, Voyakin D, Yepiskoposyan L, Undrakhbold S, Varfolomeev V, Weber A, Wilson Sayres MA, Kradin N, Allentoft ME, Orlando L, Nielsen R, Sikora M, Heyer E, Kristiansen K and Willerslev E

    Center for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Copenhagen, Denmark.

    For thousands of years the Eurasian steppes have been a centre of human migrations and cultural change. Here we sequence the genomes of 137 ancient humans (about 1× average coverage), covering a period of 4,000 years, to understand the population history of the Eurasian steppes after the Bronze Age migrations. We find that the genetics of the Scythian groups that dominated the Eurasian steppes throughout the Iron Age were highly structured, with diverse origins comprising Late Bronze Age herders, European farmers and southern Siberian hunter-gatherers. Later, Scythians admixed with the eastern steppe nomads who formed the Xiongnu confederations, and moved westward in about the second or third century BC, forming the Hun traditions in the fourth-fifth century AD, and carrying with them plague that was basal to the Justinian plague. These nomads were further admixed with East Asian groups during several short-term khanates in the Medieval period. These historical events transformed the Eurasian steppes from being inhabited by Indo-European speakers of largely West Eurasian ancestry to the mostly Turkic-speaking groups of the present day, who are primarily of East Asian ancestry.

    Nature 2018;557;7705;369-374

  • Amino acid residues in five separate HLA genes can explain most of the known associations between the MHC and primary biliary cholangitis.

    Darlay R, Ayers KL, Mells GF, Hall LS, Liu JZ, Almarri MA, Alexander GJ, Jones DE, Sandford RN, Anderson CA and Cordell HJ

    Institute of Genetic Medicine, Newcastle University, Newcastle upon Tyne, United Kingdom.

    Primary Biliary Cholangitis (PBC) is a chronic autoimmune liver disease characterised by progressive destruction of intrahepatic bile ducts. The strongest genetic association is with HLA-DQA1*04:01, but at least three additional independent HLA haplotypes contribute to susceptibility. We used dense single nucleotide polymorphism (SNP) data in 2861 PBC cases and 8514 controls to impute classical HLA alleles and amino acid polymorphisms using state-of-the-art methodologies. We then demonstrated through stepwise regression that association in the HLA region can be largely explained by variation at five separate amino acid positions. Three-dimensional modelling of protein structures and calculation of electrostatic potentials for the implicated HLA alleles/amino acid substitutions demonstrated a correlation between the electrostatic potential of pocket P6 in HLA-DP molecules and the HLA-DPB1 alleles/amino acid substitutions conferring PBC susceptibility/protection, highlighting potential new avenues for future functional investigation.

    PLoS genetics 2018;14;12;e1007833

  • NOTCH1 represses MCL-1 levels in GSI-resistant T-ALL, making them susceptible to ABT-263.

    Dastur A, Choi A, Costa C, Yin X, Williams AF, McClanaghan JD, Greenberg M, Roderick JE, Patel NU, Boisvert JL, McDermott U, Garnett MJ, Almenara J, Grant S, Rizzo K, Engelman JA, Kelliher MA, Faber AC and Benes CH

    Cancer Center, Massachusetts General Hospital.

    Purpose: Effective targeted therapies are lacking for refractory and relapsed T-cell Acute Lymphoblastic Leukemia (T-ALL). Suppression of the NOTCH pathway using gamma-secretase inhibitors (GSIs) is toxic and clinically not effective. The goal of this study was to identify alternative therapeutic strategies for T-ALL.

    Experimental design: We performed a comprehensive analysis of our high throughput drug screen across hundreds of human cell lines including fifteen T-ALL models. We validated and further studied the top hit, navitoclax (ABT-263). We used multiple human T-ALL cell lines as well as primary patient samples, and performed both, <i>in</i><i>vitro</i> experiments and <i>in vivo</i> studies on patient-derived xenograft models.

    Results: We found that T-ALL are hypersensitive to navitoclax, an inhibitor of BCL2 family of anti-apoptotic proteins. Importantly, GSI-resistant T-ALL are also susceptible to navitoclax. Sensitivity to navitoclax is due to low levels of MCL-1 in T-ALL. We identify an unsuspected regulation of mTORC1 by the NOTCH pathway, resulting in increased MCL-1 upon GSI treatment. Finally, we show that pharmacological inhibition of mTORC1 lowers MCL-1 levels and further sensitizes cells to navitoclax <i>in vitro</i> and leads to tumor regressions <i>in vivo</i> Conclusions: Our results support the development of navitoclax, as single agent and in combination with mTOR inhibitors, as a new therapeutic strategy for T-ALL, including in the setting of GSI resistance.

    Clinical cancer research : an official journal of the American Association for Cancer Research 2018

  • Spatial structuring of a Legionella pneumophila population within the water system of a large occupational building.

    David S, Mentasti M, Lai S, Vaghji L, Ready D, Chalker VJ and Parkhill J

    1​Pathogen Genomics, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK.

    The diversity of Legionella pneumophila populations within single water systems is not well understood, particularly in those unassociated with cases of Legionnaires' disease. Here, we performed genomic analysis of 235 L. pneumophila isolates obtained from 28 water samples in 13 locations within a large occupational building. Despite regular treatment, the water system of this building is thought to have been colonized by L. pneumophila for at least 30 years without evidence of association with Legionnaires' disease cases. All isolates belonged to one of three sequence types (STs), ST27 (n=81), ST68 (n=122) and ST87 (n=32), all three of which have been recovered from Legionnaires' disease patients previously. Pairwise single nucleotide polymorphism differences amongst isolates of the same ST were low, ranging from 0 to 19 in ST27, from 0 to 30 in ST68 and from 0 to 7 in ST87, and no homologous recombination was observed in any lineage. However, there was evidence of horizontal transfer of a plasmid, which was found in all ST87 isolates and only one ST68 isolate. A single ST was found in 10/13 sampled locations, and isolates of each ST were also more similar to those from the same location compared with those from different locations, demonstrating spatial structuring of the population within the water system. These findings provide the first insights into the diversity and genomic evolution of a L. pneumophila population within a complex water system not associated with disease.

    Funded by: Wellcome Trust: 098051

    Microbial genomics 2018;4;10

  • Low genomic diversity of Legionella pneumophila within clinical specimens.

    David S, Mentasti M, Parkhill J and Chalker VJ

    Centre for Genomic Pathogen Surveillance, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SA, United Kingdom. Electronic address:

    Objectives: Legionella pneumophila is the leading cause of Legionnaires' disease, a severe form of pneumonia acquired from environmental sources. Investigations of both sporadic cases and outbreaks rely mostly on analysis of a single to a few colony pick(s) isolated from each patient. However, because of the lack of data describing diversity within single patients, the optimal number of picks is unknown. Here, we investigated diversity within individual patients using sequence-based typing (SBT) and whole-genome sequencing (WGS).

    Methods: Ten isolates of L. pneumophila were obtained from each of ten epidemiologically unrelated patients. SBT and WGS were undertaken, and single-nucleotide polymorphisms (SNPs) were identified between isolates from the same patient.

    Results: The same sequence type (ST) was obtained for each set of ten isolates. Using genomic analysis, zero SNPs were identified between isolates from seven patients, a maximum of one SNP was found between isolates from two patients, and a maximum of two SNPs was found amongst isolates from one patient. Assuming that the full within-host diversity has been captured with ten isolates, statistical analyses showed that, on average, analysis of one isolate would yield a 70% chance of capturing all observed genotypes, and seven isolates would yield a 90% chance.

    Conclusions: SBT and WGS analyses of multiple colony picks obtained from ten patients showed no, or very low, within-host genomic diversity in L. pneumophila, suggesting that analysis of one colony pick per patient will often be sufficient to obtain reliable typing data to aid investigation of cases of Legionnaires' disease.

    Clinical microbiology and infection : the official publication of the European Society of Clinical Microbiology and Infectious Diseases 2018;24;9;1020.e1-1020.e4

  • Study of 300,486 individuals identifies 148 independent genetic loci influencing general cognitive function.

    Davies G, Lam M, Harris SE, Trampush JW, Luciano M, Hill WD, Hagenaars SP, Ritchie SJ, Marioni RE, Fawns-Ritchie C, Liewald DCM, Okely JA, Ahola-Olli AV, Barnes CLK, Bertram L, Bis JC, Burdick KE, Christoforou A, DeRosse P, Djurovic S, Espeseth T, Giakoumaki S, Giddaluru S, Gustavson DE, Hayward C, Hofer E, Ikram MA, Karlsson R, Knowles E, Lahti J, Leber M, Li S, Mather KA, Melle I, Morris D, Oldmeadow C, Palviainen T, Payton A, Pazoki R, Petrovic K, Reynolds CA, Sargurupremraj M, Scholz M, Smith JA, Smith AV, Terzikhan N, Thalamuthu A, Trompet S, van der Lee SJ, Ware EB, Windham BG, Wright MJ, Yang J, Yu J, Ames D, Amin N, Amouyel P, Andreassen OA, Armstrong NJ, Assareh AA, Attia JR, Attix D, Avramopoulos D, Bennett DA, Böhmer AC, Boyle PA, Brodaty H, Campbell H, Cannon TD, Cirulli ET, Congdon E, Conley ED, Corley J, Cox SR, Dale AM, Dehghan A, Dick D, Dickinson D, Eriksson JG, Evangelou E, Faul JD, Ford I, Freimer NA, Gao H, Giegling I, Gillespie NA, Gordon SD, Gottesman RF, Griswold ME, Gudnason V, Harris TB, Hartmann AM, Hatzimanolis A, Heiss G, Holliday EG, Joshi PK, Kähönen M, Kardia SLR, Karlsson I, Kleineidam L, Knopman DS, Kochan NA, Konte B, Kwok JB, Le Hellard S, Lee T, Lehtimäki T, Li SC, Liu T, Koini M, London E, Longstreth WT, Lopez OL, Loukola A, Luck T, Lundervold AJ, Lundquist A, Lyytikäinen LP, Martin NG, Montgomery GW, Murray AD, Need AC, Noordam R, Nyberg L, Ollier W, Papenberg G, Pattie A, Polasek O, Poldrack RA, Psaty BM, Reppermund S, Riedel-Heller SG, Rose RJ, Rotter JI, Roussos P, Rovio SP, Saba Y, Sabb FW, Sachdev PS, Satizabal CL, Schmid M, Scott RJ, Scult MA, Simino J, Slagboom PE, Smyrnis N, Soumaré A, Stefanis NC, Stott DJ, Straub RE, Sundet K, Taylor AM, Taylor KD, Tzoulaki I, Tzourio C, Uitterlinden A, Vitart V, Voineskos AN, Kaprio J, Wagner M, Wagner H, Weinhold L, Wen KH, Widen E, Yang Q, Zhao W, Adams HHH, Arking DE, Bilder RM, Bitsios P, Boerwinkle E, Chiba-Falek O, Corvin A, De Jager PL, Debette S, Donohoe G, Elliott P, Fitzpatrick AL, Gill M, Glahn DC, Hägg S, Hansell NK, Hariri AR, Ikram MK, Jukema JW, Vuoksimaa E, Keller MC, Kremen WS, Launer L, Lindenberger U, Palotie A, Pedersen NL, Pendleton N, Porteous DJ, Räikkönen K, Raitakari OT, Ramirez A, Reinvang I, Rudan I, Dan Rujescu, Schmidt R, Schmidt H, Schofield PW, Schofield PR, Starr JM, Steen VM, Trollor JN, Turner ST, Van Duijn CM, Villringer A, Weinberger DR, Weir DR, Wilson JF, Malhotra A, McIntosh AM, Gale CR, Seshadri S, Mosley TH, Bressler J, Lencz T and Deary IJ

    Centre for Cognitive Ageing and Cognitive Epidemiology, Department of Psychology, School of Philosophy, Psychology and Language Sciences, The University of Edinburgh, Edinburgh, EH8 9JZ, UK.

    General cognitive function is a prominent and relatively stable human trait that is associated with many important life outcomes. We combine cognitive and genetic data from the CHARGE and COGENT consortia, and UK Biobank (total N = 300,486; age 16-102) and find 148 genome-wide significant independent loci (P < 5 × 10<sup>-8</sup>) associated with general cognitive function. Within the novel genetic loci are variants associated with neurodegenerative and neurodevelopmental disorders, physical and psychiatric illnesses, and brain structure. Gene-based analyses find 709 genes associated with general cognitive function. Expression levels across the cortex are associated with general cognitive function. Using polygenic scores, up to 4.3% of variance in general cognitive function is predicted in independent samples. We detect significant genetic overlap between general cognitive function, reaction time, and many health variables including eyesight, hypertension, and longevity. In conclusion we identify novel genetic loci and pathways contributing to the heritability of general cognitive function.

    Funded by: NIA NIH HHS: R01 AG033193, U01 AG049505, U01 AG052409

    Nature communications 2018;9;1;2098

  • New highly diverse hepatitis C strains detected in sub-Saharan Africa have unknown susceptibility to direct-acting antiviral treatments.

    Davis C, Mgomella GS, Filipe ADS, Frost EH, Giroux G, Hughes J, Hogan C, Kaleebu P, Asiki G, McLauchlan J, Niebel M, Ocama P, Pomila C, Pybus OG, Pépin J, Simmonds P, Singer JB, Sreenu VB, Wekesa C, Young EH, Murphy DG, Sandhu M and Thomson EC

    MRC-University of Glasgow Centre for Virus Research, 464 Bearsden Road, Glasgow, UK, G61 1QH.

    Background and rationale for the study: The global plan to eradicate hepatitis C (HCV) led by the World Health Organisation (WHO) outlines the use of highly effective direct-acting antiviral drugs (DAAs) to achieve elimination by 2030. Identifying individuals with active disease and investigation of the breadth of diversity of the virus sub-Saharan Africa (SSA) is essential as genotypes in this region (where very few clinical trials have been carried out) are distinct from those found in other parts of the world. We undertook a population-based nested case-control study in Uganda and obtained additional samples from the Democratic Republic of Congo (DRC), to estimate the prevalence of HCV, assess strategies for disease detection using serological and molecular techniques, and characterise genetic diversity of the virus. Using next generation (NGS) and Sanger sequencing, we aimed to identify strains circulating in East and Central Africa.

    Main results: 7751 Ugandan patients were initially screened for HCV and 20 PCR positive samples obtained for sequencing. Serological assays were found to vary significantly in specificity for HCV. HCV strains detected in Uganda included genotypes (g) 4k, 4p, 4q and 4s and a new unassigned genotype 7 HCV strain. Two additional unassigned g7 strains were identified in patients originating from DRC (one partial and one full ORF sequence). These g4 and 7 strains contain NS3 and NS5A polymorphisms associated with resistance to DAAs in other genotypes. Clinical studies are therefore indicated to investigate treatment response in infected patients.

    Conclusion: While HCV prevalence and genotypes have been well characterised in patients in well-resourced countries, clinical trials are urgently required in SSA where highly diverse g4 and 7 strains circulate. This article is protected by copyright. All rights reserved.

    Hepatology (Baltimore, Md.) 2018

  • The first horse herders and the impact of early Bronze Age steppe expansions into Asia.

    de Barros Damgaard P, Martiniano R, Kamm J, Moreno-Mayar JV, Kroonen G, Peyrot M, Barjamovic G, Rasmussen S, Zacho C, Baimukhanov N, Zaibert V, Merz V, Biddanda A, Merz I, Loman V, Evdokimov V, Usmanova E, Hemphill B, Seguin-Orlando A, Yediay FE, Ullah I, Sjögren KG, Iversen KH, Choin J, de la Fuente C, Ilardo M, Schroeder H, Moiseyev V, Gromov A, Polyakov A, Omura S, Senyurt SY, Ahmad H, McKenzie C, Margaryan A, Hameed A, Samad A, Gul N, Khokhar MH, Goriunova OI, Bazaliiskii VI, Novembre J, Weber AW, Orlando L, Allentoft ME, Nielsen R, Kristiansen K, Sikora M, Outram AK, Durbin R and Willerslev E

    Centre for GeoGenetics, Natural History Museum, University of Copenhagen, Copenhagen, Denmark.

    The Yamnaya expansions from the western steppe into Europe and Asia during the Early Bronze Age (~3000 BCE) are believed to have brought with them Indo-European languages and possibly horse husbandry. We analyzed 74 ancient whole-genome sequences from across Inner Asia and Anatolia and show that the Botai people associated with the earliest horse husbandry derived from a hunter-gatherer population deeply diverged from the Yamnaya. Our results also suggest distinct migrations bringing West Eurasian ancestry into South Asia before and after, but not at the time of, Yamnaya culture. We find no evidence of steppe ancestry in Bronze Age Anatolia from when Indo-European languages are attested there. Thus, in contrast to Europe, Early Bronze Age Yamnaya-related migrations had limited direct genetic impact in Asia.

    Funded by: NIGMS NIH HHS: T32 GM007197

    Science (New York, N.Y.) 2018;360;6396

  • Single-cell sequencing reveals the origin and the order of mutation acquisition in T-cell acute lymphoblastic leukemia.

    De Bie J, Demeyer S, Alberti-Servera L, Geerdens E, Segers H, Broux M, De Keersmaecker K, Michaux L, Vandenberghe P, Voet T, Boeckx N, Uyttebroeck A and Cools J

    Center for Human Genetics, KU Leuven, Leuven, Belgium.

    Next-generation sequencing has provided a detailed overview of the various genomic lesions implicated in the pathogenesis of T-cell acute lymphoblastic leukemia (T-ALL). Typically, 10-20 protein-altering lesions are found in T-ALL cells at diagnosis. However, it is currently unclear in which order these mutations are acquired and in which progenitor cells this is initiated. To address these questions, we used targeted single-cell sequencing of total bone marrow cells and CD34<sup>+</sup>CD38<sup>-</sup> multipotent progenitor cells for four T-ALL cases. Hierarchical clustering detected a dominant leukemia cluster at diagnosis, accompanied by a few smaller clusters harboring only a fraction of the mutations. We developed a graph-based algorithm to determine the order of mutation acquisition. Two of the four patients had an early event in a known oncogene (MED12, STAT5B) among various pre-leukemic events. Intermediate events included loss of 9p21 (CDKN2A/B) and acquisition of fusion genes, while NOTCH1 mutations were typically late events. Analysis of CD34<sup>+</sup>CD38<sup>-</sup> cells and myeloid progenitors revealed that in half of the cases somatic mutations were detectable in multipotent progenitor cells. We demonstrate that targeted single-cell sequencing can elucidate the order of mutation acquisition in T-ALL and that T-ALL development can start in a multipotent progenitor cell.

    Leukemia 2018

  • Recognizing the reagent microbiome.

    de Goffau MC, Lager S, Salter SJ, Wagner J, Kronbichler A, Charnock-Jones DS, Peacock SJ, Smith GCS and Parkhill J

    Wellcome Sanger Institute, Cambridge, UK.

    Funded by: Medical Research Council: MR/K021133/1

    Nature microbiology 2018;3;8;851-853

  • Genomic insights into the origin and diversification of late maritime hunter-gatherers from the Chilean Patagonia.

    de la Fuente C, Ávila-Arcos MC, Galimany J, Carpenter ML, Homburger JR, Blanco A, Contreras P, Cruz Dávalos D, Reyes O, San Roman M, Moreno-Estrada A, Campos PF, Eng C, Huntsman S, Burchard EG, Malaspinas AS, Bustamante CD, Willerslev E, Llop E, Verdugo RA and Moraga M

    Human Genetics Program, Institute of Biomedical Sciences, Faculty of Medicine, University of Chile, Santiago 8380453, Chile.

    Patagonia was the last region of the Americas reached by humans who entered the continent from Siberia ∼15,000-20,000 y ago. Despite recent genomic approaches to reconstruct the continental evolutionary history, regional characterization of ancient and modern genomes remains understudied. Exploring the genomic diversity within Patagonia is not just a valuable strategy to gain a better understanding of the history and diversification of human populations in the southernmost tip of the Americas, but it would also improve the representation of Native American diversity in global databases of human variation. Here, we present genome data from four modern populations from Central Southern Chile and Patagonia (<i>n</i> = 61) and four ancient maritime individuals from Patagonia (∼1,000 y old). Both the modern and ancient individuals studied in this work have a greater genetic affinity with other modern Native Americans than to any non-American population, showing within South America a clear structure between major geographical regions. Native Patagonian Kawéskar and Yámana showed the highest genetic affinity with the ancient individuals, indicating genetic continuity in the region during the past 1,000 y before present, together with an important agreement between the ethnic affiliation and historical distribution of both groups. Lastly, the ancient maritime individuals were genetically equidistant to a ∼200-y-old terrestrial hunter-gatherer from Tierra del Fuego, which supports a model with an initial separation of a common ancestral group to both maritime populations from a terrestrial population, with a later diversification of the maritime groups.

    Proceedings of the National Academy of Sciences of the United States of America 2018

  • Streptococcus bovimastitidis sp. nov., isolated from a dairy cow with mastitis.

    de Vries SPW, Hadjirin NF, Lay EM, Zadoks RN, Peacock SJ, Parkhill J, Grant AJ, McDougall S and Holmes MA

    1​Department of Veterinary Medicine, University of Cambridge, Cambridge, UK.

    Here we describe a new species of the genus Streptococcus that was isolated from a dairy cow with mastitis in New Zealand. Strain NZ1587<sup>T</sup> was Gram-positive, coccus-shaped and arranged as chains, catalase and coagulase negative, γ-haemolytic and negative for Lancefield carbohydrates (A-D, F and G). The 16S rRNA sequence did not match sequences in the NCBI 16S rRNA or GreenGenes databases. Taxonomic classification of strain NZ1587<sup>T</sup> was investigated using 16S rRNA and core genome phylogeny, genome-wide average nucleotide identity (ANI) and predicted DNA-DNA hybridisation (DDH) analyses. Phylogeny based on 16S rRNA was unable to resolve the taxonomic position of strain NZ1587<sup>T</sup>, however NZ1587<sup>T</sup> shared 99.4 % identity at the 16S rRNA level with a distinct branch of S. pseudoporcinus. Importantly, core genome phylogeny demonstrated that NZ1587<sup>T</sup> grouped amongst the 'pyogenic' streptococcal species and formed a distinct branch supported by a 100 % bootstrap value. In addition, average nucleotide identity and inferred DNA-DNA hybridisation analyses showed that NZ1587<sup>T</sup> represents a novel species. Biochemical profiling using the rapid ID 32 strep identification test enabled differentiation of strain NZ1587<sup>T</sup> from closely related streptococcal species. In conclusion, strain NZ1587<sup>T</sup> can be classified as a novel species, and we propose a novel taxon named Streptococcus bovimastitidis sp. nov.; the type strain is NZ1587<sup>T</sup>. NZ1587<sup>T</sup> has been deposited in the Culture Collection University of Gothenburg (CCUG 69277<sup>T</sup>) and the Belgian Co-ordinated Collections of Micro-organisms/LMG (LMG 29747).

    Funded by: Medical Research Council: G1001787

    International journal of systematic and evolutionary microbiology 2018;68;1;21-27

  • Comparative genomics reveals that loss of lunatic fringe (LFNG) promotes melanoma metastasis.

    Del Castillo Velasco-Herrera M, van der Weyden L, Nsengimana J, Speak AO, Sjöberg MK, Bishop DT, Jönsson G, Newton-Bishop J and Adams DJ

    Experimental Cancer Genetics, Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK.

    Metastasis is the leading cause of death in patients with advanced melanoma, yet the somatic alterations that aid tumour cell dissemination and colonisation are poorly understood. Here, we deploy comparative genomics to identify and validate clinically relevant drivers of melanoma metastasis. To do this, we identified a set of 976 genes whose expression level was associated with a poor outcome in patients from two large melanoma cohorts. Next, we characterised the genomes and transcriptomes of mouse melanoma cell lines defined as weakly metastatic, and their highly metastatic derivatives. By comparing expression data between species, we identified lunatic fringe (LFNG), among 28 genes whose expression level is predictive of poor prognosis and whose altered expression is associated with a prometastatic phenotype in mouse melanoma cells. CRISPR/Cas9-mediated knockout of Lfng dramatically enhanced the capability of weakly metastatic melanoma cells to metastasise in vivo, a phenotype that could be rescued with the Lfng cDNA. Notably, genomic alterations disrupting LFNG are found exclusively in human metastatic melanomas sequenced as part of The Cancer Genome Atlas. Using comparative genomics, we show that LFNG expression plays a functional role in regulating melanoma metastasis.

    Funded by: Cancer Research UK: 13031; Wellcome Trust

    Molecular oncology 2018;12;2;239-255

  • Outer membrane vesicles from Neisseria gonorrhoeae target PorB to mitochondria and induce apoptosis.

    Deo P, Chow SH, Hay ID, Kleifeld O, Costin A, Elgass KD, Jiang JH, Ramm G, Gabriel K, Dougan G, Lithgow T, Heinz E and Naderer T

    Biomedicine Discovery Institute and Department of Biochemistry & Molecular Biology, Monash University, Clayton, Victoria, Australia.

    Neisseria gonorrhoeae causes the sexually transmitted disease gonorrhoea by evading innate immunity. Colonizing the mucosa of the reproductive tract depends on the bacterial outer membrane porin, PorB, which is essential for ion and nutrient uptake. PorB is also targeted to host mitochondria and regulates apoptosis pathways to promote infections. How PorB traffics from the outer membrane of N. gonorrhoeae to mitochondria and whether it modulates innate immune cells, such as macrophages, remains unclear. Here, we show that N. gonorrhoeae secretes PorB via outer membrane vesicles (OMVs). Purified OMVs contained primarily outer membrane proteins including oligomeric PorB. The porin was targeted to mitochondria of macrophages after exposure to purified OMVs and wild type N. gonorrhoeae. This was associated with loss of mitochondrial membrane potential, release of cytochrome c, activation of apoptotic caspases and cell death in a time-dependent manner. Consistent with this, OMV-induced macrophage death was prevented with the pan-caspase inhibitor, Q-VD-PH. This shows that N. gonorrhoeae utilizes OMVs to target PorB to mitochondria and to induce apoptosis in macrophages, thus affecting innate immunity.

    PLoS pathogens 2018;14;3;e1006945

  • Genome-wide haplotyping embryos developing from 0PN and 1PN zygotes increases transferrable embryos in PGT-M.

    Destouni A, Dimitriadou E, Masset H, Debrock S, Melotte C, Van Den Bogaert K, Zamani Esteki M, Ding J, Voet T, Denayer E, de Ravel T, Legius E, Meuleman C, Peeraer K and Vermeesch JR

    Laboratory for Cytogenetics and Genome Research, Center for Human Genetics, University of Leuven, O&N I Herestraat 49, KU Leuven, Leuven, Belgium.

    Study question: Can genome-wide haplotyping increase success following preimplantation genetic testing for a monogenic disorder (PGT-M) by including zygotes with absence of pronuclei (0PN) or the presence of only one pronucleus (1PN)?

    Summary answer: Genome-wide haplotyping 0PNs and 1PNs increases the number of PGT-M cycles reaching embryo transfer (ET) by 81% and the live-birth rate by 75%.

    What is known already: Although a significant subset of 0PN and 1PN zygotes can develop into balanced, diploid and developmentally competent embryos, they are usually discarded because parental diploidy detection is not part of the routine work-up of PGT-M.

    Study design, size, duration: This prospective cohort study evaluated the pronuclear number in 2229 zygotes from 2337 injected metaphase II (MII) oocytes in 268 cycles. PGT-M for 0PN and 1PN embryos developing into Day 5/6 blastocysts with adequate quality for vitrification was performed in 42 of the 268 cycles (15.7%). In these 42 cycles, we genome-wide haplotyped 216 good quality embryos corresponding to 49 0PNs, 15 1PNs and 152 2PNs. The reported outcomes include parental contribution to embryonic ploidy, embryonic aneuploidy, genetic diagnosis for the monogenic disorder, cycles reaching ETs, pregnancy and live birth rates (LBR) for unaffected offspring.

    Participants/materials, setting, methods: Blastomere DNA was whole-genome amplified and hybridized on the Illumina Human CytoSNP12V2.1.1 BeadChip arrays. Subsequently, genome-wide haplotyping and copy-number profiling was applied to investigate the embryonic genome architecture. Bi-parental, unaffected embryos were transferred regardless of their initial zygotic PN score.

    Main results and the role of chance: A staggering 75.51% of 0PN and 42.86% of 1PN blastocysts are diploid bi-parental allowing accurate genetic diagnosis for the monogenic disorder. In total, 31% (13/42) of the PGT-M cycles reached ET or could repeat ET with an unaffected 0PN or 1PN embryo. The LBR per initiated cycle increased from 9.52 to 16.67%.

    Limitations, reasons for caution: The clinical efficacy of the routine inclusion of 0PN and 1PN zygotes in PGT-M cycles should be confirmed in larger cohorts from multicenter studies.

    Wider implications of the findings: Genome-wide haplotyping allows the inclusion of 0PN and 1PN embryos and subsequently increases the cycles reaching ET following PGT-M and potentially PGT for aneuploidy (PGT-A) and chromosomal structural rearrangements (PGT-SR). Establishing measures of clinical efficacy could lead to an update of the ESHRE guidelines which advise against the use of these zygotes.

    Study funding/competing interest(s): SymBioSys (PFV/10/016 and C1/018 to J.R.V. and T.V.), the Horizon 2020 WIDENLIFE: 692065 to J.R.V., T.V., E.D., A.D. and M.Z.E. M.Z.E., T.V. and J.R.V. co-invented haplarithmisis ('Haplotyping and copy-number typing using polymorphic variant allelic frequencies'), which has been licensed to Agilent Technologies. H.M. is fully supported by the (FWO) (ZKD1543-ASP/16). The authors have no competing interests to declare.

    Human reproduction (Oxford, England) 2018

  • Shieldin complex promotes DNA end-joining and counters homologous recombination in BRCA1-null cells.

    Dev H, Chiang TW, Lescale C, de Krijger I, Martin AG, Pilger D, Coates J, Sczaniecka-Clift M, Wei W, Ostermaier M, Herzog M, Lam J, Shea A, Demir M, Wu Q, Yang F, Fu B, Lai Z, Balmus G, Belotserkovskaya R, Serra V, O'Connor MJ, Bruna A, Beli P, Pellegrini L, Caldas C, Deriano L, Jacobs JJL, Galanty Y and Jackson SP

    The Wellcome Trust/Cancer Research UK Gurdon Institute and Department of Biochemistry, University of Cambridge, Cambridge, UK.

    BRCA1 deficiencies cause breast, ovarian, prostate and other cancers, and render tumours hypersensitive to poly(ADP-ribose) polymerase (PARP) inhibitors. To understand the resistance mechanisms, we conducted whole-genome CRISPR-Cas9 synthetic-viability/resistance screens in BRCA1-deficient breast cancer cells treated with PARP inhibitors. We identified two previously uncharacterized proteins, C20orf196 and FAM35A, whose inactivation confers strong PARP-inhibitor resistance. Mechanistically, we show that C20orf196 and FAM35A form a complex, 'Shieldin' (SHLD1/2), with FAM35A interacting with single-stranded DNA through its C-terminal oligonucleotide/oligosaccharide-binding fold region. We establish that Shieldin acts as the downstream effector of 53BP1/RIF1/MAD2L2 to promote DNA double-strand break (DSB) end-joining by restricting DSB resection and to counteract homologous recombination by antagonizing BRCA2/RAD51 loading in BRCA1-deficient cells. Notably, Shieldin inactivation further sensitizes BRCA1-deficient cells to cisplatin, suggesting how defining the SHLD1/2 status of BRCA1-deficient tumours might aid patient stratification and yield new treatment opportunities. Highlighting this potential, we document reduced SHLD1/2 expression in human breast cancers displaying intrinsic or acquired PARP-inhibitor resistance.

    Nature cell biology 2018

  • Bayesian inference of ancestral dates on bacterial phylogenetic trees.

    Didelot X, Croucher NJ, Bentley SD, Harris SR and Wilson DJ

    Department of Infectious Disease Epidemiology, School of Public Health, Imperial College London, London, UK.

    The sequencing and comparative analysis of a collection of bacterial genomes from a single species or lineage of interest can lead to key insights into its evolution, ecology or epidemiology. The tool of choice for such a study is often to build a phylogenetic tree, and more specifically when possible a dated phylogeny, in which the dates of all common ancestors are estimated. Here, we propose a new Bayesian methodology to construct dated phylogenies which is specifically designed for bacterial genomics. Unlike previous Bayesian methods aimed at building dated phylogenies, we consider that the phylogenetic relationships between the genomes have been previously evaluated using a standard phylogenetic method, which makes our methodology much faster and scalable. This two-step approach also allows us to directly exploit existing phylogenetic methods that detect bacterial recombination, and therefore to account for the effect of recombination in the construction of a dated phylogeny. We analysed many simulated datasets in order to benchmark the performance of our approach in a wide range of situations. Furthermore, we present applications to three different real datasets from recent bacterial genomic studies. Our methodology is implemented in a R package called BactDating which is freely available for download at

    Nucleic acids research 2018

  • Comparative genomics of Czech vaccine strains of Bordetella pertussis.

    Dienstbier A, Pouchnik D, Wildung M, Amman F, Hofacker IL, Parkhill J, Holubova J, Sebo P and Vecerek B

    Institute of Microbiology v.v.i., Laboratory of post-transcriptional control of gene expression, 14220 Prague, Czech Republic.

    Bordetella pertussis is a strictly human pathogen causing the respiratory infectious disease called whooping cough or pertussis. B. pertussis adaptation to acellular pertussis vaccine pressure has been repeatedly highlighted, but recent data indicate that adaptation of circulating strains started already in the era of the whole cell pertussis vaccine (wP) use. We sequenced the genomes of five B. pertussis wP vaccine strains isolated in the former Czechoslovakia in the pre-wP (1954-1957) and early wP (1958-1965) eras, when only limited population travel into and out of the country was possible. Four isolates exhibit a similar genome organization and form a distinct phylogenetic cluster with a geographic signature. The fifth strain is rather distinct, both in genome organization and SNP-based phylogeny. Surprisingly, despite isolation of this strain before 1966, its closest sequenced relative appears to be a recent isolate from the US. On the genome content level, the five vaccine strains contained both new and already described regions of difference. One of the new regions contains duplicated genes potentially associated with transport across the membrane. The prevalence of this region in recent isolates indicates that its spread might be associated with selective advantage leading to increased strain fitness.

    Pathogens and disease 2018;76;7

  • Mutational Analysis Identifies Therapeutic Biomarkers in Inflammatory Bowel Disease-Associated Colorectal Cancers.

    Din S, Wong K, Mueller MF, Oniscu A, Hewinson J, Black CJ, Miller ML, Jiménez-Sánchez A, Rabbie R, Rashid M, Satsangi J, Adams DJ and Arends MJ

    NHS Lothian, Gastrointestinal Unit, Western General Hospital, Edinburgh, Scotland, United Kingdom.

    <b>Purpose:</b> Inflammatory bowel disease-associated colorectal cancers (IBD-CRC) are associated with a higher mortality than sporadic colorectal cancers. The poorly defined molecular pathogenesis of IBD-CRCs limits development of effective prevention, detection, and treatment strategies. We aimed to identify biomarkers using whole-exome sequencing of IBD-CRCs to guide individualized management.<b>Experimental Design:</b> Whole-exome sequencing was performed on 34 formalin-fixed paraffin-embedded primary IBD-CRCs and 31 matched normal lymph nodes. Computational methods were used to identify somatic point mutations, small insertions and deletions, mutational signatures, and somatic copy number alterations. Mismatch repair status was examined.<b>Results:</b> Hypermutation was observed in 27% of IBD-CRCs. All hypermutated cancers were from the proximal colon; all but one of the cancers with hypermutation had defective mismatch repair or somatic mutations in the proofreading domain of DNA <i>POLE</i> Hypermutated IBD-CRCs had increased numbers of predicted neo-epitopes, which could be exploited using immunotherapy. We identified six distinct mutation signatures in IBD-CRCs, three of which corresponded to known mechanisms of mutagenesis. Driver genes were also identified.<b>Conclusions:</b> IBD-CRCs should be evaluated for hypermutation and defective mismatch repair to identify patients with a higher neo-epitope load who may benefit from immunotherapies. Prospective trials are required to determine whether IHC to detect loss of MLH1 expression in dysplastic colonic tissue could identify patients at increased risk of developing IBD-CRC. We identified mutations in genes in IBD-CRCs with hypermutation that might be targeted therapeutically. These approaches would complement and individualize surveillance and treatment programs. <i>Clin Cancer Res; 24(20); 5133-42. ©2018 AACR</i>.

    Funded by: Wellcome Trust

    Clinical cancer research : an official journal of the American Association for Cancer Research 2018;24;20;5133-5142

  • SRSF3 maintains transcriptome integrity in oocytes by regulation of alternative splicing and transposable elements.

    Do DV, Strauss B, Cukuroglu E, Macaulay I, Wee KB, Hu TX, Igor RLM, Lee C, Harrison A, Butler R, Dietmann S, Jernej U, Marioni J, Smith CWJ, Göke J and Surani MA

    1Wellcome Trust/Cancer Research UK Gurdon Institute, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QN UK.

    The RNA-binding protein SRSF3 (also known as SRp20) has critical roles in the regulation of pre-mRNA splicing. Zygotic knockout of <i>Srsf3</i> results in embryo arrest at the blastocyst stage. However, SRSF3 is also present in oocytes, suggesting that it might be critical as a maternally inherited factor. Here we identify SRSF3 as an essential regulator of alternative splicing and of transposable elements to maintain transcriptome integrity in mouse oocyte. Using 3D time-lapse confocal live imaging, we show that conditional deletion of <i>Srsf3</i> in fully grown germinal vesicle oocytes substantially compromises the capacity of germinal vesicle breakdown (GVBD), and consequently entry into meiosis. By combining single cell RNA-seq, and oocyte micromanipulation with steric blocking antisense oligonucleotides and RNAse-H inducing gapmers, we found that the GVBD defect in mutant oocytes is due to both aberrant alternative splicing and derepression of B2 SINE transposable elements. Together, our study highlights how control of transcriptional identity of the maternal transcriptome by the RNA-binding protein SRSF3 is essential to the development of fertilized-competent oocytes.

    Cell discovery 2018;4;33

  • Defining endemic cholera at three levels of spatiotemporal resolution within Bangladesh.

    Domman D, Chowdhury F, Khan AI, Dorman MJ, Mutreja A, Uddin MI, Paul A, Begum YA, Charles RC, Calderwood SB, Bhuiyan TR, Harris JB, LaRocque RC, Ryan ET, Qadri F and Thomson NR

    Infection Genomics Programme, Wellcome Sanger Institute, Hinxton, UK.

    Although much focus is placed on cholera epidemics, the greatest burden occurs in settings in which cholera is endemic, including areas of South Asia, Africa and now Haiti<sup>1,2</sup>. Dhaka, Bangladesh is a megacity that is hyper-endemic for cholera, and experiences two regular seasonal outbreaks of cholera each year<sup>3</sup>. Despite this, a detailed understanding of the diversity of Vibrio cholerae strains circulating in this setting, and their relationships to annual outbreaks, has not yet been obtained. Here we performed whole-genome sequencing of V. cholerae across several levels of focus and scale, at the maximum possible resolution. We analyzed bacterial isolates to define cholera dynamics at multiple levels, ranging from infection within individuals, to disease dynamics at the household level, to regional and intercontinental cholera transmission. Our analyses provide a genomic framework for understanding cholera diversity and transmission in an endemic setting.

    Funded by: FIC NIH HHS: D43 TW005572, K43 TW010362; NIAID NIH HHS: R01 AI103055, R01 AI106878, R56 AI106878, U01 AI058935, U01 AI077883; NIDDK NIH HHS: P30 DK043351

    Nature genetics 2018;50;7;951-955

  • The Capsule Regulatory Network of <i>Klebsiella pneumoniae</i> Defined by density-TraDISort.

    Dorman MJ, Feltwell T, Goulding DA, Parkhill J and Short FL

    Wellcome Sanger Institute, Hinxton, Cambridgeshire, United Kingdom.

    <i>Klebsiella pneumoniae</i> infections affect infants and the immunocompromised, and the recent emergence of hypervirulent and multidrug-resistant <i>K. pneumoniae</i> lineages is a critical health care concern. Hypervirulence in <i>K. pneumoniae</i> is mediated by several factors, including the overproduction of extracellular capsule. However, the full details of how <i>K. pneumoniae</i> capsule biosynthesis is achieved or regulated are not known. We have developed a robust and sensitive procedure to identify genes influencing capsule production, density-TraDISort, which combines density gradient centrifugation with transposon insertion sequencing. We have used this method to explore capsule regulation in two clinically relevant <i>Klebsiella</i> strains, <i>K. pneumoniae</i> NTUH-K2044 (capsule type K1) and <i>K. pneumoniae</i> ATCC 43816 (capsule type K2). We identified multiple genes required for full capsule production in <i>K. pneumoniae</i>, as well as putative suppressors of capsule in NTUH-K2044, and have validated the results of our screen with targeted knockout mutants. Further investigation of several of the <i>K. pneumoniae</i> capsule regulators identified-ArgR, MprA/KvrB, SlyA/KvrA, and the Sap ABC transporter-revealed effects on capsule amount and architecture, serum resistance, and virulence. We show that capsule production in <i>K. pneumoniae</i> is at the center of a complex regulatory network involving multiple global regulators and environmental cues and that the majority of capsule regulatory genes are located in the core genome. Overall, our findings expand our understanding of how capsule is regulated in this medically important pathogen and provide a technology that can be easily implemented to study capsule regulation in other bacterial species.<b>IMPORTANCE</b> Capsule production is essential for <i>K. pneumoniae</i> to cause infections, but its regulation and mechanism of synthesis are not fully understood in this organism. We have developed and applied a new method for genome-wide identification of capsule regulators. Using this method, many genes that positively or negatively affect capsule production in <i>K. pneumoniae</i> were identified, and we use these data to propose an integrated model for capsule regulation in this species. Several of the genes and biological processes identified have not previously been linked to capsule synthesis. We also show that the methods presented here can be applied to other species of capsulated bacteria, providing the opportunity to explore and compare capsule regulatory networks in other bacterial strains and species.

    Funded by: Wellcome Trust: 106063/A/14/Z, 206194

    mBio 2018;9;6

  • Meeting the discovery challenge of drug-resistant infections: progress and focusing resources.

    Dougan G, Dowson C, Overington J and Next Generation Antibiotic Discovery Symposium Participants

    The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK; The Department of Medicine, University of Cambridge, UK. Electronic address:

    Following multiple warnings from governments and health organisations, there has been renewed investment, led by the public sector, in the discovery of novel antimicrobials to meet the challenge of rising levels of drug-resistant infection, particularly in the case of resistance to antibiotics. Initiatives have also been announced to support and enable the antibiotic discovery process. In January 2018, the Medicines Discovery Catapult, UK, hosted a symposium: Next Generation Antibiotics Discovery, to consider the latest initiatives and any remaining challenges to inform and guide the international research community and better focus resources to yield a novel class of antibiotic.

    Drug discovery today 2018

  • Malaria Vaccines: Recent Advances and New Horizons.

    Draper SJ, Sack BK, King CR, Nielsen CM, Rayner JC, Higgins MK, Long CA and Seder RA

    The Jenner Institute, University of Oxford, Old Road Campus Research Building, Oxford, OX3 7DQ, UK. Electronic address:

    The development of highly effective and durable vaccines against the human malaria parasites Plasmodium falciparum and P. vivax remains a key priority. Decades of endeavor have taught that achieving this goal will be challenging; however, recent innovation in malaria vaccine research and a diverse pipeline of novel vaccine candidates for clinical assessment provides optimism. With first-generation pre-erythrocytic vaccines aiming for licensure in the coming years, it is important to reflect on how next-generation approaches can improve on their success. Here we review the latest vaccine approaches that seek to prevent malaria infection, disease, and transmission and highlight some of the major underlying immunological and molecular mechanisms of protection. The synthesis of rational antigen selection, immunogen design, and immunization strategies to induce quantitatively and qualitatively improved immune effector mechanisms offers promise for achieving sustained high-level protection.

    Cell host & microbe 2018;24;1;43-56

  • Multi-population genomic analysis of malaria parasites indicates local selection and differentiation at the gdv1 locus regulating sexual development.

    Duffy CW, Amambua-Ngwa A, Ahouidi AD, Diakite M, Awandare GA, Ba H, Tarr SJ, Murray L, Stewart LB, D'Alessandro U, Otto TD, Kwiatkowski DP and Conway DJ

    Pathogen Molecular Biology Department, London School of Hygiene and Tropical Medicine, Keppel St, London, UK.

    Parasites infect hosts in widely varying environments, encountering diverse challenges for adaptation. To identify malaria parasite genes under locally divergent selection across a large endemic region with a wide spectrum of transmission intensity, genome sequences were obtained from 284 clinical Plasmodium falciparum infections from four newly sampled locations in Senegal, The Gambia, Mali and Guinea. Combining these with previous data from seven other sites in West Africa enabled a multi-population analysis to identify discrete loci under varying local selection. A genome-wide scan showed the most exceptional geographical divergence to be at the early gametocyte gene locus gdv1 which is essential for parasite sexual development and transmission. We identified a major structural dimorphism with alternative 1.5 kb and 1.0 kb sequence deletions at different positions of the 3'-intergenic region, in tight linkage disequilibrium with the most highly differentiated single nucleotide polymorphism, one of the alleles being very frequent in Senegal and The Gambia but rare in the other locations. Long non-coding RNA transcripts were previously shown to include the entire antisense of the gdv1 coding sequence and the portion of the intergenic region with allelic deletions, suggesting adaptive regulation of parasite sexual development and transmission in response to local conditions.

    Funded by: Biotechnology and Biological Sciences Research Council (BBSRC): LIDO studentship; EC | European Research Council (ERC): AdG-2011-294428; Medical Research Council (MRC): G1100123; Royal Society: AA110050; Wellcome Trust: 090770/Z/09/Z

    Scientific reports 2018;8;1;15763

  • Relationship Between Sequence Homology, Genome Architecture, and Meiotic Behavior of the Sex Chromosomes in North American Voles.

    Dumont BL, Williams CL, Ng BL, Horncastle V, Chambers CL, McGraw LA, Adams D, Mackay TFC and Breen M

    Initiative in Biological Complexity, North Carolina State University, Raleigh, North Carolina 04609

    In most mammals, the X and Y chromosomes synapse and recombine along a conserved region of homology known as the pseudoautosomal region (PAR). These homology-driven interactions are required for meiotic progression and are essential for male fertility. Although the PAR fulfills key meiotic functions in most mammals, several exceptional species lack PAR-mediated sex chromosome associations at meiosis. Here, we leveraged the natural variation in meiotic sex chromosome programs present in North American voles (<i>Microtus</i>) to investigate the relationship between meiotic sex chromosome dynamics and X/Y sequence homology. To this end, we developed a novel, reference-blind computational method to analyze sparse sequencing data from flow-sorted X and Y chromosomes isolated from vole species with sex chromosomes that always (<i>Microtus montanus</i>), never (<i>Microtus mogollonensis</i>), and occasionally synapse (<i>Microtus ochrogaster</i>) at meiosis. Unexpectedly, we find more shared X/Y homology in the two vole species with no and sporadic X/Y synapsis compared to the species with obligate synapsis. Sex chromosome homology in the asynaptic and occasionally synaptic species is interspersed along chromosomes and largely restricted to low-complexity sequences, including a striking enrichment for the telomeric repeat sequence, TTAGGG. In contrast, homology is concentrated in high complexity, and presumably euchromatic, sequence on the X and Y chromosomes of the synaptic vole species, <i>M. montanus</i> Taken together, our findings suggest key conditions required to sustain the standard program of X/Y synapsis at meiosis and reveal an intriguing connection between heterochromatic repeat architecture and noncanonical, asynaptic mechanisms of sex chromosome segregation in voles.

    Funded by: NIGMS NIH HHS: K99 GM110332, R00 GM110332

    Genetics 2018;210;1;83-97

  • Important Extracellular Interactions between Plasmodium Sporozoites and Host Cells Required for Infection.

    Dundas K, Shears MJ, Sinnis P and Wright GJ

    Cell Surface Signalling Laboratory and Parasites and Microbes Programme, Wellcome Trust Sanger Institute, Cambridge, CB10 1SA, UK.

    Malaria is an infectious disease, caused by Plasmodium parasites, that remains a major global health problem. Infection begins when salivary gland sporozoites are transmitted through the bite of an infected mosquito. Once within the host, sporozoites navigate through the dermis, into the bloodstream, and eventually invade hepatocytes. While we have an increasingly sophisticated cellular description of this journey, our molecular understanding of the extracellular interactions between the sporozoite and mammalian host that regulate migration and invasion remain comparatively poor. Here, we review the current state of our understanding, highlight the technical limitations that have frustrated progress, and outline how new approaches will help to address this knowledge gap with the ultimate aim of improving malaria treatments.

    Trends in parasitology 2018

  • Alpha-v-containing integrins are host receptors for the <i>Plasmodium falciparum</i> sporozoite surface protein, TRAP.

    Dundas K, Shears MJ, Sun Y, Hopp CS, Crosnier C, Metcalf T, Girling G, Sinnis P, Billker O and Wright GJ

    Cell Surface Signalling Laboratory, Wellcome Trust Sanger Institute, CB10 1SA Cambridge, United Kingdom.

    Malaria-causing <i>Plasmodium</i> sporozoites are deposited in the dermis by the bite of an infected mosquito and move by gliding motility to the liver where they invade and develop within host hepatocytes. Although extracellular interactions between <i>Plasmodium</i> sporozoite ligands and host receptors provide important guidance cues for productive infection and are good vaccine targets, these interactions remain largely uncharacterized. Thrombospondin-related anonymous protein (TRAP) is a parasite cell surface ligand that is essential for both gliding motility and invasion because it couples the extracellular binding of host receptors to the parasite cytoplasmic actinomyosin motor; however, the molecular nature of the host TRAP receptors is poorly defined. Here, we use a systematic extracellular protein interaction screening approach to identify the integrin αvβ3 as a directly interacting host receptor for <i>Plasmodium falciparum</i> TRAP. Biochemical characterization of the interaction suggests a two-site binding model, requiring contributions from both the von Willebrand factor A domain and the RGD motif of TRAP for integrin binding. We show that TRAP binding to cells is promoted in the presence of integrin-activating proadhesive Mn<sup>2+</sup> ions, and that cells genetically targeted so that they lack cell surface expression of the integrin αv-subunit are no longer able to bind TRAP. <i>P. falciparum</i> sporozoites moved with greater speed in the dermis of <i>Itgb3</i>-deficient mice, suggesting that the interaction has a role in sporozoite migration. The identification of the integrin αvβ3 as the host receptor for TRAP provides an important demonstration of a sporozoite surface ligand that directly interacts with host receptors.

    Proceedings of the National Academy of Sciences of the United States of America 2018

  • Registered access: authorizing data access.

    Dyke SOM, Linden M, Lappalainen I, De Argila JR, Carey K, Lloyd D, Spalding JD, Cabili MN, Kerry G, Foreman J, Cutts T, Shabani M, Rodriguez LL, Haeussler M, Walsh B, Jiang X, Wang S, Perrett D, Boughtwood T, Matern A, Brookes AJ, Cupak M, Fiume M, Pandya R, Tulchinsky I, Scollen S, Törnroos J, Das S, Evans AC, Malin BA, Beck S, Brenner SE, Nyrönen T, Blomberg N, Firth HV, Hurles M, Philippakis AA, Rätsch G, Brudno M, Boycott KM, Rehm HL, Baudis M, Sherry ST, Kato K, Knoppers BM, Baker D and Flicek P

    Centre of Genomics and Policy, Faculty of Medicine, McGill University, Montreal, QC, Canada.

    The Global Alliance for Genomics and Health (GA4GH) proposes a data access policy model-"registered access"-to increase and improve access to data requiring an agreement to basic terms and conditions, such as the use of DNA sequence and health data in research. A registered access policy would enable a range of categories of users to gain access, starting with researchers and clinical care professionals. It would also facilitate general use and reuse of data but within the bounds of consent restrictions and other ethical obligations. In piloting registered access with the Scientific Demonstration data sharing projects of GA4GH, we provide additional ethics, policy and technical guidance to facilitate the implementation of this access model in an international setting.

    European journal of human genetics : EJHG 2018

  • Epigenetic and Transcriptional Variability Shape Phenotypic Plasticity.

    Ecker S, Pancaldi V, Valencia A, Beck S and Paul DS

    UCL Cancer Institute, University College London, 72 Huntley Street, London, WC1E 6BT, UK.

    Epigenetic and transcriptional variability contribute to the vast diversity of cellular and organismal phenotypes and are key in human health and disease. In this review, we describe different types, sources, and determinants of epigenetic and transcriptional variability, enabling cells and organisms to adapt and evolve to a changing environment. We highlight the latest research and hypotheses on how chromatin structure and the epigenome influence gene expression variability. Further, we provide an overview of challenges in the analysis of biological variability. An improved understanding of the molecular mechanisms underlying epigenetic and transcriptional variability, at both the intra- and inter-individual level, provides great opportunity for disease prevention, better therapeutic approaches, and personalized medicine.

    Funded by: British Heart Foundation: RG/08/014/24067, RG/13/13/30194; Medical Research Council: MR/L003120/1

    BioEssays : news and reviews in molecular, cellular and developmental biology 2018;40;2

  • Dynamics of the epigenetic landscape during the maternal-to-zygotic transition.

    Eckersley-Maslin MA, Alda-Catalinas C and Reik W

    Epigenetics Programme, Babraham Institute, Cambridge, UK.

    A remarkable epigenetic remodelling process occurs shortly after fertilization, which restores totipotency to the zygote. This involves global DNA demethylation, chromatin remodelling, genome spatial reorganization and substantial transcriptional changes. Key to these changes is the transition from the maternal environment of the oocyte to an embryonic-driven developmental expression programme, a process termed the maternal-to-zygotic transition (MZT). Zygotic genome activation occurs predominantly at the two-cell stage in mice and the eight-cell stage in humans, yet the dynamics of its control are still mostly obscure. In recent years, partly due to single-cell and low-cell number epigenomic studies, our understanding of the epigenetic and chromatin landscape of preimplantation development has improved considerably. In this Review, we discuss the latest advances in the study of the MZT, focusing on DNA methylation, histone post-translational modifications, local chromatin structure and higher-order genome organization. We also discuss key mechanistic studies that investigate the mode of action of chromatin regulators, transcription factors and non-coding RNAs during preimplantation development. Finally, we highlight areas requiring additional research, as well as new technological advances that could assist in eventually completing our understanding of the MZT.

    Nature reviews. Molecular cell biology 2018

  • Inter-homologue repair in fertilized human eggs?

    Egli D, Zuccaro MV, Kosicki M, Church GM, Bradley A and Jasin M

    Department of Obstetrics and Gynecology and Department of Pediatrics, Columbia University, New York, NY, USA.

    Nature 2018;560;7717;E5-E7

  • Lysogenic conversion of atypical enteropathogenic Escherichia coli (aEPEC) from human, murine, and bovine origin with bacteriophage Φ3538 Δstx<sub>2</sub>::cat proves their enterohemorrhagic E. coli (EHEC) progeny.

    Eichhorn I, Heidemanns K, Ulrich RG, Schmidt H, Semmler T, Fruth A, Bethe A, Goulding D, Pickard D, Karch H and Wieler LH

    Institute for Microbiology and Epizootics, Freie Universität Berlin, Berlin, Germany.

    Bacteriophages play an important role in the evolution of bacterial pathogens. A phage-mediated transfer of stx-genes to atypical enteropathogenic E. coli (aEPEC) which are prevalent in different hosts, would convert them to enterohemorrhagic E. coli (EHEC). We decided to confirm this hypothesis experimentally to provide conclusive evidence that aEPEC isolated from different mammalian hosts are indeed progenitors of typical EHEC which gain the ability to produce Shiga-Toxin by lysogeny with stx-converting bacteriophages, utilizing the model phage Φ3538 Δstx<sub>2</sub>::cat. We applied a modified in vitro plaque-assay, using a high titer of a bacteriophage carrying a deletion in the stx<sub>2</sub> gene (Φ3538 Δstx<sub>2</sub>::cat) to increase the detection of lysogenic conversion events. Three wild-type aEPEC strains were chosen as acceptor strains: the murine aEPEC-strain IMT14505 (sequence type (ST)28, serotype Ont:H6), isolated from a striped field mouse (Apodemus agrarius) in the surrounding of a cattle shed, and the human aEPEC-strain 910#00 (ST28, Ont:H6). The close genomic relationship of both strains implies a high zoonotic potential. A third strain, the bovine aEPEC IMT19981, was of serotype O26:H11 and ST21 (STC29). All three aEPEC were successfully lysogenized with phage Φ3538 Δstx<sub>2</sub>::cat. Integration of the bacteriophage DNA into the aEPEC host genomes was confirmed by amplification of chloramphenicol transferase (cat) marker gene and by Southern-Blot hybridization. Analysis of the whole genome sequence of each of the three lysogens showed that the bacteriophage was integrated into the known tRNA integration site argW, which is highly variable among E. coli. In conclusion, the successful lysogenic conversion of aEPEC with a stx-phage in vitro underlines the important role of aEPEC as progenitors of EHEC. Given the high prevalence and the wide host range of aEPEC acceptors, their high risk of zoonotic transmission should be recognized in infection control measures.

    International journal of medical microbiology : IJMM 2018;308;7;890-898

  • Microevolution of epidemiological highly relevant non-O157 enterohemorrhagic Escherichia coli of serogroups O26 and O111.

    Eichhorn I, Semmler T, Mellmann A, Pickard D, Anjum MF, Fruth A, Karch H and Wieler LH

    Institute of Microbiology and Epizootics, Freie Universität Berlin, Centre for Infection Medicine, Berlin, Germany.

    Enterohemorrhagic Escherichia coli (EHEC) are a cause of bloody diarrhea, hemorrhagic colitis (HC) and the potentially fatal hemolytic uremic syndrome (HUS). While O157:H7 is the dominant EHEC serotype, non-O157 EHEC have emerged as serious causes of disease. In Germany, the most important non-O157 O-serogroups causing one third of EHEC infections, including diarrhea as well as HUS, are O26, O103, O111 and O145. Interestingly, we identified EHEC O-serogroups O26 and O111 in one single sequence type complex, STC29, that also harbours atypical enteropathogenic E. coli (aEPEC). aEPEC differ from typical EHEC merely in the absence of stx-genes. These findings inspired us to unravel a putative microevolutionary scenario of these non-O157 EHEC by whole genome analyses. Analysis of single nucleotide polymorphisms (SNPs) of the maximum common genome (MCG) of 20 aEPEC (11 human/ 9 bovine) and 79 EHEC (42 human/ 36 bovine/ 1 food source) of STC29 identified three distinct clusters: Cluster 1 harboured strains of O-serogroup O111, the central Cluster 2 harboured only O26 aEPEC strains, while the more heterogeneous Cluster 3 contained both EHEC and aEPEC strains of O-serogroup O26. Further combined analyses of accessory virulence associated genes (VAGs) and insertion sites for mobile genetic elements suggested a parallel evolution of the MCG and the acquisition of virulence genes. The resulting microevolutionary model suggests the development of two distinct EHEC lineages from one common aEPEC ancestor of ST29 by lysogenic conversion with stx-converting bacteriophages, independent of the host species the strains had been isolated from. In conclusion, our cumulative data indicate that EHEC of O-serogroups O26 and O111 of STC29 originate from a common aEPEC ancestor and are bona fide zoonotic agents. The role of aEPEC in the emergence of O26 and O111 EHEC should be considered for infection control measures to prevent possible lysogenic conversion with stx-converting bacteriophages as major vehicle driving the emergence of EHEC lineages with direct Public Health consequences.

    International journal of medical microbiology : IJMM 2018;308;8;1085-1095

  • HIV treatment is associated with a two-fold higher probability of raised triglycerides: Pooled Analyses in 21 023 individuals in sub-Saharan Africa.

    Ekoru K, Young EH, Dillon DG, Gurdasani D, Stehouwer N, Faurholt-Jepsen D, Levitt NS, Crowther NJ, Nyirenda M, Njelekela MA, Ramaiya K, Nyan O, Adewole OO, Anastos K, Compostella C, Dave JA, Fourie CM, Friis H, Kruger IM, Longenecker CT, Maher DP, Mutimura E, Ndhlovu CE, Praygod G, Pefura Yone EW, Pujades-Rodriguez M, Range N, Sani MU, Sanusi M, Schutte AE, Sliwa K, Tien PC, Vorster EH, Walsh C, Gareta D, Mashili F, Sobngwi E, Adebamowo C, Kamali A, Seeley J, Smeeth L, Pillay D, Motala AA, Kaleebu P and Sandhu MS

    Department of Medicine, University of Cambridge, Cambridge, United Kingdom.

    Background: Anti-retroviral therapy (ART) regimes for HIV are associated with raised levels of circulating triglycerides (TG) in western populations. However, there are limited data on the impact of ART on cardiometabolic risk in sub-Saharan African (SSA) populations.

    Methods: Pooled analyses of 14 studies comprising 21 023 individuals, on whom relevant cardiometabolic risk factors (including TG), HIV and ART status were assessed between 2003 and 2014, in SSA. The association between ART and raised TG (>2.3 mmol/L) was analysed using regression models.

    Findings: Among 10 615 individuals, ART was associated with a two-fold higher probability of raised TG (RR 2.05, 95% CI 1.51-2.77, I<sup>2</sup>=45.2%). The associations between ART and raised blood pressure, glucose, HbA1c, and other lipids were inconsistent across studies.

    Interpretation: Evidence from this study confirms the association of ART with raised TG in SSA populations. Given the possible causal effect of raised TG on cardiovascular disease (CVD), the evidence highlights the need for prospective studies to clarify the impact of long term ART on CVD outcomes in SSA.

    Funded by: Medical Research Council: MR/K013491/1

    Global health, epidemiology and genomics 2018;3

  • Uncovering Natural Longevity Alleles from Intercrossed Pools of Aging Fission Yeast Cells.

    Ellis DA, Mustonen V, Rodríguez-López M, Rallis C, Malecki M, Jeffares DC and Bähler J

    University College London.

    Quantitative traits often show large variation caused by multiple genetic factors. One such trait is the chronological lifespan of non-dividing yeast cells, serving as a model for cellular aging. Screens for genetic factors involved in ageing typically assay mutants of protein-coding genes. To identify natural genetic variants contributing to cellular aging, we exploited two strains of the fission yeast, <i>Schizosaccharomyces pombe,</i> that differ in chronological lifespan. We generated segregant pools from these strains and subjected them to advanced intercrossing over multiple generations to break up linkage groups. We chronologically aged the intercrossed segregant pool, followed by genome sequencing at different times to detect genetic variants that became reproducibly enriched as a function of age. A region on Chromosome II showed strong positive selection during ageing. Based on expected functions, two candidate variants from this region in the long-lived strain were most promising to be causal: small insertions and deletions in the 5'-untranslated regions of <i>ppk31</i> and <i>SPBC409.08.</i> Ppk31 is an orthologue of Rim15, a conserved kinase controlling cell proliferation in response to nutrients, while SPBC409.08 is a predicted spermine transmembrane transporter. Both Rim15 and the spermine-precursor, spermidine, are implicated in ageing as they are involved in autophagy-dependent lifespan extension. Single and double allele replacement suggests that both variants, alone or combined, have subtle effects on cellular longevity. Furthermore, deletion mutants of both <i>ppk31</i> and <i>SPBC409.08</i> rescued growth defects caused by spermidine. We propose that Ppk31 and SPBC409.08 may function together to modulate lifespan, thus linking Rim15/Ppk31 with spermidine metabolism.

    Genetics 2018

  • A library of recombinant Babesia microti cell surface and secreted proteins for diagnostics discovery and reverse vaccinology.

    Elton CM, Rodriguez M, Ben Mamoun C, Lobo CA and Wright GJ

    Cell Surface Signalling Laboratory, Wellcome Trust Sanger Institute, Cambridge CB10 1SA, United Kingdom.

    Human babesiosis is an emerging tick-borne parasitic disease and blood transfusion-transmitted infection primarily caused by the apicomplexan parasite, Babesia microti. There is no licensed vaccine for B. microti and the development of a reliable serological screening test would contribute to ensuring the safety of the donated blood supply. The recent sequencing of the B. microti genome has revealed many novel genes encoding proteins that can now be tested for their suitability as subunit vaccine candidates and diagnostic serological markers. Extracellular proteins are considered excellent vaccine candidates and serological markers because they are directly exposed to the host humoral immune system, but can be challenging to express as soluble recombinant proteins. We have recently developed an approach based on a mammalian expression system that can produce large panels of functional recombinant cell surface and secreted parasite proteins. Here, we use the B. microti genome sequence to identify 54 genes that are predicted to encode surface-displayed and secreted proteins expressed during the blood stages, and show that 41 (76%) are expressed using our method at detectable levels. We demonstrate that the proteins contain conformational, heat-labile, epitopes and use them to serologically profile the kinetics of the humoral immune responses to two strains of B. microti in a murine infection model. Using sera from validated human infections, we show a concordance in the host antibody responses to B. microti infections in mouse and human hosts. Finally, we show that BmSA1 expressed in mammalian cells can elicit high antibody titres in vaccinated mice using a human-compatible adjuvant but these antibodies did not affect the pathology of infection in vivo. Our library of recombinant B. microti cell surface and secreted antigens constitutes a valuable resource that could contribute to the development of a serological diagnostic test, vaccines, and elucidate the molecular basis of host-parasite interactions.

    International journal for parasitology 2018

  • Analysis of predicted loss-of-function variants in UK Biobank identifies variants protective for disease.

    Emdin CA, Khera AV, Chaffin M, Klarin D, Natarajan P, Aragam K, Haas M, Bick A, Zekavat SM, Nomura A, Ardissino D, Wilson JG, Schunkert H, McPherson R, Watkins H, Elosua R, Bown MJ, Samani NJ, Baber U, Erdmann J, Gupta N, Danesh J, Chasman D, Ridker P, Denny J, Bastarache L, Lichtman JH, D'Onofrio G, Mattera J, Spertus JA, Sheu WH, Taylor KD, Psaty BM, Rich SS, Post W, Rotter JI, Chen YI, Krumholz H, Saleheen D, Gabriel S and Kathiresan S

    Center for Genomic Medicine, Massachusetts General Hospital, Harvard Medical School, Boston, MA, 02114, USA.

    Less than 3% of protein-coding genetic variants are predicted to result in loss of protein function through the introduction of a stop codon, frameshift, or the disruption of an essential splice site; however, such predicted loss-of-function (pLOF) variants provide insight into effector transcript and direction of biological effect. In >400,000 UK Biobank participants, we conduct association analyses of 3759 pLOF variants with six metabolic traits, six cardiometabolic diseases, and twelve additional diseases. We identified 18 new low-frequency or rare (allele frequency < 5%) pLOF variant-phenotype associations. pLOF variants in the gene GPR151 protect against obesity and type 2 diabetes, in the gene IL33 against asthma and allergic disease, and in the gene IFIH1 against hypothyroidism. In the gene PDE3B, pLOF variants associate with elevated height, improved body fat distribution and protection from coronary artery disease. Our findings prioritize genes for which pharmacologic mimics of pLOF variants may lower risk for disease.

    Nature communications 2018;9;1;1613

  • SeroBA: rapid high-throughput serotyping of Streptococcus pneumoniae from whole genome sequence data.

    Epping L, van Tonder AJ, Gladstone RA, The Global Pneumococcal Sequencing Consortium, Bentley SD, Page AJ and Keane JA

    1​Pathogen Informatics, Wellcome Sanger Institute, Hinxton, Cambridgeshire CB10 1SA, UK.

    Streptococcus pneumoniae is responsible for 240 000-460 000 deaths in children under 5 years of age each year. Accurate identification of pneumococcal serotypes is important for tracking the distribution and evolution of serotypes following the introduction of effective vaccines. Recent efforts have been made to infer serotypes directly from genomic data but current software approaches are limited and do not scale well. Here, we introduce a novel method, SeroBA, which uses a k-mer approach. We compare SeroBA against real and simulated data and present results on the concordance and computational performance against a validation dataset, the robustness and scalability when analysing a large dataset, and the impact of varying the depth of coverage on sequence-based serotyping. SeroBA can predict serotypes, by identifying the cps locus, directly from raw whole genome sequencing read data with 98 % concordance using a k-mer-based method, can process 10 000 samples in just over 1 day using a standard server and can call serotypes at a coverage as low as 15-21×. SeroBA is implemented in Python3 and is freely available under an open source GPLv3 licence from:

    Microbial genomics 2018

  • Epistasis studies reveal redundancy among calcium-dependent protein kinases in motility and invasion of malaria parasites.

    Fang H, Gomes AR, Klages N, Pino P, Maco B, Walker EM, Zenonos ZA, Angrisano F, Baum J, Doerig C, Baker DA, Billker O and Brochet M

    Department of Microbiology and Molecular Medicine, Faculty of Medicine, University of Geneva, Geneva, CH-1211, Switzerland.

    In malaria parasites, evolution of parasitism has been linked to functional optimisation. Despite this optimisation, most members of a calcium-dependent protein kinase (CDPK) family show genetic redundancy during erythrocytic proliferation. To identify relationships between phospho-signalling pathways, we here screen 294 genetic interactions among protein kinases in Plasmodium berghei. This reveals a synthetic negative interaction between a hypomorphic allele of the protein kinase G (PKG) and CDPK4 to control erythrocyte invasion which is conserved in P. falciparum. CDPK4 becomes critical when PKG-dependent calcium signals are attenuated to phosphorylate proteins important for the stability of the inner membrane complex, which serves as an anchor for the acto-myosin motor required for motility and invasion. Finally, we show that multiple kinases functionally complement CDPK4 during erythrocytic proliferation and transmission to the mosquito. This study reveals how CDPKs are wired within a stage-transcending signalling network to control motility and host cell invasion in malaria parasites.

    Funded by: EC | European Research Council (ERC): 695596; Schweizerischer Nationalfonds zur Förderung der Wissenschaftlichen Forschung (Swiss National Science Foundation): BSSGI0_155852; Wellcome Trust: 098051, 100993/Z/13/Z, 106240/Z/14/Z

    Nature communications 2018;9;1;4248

  • Telomerecat: A ploidy-agnostic method for estimating telomere length from whole genome sequencing data.

    Farmery JHR, Smith ML, NIHR BioResource - Rare Diseases and Lynch AG

    Cancer Research UK Cambridge Institute, University of Cambridge, Li Ka Shing Centre, Robinson Way, Cambridge, CB2 0RE, UK.

    Telomere length is a risk factor in disease and the dynamics of telomere length are crucial to our understanding of cell replication and vitality. The proliferation of whole genome sequencing represents an unprecedented opportunity to glean new insights into telomere biology on a previously unimaginable scale. To this end, a number of approaches for estimating telomere length from whole-genome sequencing data have been proposed. Here we present Telomerecat, a novel approach to the estimation of telomere length. Previous methods have been dependent on the number of telomeres present in a cell being known, which may be problematic when analysing aneuploid cancer data and non-human samples. Telomerecat is designed to be agnostic to the number of telomeres present, making it suited for the purpose of estimating telomere length in cancer studies. Telomerecat also accounts for interstitial telomeric reads and presents a novel approach to dealing with sequencing errors. We show that Telomerecat performs well at telomere length estimation when compared to leading experimental and computational methods. Furthermore, we show that it detects expected patterns in longitudinal data, repeated measurements, and cross-species comparisons. We also apply the method to a cancer cell data, uncovering an interesting relationship with the underlying telomerase genotype.

    Funded by: Wellcome Trust

    Scientific reports 2018;8;1;1300

  • Maturing Human CD127+ CCR7+ PDL1+ Dendritic Cells Express AIRE in the Absence of Tissue Restricted Antigens.

    Fergusson JR, Morgan MD, Bruchard M, Huitema L, Heesters BA, van Unen V, van Hamburg JP, van der Wel NN, Picavet D, Koning F, Tas SW, Anderson MS, Marioni JC, Holländer GA and Spits H

    Department of Experimental Immunology, Academic Medical Center, Amsterdam, Netherlands.

    Expression of the Autoimmune regulator (AIRE) outside of the thymus has long been suggested in both humans and mice, but the cellular source in humans has remained undefined. Here we identify AIRE expression in human tonsils and extensively analyzed these "extra-thymic AIRE expressing cells" (eTACs) using combinations of flow cytometry, CyTOF and single cell RNA-sequencing. We identified AIRE+ cells as dendritic cells (DCs) with a mature and migratory phenotype including high levels of antigen presenting molecules and costimulatory molecules, and specific expression of CD127, CCR7, and PDL1. These cells also possessed the ability to stimulate and re-stimulate T cells and displayed reduced responses to toll-like receptor (TLR) agonists compared to conventional DCs. While expression of <i>AIRE</i> was enriched within CCR7+CD127+ DCs, single-cell RNA sequencing revealed expression of <i>AIRE</i> to be transient, rather than stable, and associated with the differentiation to a mature phenotype. The role of AIRE in central tolerance induction within the thymus is well-established, however our study shows that <i>AIRE</i> expression within the periphery is not associated with an enriched expression of tissue-restricted antigens (TRAs). This unexpected finding, suggestive of wider functions of AIRE, may provide an explanation for the non-autoimmune symptoms of APECED patients who lack functional AIRE.

    Frontiers in immunology 2018;9;2902

  • The epilepsy-associated protein TBC1D24 is required for normal development, survival and vesicle trafficking in mammalian neurons.

    Finelli MJ, Aprile D, Castroflorio E, Jeans A, Moschetta M, Chessum L, Degiacomi MT, Grasegger J, Lupien-Meilleur A, Bassett A, Rossignol E, Campeau PM, Bowl MR, Benfenati F, Fassio A and Oliver PL

    Department of Physiology, Anatomy and Genetics, University of Oxford, Parks Road, Oxford OX1 3PT, UK.

    Mutations in the Tre2/Bub2/Cdc16 (TBC)1 domain family member 24 (TBC1D24) gene are associated with a range of inherited neurological disorders, from drug-refractory lethal epileptic encephalopathy and DOORS syndrome (Deafness, Onychodystrophy, Osteodystrophy, mental Retardation, Seizures) to non-syndromic hearing loss. TBC1D24 has been implicated in neuronal transmission and maturation, although the molecular function of the gene and the cause of the apparently complex disease spectrum remain unclear. Importantly, heterozygous TBC1D24 mutation carriers have also been reported with seizures, suggesting that haploinsufficiency for TBC1D24 is significant clinically. Here we have systematically investigated an allelic series of disease-associated mutations in neurons alongside a new mouse model to investigate the consequences of TBC1D24 haploinsufficiency to mammalian neurodevelopment and synaptic physiology. The cellular studies reveal that disease-causing mutations that disrupt either of the conserved protein domains in TBC1D24 are implicated in neuronal development and survival and are likely acting as loss-of-function alleles. We then further investigated TBC1D24 haploinsufficiency in vivo and demonstrate that TBC1D24 is also critical for normal presynaptic function: genetic disruption of Tbc1d24 expression in the mouse leads to an impairment of endocytosis and an enlarged endosomal compartment in neurons with a decrease in spontaneous neurotransmission. These data reveal the essential role for TBC1D24 at the mammalian synapse and help to define common synaptic mechanisms that could underlie the varied effects of TBC1D24 mutations in neurological disease.

    Human molecular genetics 2018

  • Recurrent rearrangements of FOS and FOSB define osteoblastoma.

    Fittall MW, Mifsud W, Pillay N, Ye H, Strobl AC, Verfaillie A, Demeulemeester J, Zhang L, Berisha F, Tarabichi M, Young MD, Miranda E, Tarpey PS, Tirabosco R, Amary F, Grigoriadis AE, Stratton MR, Van Loo P, Antonescu CR, Campbell PJ, Flanagan AM and Behjati S

    The Francis Crick Institute, London, NW1 1AT, UK.

    The transcription factor FOS has long been implicated in the pathogenesis of bone tumours, following the discovery that the viral homologue, v-fos, caused osteosarcoma in laboratory mice. However, mutations of FOS have not been found in human bone-forming tumours. Here, we report recurrent rearrangement of FOS and its paralogue, FOSB, in the most common benign tumours of bone, osteoblastoma and osteoid osteoma. Combining whole-genome DNA and RNA sequences, we find rearrangement of FOS in five tumours and of FOSB in one tumour. Extending our findings into a cohort of 55 cases, using FISH and immunohistochemistry, provide evidence of ubiquitous mutation of FOS or FOSB in osteoblastoma and osteoid osteoma. Overall, our findings reveal a human bone tumour defined by mutations of FOS and FOSB.

    Funded by: NCI NIH HHS: P30 CA008748, P50 CA140146; Wellcome Trust

    Nature communications 2018;9;1;2150

  • Proteomic identification of Axc, a novel beta-lactamase with carbapenemase activity in a meropenem-resistant clinical isolate of Achromobacter xylosoxidans.

    Fleurbaaij F, Henneman AA, Corver J, Knetsch CW, Smits WK, Nauta ST, Giera M, Dragan I, Kumar N, Lawley TD, Verhoeven A, van Leeuwen HC, Kuijper EJ and Hensbergen PJ

    Department of Medical Microbiology, Leiden University Medical Center, 2333 ZA, Leiden, The Netherlands.

    The development of antibiotic resistance during treatment is a threat to patients and their environment. Insight in the mechanisms of resistance development is important for appropriate therapy and infection control. Here, we describe how through the application of mass spectrometry-based proteomics, a novel beta-lactamase Axc was identified as an indicator of acquired carbapenem resistance in a clinical isolate of Achromobacter xylosoxidans. Comparative proteomic analysis of consecutively collected susceptible and resistant isolates from the same patient revealed that high Axc protein levels were only observed in the resistant isolate. Heterologous expression of Axc in Escherichia coli significantly increased the resistance towards carbapenems. Importantly, direct Axc mediated hydrolysis of imipenem was demonstrated using pH shift assays and <sup>1</sup>H-NMR, confirming Axc as a legitimate carbapenemase. Whole genome sequencing revealed that the susceptible and resistant isolates were remarkably similar. Together these findings provide a molecular context for the fast development of meropenem resistance in A. xylosoxidans during treatment and demonstrate the use of mass spectrometric techniques in identifying novel resistance determinants.

    Scientific reports 2018;8;1;8181

  • Interleukin-22 promotes phagolysosomal fusion to induce protection against <i>Salmonella enterica</i> Typhimurium in human epithelial cells.

    Forbester JL, Lees EA, Goulding D, Forrest S, Yeung A, Speak A, Clare S, Coomber EL, Mukhopadhyay S, Kraiczy J, Schreiber F, Lawley TD, Hancock REW, Uhlig HH, Zilbauer M, Powrie F and Dougan G

    Institute of Infection and Immunity, School of Medicine, Cardiff University, Cardiff CF14 4XN, United Kingdom;

    Intestinal epithelial cells (IECs) play a key role in regulating immune responses and controlling infection. However, the direct role of IECs in restricting pathogens remains incompletely understood. Here, we provide evidence that IL-22 primed intestinal organoids derived from healthy human induced pluripotent stem cells (hIPSCs) to restrict <i>Salmonella enterica</i> serovar Typhimurium SL1344 infection. A combination of transcriptomics, bacterial invasion assays, and imaging suggests that IL-22-induced antimicrobial activity is driven by increased phagolysosomal fusion in IL-22-pretreated cells. The antimicrobial phenotype was absent in hIPSCs derived from a patient harboring a homozygous mutation in the <i>IL10RB</i> gene that inactivates the IL-22 receptor but was restored by genetically complementing the IL10RB deficiency. This study highlights a mechanism through which the IL-22 pathway facilitates the human intestinal epithelium to control microbial infection.

    Funded by: Wellcome Trust

    Proceedings of the National Academy of Sciences of the United States of America 2018;115;40;10118-10123

  • Natural Genetic Variation in a Multigenerational Phenotype in C. elegans.

    Frézal L, Demoinet E, Braendle C, Miska E and Félix MA

    Institut de Biologie de l'Ecole Normale Supérieure, Centre National de la Recherche Scientifique, INSERM, École Normale Supérieure, Paris Sciences et Lettres, Paris, France; Wellcome Cancer Research UK Gurdon Institute, University of Cambridge, Tennis Court Road, Cambridge CB2 1QN, UK; Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, UK.

    Although heredity mostly relies on the transmission of DNA sequence, additional molecular and cellular features are heritable across several generations. In the nematode Caenorhabditis elegans, insights into such unconventional inheritance result from two lines of work. First, the mortal germline (Mrt) phenotype was defined as a multigenerational phenotype whereby a selfing lineage becomes sterile after several generations, implying multigenerational memory [1, 2]. Second, certain RNAi effects are heritable over several generations in the absence of the initial trigger [3-5]. Both lines of work converged when the subset of Mrt mutants that are heat sensitive were found to closely correspond to mutants defective in the RNAi-inheritance machinery, including histone modifiers [6-9]. Here, we report the surprising finding that several C. elegans wild isolates display a heat-sensitive mortal germline phenotype in laboratory conditions: upon chronic exposure to higher temperatures, such as 25°C, lines reproducibly become sterile after several generations. This phenomenon is reversible, as it can be suppressed by temperature alternations at each generation, suggesting a non-genetic basis for the sterility. We tested whether natural variation in the temperature-induced Mrt phenotype was of genetic nature by building recombinant inbred lines between the isolates MY10 (Mrt) and JU1395 (non-Mrt). Using bulk segregant analysis, we detected two quantitative trait loci. After further recombinant mapping and genome editing, we identified the major causal locus as a polymorphism in the set-24 gene, encoding a SET- and SPK-domain protein. We conclude that C. elegans natural populations may harbor natural genetic variation in epigenetic inheritance phenomena.

    Current biology : CB 2018;28;16;2588-2596.e8

  • Population size changes and selection drive patterns of parallel evolution in a host-virus system.

    Frickel J, Feulner PGD, Karakoc E and Becks L

    Community Dynamics Group, Department Evolutionary Ecology, Max Planck Institute for Evolutionary Biology, 24306, Plön, Germany.

    Predicting the repeatability of evolution remains elusive. Theory and empirical studies suggest that strong selection and large population sizes increase the probability for parallel evolution at the phenotypic and genotypic levels. However, selection and population sizes are not constant, but rather change continuously and directly affect each other even on short time scales. Here, we examine the degree of parallel evolution shaped through eco-evolutionary dynamics in an algal host population coevolving with a virus. We find high degrees of parallelism at the level of population size changes (ecology) and at the phenotypic level between replicated populations. At the genomic level, we find evidence for parallelism, as the same large genomic region was duplicated in all replicated populations, but also substantial novel sequence divergence between replicates. These patterns of genome evolution can be explained by considering population size changes as an important driver of rapid evolution.

    Nature communications 2018;9;1;1706

  • Surveillance and Epidemiology of Drug Resistant Infections Consortium (SEDRIC): Supporting the transition from strategy to action.

    Fukuda K, Limmathurotsakul D, Okeke IN, Shetty N, van Doorn R, Feasey NA, Chiara F, Zoubiane G, Jinks T, Parkhill J, Patel J, Reid SWJ, Holmes AH, Peacock SJ and Surveillance and Epidemiology of Drug Resistant Infections Consortium (SEDRIC)

    School of Public Health, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Pokfulum, Hong Kong.

    In recognition of the central importance of surveillance and epidemiology in the control of antimicrobial resistance and the need to strengthen surveillance at all levels, Wellcome has brought together a new international expert group SEDRIC (Surveillance and Epidemiology of Drug Resistant Infections Consortium). SEDRIC aims to advance and transform the ways of tracking, sharing and analysing rates of infection and drug resistance, burden of disease, information on antibiotic use, opportunities for preventative measures such as vaccines, and contamination of the environment. SEDRIC will strengthen the availability of information needed to monitor and track risks, including an evaluation of access to, and utility of data generated by pharma and research activities, and will support the translation of surveillance data into interventions, changes in policy and more effective practices. Ways of working will include the provision of independent scientific analysis, advocacy and expert advice to groups, such as the Wellcome Drug Resistant Infection Priority Programme. A priority for SEDRIC's first Working Group is to review mechanisms to strengthen the generation, collection, collation and dissemination of high quality data, together with the need for creativity in the use of existing data and proxy measures, and linking to existing in-country networking infrastructure. SEDRIC will also promote the translation of technological innovations into public health solutions.

    Funded by: Wellcome Trust

    Wellcome open research 2018;3;59

  • A CRISPR knockout screen Identifies SETDB1-target retroelement silencing factors in embryonic stem cells.

    Fukuda K, Okuda A, Yusa K and Shinkai Y


    In mouse embryonic stem cells (mESCs), expression of provirus and endogenous retroelements is epigenetically repressed. Although many cellular factors involved in retroelement silencing have been identified, the complete molecular mechanism remains elusive. In this study, we performed a genome-wide CRISPR screen to advance our understanding of retroelement silencing in mESCs. The Moloney murine leukemia virus (MLV)-based retroviral vector MSCV-GFP, which is repressed by the SETDB1/KAP1 pathway in mESCs was used as a reporter provirus and we identified more than 80 genes involved in this process. In particular, ATF7IP and the BAF complex components are linked with the repression of most of the SETDB1 targets. We characterized two factors, MORC2A and DRES1, of which DRES1 is novel molecule in retroelement silencing. Although both factors are recruited to repress provirus, their roles in repression are different. MORC2A appears to function dependent on repressive epigenetic modifications while DRES1 regulates repressive epigenetic modifications associated with SETDB1. Our genome-wide CRISPR screen cataloged genes which function at different levels in silencing of SETDB1-target retroelements and provides a useful resource for further molecular studies.

    Genome research 2018

  • Glutaminolysis is a metabolic dependency in FLT3<sup>ITD</sup>acute myeloid leukemia unmasked by FLT3 tyrosine kinase inhibition.

    Gallipoli P, Giotopoulos G, Tzelepis K, Costa ASH, Vohra S, Medina-Perez P, Basheer F, Marando L, Di Lisio L, Dias JML, Yun H, Sasca D, Horton SJ, Vassiliou G, Frezza C and Huntly BJP

    Wellcome Trust-MRC Cambridge Stem Cell Institute, United Kingdom.

    FLT3 internal tandem duplication (FLT3<sup>ITD</sup>) are common mutations in acute myeloid leukemia (AML) associated with poor patient prognosis. Although new generation FLT3 tyrosine kinase inhibitors (TKI) have shown promising results, the outcome of FLT3<sup>ITD</sup>AML patients remains poor and demands the identification of novel, specific and validated therapeutic targets for this highly aggressive AML subtype. Utilizing an unbiased genome-wide CRISPR/Cas9 screen, we identify GLS, the first enzyme in glutamine metabolism, as synthetically lethal with FLT3-TKI treatment. Using complementary metabolomic and gene-expression analysis, we demonstrate that glutamine metabolism, through its ability to support both mitochondrial function and cellular redox metabolism, becomes a metabolic dependency of FLT3<sup>ITD</sup>AML, specifically unmasked by FLT3-TKI treatment. We extend these findings to AML subtypes driven by other tyrosine kinase (TK) activating mutations, and validate the role of GLS as a clinically actionable therapeutic target in both primary AML and<i>in vivo</i>models. Our work highlights the role of metabolic adaptations as a resistance mechanism to several TKI, and suggests glutaminolysis as a therapeutically targetable vulnerability when combined with specific TKI in FLT3<sup>ITD</sup>and other TK activating mutation driven leukemias.

    Blood 2018

  • Quantifying the Impact of Rare and Ultra-rare Coding Variation across the Phenotypic Spectrum.

    Ganna A, Satterstrom FK, Zekavat SM, Das I, Kurki MI, Churchhouse C, Alfoldi J, Martin AR, Havulinna AS, Byrnes A, Thompson WK, Nielsen PR, Karczewski KJ, Saarentaus E, Rivas MA, Gupta N, Pietiläinen O, Emdin CA, Lescai F, Bybjerg-Grauholm J, Flannick J, GoT2D/T2D-GENES Consortium, Mercader JM, Udler M, SIGMA Consortium Helmsley IBD Exome Sequencing Project, FinMetSeq Consortium, iPSYCH-Broad Consortium, Laakso M, Salomaa V, Hultman C, Ripatti S, Hämäläinen E, Moilanen JS, Körkkö J, Kuismin O, Nordentoft M, Hougaard DM, Mors O, Werge T, Mortensen PB, MacArthur D, Daly MJ, Sullivan PF, Locke AE, Palotie A, Børglum AD, Kathiresan S and Neale BM

    Analytic and Translational Genetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm 17176, Sweden. Electronic address:

    There is a limited understanding about the impact of rare protein-truncating variants across multiple phenotypes. We explore the impact of this class of variants on 13 quantitative traits and 10 diseases using whole-exome sequencing data from 100,296 individuals. Protein-truncating variants in genes intolerant to this class of mutations increased risk of autism, schizophrenia, bipolar disorder, intellectual disability, and ADHD. In individuals without these disorders, there was an association with shorter height, lower education, increased hospitalization, and reduced age at enrollment. Gene sets implicated from GWASs did not show a significant protein-truncating variants burden beyond what was captured by established Mendelian genes. In conclusion, we provide a thorough investigation of the impact of rare deleterious coding variants on complex traits, suggesting widespread pleiotropic risk.

    American journal of human genetics 2018

  • Alterations in sperm long RNA contribute to the epigenetic inheritance of the effects of postnatal trauma.

    Gapp K, van Steenwyk G, Germain PL, Matsushima W, Rudolph KLM, Manuella F, Roszkowski M, Vernaz G, Ghosh T, Pelczar P, Mansuy IM and Miska EA

    Gurdon Institute, University of Cambridge, Tennis Court Rd, Cambridge, CB2 1QN, UK.

    Psychiatric diseases have a strong heritable component known to not be restricted to DNA sequence-based genetic inheritance alone but to also involve epigenetic factors in germ cells. Initial evidence suggested that sperm RNA is causally linked to the transmission of symptoms induced by traumatic experiences. Here, we show that alterations in long RNA in sperm contribute to the inheritance of specific trauma symptoms. Injection of long RNA fraction from sperm of males exposed to postnatal trauma recapitulates the effects on food intake, glucose response to insulin and risk-taking in adulthood whereas the small RNA fraction alters body weight and behavioural despair. Alterations in long RNA are maintained after fertilization, suggesting a direct link between sperm and embryo RNA.

    Molecular psychiatry 2018

  • Alcohol and endogenous aldehydes damage chromosomes and mutate stem cells.

    Garaycoechea JI, Crossan GP, Langevin F, Mulderrig L, Louzada S, Yang F, Guilbaud G, Park N, Roerink S, Nik-Zainal S, Stratton MR and Patel KJ

    MRC Laboratory of Molecular Biology, Cambridge Biomedical Campus, Francis Crick Avenue, Cambridge CB2 0QH, UK.

    Haematopoietic stem cells renew blood. Accumulation of DNA damage in these cells promotes their decline, while misrepair of this damage initiates malignancies. Here we describe the features and mutational landscape of DNA damage caused by acetaldehyde, an endogenous and alcohol-derived metabolite. This damage results in DNA double-stranded breaks that, despite stimulating recombination repair, also cause chromosome rearrangements. We combined transplantation of single haematopoietic stem cells with whole-genome sequencing to show that this damage occurs in stem cells, leading to deletions and rearrangements that are indicative of microhomology-mediated end-joining repair. Moreover, deletion of p53 completely rescues the survival of aldehyde-stressed and mutated haematopoietic stem cells, but does not change the pattern or the intensity of genome instability within individual stem cells. These findings characterize the mutation of the stem-cell genome by an alcohol-derived and endogenous source of DNA damage. Furthermore, we identify how the choice of DNA-repair pathway and a stringent p53 response limit the transmission of aldehyde-induced mutations in stem cells.

    Nature 2018;553;7687;171-177

  • Transcription Factor Activities Enhance Markers of Drug Sensitivity in Cancer.

    Garcia-Alonso L, Iorio F, Matchan A, Fonseca N, Jaaks P, Peat G, Pignatelli M, Falcone F, Benes CH, Dunham I, Bignell G, McDade SS, Garnett MJ and Saez-Rodriguez J

    European Molecular Biology Laboratory - European Bioinformatics Institute, Wellcome Genome Campus, Cambridge, United Kingdom.

    Transcriptional dysregulation induced by aberrant transcription factors (TF) is a key feature of cancer, but its global influence on drug sensitivity has not been examined. Here, we infer the transcriptional activity of 127 TFs through analysis of RNA-seq gene expression data newly generated for 448 cancer cell lines, combined with publicly available datasets to survey a total of 1,056 cancer cell lines and 9,250 primary tumors. Predicted TF activities are supported by their agreement with independent shRNA essentiality profiles and homozygous gene deletions, and recapitulate mutant-specific mechanisms of transcriptional dysregulation in cancer. By analyzing cell line responses to 265 compounds, we uncovered numerous TFs whose activity interacts with anticancer drugs. Importantly, combining existing pharmacogenomic markers with TF activities often improves the stratification of cell lines in response to drug treatment. Our results, which can be queried freely at, offer a broad foundation for discovering opportunities to refine personalized cancer therapies.<b>Significance:</b> Systematic analysis of transcriptional dysregulation in cancer cell lines and patient tumor specimens offers a publicly searchable foundation to discover new opportunities to refine personalized cancer therapies. <i>Cancer Res; 78(3); 769-80. ©2017 AACR</i>.

    Funded by: Wellcome Trust: 102696

    Cancer research 2018;78;3;769-780

  • A graph-based approach to diploid genome assembly.

    Garg S, Rautiainen M, Novak AM, Garrison E, Durbin R and Marschall T

    Center for Bioinformatics, Saarland University, Saarland Informatics Campus E2.1, Saarbrücken, Germany.

    Motivation: Constructing high-quality haplotype-resolved de novo assemblies of diploid genomes is important for revealing the full extent of structural variation and its role in health and disease. Current assembly approaches often collapse the two sequences into one haploid consensus sequence and, therefore, fail to capture the diploid nature of the organism under study. Thus, building an assembler capable of producing accurate and complete diploid assemblies, while being resource-efficient with respect to sequencing costs, is a key challenge to be addressed by the bioinformatics community.

    Results: We present a novel graph-based approach to diploid assembly, which combines accurate Illumina data and long-read Pacific Biosciences (PacBio) data. We demonstrate the effectiveness of our method on a pseudo-diploid yeast genome and show that we require as little as 50× coverage Illumina data and 10× PacBio data to generate accurate and complete assemblies. Additionally, we show that our approach has the ability to detect and phase structural variants.

    Availability and implementation:

    Supplementary information: Supplementary data are available at Bioinformatics online.

    Bioinformatics (Oxford, England) 2018;34;13;i105-i114

  • Variation graph toolkit improves read mapping by representing genetic variation in the reference.

    Garrison E, Sirén J, Novak AM, Hickey G, Eizenga JM, Dawson ET, Jones W, Garg S, Markello C, Lin MF, Paten B and Durbin R

    Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK.

    Reference genomes guide our interpretation of DNA sequence data. However, conventional linear references represent only one version of each locus, ignoring variation in the population. Poor representation of an individual's genome sequence impacts read mapping and introduces bias. Variation graphs are bidirected DNA sequence graphs that compactly represent genetic variation across a population, including large-scale structural variation such as inversions and duplications. Previous graph genome software implementations have been limited by scalability or topological constraints. Here we present vg, a toolkit of computational methods for creating, manipulating, and using these structures as references at the scale of the human genome. vg provides an efficient approach to mapping reads onto arbitrary variation graphs using generalized compressed suffix arrays, with improved accuracy over alignment to a linear reference, and effectively removing reference bias. These capabilities make using variation graphs as references for DNA sequencing practical at a gigabase scale, or at the topological complexity of de novo assemblies.

    Funded by: NHGRI NIH HHS: U41 HG007234, U54 HG007990

    Nature biotechnology 2018

  • Noncanonical secondary structures arising from non-B DNA motifs are determinants of mutagenesis.

    Georgakopoulos-Soares I, Morganella S, Jain N, Hemberg M and Nik-Zainal S

    Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1SA, United Kingdom.

    Somatic mutations show variation in density across cancer genomes. Previous studies have shown that chromatin organization and replication time domains are correlated with, and thus predictive of, this variation. Here, we analyze 1809 whole-genome sequences from 10 cancer types to show that a subset of repetitive DNA sequences, called non-B motifs that predict noncanonical secondary structure formation can independently account for variation in mutation density. Combined with epigenetic factors and replication timing, the variance explained can be improved to 43%-76%. Approximately twofold mutation enrichment is observed directly within non-B motifs, is focused on exposed structural components, and is dependent on physical properties that are optimal for secondary structure formation. Therefore, there is mounting evidence that secondary structures arising from non-B motifs are not simply associated with increased mutation density-they are possibly causally implicated. Our results suggest that they are determinants of mutagenesis and increase the likelihood of recurrent mutations in the genome. This analysis calls for caution in the interpretation of recurrent mutations and highlights the importance of taking non-B motifs that can simply be inferred from the reference sequence into consideration in background models of mutability henceforth.

    Genome research 2018

  • Updated recommendation for the benign stand-alone ACMG/AMP criterion.

    Ghosh R, Harrison SM, Rehm HL, Plon SE, Biesecker LG and ClinGen Sequence Variant Interpretation Working Group

    Department of Pediatrics, Baylor College of Medicine, Houston, Texas.

    The Clinical Genome Resource (ClinGen) Sequence Variant Interpretation Working Group set out to refine the American College of Medical Genetics and Genomics and the Association of Molecular Pathologists (ACMG/AMP) variant pathogenicity recommendations for stand-alone rule BA1 (a variant with minor allele frequency [MAF] > 0.05 is benign), by clarifying how it should be used and specifying a set of variants that should be exempted from this rule. We cross-referenced ClinVar and Exome Aggregation Consortium data to identify variants for which there was a plausible argument for pathogenicity and the variant exists in one or more population data sets at MAF > 0.05. We identified nine such variants that were present in these data sets that may not be benign. The ACMG/AMP criteria were applied to these variants that resulted in four pathogenic and five variants of uncertain significance. We have refined benign rule BA1 by clarifying terms used to describe its use, which databases we recommend using, and assumptions made about this rule. We also recognized an initial list of nine variants for which there was some evidence of pathogenicity even though the MAF was high for these variants. We specify processes whereby individuals can petition ClinGen for amendments to our variant-specific assertions and the criteria experts should use when setting a numerically lower threshold for BA1 for specific genes.

    Funded by: ClinGen; Eunice Kennedy Shriver National Institute of Child Health and Human Development; Intramural NIH HHS: ZIA HG200328-13, ZIA HG200359-09; Intramural Research Program of the National Human Genome Research Institute: HG200359 09; NHGRI NIH HHS: U01 HG007436, U01 HG007437, U41 HG006834, U41 HG009649; National Cancer Institute: 1U41HG009649, U01HG007436, U01HG007437, U41HG006834, U41HG009650; National Human Genome Research Institute: HG200359 09, U01HG007436, U01HG007437, U41HG006834, U41HG009649, U41HG009650

    Human mutation 2018;39;11;1525-1530

  • Plasmodium vivax-like genome sequences shed new insights into Plasmodium vivax biology and evolution.

    Gilabert A, Otto TD, Rutledge GG, Franzon B, Ollomo B, Arnathau C, Durand P, Moukodoum ND, Okouga AP, Ngoubangoye B, Makanga B, Boundenga L, Paupy C, Renaud F, Prugnolle F and Rougeron V

    MIVEGEC, IRD, CNRS, University of Montpellier, Montpellier, France.

    Although Plasmodium vivax is responsible for the majority of malaria infections outside Africa, little is known about its evolution and pathway to humans. Its closest genetic relative, P. vivax-like, was discovered in African great apes and is hypothesized to have given rise to P. vivax in humans. To unravel the evolutionary history and adaptation of P. vivax to different host environments, we generated using long- and short-read sequence technologies 2 new P. vivax-like reference genomes and 9 additional P. vivax-like genotypes. Analyses show that the genomes of P. vivax and P. vivax-like are highly similar and colinear within the core regions. Phylogenetic analyses clearly show that P. vivax-like parasites form a genetically distinct clade from P. vivax. Concerning the relative divergence dating, we show that the evolution of P. vivax in humans did not occur at the same time as the other agents of human malaria, thus suggesting that the transfer of Plasmodium parasites to humans happened several times independently over the history of the Homo genus. We further identify several key genes that exhibit signatures of positive selection exclusively in the human P. vivax parasites. Two of these genes have been identified to also be under positive selection in the other main human malaria agent, P. falciparum, thus suggesting their key role in the evolution of the ability of these parasites to infect humans or their anthropophilic vectors. Finally, we demonstrate that some gene families important for red blood cell (RBC) invasion (a key step of the life cycle of these parasites) have undergone lineage-specific evolution in the human parasite (e.g., reticulocyte-binding proteins [RBPs]).

    PLoS biology 2018;16;8;e2006035

  • Genetic diversity of Cryptosporidium hominis in a Bangladeshi community as revealed by whole genome sequencing.

    Gilchrist CA, Cotton JA, Burkey C, Arju T, Gilmartin A, Lin Y, Ahmed E, Steiner K, Alam M, Ahmed S, Robinson G, Zaman SU, Kabir M, Sanders M, Chalmers RM, Ahmed T, Ma JZ, Haque R, Faruque ASG, Berriman M and Petri WA

    Department of Medicine, University of Virginia, Charlottesville, Virginia, United States of America.

    We studied the genetic diversity of Cryptosporidium hominis infections in slum dwelling infants from Dhaka over a two-year period. C. hominis infections were common during the monsoon, and were genetically diverse as measured by gp60 genotyping and whole genome resequencing. Recombination in the parasite was evidenced by the decay of linkage disequilibrium in the genome over less than 300 bp. Regions of the genome with high levels of polymorphism were also identified.  Yet to be determined is if genomic diversity is responsible in part for the high rate of reinfection, seasonality and varied clinical presentations of cryptosporidiosis in this population.

    The Journal of infectious diseases 2018

  • Very low depth whole genome sequencing in complex trait association studies.

    Gilly A, Southam L, Suveges D, Kuchenbaecker K, Moore R, Melloni GEM, Hatzikotoulas K, Farmaki AE, Ritchie G, Schwartzentruber J, Danecek P, Kilian B, Pollard MO, Ge X, Tsafantakis E, Dedoussis G and Zeggini E

    Department of Human Genetics, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1HH, UK.

    Motivation: Very low depth sequencing has been proposed as a cost-effective approach to capture low-frequency and rare variation in complex trait association studies. However, a full characterisation of the genotype quality and association power for very low depth sequencing designs is still lacking.

    Results: We perform cohort-wide whole genome sequencing (WGS) at low depth in 1,239 individuals (990 at 1x depth and 249 at 4x depth) from an isolated population, and establish a robust pipeline for calling and imputing very low depth WGS genotypes from standard bioinformatics tools. Using genotyping chip, whole-exome sequencing (WES, 75x depth) and high-depth (22x) WGS data in the same samples, we examine in detail the sensitivity of this approach, and show that imputed 1x WGS recapitulates 95.2% of variants found by imputed GWAS with an average minor allele concordance of 97% for common and low-frequency variants. In our study, 1x further allowed the discovery of 140,844 true low-frequency variants with 73% genotype concordance when compared to high-depth WGS data. Finally, using association results for 57 quantitative traits, we show that very low depth WGS is an efficient alternative to imputed GWAS chip designs, allowing the discovery of up to twice as many true association signals than the classical imputed GWAS design.

    Availability: The HELIC genotype and WGS datasets have been deposited to the European Genome-phenome Archive ( EGAD00010000518; EGAD00010000522; EGAD00010000610; EGAD00001001636, EGAD00001001637. The peakplotter software is available at, the transformPhenotype app can be downloaded at

    Supplementary information: Supplementary data are available at Bioinformatics online.

    Bioinformatics (Oxford, England) 2018

  • Cohort-wide deep whole genome sequencing and the allelic architecture of complex traits.

    Gilly A, Suveges D, Kuchenbaecker K, Pollard M, Southam L, Hatzikotoulas K, Farmaki AE, Bjornland T, Waples R, Appel EVR, Casalone E, Melloni G, Kilian B, Rayner NW, Ntalla I, Kundu K, Walter K, Danesh J, Butterworth A, Barroso I, Tsafantakis E, Dedoussis G, Moltke I and Zeggini E

    Department of Human Genetics, Wellcome Sanger Institute, Hinxton, CB10 1SA, United Kingdom.

    The role of rare variants in complex traits remains uncharted. Here, we conduct deep whole genome sequencing of 1457 individuals from an isolated population, and test for rare variant burdens across six cardiometabolic traits. We identify a role for rare regulatory variation, which has hitherto been missed. We find evidence of rare variant burdens that are independent of established common variant signals (ADIPOQ and adiponectin, P = 4.2 × 10<sup>-8</sup>; APOC3 and triglyceride levels, P = 1.5 × 10<sup>-26</sup>), and identify replicating evidence for a burden associated with triglyceride levels in FAM189B (P = 2.2 × 10<sup>-8</sup>), indicating a role for this gene in lipid metabolism.

    Funded by: Wellcome Trust

    Nature communications 2018;9;1;4674

  • Interferon lambda is required for interferon gamma-expressing NK cell responses but does not afford antiviral protection during acute and persistent murine cytomegalovirus infection.

    Gimeno Brias S, Marsden M, Forbester J, Clement M, Brandt C, Harcourt K, Kane L, Chapman L, Clare S and Humphreys IR

    Institute of Infection Immunity, School of Medicine/Systems Immunity University Research Institute, Cardiff University, Cardiff, United Kingdom.

    Interferon lambda (IFNλ) is a group of cytokines that belong to the IL-10 family. They exhibit antiviral activities against certain viruses during infection of the liver and mucosal tissues. Here we report that IFNλ restricts in vitro replication of the β-herpesvirus murine cytomegalovirus (mCMV). However, IFNλR1-deficient (Ifnλr1-/-) mice were not preferentially susceptible to mCMV infection in vivo during acute infection after systemic or mucosal challenge, or during virus persistence in the mucosa. Instead, our studies revealed that IFNλ influences NK cell responses during mCMV infection. Ifnλr1-/- mice exhibited defective development of conventional interferon-gamma (IFNγ)-expressing NK cells in the spleen during mCMV infection whereas accumulation of granzyme B-expressing NK cells was unaltered. In vitro, development of splenic IFNγ+ NK cells following stimulation with IL-12 or, to a lesser extent, IL-18 was abrogated by IFNλR1-deficiency. Thus, IFNλ regulates NK cell responses during mCMV infection and restricts virus replication in vitro but is redundant in the control of acute and persistent mCMV replication within mucosal and non-mucosal tissues.

    PloS one 2018;13;5;e0197596

  • scanPAV: a pipeline for extracting presence-absence variations in genome pairs.

    Giordano F, Stammnitz MR, Murchison EP and Ning Z

    The Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK.

    Motivation: The recent technological advances in genome sequencing techniques have resulted in an exponential increase in the number of sequenced human and non-human genomes. The ever increasing number of assemblies generated by novel de novo pipelines and strategies demands the development of new software to evaluate assembly quality and completeness. One way to determine the completeness of an assembly is by detecting its Presence-Absence variations (PAV) with respect to a reference, where PAVs between two assemblies are defined as the sequences present in one assembly but entirely missing in the other one. Beyond assembly error or technology bias, PAVs can also reveal real genome polymorphism, consequence of species or individual evolution, or horizontal transfer from viruses and bacteria.

    Results: We present scanPAV, a pipeline for pairwise assembly comparison to identify and extract sequences present in one assembly but not the other. In this note we use the GRCh38 reference assembly to assess the completeness of six human genome assemblies from various assembly strategies and sequencing technologies including Illumina short reads, 10X genomics linked-reads, PacBio and Oxford Nanopore long reads, and Bionano optical maps. We also discuss the PAV polymorphism of seven Tasmanian devil whole genome assemblies of normal animal tissues and devil facial tumour 1 (DFT1) and 2 (DFT2) samples, and the identification of bacterial sequences as contamination in some of the tumorous assemblies.

    Availability: The pipeline is available under the MIT License at


    Supplementary information: A supplementary note is available at Bioinformatics online.

    Bioinformatics (Oxford, England) 2018

  • Mutational mechanisms of amplifications revealed by analysis of clustered rearrangements in breast cancers.

    Glodzik D, Purdie C, Rye IH, Simpson P, Staaf J, Span PN, Russnes H and Nik-Zainal S

    Division of Oncology and Pathology, Department of Clinical Sciences Lund, Lund University, Lund, Sweden.

    Background: Complex clusters of rearrangements in cancer genomes are a challenge to interpret. Some are clear amplifications of driver oncogenes but others are less well understood. Detailed analysis of rearrangements within these complex clusters could reveal new insights into selection, and underlying mutational mechanisms.

    Results: Here, we systematically investigate rearrangements that are densely clustered in individual tumours in a cohort of 560 breast cancers. Applying an agnostic approach, we identify 21 hotspots where clustered rearrangements recur across cancers. Some hotspots coincide with known oncogene loci including CCND1, ERBB2, ZNF217, chr8:ZNF703/FGFR1, IGF1R, and MYC. Others contain cancer genes not typically associated with breast cancer: MCL1, PTP4A1 and MYB. Intriguingly, we identify clustered rearrangements that physically connect distant hotspots. In particular, we observe simultaneous amplification of chr8:ZNF703/FGFR1 and chr11:CCND1 where deep analysis reveals that a chr8-chr11 translocation is likely to be an early, critical, initiating event.

    Conclusions: We present an overview of complex rearrangements in breast cancer, highlighting a potential new way for detecting drivers and revealing novel mechanistic insights into the formation of two common amplicons.

    Annals of oncology : official journal of the European Society for Medical Oncology 2018

  • Hydroxycarbamide Plus Aspirin Versus Aspirin Alone in Patients With Essential Thrombocythemia Age 40 to 59 Years Without High-Risk Features.

    Godfrey AL, Campbell PJ, MacLean C, Buck G, Cook J, Temple J, Wilkins BS, Wheatley K, Nangalia J, Grinfeld J, McMullin MF, Forsyth C, Kiladjian JJ, Green AR, Harrison CN, United Kingdom Medical Research Council Primary Thrombocythemia-1 Study, United Kingdom National Cancer Research Institute Myeloproliferative Neoplasms Subgroup, French Intergroup of Myeloproliferative Neoplasms and and the Australasian Leukaemia and Lymphoma Group.

    Anna L. Godfrey, Jacob Grinfeld, and Anthony R. Green, Cambridge University Hospitals National Health Service (NHS) Foundation Trust; Peter J. Campbell and Jyoti Nangalia, Wellcome Trust Sanger Institute, Hinxton; Cathy MacLean, Julia Cook, Julie Temple, and Anthony R. Green, University of Cambridge; Anthony R. Green, Wellcome Trust-Medical Research Council Cambridge Stem Cell Institute, Cambridge; Georgina Buck, University of Oxford, Oxford; Bridget S. Wilkins and Claire N. Harrison, Guy's and St Thomas' NHS Foundation Trust, London; Keith Wheatley, University of Birmingham, Birmingham; Mary Frances McMullin, Queen's University Belfast, Belfast, United Kingdom; Cecily Forsyth, Gosford Hospital, Gosford, and Australasian Leukaemia and Lymphoma Group, Australia; and Jean-Jacques Kiladjian, Hôpital Saint-Louis, Paris, France.

    Purpose Cytoreductive therapy is beneficial in patients with essential thrombocythemia (ET) at high risk of thrombosis. However, its value in those lacking high-risk features remains unknown. This open-label, randomized trial compared hydroxycarbamide plus aspirin with aspirin alone in patients with ET age 40 to 59 years and without high-risk factors or extreme thrombocytosis. Patients and Methods Patients were age 40 to 59 years and lacked a history of ischemia, thrombosis, embolism, hemorrhage, extreme thrombocytosis (platelet count ≥ 1,500 × 10<sup>9</sup>/L), hypertension, or diabetes requiring therapy. In all, 382 patients were randomly assigned 1:1 to hydroxycarbamide plus aspirin or aspirin alone. The composite primary end point was time to arterial or venous thrombosis, serious hemorrhage, or death from vascular causes. Secondary end points were time to first arterial or venous thrombosis, first serious hemorrhage, death, incidence of transformation, and patient-reported quality of life. Results After a median follow-up of 73 months and a total follow-up of 2,373 patient-years, there was no significant difference between the arms in the likelihood of patients reaching the primary end point (hazard ratio, 0.98; 95% CI, 0.42 to 2.25; P = 1.0). The incidence of significant vascular events was low, at 0.93 per 100 patient-years (95% CI, 0.61 to 1.41). There were also no differences in overall survival; in the composite end point of transformation to myelofibrosis, acute myeloid leukemia, or myelodysplasia; in adverse events; or in patient-reported quality of life. Conclusion In patients with ET age 40 to 59 years and lacking high-risk factors for thrombosis or extreme thrombocytosis, preemptive addition of hydroxycarbamide to aspirin did not reduce vascular events, myelofibrotic transformation, or leukemic transformation. Patients age 40 to 59 years without other clinical indications for treatment (such as previous thrombosis or hemorrhage) who have a platelet count < 1,500 × 10<sup>9</sup>/L should not receive cytoreductive therapy.

    Journal of clinical oncology : official journal of the American Society of Clinical Oncology 2018;JCO2018788414

  • Polycythaemia Vera, Essential Thrombocythaemia and Myelofibrosis

    Godfrey, A.L, VASSILIOU,G.S and Green,A.R

    ABC of Clinical Haematology 2018;21

  • Polycythaemia Vera, Essential Thrombocythaemia and Myelofibrosis

    Godfrey, Anna L., Vassiliou,George S. and Green, Anthony R.

    ABC of Clinical Haematology 2018;21

  • Antimicrobial resistance prediction and phylogenetic analysis of Neisseria gonorrhoeae isolates using the Oxford Nanopore MinION sequencer.

    Golparian D, Donà V, Sánchez-Busó L, Foerster S, Harris S, Endimiani A, Low N and Unemo M

    WHO Collaborating Centre for Gonorrhoea and other Sexually Transmitted Infections, Department of Laboratory Medicine, Clinical Microbiology, Faculty of Medicine and Health, Örebro University, Örebro, Sweden.

    Antimicrobial resistance (AMR) in Neisseria gonorrhoeae is common, compromising gonorrhoea treatment internationally. Rapid characterisation of AMR strains could ensure appropriate and personalised treatment, and support identification and investigation of gonorrhoea outbreaks in nearly real-time. Whole-genome sequencing is ideal for investigation of emergence and dissemination of AMR determinants, predicting AMR, in the gonococcal population and spread of AMR strains in the human population. The novel, rapid and revolutionary long-read sequencer MinION is a small hand-held device that generates bacterial genomes within one day. However, accuracy of MinION reads has been suboptimal for many objectives and the MinION has not been evaluated for gonococci. In this first MinION study for gonococci, we show that MinION-derived sequences analysed with existing open-access, web-based sequence analysis tools are not sufficiently accurate to identify key gonococcal AMR determinants. Nevertheless, using an in house-developed CLC Genomics Workbench including de novo assembly and optimised BLAST algorithms, we show that 2D ONT-derived sequences can be used for accurate prediction of decreased susceptibility or resistance to recommended antimicrobials in gonococcal isolates. We also show that the 2D ONT-derived sequences are useful for rapid phylogenomic-based molecular epidemiological investigations, and, in hybrid assemblies with Illumina sequences, for producing contiguous assemblies and finished reference genomes.

    Funded by: Wellcome Trust: 098051

    Scientific reports 2018;8;1;17596

  • Chromosomal evolution and phylogeny in the Nullicauda group (Chiroptera, Phyllostomidae): evidence from multidirectional chromosome painting.

    Gomes AJB, Nagamachi CY, Rodrigues LRR, Ferguson-Smith MA, Yang F, O'Brien PCM and Pieczarka JC

    Laboratório de Citogenética, CEABIO, ICB, Universidade Federal do Pará, Av. Bernardo Sayão, sn. Guamá, Belém, Pará, 66075-900, Brazil.

    Background: The family Phyllostomidae (Chiroptera) shows wide morphological, molecular and cytogenetic variation; many disagreements regarding its phylogeny and taxonomy remains to be resolved. In this study, we use chromosome painting with whole chromosome probes from the Phyllostomidae Phyllostomus hastatus and Carollia brevicauda to determine the rearrangements among several genera of the Nullicauda group (subfamilies Gliphonycterinae, Carolliinae, Rhinophyllinae and Stenodermatinae).

    Results: These data, when compared with previously published chromosome homology maps, allow the construction of a phylogeny comparable to those previously obtained by morphological and molecular analysis. Our phylogeny is largely in agreement with that proposed with molecular data, both on relationships between the subfamilies and among genera; it confirms, for instance, that Carollia and Rhinophylla, previously considered as part of the same subfamily are, in fact, distant genera.

    Conclusions: The occurrence of the karyotype considered ancestral for this family in several different branches suggests that the diversification of Phyllostomidae into many subfamilies has occurred in a short period of time. Finally, the comparison with published maps using human whole chromosome probes allows us to track some syntenic associations prior to the emergence of this family.

    Funded by: Banco Nacional de Desenvolvimento Economico e Social: 2.318.697.0001; Conselho Nacional de Desenvolvimento Científico e Tecnológico: 308401/2013-1, 308428/2013-7; Conselho Nacional de Desenvolvimento Científico e Tecnológico (BR): 552032/2010-7; Coordenação de Aperfeiçoamento de Pessoal de Nível Superior: 047/2012; Fundação Amazônia Paraense de Amparo à Pesquisa: 064/2008, 064/2011

    BMC evolutionary biology 2018;18;1;62

  • Factors Associated With Outcomes of Patients With Primary Sclerosing Cholangitis and Development and Validation of a Risk Scoring System.

    Goode EC, Clark AB, Mells GM, Srivastava B, Spiess K, Gelson WTH, Trivedi PJ, Kd L, Castren E, Vesterhus MN, Karlsen TH, Ji SG, Anderson CA, Thorburn D, Hudson M, M H, Aldersley MA, Bathgate A, Sandford RN, Alexander GJ, Chapman RW, Walmsley M, UK-PSC Consortium, Hirschfield GM and Rushbrook SM

    Norfolk and Norwich University Hospital, Norwich, UK.

    Background &amp; aims: We sought to identify factors predictive of liver transplantation or death in patients with Primary Sclerosing Cholangitis (PSC), and to develop and validate a contemporaneous risk score for use in a real-world clinical setting.

    Methods: Analysing data from 1001 patients recruited to the UK-PSC research cohort, we evaluated clinical variables for their association with 2- and 10-year outcome through Cox-proportional hazards and C-statistic analyses. We generated risk scores for short- and long-term outcome prediction, validating their use in two independent cohorts totalling 451 patients.

    Results: 36% of the derivation cohort were transplanted or died over a cumulative follow-up of 7,904 years. Serum alkaline phosphatase ≥2.4×ULN at 1 year post diagnosis, was predictive of 10-year outcome (HR=3.05, C=0.63, median transplant-free survival 63 versus 108 months, p<0.0001), as was the presence of extra-hepatic biliary disease (HR=1.45, p=0.01). We developed two risk scoring systems based upon age, values of bilirubin, alkaline phosphatase, albumin, platelets, presence of extra-hepatic biliary disease and variceal haemorrhage, which predicted 2- and 10-year outcome with good discrimination (C=0.81 and 0.80 respectively). Both UK-PSC risk scores were well-validated in our external cohort, and out-performed the Mayo and APRI scores (C=0.75 and 0.63 respectively). Whilst heterozygosity for the previously validated HLA-DR*03:01 risk allele predicted increased risk of adverse outcome (HR=1.33, p=.001), its addition did not improve the predictive accuracy the UK-PSC risk scores.

    Conclusions: Our analyses, based upon a detailed clinical evaluation of a large representative cohort of participants with PSC, furthers our understanding of clinical risk markers and reports the development and validation of a real-world scoring system to identify those patients most likely to die or require liver transplantation. This article is protected by copyright. All rights reserved.

    Hepatology (Baltimore, Md.) 2018

  • Antimicrobial resistant Klebsiella pneumoniae carriage and infection in specialized geriatric care wards linked to acquisition in the referring hospital.

    Gorrie CL, Mirceta M, Wick RR, Judd LM, Wyres KL, Thomson NR, Strugnell RA, Pratt NF, Garlick JS, Watson KM, Hunter PC, McGloughlin SA, Spelman DW, Jenney AWJ and Holt KE

    Department of Biochemistry and Molecular Biology, Bio21 Molecular Science and Biotechnology Institute, The University of Melbourne.

    Background: Klebsiella pneumoniae is a leading cause of extended-spectrum beta-lactamase (ESBL) producing hospital-associated infections, for which elderly patients are at increased risk.

    Methods: We conducted a 1-year prospective cohort study, in which a third of patients admitted to two geriatric wards in a specialized hospital were recruited and screened for carriage of K. pneumoniae by microbiological culture. Clinical isolates were monitored via the hospital laboratory. Colonizing and clinical isolates were subjected to whole genome sequencing and antimicrobial susceptibility testing.

    Results: K. pneumoniae throat carriage prevalence was 4.1%, rectal carriage 10.8% and ESBL carriage 1.7%. K. pneumoniae infection incidence was 1.2%. The isolates were diverse, and most patients were colonized or infected with a unique phylogenetic lineage, with no evidence of transmission in the wards. ESBL strains carried blaCTX-M-15 and belonged to clones associated with hospital-acquired ESBL infections in other countries (ST29, ST323, ST340). One also carried the carbapenemase blaIMP-26. Genomic and epidemiological data provided evidence that ESBL strains were acquired in the referring hospital. Nanopore sequencing also identified strain-to-strain transmission of a blaCTX-M-15 FIBK/FIIK plasmid in the referring hospital.

    Conclusions: The data suggest the major source of K. pneumoniae was the patient's own gut microbiome, but ESBL strains were acquired in the referring hospital. This highlights the importance of the wider hospital network to understanding K. pneumoniae risk and infection control. Rectal screening for ESBL organisms upon admission to geriatric wards could help inform patient management and infection control in such facilities.

    Clinical infectious diseases : an official publication of the Infectious Diseases Society of America 2018

  • Genomic Surveillance of Enterococcus faecium Reveals Limited Sharing of Strains and Resistance Genes between Livestock and Humans in the United Kingdom.

    Gouliouris T, Raven KE, Ludden C, Blane B, Corander J, Horner CS, Hernandez-Garcia J, Wood P, Hadjirin NF, Radakovic M, Holmes MA, de Goffau M, Brown NM, Parkhill J and Peacock SJ

    Department of Medicine, University of Cambridge, Cambridge, United Kingdom.

    Vancomycin-resistant <i>Enterococcus faecium</i> (VREfm) is a major cause of nosocomial infection and is categorized as high priority by the World Health Organization global priority list of antibiotic-resistant bacteria. In the past, livestock have been proposed as a putative reservoir for drug-resistant <i>E. faecium</i> strains that infect humans, and isolates of the same lineage have been found in both reservoirs. We undertook cross-sectional surveys to isolate <i>E. faecium</i> (including VREfm) from livestock farms, retail meat, and wastewater treatment plants in the United Kingdom. More than 600 isolates from these sources were sequenced, and their relatedness and antibiotic resistance genes were compared with genomes of almost 800 <i>E. faecium</i> isolates from patients with bloodstream infection in the United Kingdom and Ireland. <i>E. faecium</i> was isolated from 28/29 farms; none of these isolates were VREfm, suggesting a decrease in VREfm prevalence since the last UK livestock survey in 2003. However, VREfm was isolated from 1% to 2% of retail meat products and was ubiquitous in wastewater treatment plants. Phylogenetic comparison demonstrated that the majority of human and livestock-related isolates were genetically distinct, although pig isolates from three farms were more genetically related to human isolates from 2001 to 2004 (minimum of 50 single-nucleotide polymorphisms [SNPs]). Analysis of accessory (variable) genes added further evidence for distinct niche adaptation. An analysis of acquired antibiotic resistance genes and their variants revealed limited sharing between humans and livestock. Our findings indicate that the majority of <i>E. faecium</i> strains infecting patients are largely distinct from those from livestock in this setting, with limited sharing of strains and resistance genes.<b>IMPORTANCE</b> The rise in rates of human infection caused by vancomycin-resistant <i>Enterococcus faecium</i> (VREfm) strains between 1988 to the 2000s in Europe was suggested to be associated with acquisition from livestock. As a result, the European Union banned the use of the glycopeptide drug avoparcin as a growth promoter in livestock feed. While some studies reported a decrease in VREfm in livestock, others reported no reduction. Here, we report the first livestock VREfm prevalence survey in the UK since 2003 and the first large-scale study using whole-genome sequencing to investigate the relationship between <i>E. faecium</i> strains in livestock and humans. We found a low prevalence of VREfm in retail meat and limited evidence for recent sharing of strains between livestock and humans with bloodstream infection. There was evidence for limited sharing of genes encoding antibiotic resistance between these reservoirs, a finding which requires further research.

    mBio 2018;9;6

  • UTX-mediated enhancer and chromatin remodeling suppresses myeloid leukemogenesis through noncatalytic inverse regulation of ETS and GATA programs.

    Gozdecka M, Meduri E, Mazan M, Tzelepis K, Dudek M, Knights AJ, Pardo M, Yu L, Choudhary JS, Metzakopian E, Iyer V, Yun H, Park N, Varela I, Bautista R, Collord G, Dovey O, Garyfallos DA, De Braekeleer E, Kondo S, Cooper J, Göttgens B, Bullinger L, Northcott PA, Adams D, Vassiliou GS and Huntly BJP

    Haematological Cancer Genetics, Wellcome Trust Sanger Institute, Hinxton, UK.

    The histone H3 Lys27-specific demethylase UTX (or KDM6A) is targeted by loss-of-function mutations in multiple cancers. Here, we demonstrate that UTX suppresses myeloid leukemogenesis through noncatalytic functions, a property shared with its catalytically inactive Y-chromosome paralog, UTY (or KDM6C). In keeping with this, we demonstrate concomitant loss/mutation of KDM6A (UTX) and UTY in multiple human cancers. Mechanistically, global genomic profiling showed only minor changes in H3K27me3 but significant and bidirectional alterations in H3K27ac and chromatin accessibility; a predominant loss of H3K4me1 modifications; alterations in ETS and GATA-factor binding; and altered gene expression after Utx loss. By integrating proteomic and genomic analyses, we link these changes to UTX regulation of ATP-dependent chromatin remodeling, coordination of the COMPASS complex and enhanced pioneering activity of ETS factors during evolution to AML. Collectively, our findings identify a dual role for UTX in suppressing acute myeloid leukemia via repression of oncogenic ETS and upregulation of tumor-suppressive GATA programs.

    Nature genetics 2018;50;6;883-894

  • Identification of rare sequence variation underlying heritable pulmonary arterial hypertension.

    Gräf S, Haimel M, Bleda M, Hadinnapola C, Southgate L, Li W, Hodgson J, Liu B, Salmon RM, Southwood M, Machado RD, Martin JM, Treacy CM, Yates K, Daugherty LC, Shamardina O, Whitehorn D, Holden S, Aldred M, Bogaard HJ, Church C, Coghlan G, Condliffe R, Corris PA, Danesino C, Eyries M, Gall H, Ghio S, Ghofrani HA, Gibbs JSR, Girerd B, Houweling AC, Howard L, Humbert M, Kiely DG, Kovacs G, MacKenzie Ross RV, Moledina S, Montani D, Newnham M, Olschewski A, Olschewski H, Peacock AJ, Pepke-Zaba J, Prokopenko I, Rhodes CJ, Scelsi L, Seeger W, Soubrier F, Stein DF, Suntharalingam J, Swietlik EM, Toshner MR, van Heel DA, Vonk Noordegraaf A, Waisfisz Q, Wharton J, Wort SJ, Ouwehand WH, Soranzo N, Lawrie A, Upton PD, Wilkins MR, Trembath RC and Morrell NW

    Department of Medicine, University of Cambridge, Cambridge, CB2 0QQ, United Kingdom.

    Pulmonary arterial hypertension (PAH) is a rare disorder with a poor prognosis. Deleterious variation within components of the transforming growth factor-β pathway, particularly the bone morphogenetic protein type 2 receptor (BMPR2), underlies most heritable forms of PAH. To identify the missing heritability we perform whole-genome sequencing in 1038 PAH index cases and 6385 PAH-negative control subjects. Case-control analyses reveal significant overrepresentation of rare variants in ATP13A3, AQP1 and SOX17, and provide independent validation of a critical role for GDF2 in PAH. We demonstrate familial segregation of mutations in SOX17 and AQP1 with PAH. Mutations in GDF2, encoding a BMPR2 ligand, lead to reduced secretion from transfected cells. In addition, we identify pathogenic mutations in the majority of previously reported PAH genes, and provide evidence for further putative genes. Taken together these findings contribute new insights into the molecular basis of PAH and indicate unexplored pathways for therapeutic intervention.

    Nature communications 2018;9;1;1416

  • Loss-of-function variants in ADCY3 increase risk of obesity and type 2 diabetes.

    Grarup N, Moltke I, Andersen MK, Dalby M, Vitting-Seerup K, Kern T, Mahendran Y, Jørsboe E, Larsen CVL, Dahl-Petersen IK, Gilly A, Suveges D, Dedoussis G, Zeggini E, Pedersen O, Andersson R, Bjerregaard P, Jørgensen ME, Albrechtsen A and Hansen T

    Novo Nordisk Foundation Center for Basic Metabolic Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark.

    We have identified a variant in ADCY3 (encoding adenylate cyclase 3) associated with markedly increased risk of obesity and type 2 diabetes in the Greenlandic population. The variant disrupts a splice acceptor site, and carriers have decreased ADCY3 RNA expression. Additionally, we observe an enrichment of rare ADCY3 loss-of-function variants among individuals with type 2 diabetes in trans-ancestry cohorts. These findings provide new information on disease etiology relevant for future treatment strategies.

    Nature genetics 2018

  • Dynamics of Transcription Regulation in Human Bone Marrow Myeloid Differentiation to Mature Blood Neutrophils.

    Grassi L, Pourfarzad F, Ullrich S, Merkel A, Were F, Carrillo-de-Santa-Pau E, Yi G, Hiemstra IH, Tool ATJ, Mul E, Perner J, Janssen-Megens E, Berentsen K, Kerstens H, Habibi E, Gut M, Yaspo ML, Linser M, Lowy E, Datta A, Clarke L, Flicek P, Vingron M, Roos D, van den Berg TK, Heath S, Rico D, Frontini M, Kostadima M, Gut I, Valencia A, Ouwehand WH, Stunnenberg HG, Martens JHA and Kuijpers TW

    Department of Haematology, University of Cambridge, Cambridge CB2 0PT, UK; National Health Service Blood and Transplant, Cambridge Biomedical Campus, Cambridge CB2 0PT, UK.

    Neutrophils are short-lived blood cells that play a critical role in host defense against infections. To better comprehend neutrophil functions and their regulation, we provide a complete epigenetic overview, assessing important functional features of their differentiation stages from bone marrow-residing progenitors to mature circulating cells. Integration of chromatin modifications, methylation, and transcriptome dynamics reveals an enforced regulation of differentiation, for cellular functions such as release of proteases, respiratory burst, cell cycle regulation, and apoptosis. We observe an early establishment of the cytotoxic capability, while the signaling components that activate these antimicrobial mechanisms are transcribed at later stages, outside the bone marrow, thus preventing toxic effects in the bone marrow niche. Altogether, these data reveal how the developmental dynamics of the chromatin landscape orchestrate the daily production of a large number of neutrophils required for innate host defense and provide a comprehensive overview of differentiating human neutrophils.

    Cell reports 2018;24;10;2784-2794

  • De Novo Variants in the F-Box Protein FBXO11 in 20 Individuals with a Variable Neurodevelopmental Disorder.

    Gregor A, Sadleir LG, Asadollahi R, Azzarello-Burri S, Battaglia A, Ousager LB, Boonsawat P, Bruel AL, Buchert R, Calpena E, Cogné B, Dallapiccola B, Distelmaier F, Elmslie F, Faivre L, Haack TB, Harrison V, Henderson A, Hunt D, Isidor B, Joset P, Kumada S, Lachmeijer AMA, Lees M, Lynch SA, Martinez F, Matsumoto N, McDougall C, Mefford HC, Miyake N, Myers CT, Moutton S, Nesbitt A, Novelli A, Orellana C, Rauch A, Rosello M, Saida K, Santani AB, Sarkar A, Scheffer IE, Shinawi M, Steindl K, Symonds JD, Zackai EH, University of Washington Center for Mendelian Genomics, DDD Study, Reis A, Sticht H and Zweier C

    Institute of Human Genetics, Friedrich-Alexander-Universität Erlangen-Nürnberg, 91054 Erlangen, Germany.

    Next-generation sequencing combined with international data sharing has enormously facilitated identification of new disease-associated genes and mutations. This is particularly true for genetically extremely heterogeneous entities such as neurodevelopmental disorders (NDDs). Through exome sequencing and world-wide collaborations, we identified and assembled 20 individuals with de novo variants in FBXO11. They present with mild to severe developmental delay associated with a range of features including short (4/20) or tall (2/20) stature, obesity (5/20), microcephaly (4/19) or macrocephaly (2/19), behavioral problems (17/20), seizures (5/20), cleft lip or palate or bifid uvula (3/20), and minor skeletal anomalies. FBXO11 encodes a member of the F-Box protein family, constituting a subunit of an E3-ubiquitin ligase complex. This complex is involved in ubiquitination and proteasomal degradation and thus in controlling critical biological processes by regulating protein turnover. The identified de novo aberrations comprise two large deletions, ten likely gene disrupting variants, and eight missense variants distributed throughout FBXO11. Structural modeling for missense variants located in the CASH or the Zinc-finger UBR domains suggests destabilization of the protein. This, in combination with the observed spectrum and localization of identified variants and the lack of apparent genotype-phenotype correlations, is compatible with loss of function or haploinsufficiency as an underlying mechanism. We implicate de novo missense and likely gene disrupting variants in FBXO11 in a neurodevelopmental disorder with variable intellectual disability and various other features.

    Funded by: NHGRI NIH HHS: U54 HG006493, UM1 HG006493; NICHD NIH HHS: U54 HD087011; NINDS NIH HHS: R01 NS069605

    American journal of human genetics 2018;103;2;305-316

  • Detection and removal of barcode swapping in single-cell RNA-seq data.

    Griffiths JA, Richard AC, Bach K, Lun ATL and Marioni JC

    Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, CB2 0RE, United Kingdom.

    Barcode swapping results in the mislabelling of sequencing reads between multiplexed samples on patterned flow-cell Illumina sequencing machines. This may compromise the validity of numerous genomic assays; however, the severity and consequences of barcode swapping remain poorly understood. We have used two statistical approaches to robustly quantify the fraction of swapped reads in two plate-based single-cell RNA-sequencing datasets. We found that approximately 2.5% of reads were mislabelled between samples on the HiSeq 4000, which is lower than previous reports. We observed no correlation between the swapped fraction of reads and the concentration of free barcode across plates. Furthermore, we have demonstrated that barcode swapping may generate complex but artefactual cell libraries in droplet-based single-cell RNA-sequencing studies. To eliminate these artefacts, we have developed an algorithm to exclude individual molecules that have swapped between samples in 10x Genomics experiments, allowing the continued use of cutting-edge sequencing machines for these assays.

    Funded by: Cancer Research UK (CRUK): A17197; Wellcome Trust: 109081

    Nature communications 2018;9;1;2667

  • Using single-cell genomics to understand developmental processes and cell fate decisions.

    Griffiths JA, Scialdone A and Marioni JC

    Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, UK.

    High-throughput <i>-omics</i> techniques have revolutionised biology, allowing for thorough and unbiased characterisation of the molecular states of biological systems. However, cellular decision-making is inherently a unicellular process to which "bulk" -omics techniques are poorly suited, as they capture ensemble averages of cell states. Recently developed single-cell methods bridge this gap, allowing high-throughput molecular surveys of individual cells. In this review, we cover core concepts of analysis of single-cell gene expression data and highlight areas of developmental biology where single-cell techniques have made important contributions. These include understanding of cell-to-cell heterogeneity, the tracing of differentiation pathways, quantification of gene expression from specific alleles, and the future directions of cell lineage tracing and spatial gene expression analysis.

    Molecular systems biology 2018;14;4;e8046

  • Classification and Personalized Prognosis in Myeloproliferative Neoplasms.

    Grinfeld J, Nangalia J, Baxter EJ, Wedge DC, Angelopoulos N, Cantrill R, Godfrey AL, Papaemmanuil E, Gundem G, MacLean C, Cook J, O'Neil L, O'Meara S, Teague JW, Butler AP, Massie CE, Williams N, Nice FL, Andersen CL, Hasselbalch HC, Guglielmelli P, McMullin MF, Vannucchi AM, Harrison CN, Gerstung M, Green AR and Campbell PJ

    From the Wellcome-MRC Cambridge Stem Cell Institute and Cambridge Institute for Medical Research (J.G., C.E.M., F.L.N., A.R.G., P.J.C.), the Department of Haematology, University of Cambridge (J.G., E.J.B., C.M., J.C., C.E.M., F.L.N., A.R.G.), and the Department of Haematology, Cambridge University Hospitals NHS Foundation Trust (J.G., E.J.B., A.L.G., C.M., J.C., A.R.G.), Cambridge, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus (J.N., D.C.W., N.A., E.P., G.G., L.O., S.O., J.W.T., A.P.B., N.W., P.J.C.), and the European Molecular Biology Laboratory, European Bioinformatics Institute (R.C., M.G.), Hinxton, Big Data Institute, University of Oxford, Oxford (D.C.W.), the Department of Haematology, Queen's University Belfast, Belfast (M.F.M.), and the Department of Haematology, Guy's and St. Thomas' NHS Foundation Trust, London (C.N.H.) - all in the United Kingdom; the Center for Molecular Oncology and the Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York (E.P., G.G.); the Department of Hematology, Zealand University Hospital, Roskilde, and the University of Copenhagen, Copenhagen (C.L.A., H.C.H.); and the Department of Experimental and Clinical Medicine, Center of Research and Innovation of Myeloproliferative Neoplasms, Azienda Ospedaliera Universitaria Careggi, University of Florence, Florence, Italy (P.G., A.M.V.).

    Background: Myeloproliferative neoplasms, such as polycythemia vera, essential thrombocythemia, and myelofibrosis, are chronic hematologic cancers with varied progression rates. The genomic characterization of patients with myeloproliferative neoplasms offers the potential for personalized diagnosis, risk stratification, and treatment.

    Methods: We sequenced coding exons from 69 myeloid cancer genes in patients with myeloproliferative neoplasms, comprehensively annotating driver mutations and copy-number changes. We developed a genomic classification for myeloproliferative neoplasms and multistage prognostic models for predicting outcomes in individual patients. Classification and prognostic models were validated in an external cohort.

    Results: A total of 2035 patients were included in the analysis. A total of 33 genes had driver mutations in at least 5 patients, with mutations in JAK2, CALR, or MPL being the sole abnormality in 45% of the patients. The numbers of driver mutations increased with age and advanced disease. Driver mutations, germline polymorphisms, and demographic variables independently predicted whether patients received a diagnosis of essential thrombocythemia as compared with polycythemia vera or a diagnosis of chronic-phase disease as compared with myelofibrosis. We defined eight genomic subgroups that showed distinct clinical phenotypes, including blood counts, risk of leukemic transformation, and event-free survival. Integrating 63 clinical and genomic variables, we created prognostic models capable of generating personally tailored predictions of clinical outcomes in patients with chronic-phase myeloproliferative neoplasms and myelofibrosis. The predicted and observed outcomes correlated well in internal cross-validation of a training cohort and in an independent external cohort. Even within individual categories of existing prognostic schemas, our models substantially improved predictive accuracy.

    Conclusions: Comprehensive genomic characterization identified distinct genetic subgroups and provided a classification of myeloproliferative neoplasms on the basis of causal biologic mechanisms. Integration of genomic data with clinical variables enabled the personalized predictions of patients' outcomes and may support the treatment of patients with myeloproliferative neoplasms. (Funded by the Wellcome Trust and others.).

    Funded by: Medical Research Council: MC_PC_12009

    The New England journal of medicine 2018;379;15;1416-1430

  • FusC, a member of the M16 protease family acquired by bacteria for iron piracy against plants.

    Grinter R, Hay ID, Song J, Wang J, Teng D, Dhanesakaran V, Wilksch JJ, Davies MR, Littler D, Beckham SA, Henderson IR, Strugnell RA, Dougan G and Lithgow T

    Infection and Immunity Program, Biomedicine Discovery Institute and Department of Microbiology, Monash University, Clayton, Australia.

    Iron is essential for life. Accessing iron from the environment can be a limiting factor that determines success in a given environmental niche. For bacteria, access of chelated iron from the environment is often mediated by TonB-dependent transporters (TBDTs), which are β-barrel proteins that form sophisticated channels in the outer membrane. Reports of iron-bearing proteins being used as a source of iron indicate specific protein import reactions across the bacterial outer membrane. The molecular mechanism by which a folded protein can be imported in this way had remained mysterious, as did the evolutionary process that could lead to such a protein import pathway. How does the bacterium evolve the specificity factors that would be required to select and import a protein encoded on another organism's genome? We describe here a model whereby the plant iron-bearing protein ferredoxin can be imported across the outer membrane of the plant pathogen Pectobacterium by means of a Brownian ratchet mechanism, thereby liberating iron into the bacterium to enable its growth in plant tissues. This import pathway is facilitated by FusC, a member of the same protein family as the mitochondrial processing peptidase (MPP). The Brownian ratchet depends on binding sites discovered in crystal structures of FusC that engage a linear segment of the plant protein ferredoxin. Sequence relationships suggest that the bacterial gene encoding FusC has previously unappreciated homologues in plants and that the protein import mechanism employed by the bacterium is an evolutionary echo of the protein import pathway in plant mitochondria and plastids.

    PLoS biology 2018;16;8;e2006026

  • Cryo-EM structure of an essential Plasmodium vivax invasion complex.

    Gruszczyk J, Huang RK, Chan LJ, Menant S, Hong C, Murphy JM, Mok YF, Griffin MDW, Pearson RD, Wong W, Cowman AF, Yu Z and Tham WH

    The Walter and Eliza Hall Institute of Medical Research, Parkville, Victoria, Australia.

    Plasmodium vivax is the most widely distributed malaria parasite that infects humans<sup>1</sup>. P. vivax invades reticulocytes exclusively, and successful entry depends on specific interactions between the P. vivax reticulocyte-binding protein 2b (PvRBP2b) and transferrin receptor 1 (TfR1)<sup>2</sup>. TfR1-deficient erythroid cells are refractory to invasion by P. vivax, and anti-PvRBP2b monoclonal antibodies inhibit reticulocyte binding and block P. vivax invasion in field isolates<sup>2</sup>. Here we report a high-resolution cryo-electron microscopy structure of a ternary complex of PvRBP2b bound to human TfR1 and transferrin, at 3.7 Å resolution. Mutational analyses show that PvRBP2b residues involved in complex formation are conserved; this suggests that antigens could be designed that act across P. vivax strains. Functional analyses of TfR1 highlight how P. vivax hijacks TfR1, an essential housekeeping protein, by binding to sites that govern host specificity, without affecting its cellular function of transporting iron. Crystal and solution structures of PvRBP2b in complex with antibody fragments characterize the inhibitory epitopes. Our results establish a structural framework for understanding how P. vivax reticulocyte-binding protein engages its receptor and the molecular mechanism of inhibitory monoclonal antibodies, providing important information for the design of novel vaccine candidates.

    Funded by: Wellcome Trust: 090770, 208693

    Nature 2018;559;7712;135-139

  • Transferrin receptor 1 is a reticulocyte-specific receptor for Plasmodium vivax.

    Gruszczyk J, Kanjee U, Chan LJ, Menant S, Malleret B, Lim NTY, Schmidt CQ, Mok YF, Lin KM, Pearson RD, Rangel G, Smith BJ, Call MJ, Weekes MP, Griffin MDW, Murphy JM, Abraham J, Sriprawat K, Menezes MJ, Ferreira MU, Russell B, Renia L, Duraisingh MT and Tham WH

    The Walter and Eliza Hall Institute of Medical Research, Parkville, Victoria 3052, Australia.

    Plasmodium vivax shows a strict host tropism for reticulocytes. We identified transferrin receptor 1 (TfR1) as the receptor for P. vivax reticulocyte-binding protein 2b (PvRBP2b). We determined the structure of the N-terminal domain of PvRBP2b involved in red blood cell binding, elucidating the molecular basis for TfR1 recognition. We validated TfR1 as the biological target of PvRBP2b engagement by means of TfR1 expression knockdown analysis. TfR1 mutant cells deficient in PvRBP2b binding were refractory to invasion of P. vivax but not to invasion of P. falciparum Using Brazilian and Thai clinical isolates, we show that PvRBP2b monoclonal antibodies that inhibit reticulocyte binding also block P. vivax entry into reticulocytes. These data show that TfR1-PvRBP2b invasion pathway is critical for the recognition of reticulocytes during P. vivax invasion.

    Funded by: CIHR; Howard Hughes Medical Institute; Wellcome Trust

    Science (New York, N.Y.) 2018;359;6371;48-55

  • Mutations in Vps15 perturb neuronal migration in mice and are associated with neurodevelopmental disease in humans.

    Gstrein T, Edwards A, Přistoupilová A, Leca I, Breuss M, Pilat-Carotta S, Hansen AH, Tripathy R, Traunbauer AK, Hochstoeger T, Rosoklija G, Repic M, Landler L, Stránecký V, Dürnberger G, Keane TM, Zuber J, Adams DJ, Flint J, Honzik T, Gut M, Beltran S, Mechtler K, Sherr E, Kmoch S, Gut I and Keays DA

    Institute of Molecular Pathology (IMP), Vienna Biocentre (VBC), Vienna, Austria.

    The formation of the vertebrate brain requires the generation, migration, differentiation and survival of neurons. Genetic mutations that perturb these critical cellular events can result in malformations of the telencephalon, providing a molecular window into brain development. Here we report the identification of an N-ethyl-N-nitrosourea-induced mouse mutant characterized by a fractured hippocampal pyramidal cell layer, attributable to defects in neuronal migration. We show that this is caused by a hypomorphic mutation in Vps15 that perturbs endosomal-lysosomal trafficking and autophagy, resulting in an upregulation of Nischarin, which inhibits Pak1 signaling. The complete ablation of Vps15 results in the accumulation of autophagic substrates, the induction of apoptosis and severe cortical atrophy. Finally, we report that mutations in VPS15 are associated with cortical atrophy and epilepsy in humans. These data highlight the importance of the Vps15-Vps34 complex and the Nischarin-Pak1 signaling hub in the development of the telencephalon.

    Nature neuroscience 2018

  • The opium poppy genome and morphinan production.

    Guo L, Winzer T, Yang X, Li Y, Ning Z, He Z, Teodor R, Lu Y, Bowser TA, Graham IA and Ye K

    MOE Key Lab for Intelligent Networks and Networks Security, School of Electronics and Information Engineering, Xi'an Jiaotong University, Xi'an, 710049 China.

    Morphinan-based painkillers are derived from opium poppy (<i>Papaver somniferum</i> L.). We report a draft of the opium poppy genome, with 2.72 gigabases assembled into 11 chromosomes with contig N50 and scaffold N50 of 1.77 and 204 megabases, respectively. Synteny analysis suggests a whole-genome duplication at ~7.8 million years ago and ancient segmental or whole-genome duplication(s) that occurred before the Papaveraceae-Ranunculaceae divergence 110 million years ago. Syntenic blocks representative of phthalideisoquinoline and morphinan components of a benzylisoquinoline alkaloid cluster of 15 genes provide insight into how this cluster evolved. Paralog analysis identified P450 and oxidoreductase genes that combined to form the <i>STORR</i> gene fusion essential for morphinan biosynthesis in opium poppy. Thus, gene duplication, rearrangement, and fusion events have led to evolution of specialized metabolic products in opium poppy.

    Funded by: Wellcome Trust

    Science (New York, N.Y.) 2018;362;6412;343-347

  • Compensation between CSF1R+ macrophages and Foxp3+ Treg cells drives resistance to tumor immunotherapy.

    Gyori D, Lim EL, Grant FM, Spensberger D, Roychoudhuri R, Shuttleworth SJ, Okkenhaug K, Stephens LR and Hawkins PT

    Signalling ISP, Babraham Institute, Babraham Research Campus, Cambridge, Cambridgeshire, United Kingdom.

    Redundancy and compensation provide robustness to biological systems but may contribute to therapy resistance. Both tumor-associated macrophages (TAMs) and Foxp3+ regulatory T (Treg) cells promote tumor progression by limiting antitumor immunity. Here we show that genetic ablation of CSF1 in colorectal cancer cells reduces the influx of immunosuppressive CSF1R+ TAMs within tumors. This reduction in CSF1-dependent TAMs resulted in increased CD8+ T cell attack on tumors, but its effect on tumor growth was limited by a compensatory increase in Foxp3+ Treg cells. Similarly, disruption of Treg cell activity through their experimental ablation produced moderate effects on tumor growth and was associated with elevated numbers of CSF1R+ TAMs. Importantly, codepletion of CSF1R+ TAMs and Foxp3+ Treg cells resulted in an increased influx of CD8+ T cells, augmentation of their function, and a synergistic reduction in tumor growth. Further, inhibition of Treg cell activity either through systemic pharmacological blockade of PI3Kδ, or its genetic inactivation within Foxp3+ Treg cells, sensitized previously unresponsive solid tumors to CSF1R+ TAM depletion and enhanced the effect of CSF1R blockade. These findings identify CSF1R+ TAMs and PI3Kδ-driven Foxp3+ Treg cells as the dominant compensatory cellular components of the immunosuppressive tumor microenvironment, with implications for the design of combinatorial immunotherapies.

    JCI insight 2018;3;11

  • Genome-wide association study in Finnish twins highlights the connection between nicotine addiction and neurotrophin signaling pathway.

    Hällfors J, Palviainen T, Surakka I, Gupta R, Buchwald J, Raevuori A, Ripatti S, Korhonen T, Jousilahti P, Madden PAF, Kaprio J and Loukola A

    Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Finland.

    The heritability of nicotine dependence based on family studies is substantial. Nevertheless, knowledge of the underlying genetic architecture remains meager. Our aim was to identify novel genetic variants responsible for interindividual differences in smoking behavior. We performed a genome-wide association study on 1715 ever smokers ascertained from the population-based Finnish Twin Cohort enriched for heavy smoking. Data imputation used the 1000 Genomes Phase I reference panel together with a whole genome sequence-based Finnish reference panel. We analyzed three measures of nicotine addiction-smoking quantity, nicotine dependence and nicotine withdrawal. We annotated all genome-wide significant SNPs for their functional potential. First, we detected genome-wide significant association on 16p12 with smoking quantity (P = 8.5 × 10<sup>-9</sup>), near CLEC19A. The lead-SNP stands 22 kb from a binding site for NF-κB transcription factors, which play a role in the neurotrophin signaling pathway. However, the signal was not replicated in an independent Finnish population-based sample, FINRISK (n = 6763). Second, nicotine withdrawal showed association on 2q21 in an intron of TMEM163 (P = 2.1 × 10<sup>-9</sup>), and on 11p15 (P = 6.6 × 10<sup>-8</sup>) in an intron of AP2A2, and P = 4.2 × 10<sup>-7</sup>for a missense variant in MUC6, both involved in the neurotrophin signaling pathway). Third, association was detected on 3p22.3 for maximum number of cigarettes smoked per day (P = 3.1 × 10<sup>-8</sup>) near STAC. Associating CLEC19A and TMEM163 SNPs were annotated to influence gene expression or methylation. The neurotrophin signaling pathway has previously been associated with smoking behavior. Our findings further support the role in nicotine addiction.

    Addiction biology 2018

  • Response to Giem.

    Haber M, Doumet-Serhal C, Scheib C, Xue Y, Danecek P, Mezzavilla M, Youhanna S, Martiniano R, Prado-Martinez J, Szpak M, Matisoo-Smith E, Schutkowski H, Mikulski R, Zalloua P, Kivisild T and Tyler-Smith C

    The Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambs. CB10 1SA, United Kingdom. Electronic address:

    American journal of human genetics 2018;102;2;331

  • Gene expression variability across cells and species shapes innate immunity.

    Hagai T, Chen X, Miragaia RJ, Rostom R, Gomes T, Kunowska N, Henriksson J, Park JE, Proserpio V, Donati G, Bossini-Castillo L, Vieira Braga FA, Naamati G, Fletcher J, Stephenson E, Vegh P, Trynka G, Kondova I, Dennis M, Haniffa M, Nourmohammad A, Lässig M and Teichmann SA

    Wellcome Sanger Institute, Cambridge, UK.

    As the first line of defence against pathogens, cells mount an innate immune response, which varies widely from cell to cell. The response must be potent but carefully controlled to avoid self-damage. How these constraints have shaped the evolution of innate immunity remains poorly understood. Here we characterize the innate immune response's transcriptional divergence between species and variability in expression among cells. Using bulk and single-cell transcriptomics in fibroblasts and mononuclear phagocytes from different species, challenged with immune stimuli, we map the architecture of the innate immune response. Transcriptionally diverging genes, including those that encode cytokines and chemokines, vary across cells and have distinct promoter structures. Conversely, genes that are involved in the regulation of this response, such as those that encode transcription factors and kinases, are conserved between species and display low cell-to-cell variability in expression. We suggest that this expression pattern, which is observed across species and conditions, has evolved as a mechanism for fine-tuned regulation to achieve an effective but balanced response.

    Nature 2018;563;7730;197-202

  • Batch effects in single-cell RNA-sequencing data are corrected by matching mutual nearest neighbors.

    Haghverdi L, Lun ATL, Morgan MD and Marioni JC

    European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, UK.

    Large-scale single-cell RNA sequencing (scRNA-seq) data sets that are produced in different laboratories and at different times contain batch effects that may compromise the integration and interpretation of the data. Existing scRNA-seq analysis methods incorrectly assume that the composition of cell populations is either known or identical across batches. We present a strategy for batch correction based on the detection of mutual nearest neighbors (MNNs) in the high-dimensional expression space. Our approach does not rely on predefined or equal population compositions across batches; instead, it requires only that a subset of the population be shared between batches. We demonstrate the superiority of our approach compared with existing methods by using both simulated and real scRNA-seq data sets. Using multiple droplet-based scRNA-seq data sets, we demonstrate that our MNN batch-effect-correction method can be scaled to large numbers of cells.

    Nature biotechnology 2018

  • Hepatic gene body hypermethylation is a shared epigenetic signature of murine longevity.

    Hahn O, Stubbs TM, Reik W, Grönke S, Beyer A and Partridge L

    Max Planck Institute for Biology of Ageing, Cologne, Germany.

    Dietary, pharmacological and genetic interventions can extend health- and lifespan in diverse mammalian species. DNA methylation has been implicated in mediating the beneficial effects of these interventions; methylation patterns deteriorate during ageing, and this is prevented by lifespan-extending interventions. However, whether these interventions also actively shape the epigenome, and whether such epigenetic reprogramming contributes to improved health at old age, remains underexplored. We analysed published, whole-genome, BS-seq data sets from mouse liver to explore DNA methylation patterns in aged mice in response to three lifespan-extending interventions: dietary restriction (DR), reduced TOR signaling (rapamycin), and reduced growth (Ames dwarf mice). Dwarf mice show enhanced DNA hypermethylation in the body of key genes in lipid biosynthesis, cell proliferation and somatotropic signaling, which strongly correlates with the pattern of transcriptional repression. Remarkably, DR causes a similar hypermethylation in lipid biosynthesis genes, while rapamycin treatment increases methylation signatures in genes coding for growth factor and growth hormone receptors. Shared changes of DNA methylation were restricted to hypermethylated regions, and they were not merely a consequence of slowed ageing, thus suggesting an active mechanism driving their formation. By comparing the overlap in ageing-independent hypermethylated patterns between all three interventions, we identified four regions, which, independent of genetic background or gender, may serve as novel biomarkers for longevity-extending interventions. In summary, we identified gene body hypermethylation as a novel and partly conserved signature of lifespan-extending interventions in mouse, highlighting epigenetic reprogramming as a possible intervention to improve health at old age.

    PLoS genetics 2018;14;11;e1007766

  • Tissue-specific transcriptome analyses provide new insights into GPCR signalling in adult Schistosoma mansoni.

    Hahnel S, Wheeler N, Lu Z, Wangwiwatsin A, McVeigh P, Maule A, Berriman M, Day T, Ribeiro P and Grevelding CG

    Institute of Parasitology, BFS, Justus Liebig University, Giessen, Germany.

    Schistosomes are blood-dwelling trematodes with global impact on human and animal health. Because medical treatment is currently based on a single drug, praziquantel, there is urgent need for the development of alternative control strategies. The Schistosoma mansoni genome project provides a platform to study and connect the genetic repertoire of schistosomes to specific biological functions essential for successful parasitism. G protein-coupled receptors (GPCRs) form the largest superfamily of transmembrane receptors throughout the Eumetazoan phyla, including platyhelminths. Due to their involvement in diverse biological processes, their pharmacological importance, and proven druggability, GPCRs are promising targets for new anthelmintics. However, to identify candidate receptors, a more detailed understanding of the roles of GPCR signalling in schistosome biology is essential. An updated phylogenetic analysis of the S. mansoni GPCR genome (GPCRome) is presented, facilitated by updated genome data that allowed a more precise annotation of GPCRs. Additionally, we review the current knowledge on GPCR signalling in this parasite and provide new insights into the potential roles of GPCRs in schistosome reproduction based on the findings of a recent tissue-specific transcriptomic study in paired and unpaired S. mansoni. According to the current analysis, GPCRs contribute to gonad-specific functions but also to nongonad, pairing-dependent processes. The latter may regulate gonad-unrelated functions during the multifaceted male-female interaction. Finally, we compare the schistosome GPCRome to that of another parasitic trematode, Fasciola, and discuss the importance of GPCRs to basic and applied research. Phylogenetic analyses display GPCR diversity in free-living and parasitic platyhelminths and suggest diverse functions in schistosomes. Although their roles need to be substantiated by functional studies in the future, the data support the selection of GPCR candidates for basic and applied studies, invigorating the exploitation of this important receptor class for drug discovery against schistosomes but also other trematodes.

    PLoS pathogens 2018;14;1;e1006718

  • Tissue-Restricted Adaptive Type 2 Immunity Is Orchestrated by Expression of the Costimulatory Molecule OX40L on Group 2 Innate Lymphoid Cells.

    Halim TYF, Rana BMJ, Walker JA, Kerscher B, Knolle MD, Jolin HE, Serrao EM, Haim-Vilmovsky L, Teichmann SA, Rodewald HR, Botto M, Vyse TJ, Fallon PG, Li Z, Withers DR and McKenzie ANJ

    MRC Laboratory of Molecular Biology, Cambridge CB2 0QH, UK; University of Cambridge, CRUK Cambridge Institute, Cambridge CB2 0RE, UK. Electronic address:

    The local regulation of type 2 immunity relies on dialog between the epithelium and the innate and adaptive immune cells. Here we found that alarmin-induced expression of the co-stimulatory molecule OX40L on group 2 innate lymphoid cells (ILC2s) provided tissue-restricted T cell co-stimulation that was indispensable for Th2 and regulatory T (Treg) cell responses in the lung and adipose tissue. Interleukin (IL)-33 administration resulted in organ-specific surface expression of OX40L on ILC2s and the concomitant expansion of Th2 and Treg cells, which was abolished upon deletion of OX40L on ILC2s (Il7ra<sup>Cre/+</sup>Tnfsf4<sup>fl/fl</sup> mice). Moreover, Il7ra<sup>Cre/+</sup>Tnfsf4<sup>fl/fl</sup> mice failed to mount effective Th2 and Treg cell responses and corresponding adaptive type 2 pulmonary inflammation arising from Nippostrongylus brasiliensis infection or allergen exposure. Thus, the increased expression of OX40L in response to IL-33 acts as a licensing signal in the orchestration of tissue-specific adaptive type 2 immunity, without which this response fails to establish.

    Immunity 2018;48;6;1195-1207.e6

  • Heterozygous mutations affecting the protein kinase domain of <i>CDK13</i> cause a syndromic form of developmental delay and intellectual disability.

    Hamilton MJ, Caswell RC, Canham N, Cole T, Firth HV, Foulds N, Heimdal K, Hobson E, Houge G, Joss S, Kumar D, Lampe AK, Maystadt I, McKay V, Metcalfe K, Newbury-Ecob R, Park SM, Robert L, Rustad CF, Wakeling E, Wilkie AOM, Study TDDD, Twigg SRF and Suri M

    West of Scotland Genetics Service, Queen Elizabeth University Hospital, Glasgow, UK.

    Introduction: Recent evidence has emerged linking mutations in <i>CDK13</i> to syndromic congenital heart disease. We present here genetic and phenotypic data pertaining to 16 individuals with <i>CDK13</i> mutations.

    Methods: Patients were investigated by exome sequencing, having presented with developmental delay and additional features suggestive of a syndromic cause.

    Results: Our cohort comprised 16 individuals aged 4-16 years. All had developmental delay, including six with autism spectrum disorder. Common findings included feeding difficulties (15/16), structural cardiac anomalies (9/16), seizures (4/16) and abnormalities of the corpus callosum (4/11 patients who had undergone MRI). All had craniofacial dysmorphism, with common features including short, upslanting palpebral fissures, hypertelorism or telecanthus, medial epicanthic folds, low-set, posteriorly rotated ears and a small mouth with thin upper lip vermilion. Fifteen patients had predicted missense mutations, including five identical p.(Asn842Ser) substitutions and two p.(Gly717Arg) substitutions. One patient had a canonical splice acceptor site variant (c.2898-1G>A). All mutations were located within the protein kinase domain of CDK13. The affected amino acids are highly conserved, and in silico analyses including comparative protein modelling predict that they will interfere with protein function. The location of the missense mutations in a key catalytic domain suggests that they are likely to cause loss of catalytic activity but retention of cyclin K binding, resulting in a dominant negative mode of action. Although the splice-site mutation was predicted to produce a stable internally deleted protein, this was not supported by expression studies in lymphoblastoid cells. A loss of function contribution to the underlying pathological mechanism therefore cannot be excluded, and the clinical significance of this variant remains uncertain.

    Conclusions: These patients demonstrate that heterozygous, likely dominant negative mutations affecting the protein kinase domain of the <i>CDK13</i> gene result in a recognisable, syndromic form of intellectual disability, with or without congenital heart disease.

    Funded by: Wellcome Trust

    Journal of medical genetics 2018;55;1;28-38

  • The widespread use of topical antimicrobials enriches for resistance in Staphylococcus aureus isolated from Atopic Dermatitis patients.

    Harkins CP, McAleer MA, Bennett D, McHugh M, Fleury OM, Pettigrew KA, Oravcová K, Parkhill J, Proby CM, Dawe RS, Geoghegan JA, Irvine AD and Holden MTG

    School of Medicine, University of St Andrews, St Andrews, KY11 9TF, UK.

    Background: Carriage rates of Staphylococcus aureus on affected skin in atopic dermatitis (AD) are approximately 70%. Increasing disease severity during flares and overall disease severity correlate with increased burden of S. aureus. Treatment in AD therefore often targets S. aureus, with topical and systemic antimicrobials.

    Objectives: To determine if antimicrobial sensitivities and genetic determinants of resistance differed in S. aureus isolates from the skin of children with AD compared with healthy child nasal carriers.

    Methods: In this case-control study, we compared S. aureus isolates from children with AD (n=50) attending a hospital dermatology department to nasal carriage isolates from children without skin disease (n=49) attending a hospital emergency department for non-infective conditions. Using whole genome sequencing we generated a phylogenetic framework for the isolates based on variation in the core genome, then compared antimicrobial resistance phenotype and genotypes between disease groups.

    Results and conclusions: S. aureus from cases and controls had on average similar numbers of phenotypic resistances per isolate. Case isolates differed in their resistance patterns, with Fusidic acid resistance (Fus<sup>R</sup> ) being significantly more frequent in AD (p=0.009). The genetic basis of Fus<sup>R</sup> also differentiated the populations, with chromosomal mutations in fusA predominating in AD (p=0.049). Analysis revealed that Fus<sup>R</sup> evolved multiple times and via multiple mechanism in the population. Carriage of plasmid derived qac genes, which have been associated with reduced susceptibility to antiseptics, was 8 times more frequent in AD (p=0.016). The results suggest strong selective pressure drives the emergence and maintenance of specific resistances in AD. This article is protected by copyright. All rights reserved.

    The British journal of dermatology 2018

  • Public health surveillance of multidrug-resistant clones of Neisseria gonorrhoeae in Europe: a genomic survey.

    Harris SR, Cole MJ, Spiteri G, Sánchez-Busó L, Golparian D, Jacobsson S, Goater R, Abudahab K, Yeats CA, Bercot B, Borrego MJ, Crowley B, Stefanelli P, Tripodo F, Abad R, Aanensen DM, Unemo M and Euro-GASP study group

    Infection Genomics, Wellcome Sanger Institute, Hinxton, UK.

    Background: Traditional methods for molecular epidemiology of Neisseria gonorrhoeae are suboptimal. Whole-genome sequencing (WGS) offers ideal resolution to describe population dynamics and to predict and infer transmission of antimicrobial resistance, and can enhance infection control through linkage with epidemiological data. We used WGS, in conjunction with linked epidemiological and phenotypic data, to describe the gonococcal population in 20 European countries. We aimed to detail changes in phenotypic antimicrobial resistance levels (and the reasons for these changes) and strain distribution (with a focus on antimicrobial resistance strains in risk groups), and to predict antimicrobial resistance from WGS data.

    Methods: We carried out an observational study, in which we sequenced isolates taken from patients with gonorrhoea from the European Gonococcal Antimicrobial Surveillance Programme in 20 countries from September to November, 2013. We also developed a web platform that we used for automated antimicrobial resistance prediction, molecular typing (N gonorrhoeae multi-antigen sequence typing [NG-MAST] and multilocus sequence typing), and phylogenetic clustering in conjunction with epidemiological and phenotypic data.

    Findings: The multidrug-resistant NG-MAST genogroup G1407 was predominant and accounted for the most cephalosporin resistance, but the prevalence of this genogroup decreased from 248 (23%) of 1066 isolates in a previous study from 2009-10 to 174 (17%) of 1054 isolates in this survey in 2013. This genogroup previously showed an association with men who have sex with men, but changed to an association with heterosexual people (odds ratio=4·29). WGS provided substantially improved resolution and accuracy over NG-MAST and multilocus sequence typing, predicted antimicrobial resistance relatively well, and identified discrepant isolates, mixed infections or contaminants, and multidrug-resistant clades linked to risk groups.

    Interpretation: To our knowledge, we provide the first use of joint analysis of WGS and epidemiological data in an international programme for regional surveillance of sexually transmitted infections. WGS provided enhanced understanding of the distribution of antimicrobial resistance clones, including replacement with clones that were more susceptible to antimicrobials, in several risk groups nationally and regionally. We provide a framework for genomic surveillance of gonococci through standardised sampling, use of WGS, and a shared information architecture for interpretation and dissemination by use of open access software.

    Funding: The European Centre for Disease Prevention and Control, The Centre for Genomic Pathogen Surveillance, Örebro University Hospital, and Wellcome.

    Funded by: Wellcome Trust: 098051 , 099202

    The Lancet. Infectious diseases 2018;18;7;758-768

  • Genome-wide association study of developmental dysplasia of the hip identifies an association with <i>GDF5</i>.

    Hatzikotoulas K, Roposch A, DDH Case Control Consortium, Shah KM, Clark MJ, Bratherton S, Limbani V, Steinberg J, Zengini E, Warsame K, Ratnayake M, Tselepi M, Schwartzentruber J, Loughlin J, Eastwood DM, Zeggini E and Wilkinson JM

    1Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Morgan Building, Hinxton, Cambridge, CB10 1HH UK.

    Developmental dysplasia of the hip (DDH) is the most common skeletal developmental disease. However, its genetic architecture is poorly understood. We conduct the largest DDH genome-wide association study to date and replicate our findings in independent cohorts. We find the heritable component of DDH attributable to common genetic variants to be 55% and distributed equally across the autosomal and X-chromosomes. We identify replicating evidence for association between <i>GDF5</i> promoter variation and DDH (rs143384, effect allele A, odds ratio 1.44, 95% confidence interval 1.34-1.56, <i>P</i> = 3.55 × 10<sup>-22</sup>). Gene-based analysis implicates <i>GDF5</i> (<i>P</i> = 9.24 × 10<sup>-12</sup>), <i>UQCC1</i> (<i>P</i> = 1.86 × 10<sup>-</sup><sup>10</sup>), <i>MMP24</i> (<i>P</i> = 3.18 × 10<sup>-9</sup>), <i>RETSAT</i> (<i>P</i> = 3.70 × 10<sup>-</sup><sup>8</sup>) and <i>PDRG1</i> (<i>P</i> = 1.06 × 10<sup>-</sup><sup>7</sup>) in DDH susceptibility. We find shared genetic architecture between DDH and hip osteoarthritis, but no predictive power of osteoarthritis polygenic risk score on DDH status, underscoring the complex nature of the two traits. We report a scalable, time-efficient recruitment strategy and establish for the first time to our knowledge a robust DDH genetic association locus at <i>GDF5</i>.

    Funded by: Wellcome Trust

    Communications biology 2018;1;56

  • The return of Pfeiffer's bacillus: Rising incidence of ampicillin resistance in Haemophilus influenzae.

    Heinz E

    Wellcome Trust Sanger Institute, Cambridge CB10 1SA, UK.

    Haemophilus influenzae, originally named Pfeiffer's bacillus after its discoverer Richard Pfeiffer in 1892, was a major risk for global health at the beginning of the 20th century, causing childhood pneumonia and invasive disease as well as otitis media and other upper respiratory tract infections. The implementation of the Hib vaccine, targeting the major capsule type of H. influenzae, almost eradicated the disease in countries that adapted the vaccination scheme. However, a rising number of infections are caused by non-typeable H. influenzae (NTHi), which has no capsule and against which the vaccine therefore provides no protection, as well as other serotypes equally not recognised by the vaccine. The first line of treatment is ampicillin, but there is a steady rise in ampicillin resistance. This is both through acquired as well as intrinsic mechanisms, and is cause for serious concern and the need for more surveillance. There are also increasing reports of new modifications of the intrinsic ampicillin-resistance mechanism leading to resistance against cephalosporins and carbapenems, the last line of well-tolerated drugs, and ampicillin-resistant H. influenzae was included in the recently released priority list of antibiotic-resistant bacteria by the WHO. This review provides an overview of ampicillin resistance prevalence and mechanisms in the context of our current knowledge about population dynamics of H. influenzae.

    Microbial genomics 2018

  • De Novo Pathogenic Variants in CACNA1E Cause Developmental and Epileptic Encephalopathy with Contractures, Macrocephaly, and Dyskinesias.

    Helbig KL, Lauerer RJ, Bahr JC, Souza IA, Myers CT, Uysal B, Schwarz N, Gandini MA, Huang S, Keren B, Mignot C, Afenjar A, Billette de Villemeur T, Héron D, Nava C, Valence S, Buratti J, Fagerberg CR, Soerensen KP, Kibaek M, Kamsteeg EJ, Koolen DA, Gunning B, Schelhaas HJ, Kruer MC, Fox J, Bakhtiari S, Jarrar R, Padilla-Lopez S, Lindstrom K, Jin SC, Zeng X, Bilguvar K, Papavasileiou A, Xing Q, Zhu C, Boysen K, Vairo F, Lanpher BC, Klee EW, Tillema JM, Payne ET, Cousin MA, Kruisselbrink TM, Wick MJ, Baker J, Haan E, Smith N, Sadeghpour A, Davis EE, Katsanis N, Task Force for Neonatal Genomics, Corbett MA, MacLennan AH, Gecz J, Biskup S, Goldmann E, Rodan LH, Kichula E, Segal E, Jackson KE, Asamoah A, Dimmock D, McCarrier J, Botto LD, Filloux F, Tvrdik T, Cascino GD, Klingerman S, Neumann C, Wang R, Jacobsen JC, Nolan MA, Snell RG, Lehnert K, Sadleir LG, Anderlid BM, Kvarnung M, Guerrini R, Friez MJ, Lyons MJ, Leonhard J, Kringlen G, Casas K, El Achkar CM, Smith LA, Rotenberg A, Poduri A, Sanchis-Juan A, Carss KJ, Rankin J, Zeman A, Raymond FL, Blyth M, Kerr B, Ruiz K, Urquhart J, Hughes I, Banka S, Deciphering Developmental Disorders Study, Hedrich UBS, Scheffer IE, Helbig I, Zamponi GW, Lerche H and Mefford HC

    Division of Neurology, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA.

    Developmental and epileptic encephalopathies (DEEs) are severe neurodevelopmental disorders often beginning in infancy or early childhood that are characterized by intractable seizures, abundant epileptiform activity on EEG, and developmental impairment or regression. CACNA1E is highly expressed in the central nervous system and encodes the α<sub>1</sub>-subunit of the voltage-gated Ca<sub>V</sub>2.3 channel, which conducts high voltage-activated R-type calcium currents that initiate synaptic transmission. Using next-generation sequencing techniques, we identified de novo CACNA1E variants in 30 individuals with DEE, characterized by refractory infantile-onset seizures, severe hypotonia, and profound developmental impairment, often with congenital contractures, macrocephaly, hyperkinetic movement disorders, and early death. Most of the 14, partially recurring, variants cluster within the cytoplasmic ends of all four S6 segments, which form the presumed Ca<sub>V</sub>2.3 channel activation gate. Functional analysis of several S6 variants revealed consistent gain-of-function effects comprising facilitated voltage-dependent activation and slowed inactivation. Another variant located in the domain II S4-S5 linker results in facilitated activation and increased current density. Five participants achieved seizure freedom on the anti-epileptic drug topiramate, which blocks R-type calcium channels. We establish pathogenic variants in CACNA1E as a cause of DEEs and suggest facilitated R-type calcium currents as a disease mechanism for human epilepsy and developmental disorders.

    Funded by: NINDS NIH HHS: R01 NS069605, R56 NS069605; Wellcome Trust

    American journal of human genetics 2018;103;5;666-678

  • Refining the phenotype associated with GNB1 mutations: Clinical data on 18 newly identified patients and review of the literature.

    Hemati P, Revah-Politi A, Bassan H, Petrovski S, Bilancia CG, Ramsey K, Griffin NG, Bier L, Cho MT, Rosello M, Lynch SA, Colombo S, Weber A, Haug M, Heinzen EL, Sands TT, Narayanan V, Primiano M, Aggarwal VS, Millan F, Sattler-Holtrop SG, Caro-Llopis A, Pillar N, Baker J, Freedman R, Kroes HY, Sacharow S, Stong N, Lapunzina P, Schneider MC, Mendelsohn NJ, Singleton A, Loik Ramey V, Wou K, Kuzminsky A, Monfort S, Weiss M, Doyle S, Iglesias A, Martinez F, Mckenzie F, Orellana C, van Gassen KLI, Palomares M, Bazak L, Lee A, Bircher A, Basel-Vanagaite L, Hafström M, Houge G, C4RCD Research Group, DDD study, Goldstein DB and Anyane-Yeboa K

    Institute for Genomic Medicine, Columbia University Medical Center, New York, New York.

    De novo germline mutations in GNB1 have been associated with a neurodevelopmental phenotype. To date, 28 patients with variants classified as pathogenic have been reported. We add 18 patients with de novo mutations to this cohort, including a patient with mosaicism for a GNB1 mutation who presented with a milder phenotype. Consistent with previous reports, developmental delay in these patients was moderate to severe, and more than half of the patients were non-ambulatory and nonverbal. The most observed substitution affects the p.Ile80 residue encoded in exon 6, with 28% of patients carrying a variant at this residue. Dystonia and growth delay were observed more frequently in patients carrying variants in this residue, suggesting a potential genotype-phenotype correlation. In the new cohort of 18 patients, 50% of males had genitourinary anomalies and 61% of patients had gastrointestinal anomalies, suggesting a possible association of these findings with variants in GNB1. In addition, cutaneous mastocytosis, reported once before in a patient with a GNB1 variant, was observed in three additional patients, providing further evidence for an association to GNB1. We will review clinical and molecular data of these new cases and all previously reported cases to further define the phenotype and establish possible genotype-phenotype correlations.

    American journal of medical genetics. Part A 2018;176;11;2259-2275

  • Single-cell genomics.

    Hemberg M

    Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK.

    Briefings in functional genomics 2018;17;4;207-208

  • ProxECAT: Proxy External Controls Association Test. A new case-control gene region association test using allele frequencies from public controls.

    Hendricks AE, Billups SC, Pike HNC, Farooqi IS, Zeggini E, Santorico SA, Barroso I and Dupuis J

    Mathematical and Statistical Sciences Department, University of Colorado Denver, Denver, CO, United States of America.

    A primary goal of the recent investment in sequencing is to detect novel genetic associations in health and disease improving the development of treatments and playing a critical role in precision medicine. While this investment has resulted in an enormous total number of sequenced genomes, individual studies of complex traits and diseases are often smaller and underpowered to detect rare variant genetic associations. Existing genetic resources such as the Exome Aggregation Consortium (>60,000 exomes) and the Genome Aggregation Database (~140,000 sequenced samples) have the potential to be used as controls in these studies. Fully utilizing these and other existing sequencing resources may increase power and could be especially useful in studies where resources to sequence additional samples are limited. However, to date, these large, publicly available genetic resources remain underutilized, or even misused, in large part due to the lack of statistical methods that can appropriately use this summary level data. Here, we present a new method to incorporate external controls in case-control analysis called ProxECAT (Proxy External Controls Association Test). ProxECAT estimates enrichment of rare variants within a gene region using internally sequenced cases and external controls. We evaluated ProxECAT in simulations and empirical analyses of obesity cases using both low-depth of coverage (7x) whole-genome sequenced controls and ExAC as controls. We find that ProxECAT maintains the expected type I error rate with increased power as the number of external controls increases. With an accompanying R package, ProxECAT enables the use of publicly available allele frequencies as external controls in case-control analysis.

    Funded by: NIDDK NIH HHS: U01 DK078616; Wellcome Trust: WT098051, WT206194

    PLoS genetics 2018;14;10;e1007591

  • Single-cell transcriptional analysis reveals ILC-like cells in zebrafish.

    Hernández PP, Strzelecka PM, Athanasiadis EI, Hall D, Robalo AF, Collins CM, Boudinot P, Levraud JP and Cvejic A

    Macrophages et Développement de l'Immunité, Institut Pasteur, Paris, France.

    Innate lymphoid cells (ILCs) are important mediators of the immune response and homeostasis in barrier tissues of mammals. However, the existence and function of ILCs in other vertebrates are poorly understood. Here, we use single-cell RNA sequencing to generate a comprehensive atlas of zebrafish lymphocytes during tissue homeostasis and after immune challenge. We profiled 14,080 individual cells from the gut of wild-type zebrafish, as well as of <i>rag1</i>-deficient zebrafish that lack T and B cells, and discovered populations of ILC-like cells. We uncovered a <i>rorc</i>-positive subset of ILCs that could express cytokines associated with type 1, 2, and 3 responses upon immune challenge. Specifically, these ILC-like cells expressed <i>il22</i> and <i>tnfa</i> after exposure to inactivated bacteria or <i>il13</i> after exposure to helminth extract. Cytokine-producing ILC-like cells express a specific repertoire of novel immune-type receptors, likely involved in recognition of environmental cues. We identified additional novel markers of zebrafish ILCs and generated a cloud repository for their in-depth exploration.

    Science immunology 2018;3;29

  • Long- and short-term outcomes in renal allografts with deceased donors: A large recipient and donor genome-wide association study.

    Hernandez-Fuentes MP, Franklin C, Rebollo-Mesa I, Mollon J, Delaney F, Perucha E, Stapleton C, Borrows R, Byrne C, Cavalleri G, Clarke B, Clatworthy M, Feehally J, Fuggle S, Gagliano SA, Griffin S, Hammad A, Higgins R, Jardine A, Keogan M, Leach T, MacPhee I, Mark PB, Marsh J, Maxwell P, McKane W, McLean A, Newstead C, Augustine T, Phelan P, Powis S, Rowe P, Sheerin N, Solomon E, Stephens H, Thuraisingham R, Trembath R, Topham P, Vaughan R, Sacks SH, Conlon P, Opelz G, Soranzo N, Weale ME, Lord GM and United Kingdom and Ireland Renal Transplant Consortium (UKIRTC) and the Wellcome Trust Case Control Consortium (WTCCC)-3

    King's College London, MRC Centre for Transplantation, London, UK.

    Improvements in immunosuppression have modified short-term survival of deceased-donor allografts, but not their rate of long-term failure. Mismatches between donor and recipient HLA play an important role in the acute and chronic allogeneic immune response against the graft. Perfect matching at clinically relevant HLA loci does not obviate the need for immunosuppression, suggesting that additional genetic variation plays a critical role in both short- and long-term graft outcomes. By combining patient data and samples from supranational cohorts across the United Kingdom and European Union, we performed the first large-scale genome-wide association study analyzing both donor and recipient DNA in 2094 complete renal transplant-pairs with replication in 5866 complete pairs. We studied deceased-donor grafts allocated on the basis of preferential HLA matching, which provided some control for HLA genetic effects. No strong donor or recipient genetic effects contributing to long- or short-term allograft survival were found outside the HLA region. We discuss the implications for future research and clinical application.

    American journal of transplantation : official journal of the American Society of Transplantation and the American Society of Transplant Surgeons 2018

  • Improving communication for interdisciplinary teams working on storage of digital information in DNA.

    Hesketh EE, Sayir J and Goldman N

    Wellcome Sanger Institute, Hinxton, CB10 1SA, UK.

    Close collaboration between specialists from diverse backgrounds and working in different scientific domains is an effective strategy to overcome challenges in areas that interface between biology, chemistry, physics and engineering. Communication in such collaborations can itself be challenging.  Even when projects are successfully concluded, resulting publications - necessarily multi-authored - have the potential to be disjointed. Few, both in the field and outside, may be able to fully understand the work as a whole. This needs to be addressed to facilitate efficient working, peer review, accessibility and impact to larger audiences. We are an interdisciplinary team working in a nascent scientific area, the repurposing of DNA as a storage medium for digital information. In this note, we highlight some of the difficulties that arise from such collaborations and outline our efforts to improve communication through a glossary and a controlled vocabulary and accessibility via short plain-language summaries. We hope to stimulate early discussion within this emerging field of how our community might improve the description and presentation of our work to facilitate clear communication within and between research groups and increase accessibility to those not familiar with our respective fields - be it molecular biology, computer science, information theory or others that might become relevant in future. To enable an open and inclusive discussion we have created a glossary and controlled vocabulary as a cloud-based shared document and we invite other scientists to critique our suggestions and contribute their own ideas.

    F1000Research 2018;7;39

  • The contribution of CACNA1A, ATP1A2 and SCN1A mutations in hemiplegic migraine: A clinical and genetic study in Finnish migraine families.

    Hiekkala ME, Vuola P, Artto V, Häppölä P, Häppölä E, Vepsäläinen S, Cuenca-León E, Lal D, Gormley P, Hämäläinen E, Ilmavirta M, Nissilä M, Säkö E, Sumelahti ML, Harno H, Havanka H, Keski-Säntti P, Färkkilä M, Palotie A, Wessman M, Kaunisto MA and Kallela M

    1 Institute of Genetics, Folkhälsan Research Center, Helsinki, Finland.

    Objective To study the position of hemiplegic migraine in the clinical spectrum of migraine with aura and to reveal the importance of CACNA1A, ATP1A2 and SCN1A in the development of hemiplegic migraine in Finnish migraine families. Methods The International Classification of Headache Disorders 3rd edition criteria were used to determine clinical characteristics and occurrence of hemiplegic migraine, based on detailed questionnaires, in a Finnish migraine family collection consisting of 9087 subjects. Involvement of CACNA1A, ATP1A2 and SCN1A was studied using whole exome sequencing data from 293 patients with hemiplegic migraine. Results Overall, hemiplegic migraine patients reported clinically more severe headache and aura episodes than non-hemiplegic migraine with aura patients. We identified two mutations, c.1816G>A (p.Ala606Thr) and c.1148G>A (p.Arg383His), in ATP1A2 and one mutation, c.1994C>T (p.Thr665Met) in CACNA1A. Conclusions The results highlight hemiplegic migraine as a clinically and genetically heterogeneous disease. Hemiplegic migraine patients do not form a clearly separate group with distinct symptoms, but rather have an extreme phenotype in the migraine with aura continuum. We have shown that mutations in CACNA1A, ATP1A2 and SCN1A are not the major cause of the disease in Finnish hemiplegic migraine patients, suggesting that there are additional genetic factors contributing to the phenotype.

    Cephalalgia : an international journal of headache 2018;38;12;1849-1863

  • Dual-stressor selection alters eco-evolutionary dynamics in experimental communities.

    Hiltunen T, Cairns J, Frickel J, Jalasvuori M, Laakso J, Kaitala V, Künzel S, Karakoc E and Becks L

    Department of Microbiology, University of Helsinki, Helsinki, Finland.

    Recognizing when and how rapid evolution drives ecological change is fundamental for our understanding of almost all ecological and evolutionary processes such as community assembly, genetic diversification and the stability of communities and ecosystems. Generally, rapid evolutionary change is driven through selection on genetic variation and is affected by evolutionary constraints, such as tradeoffs and pleiotropic effects, all contributing to the overall rate of evolutionary change. Each of these processes can be influenced by the presence of multiple environmental stressors reducing a population's reproductive output. Potential consequences of multistressor selection for the occurrence and strength of the link from rapid evolution to ecological change are unclear. However, understanding these is necessary for predicting when rapid evolution might drive ecological change. Here we investigate how the presence of two stressors affects this link using experimental evolution with the bacterium Pseudomonas fluorescens and its predator Tetrahymena thermophila. We show that the combination of predation and sublethal antibiotic concentrations delays the evolution of anti-predator defence and antibiotic resistance compared with the presence of only one of the two stressors. Rapid defence evolution drives stabilization of the predator-prey dynamics but this link between evolution and ecology is weaker in the two-stressor environment, where defence evolution is slower, leading to less stable population dynamics. Tracking the molecular evolution of whole populations over time shows further that mutations in different genes are favoured under multistressor selection. Overall, we show that selection by multiple stressors can significantly alter eco-evolutionary dynamics and their predictability.

    Nature ecology & evolution 2018;2;12;1974-1981

  • Integrative Molecular Characterization of Malignant Pleural Mesothelioma.

    Hmeljak J, Sanchez-Vega F, Hoadley KA, Shih J, Stewart C, Heiman DI, Tarpey P, Danilova L, Drill E, Gibb EA, Bowlby R, Kanchi R, Osmanbeyoglu HU, Sekido Y, Takeshita J, Newton Y, Graim K, Gupta M, Gay CM, Diao L, Gibbs DL, Thorsson V, Iype L, Kantheti HS, Severson DT, Ravegnini G, Desmeules P, Jungbluth AA, Travis WD, Dacic S, Chirieac LR, Galateau-Salle F, Fujimoto J, Husain AN, Silveira HC, Rusch VW, Rintoul RC, Pass H, Kindler H, Zauderer MG, Kwiatkowski DJ, Bueno R, Tsao AS, Creaney J, Lichtenberg T, Leraas K, Bowen J, Research Network T, Felau I, Zenklusen JC, Akbani R, Cherniack AD, Byers LA, Noble MS, Fletcher JA, Robertson G, Shen R, Aburatani H, Robinson BW, Campbell P and Ladanyi M


    Malignant pleural mesothelioma (MPM) is a highly lethal cancer of the lining of the chest cavity. To expand our understanding of MPM, we conducted a comprehensive integrated genomic study, including the most detailed analysis of BAP1 alterations to date. We identified histology-independent molecular prognostic subsets, and defined a novel genomic subtype with TP53 and SETDB1 mutations and extensive loss of heterozygosity. We also report strong expression of the immune checkpoint gene VISTA in epithelioid MPM, strikingly higher than in other solid cancers, with implications for the immune response to MPM and for its immunotherapy. Our findings highlight new avenues for further investigation of MPM biology and novel therapeutic options.

    Cancer discovery 2018

  • Congenital macrothrombocytopenia with focal myelofibrosis due to mutations in human G6b-B is rescued in humanized mice.

    Hofmann I, Geer MJ, Vögtle T, Crispin A, Campagna DR, Barr A, Calicchio ML, Heising S, van Geffen JP, Kuijpers MJE, Heemskerk JWM, Eble JA, Schmitz-Abe K, Obeng EA, Douglas M, Freson K, Pondarré C, Favier R, Jarvis GE, Markianos K, Turro E, Ouwehand WH, Mazharian A, Fleming MD and Senis YA

    Division of Hematology, Oncology, Bone Marrow Transplantation, Department of Pediatrics, University of Wisconsin, Madison, WI, United States;

    Unlike primary myelofibrosis (PMF) in adults, myelofibrosis in children is rare. Congenital (inherited) forms of myelofibrosis (cMF) have been described, but the underlying genetic mechanisms remain elusive. Here we describe 4 families with autosomal recessive inherited macrothrombocytopenia with focal myelofibrosis due to germline loss-of-function mutations in the megakaryocyte-specific immunoreceptor tyrosine-based inhibitory motif (ITIM)-containing receptor G6b-B (G6b, C6orf25 or MPIG6B). Patients presented with a mild-to-moderate bleeding diathesis, macrothrombocytopenia, anemia, leukocytosis and atypical megakaryocytes associated with a distinctive, focal, perimegakaryocytic pattern of bone marrow fibrosis. In addition to identifying the responsible gene, the description of G6b-B as the mutated protein potentially implicates aberrant G6b-B megakaryocytic signaling and activation in the pathogenesis of myelofibrosis. Targeted insertion of human G6b in mice rescued the knockout phenotype and a copy number effect of human G6b-B expression was observed. Homozygous knockin mice expressed 25% of human G6b-B and exhibited a marginal reduction in platelet count and mild alterations in platelet function; these phenotypes were more severe in heterozygous mice that expressed only 12% of human G6b-B. This study establishes G6b-B as a critical regulator of platelet homeostasis in humans and mice. In addition, the humanized G6b mouse will provide an invaluable tool for further investigating the physiological functions of human G6b-B as well as testing the efficacy of drugs targeting this receptor.

    Blood 2018

  • SLING: a tool to search for linked genes in bacterial datasets.

    Horesh G, Harms A, Fino C, Parts L, Gerdes K, Heinz E and Thomson NR

    Wellcome Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, United Kingdom.

    Gene arrays and operons that encode functionally linked proteins form the most basic unit of transcriptional regulation in bacteria. Rules that govern the order and orientation of genes in these systems have been defined; however, these were based on a small set of genomes that may not be representative. The growing availability of large genomic datasets presents an opportunity to test these rules, to define the full range and diversity of these systems, and to understand their evolution. Here we present SLING, a tool to Search for LINked Genes by searching for a single functionally essential gene, along with its neighbours in a rule-defined proximity ( Examining this subset of genes enables us to understand the basic diversity of these genetic systems in large datasets. We demonstrate the utility of SLING on a clinical collection of enteropathogenic Escherichia coli for two relevant operons: toxin antitoxin (TA) systems and RND efflux pumps. By examining the diversity of these systems, we gain insight on distinct classes of operons which present variable levels of prevalence and ability to be lost or gained. The importance of this analysis is not limited to TA systems and RND pumps, and can be expanded to understand the diversity of many other relevant gene arrays.

    Nucleic acids research 2018;46;21;e128

  • BACTOME-a reference database to explore the sequence- and gene expression-variation landscape of Pseudomonas aeruginosa clinical isolates.

    Hornischer K, Khaledi A, Pohl S, Schniederjans M, Pezoldt L, Casilag F, Muthukumarasamy U, Bruchmann S, Thöming J, Kordes A and Häussler S

    Institute of Molecular Bacteriology, Helmholtz Centre for Infection Research, D-38124 Braunschweig, Germany.

    Extensive use of next-generation sequencing (NGS) for pathogen profiling has the potential to transform our understanding of how genomic plasticity contributes to phenotypic versatility. However, the storage of large amounts of NGS data and visualization tools need to evolve to offer the scientific community fast and convenient access to these data. We introduce BACTOME as a database system that links aligned DNA- and RNA-sequencing reads of clinical Pseudomonas aeruginosa isolates with clinically relevant pathogen phenotypes. The database allows data extraction for any single isolate, gene or phenotype as well as data filtering and phenotypic grouping for specific research questions. With the integration of statistical tools we illustrate the usefulness of a relational database structure for the identification of phenotype-genotype correlations as an essential part of the discovery pipeline in genomic research. Furthermore, the database provides a compilation of DNA sequences and gene expression values of a plethora of clinical isolates to give a consensus DNA sequence and consensus gene expression signature. Deviations from the consensus thereby describe the genomic landscape and the transcriptional plasticity of the species P. aeruginosa. The database is available at

    Nucleic acids research 2018

  • Identification of novel adenovirus genotype 90 in children from Bangladesh.

    Houldcroft CJ, Beale MA, Sayeed MA, Qadri F, Dougan G and Mutreja A

    1​Department of Medicine, University of Cambridge, Cambridge, UK.

    Novel adenovirus genotypes are associated with outbreaks of disease, such as acute gastroenteritis, renal disease, upper respiratory tract infection and keratoconjunctivitis. Here, we identify novel and variant adenovirus genotypes in children coinfected with enterotoxigenic Escherichia coli, in Bangladesh. Metagenomic sequencing of stool was performed and whole adenovirus genomes were extracted. A novel species D virus, designated genotype 90 (P33H27F67) was identified, and the partial genome of a putative recombinant species B virus was recovered. Furthermore, the enteric types HAdV-A61 and HAdV-A40 were found in stool specimens. Knowledge of the diversity of adenovirus genomes circulating worldwide, especially in low-income countries where the burden of disease is high, will be required to ensure that future vaccination strategies cover the diversity of adenovirus strains associated with disease.

    Microbial genomics 2018

  • DNA Methylation and Transcription Patterns in Intestinal Epithelial Cells From Pediatric Patients With Inflammatory Bowel Diseases Differentiate Disease Subtypes and Associate With Outcome.

    Howell KJ, Kraiczy J, Nayak KM, Gasparetto M, Ross A, Lee C, Mak TN, Koo BK, Kumar N, Lawley T, Sinha A, Rosenstiel P, Heuschkel R, Stegle O and Zilbauer M

    University Department of Paediatrics, University of Cambridge, UK; European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK.

    Background &amp; aims: We analyzed DNA methylation patterns and transcriptomes of primary intestinal epithelial cells (IEC) of children newly diagnosed with inflammatory bowel diseases (IBD) to learn more about pathogenesis.

    Methods: We obtained mucosal biopsies (N = 236) collected from terminal ileum and ascending and sigmoid colons of children (median age 13 years) newly diagnosed with IBD (43 with Crohn's disease [CD], 23 with ulcerative colitis [UC]), and 30 children without IBD (controls). Patients were recruited and managed at a hospital in the United Kingdom from 2013 through 2016. We also obtained biopsies collected at later stages from a subset of patients. IECs were purified and analyzed for genome-wide DNA methylation patterns and gene expression profiles. Adjacent microbiota were isolated from biopsies and analyzed by 16S gene sequencing. We generated intestinal organoid cultures from a subset of samples and genome-wide DNA methylation analysis was performed.

    Results: We found gut segment-specific differences in DNA methylation and transcription profiles of IECs from children with IBD vs controls; some were independent of mucosal inflammation. Changes in gut microbiota between IBD and control groups were not as large and were difficult to assess because of large amounts of intra-individual variation. Only IECs from patients with CD had changes in DNA methylation and transcription patterns in terminal ileum epithelium, compared with controls. Colon epithelium from patients with CD and from patients with ulcerative colitis had distinct changes in DNA methylation and transcription patterns, compared with controls. In IECs from patients with IBD, changes in DNA methylation, compared with controls, were stable over time and were partially retained in ex-vivo organoid cultures. Statistical analyses of epithelial cell profiles allowed us to distinguish children with CD or UC from controls; profiles correlated with disease outcome parameters, such as the requirement for treatment with biologic agents.

    Conclusions: We identified specific changes in DNA methylation and transcriptome patterns in IECs from pediatric patients with IBD compared with controls. These data indicate that IECs undergo changes during IBD development and could be involved in pathogenesis. Further analyses of primary IECs from patients with IBD could improve our understanding of the large variations in disease progression and outcomes.

    Funded by: Medical Research Council: MC_PC_12009; Wellcome Trust

    Gastroenterology 2018;154;3;585-598

  • PPM1D Mutations Drive Clonal Hematopoiesis in Response to Cytotoxic Chemotherapy.

    Hsu JI, Dayaram T, Tovy A, De Braekeleer E, Jeong M, Wang F, Zhang J, Heffernan TP, Gera S, Kovacs JJ, Marszalek JR, Bristow C, Yan Y, Garcia-Manero G, Kantarjian H, Vassiliou G, Futreal PA, Donehower LA, Takahashi K and Goodell MA

    Translational Biology and Molecular Medicine Graduate Program and Medical Scientist Training Program, Baylor College of Medicine, Houston, TX 77030, USA; Center for Cell and Gene Therapy, Baylor College of Medicine, Houston, TX 77030, USA.

    Clonal hematopoiesis (CH), in which stem cell clones dominate blood production, becomes increasingly common with age and can presage malignancy development. The conditions that promote ascendancy of particular clones are unclear. We found that mutations in PPM1D (protein phosphatase Mn<sup>2+</sup>/Mg<sup>2+</sup>-dependent 1D), a DNA damage response regulator that is frequently mutated in CH, were present in one-fifth of patients with therapy-related acute myeloid leukemia or myelodysplastic syndrome and strongly correlated with cisplatin exposure. Cell lines with hyperactive PPM1D mutations expand to outcompete normal cells after exposure to cytotoxic DNA damaging agents including cisplatin, and this effect was predominantly mediated by increased resistance to apoptosis. Moreover, heterozygous mutant Ppm1d hematopoietic cells outcompeted their wild-type counterparts in vivo after exposure to cisplatin and doxorubicin, but not during recovery from bone marrow transplantation. These findings establish the clinical relevance of PPM1D mutations in CH and the importance of studying mutation-treatment interactions. VIDEO ABSTRACT.

    Cell stem cell 2018;23;5;700-713.e6

  • Transformation of Accessible Chromatin and 3D Nucleome Underlies Lineage Commitment of Early T Cells.

    Hu G, Cui K, Fang D, Hirose S, Wang X, Wangsa D, Jin W, Ried T, Liu P, Zhu J, Rothenberg EV and Zhao K

    Systems Biology Center, National Heart, Lung, and Blood Institute, NIH, Bethesda, MD 20892, USA. Electronic address:

    How chromatin reorganization coordinates differentiation and lineage commitment from hematopoietic stem and progenitor cells (HSPCs) to mature immune cells has not been well understood. Here, we carried out an integrative analysis of chromatin accessibility, topologically associating domains, AB compartments, and gene expression from HSPCs to CD4<sup>+</sup>CD8<sup>+</sup>T cells. We found that abrupt genome-wide changes at all three levels of chromatin organization occur during the transition from double-negative stage 2 (DN2) to DN3, accompanying the T lineage commitment. The transcription factor BCL11B, a critical regulator of T cell commitment, is associated with increased chromatin interaction, and Bcl11b deletion compromised chromatin interaction at its target genes. We propose that these large-scale and concerted changes in chromatin organization present an energy barrier to prevent the cell from reversing its fate to earlier stages or redirecting to alternatives and thus lock the cell fate into the T lineages.

    Immunity 2018;48;2;227-242.e8

  • Defining murine organogenesis at single-cell resolution reveals a role for the leukotriene pathway in regulating blood progenitor formation.

    Ibarra-Soria X, Jawaid W, Pijuan-Sala B, Ladopoulos V, Scialdone A, Jörg DJ, Tyser RCV, Calero-Nieto FJ, Mulas C, Nichols J, Vallier L, Srinivas S, Simons BD, Göttgens B and Marioni JC

    Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, UK.

    During gastrulation, cell types from all three germ layers are specified and the basic body plan is established 1 . However, molecular analysis of this key developmental stage has been hampered by limited cell numbers and a paucity of markers. Single-cell RNA sequencing circumvents these problems, but has so far been limited to specific organ systems 2 . Here, we report single-cell transcriptomic characterization of >20,000 cells immediately following gastrulation at E8.25 of mouse development. We identify 20 major cell types, which frequently contain substructure, including three distinct signatures in early foregut cells. Pseudo-space ordering of somitic progenitor cells identifies dynamic waves of transcription and candidate regulators, which are validated by molecular characterization of spatially resolved regions of the embryo. Within the endothelial population, cells that transition from haemogenic endothelial to erythro-myeloid progenitors specifically express Alox5 and its co-factor Alox5ap, which control leukotriene production. Functional assays using mouse embryonic stem cells demonstrate that leukotrienes promote haematopoietic progenitor cell generation. Thus, this comprehensive single-cell map can be exploited to reveal previously unrecognized pathways that contribute to tissue development.

    Nature cell biology 2018

  • Physiological and Genetic Adaptations to Diving in Sea Nomads.

    Ilardo MA, Moltke I, Korneliussen TS, Cheng J, Stern AJ, Racimo F, de Barros Damgaard P, Sikora M, Seguin-Orlando A, Rasmussen S, van den Munckhof ICL, Ter Horst R, Joosten LAB, Netea MG, Salingkat S, Nielsen R and Willerslev E

    Centre for GeoGenetics, University of Copenhagen, Copenhagen 1350, Denmark.

    Understanding the physiology and genetics of human hypoxia tolerance has important medical implications, but this phenomenon has thus far only been investigated in high-altitude human populations. Another system, yet to be explored, is humans who engage in breath-hold diving. The indigenous Bajau people ("Sea Nomads") of Southeast Asia live a subsistence lifestyle based on breath-hold diving and are renowned for their extraordinary breath-holding abilities. However, it is unknown whether this has a genetic basis. Using a comparative genomic study, we show that natural selection on genetic variants in the PDE10A gene have increased spleen size in the Bajau, providing them with a larger reservoir of oxygenated red blood cells. We also find evidence of strong selection specific to the Bajau on BDKRB2, a gene affecting the human diving reflex. Thus, the Bajau, and possibly other diving populations, provide a new opportunity to study human adaptation to hypoxia tolerance. VIDEO ABSTRACT.

    Cell 2018;173;3;569-580.e15

  • The genomic landscape of cutaneous SCC reveals drivers and a novel azathioprine associated mutational signature.

    Inman GJ, Wang J, Nagano A, Alexandrov LB, Purdie KJ, Taylor RG, Sherwood V, Thomson J, Hogan S, Spender LC, South AP, Stratton M, Chelala C, Harwood CA, Proby CM and Leigh IM

    Division of Cancer Research, Jacqui Wood Cancer Centre, School of Medicine, University of Dundee, Dundee, DD1 9SY, UK.

    Cutaneous squamous cell carcinoma (cSCC) has a high tumour mutational burden (50 mutations per megabase DNA pair). Here, we combine whole-exome analyses from 40 primary cSCC tumours, comprising 20 well-differentiated and 20 moderately/poorly differentiated tumours, with accompanying clinical data from a longitudinal study of immunosuppressed and immunocompetent patients and integrate this analysis with independent gene expression studies. We identify commonly mutated genes, copy number changes and altered pathways and processes. Comparisons with tumour differentiation status suggest events which may drive disease progression. Mutational signature analysis reveals the presence of a novel signature (signature 32), whose incidence correlates with chronic exposure to the immunosuppressive drug azathioprine. Characterisation of a panel of 15 cSCC tumour-derived cell lines reveals that they accurately reflect the mutational signatures and genomic alterations of primary tumours and provide a valuable resource for the validation of tumour drivers and therapeutic targets.

    Funded by: Cancer Research UK (CRUK): A13044

    Nature communications 2018;9;1;3667

  • Genomic Risk Prediction of Coronary Artery Disease in 480,000 Adults: Implications for Primary Prevention.

    Inouye M, Abraham G, Nelson CP, Wood AM, Sweeting MJ, Dudbridge F, Lai FY, Kaptoge S, Brozynska M, Wang T, Ye S, Webb TR, Rutter MK, Tzoulaki I, Patel RS, Loos RJF, Keavney B, Hemingway H, Thompson J, Watkins H, Deloukas P, Di Angelantonio E, Butterworth AS, Danesh J, Samani NJ and UK Biobank CardioMetabolic Consortium CHD Working Group

    Cambridge Baker Systems Genomics Initiative, Melbourne, Victoria, Australia, and Cambridge, United Kingdom; Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia; MRC/BHF Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, United Kingdom; Department of Clinical Pathology and School of BioSciences, University of Melbourne, Parkville, Victoria, Australia; The Alan Turing Institute, London, United Kingdom. Electronic address:

    Background: Coronary artery disease (CAD) has substantial heritability and a polygenic architecture. However, the potential of genomic risk scores to help predict CAD outcomes has not been evaluated comprehensively, because available studies have involved limited genomic scope and limited sample sizes.

    Objectives: This study sought to construct a genomic risk score for CAD and to estimate its potential as a screening tool for primary prevention.

    Methods: Using a meta-analytic approach to combine large-scale, genome-wide, and targeted genetic association data, we developed a new genomic risk score for CAD (metaGRS) consisting of 1.7 million genetic variants. We externally tested metaGRS, both by itself and in combination with available data on conventional risk factors, in 22,242 CAD cases and 460,387 noncases from the UK Biobank.

    Results: The hazard ratio (HR) for CAD was 1.71 (95% confidence interval [CI]: 1.68 to 1.73) per SD increase in metaGRS, an association larger than any other externally tested genetic risk score previously published. The metaGRS stratified individuals into significantly different life course trajectories of CAD risk, with those in the top 20% of metaGRS distribution having an HR of 4.17 (95% CI: 3.97 to 4.38) compared with those in the bottom 20%. The corresponding HR was 2.83 (95% CI: 2.61 to 3.07) among individuals on lipid-lowering or antihypertensive medications. The metaGRS had a higher C-index (C = 0.623; 95% CI: 0.615 to 0.631) for incident CAD than any of 6 conventional factors (smoking, diabetes, hypertension, body mass index, self-reported high cholesterol, and family history). For men in the top 20% of metaGRS with >2 conventional factors, 10% cumulative risk of CAD was reached by 48 years of age.

    Conclusions: The genomic score developed and evaluated here substantially advances the concept of using genomic information to stratify individuals with different trajectories of CAD risk and highlights the potential for genomic screening in early life to complement conventional risk prediction.

    Journal of the American College of Cardiology 2018;72;16;1883-1893

  • Unsupervised correction of gene-independent cell responses to CRISPR-Cas9 targeting.

    Iorio F, Behan FM, Gonçalves E, Bhosle SG, Chen E, Shepherd R, Beaver C, Ansari R, Pooley R, Wilkinson P, Harper S, Butler AP, Stronach EA, Saez-Rodriguez J, Yusa K and Garnett MJ

    European Molecular Biology Laboratory - European Bioinformatics Institute, Cambridge, UK.

    Background: Genome editing by CRISPR-Cas9 technology allows large-scale screening of gene essentiality in cancer. A confounding factor when interpreting CRISPR-Cas9 screens is the high false-positive rate in detecting essential genes within copy number amplified regions of the genome. We have developed the computational tool CRISPRcleanR which is capable of identifying and correcting gene-independent responses to CRISPR-Cas9 targeting. CRISPRcleanR uses an unsupervised approach based on the segmentation of single-guide RNA fold change values across the genome, without making any assumption about the copy number status of the targeted genes.

    Results: Applying our method to existing and newly generated genome-wide essentiality profiles from 15 cancer cell lines, we demonstrate that CRISPRcleanR reduces false positives when calling essential genes, correcting biases within and outside of amplified regions, while maintaining true positive rates. Established cancer dependencies and essentiality signals of amplified cancer driver genes are detectable post-correction. CRISPRcleanR reports sgRNA fold changes and normalised read counts, is therefore compatible with downstream analysis tools, and works with multiple sgRNA libraries.

    Conclusions: CRISPRcleanR is a versatile open-source tool for the analysis of CRISPR-Cas9 knockout screens to identify essential genes.

    Funded by: Cancer Research UK: C44943/A22536, SU2C-AACR-DT1213; Open Targets: 015; Wellcome Trust (GB): 102696

    BMC genomics 2018;19;1;604

  • Pathway-based dissection of the genomic heterogeneity of cancer hallmarks' acquisition with SLAPenrich.

    Iorio F, Garcia-Alonso L, Brammeld JS, Martincorena I, Wille DR, McDermott U and Saez-Rodriguez J

    European Molecular Biology Laboratory - European Bioinformatics Institute, Wellcome Genome Campus, Cambridge, CB10 1SD, UK.

    Cancer hallmarks are evolutionary traits required by a tumour to develop. While extensively characterised, the way these traits are achieved through the accumulation of somatic mutations in key biological pathways is not fully understood. To shed light on this subject, we characterised the landscape of pathway alterations associated with somatic mutations observed in 4,415 patients across ten cancer types, using 374 orthogonal pathway gene-sets mapped onto canonical cancer hallmarks. Towards this end, we developed SLAPenrich: a computational method based on population-level statistics, freely available as an open source R package. Assembling the identified pathway alterations into sets of hallmark signatures allowed us to connect somatic mutations to clinically interpretable cancer mechanisms. Further, we explored the heterogeneity of these signatures, in terms of ratio of altered pathways associated with each individual hallmark, assuming that this is reflective of the extent of selective advantage provided to the cancer type under consideration. Our analysis revealed the predominance of certain hallmarks in specific cancer types, thus suggesting different evolutionary trajectories across cancer lineages. Finally, although many pathway alteration enrichments are guided by somatic mutations in frequently altered high-confidence cancer genes, excluding these driver mutations preserves the hallmark heterogeneity signatures, thus the detected hallmarks' predominance across cancer types. As a consequence, we propose the hallmark signatures as a ground truth to characterise tails of infrequent genomic alterations and identify potential novel cancer driver genes and networks.

    Scientific reports 2018;8;1;6713

  • Cancer-mutation network and the number and specificity of driver mutations.

    Iranzo J, Martincorena I and Koonin EV

    National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894;

    Cancer genomics has produced extensive information on cancer-associated genes, but the number and specificity of cancer-driver mutations remains a matter of debate. We constructed a bipartite network in which 7,665 tumors from 30 cancer types are connected via shared mutations in 198 previously identified cancer genes. We show that about 27% of the tumors can be assigned to statistically supported modules, most of which encompass one or two cancer types. The rest of the tumors belong to a diffuse network component suggesting lower gene specificity of driver mutations. Linear regression of the mutational loads in cancer genes was used to estimate the number of drivers required for the onset of different cancers. The mean number of drivers in known cancer genes is approximately two, with a range of one to five. Cancers that are associated with modules had more drivers than those from the diffuse network component, suggesting that unidentified and/or interchangeable drivers exist in the latter.

    Proceedings of the National Academy of Sciences of the United States of America 2018;115;26;E6010-E6019

  • Surveying what's flushed away.

    Iraola G and Kumar N

    Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK.

    Nature reviews. Microbiology 2018;16;8;456

  • Genome-wide transcriptional analyses in Anopheles mosquitoes reveal an unexpected association between salivary gland gene expression and insecticide resistance.

    Isaacs AT, Mawejje HD, Tomlinson S, Rigden DJ and Donnelly MJ

    Department of Vector Biology, Liverpool School of Tropical Medicine, Liverpool, UK.

    Background: To combat malaria transmission, the Ugandan government has embarked upon an ambitious programme of indoor residual spraying (IRS) with a carbamate class insecticide, bendiocarb. In preparation for this campaign, we characterized bendiocarb resistance and associated transcriptional variation among Anopheles gambiae s.s. mosquitoes from two sites in Uganda.

    Results: Gene expression in two mosquito populations displaying some resistance to bendiocarb (95% and 79% An. gambiae s.l. WHO tube bioassay mortality in Nagongera and Kihihi, respectively) was investigated using whole-genome microarrays. Significant overexpression of several genes encoding salivary gland proteins, including D7r2 and D7r4, was detected in mosquitoes from Nagongera. In Kihihi, D7r4, two detoxification-associated genes (Cyp6m2 and Gstd3) and an epithelial serine protease were among the genes most highly overexpressed in resistant mosquitoes. Following the first round of IRS in Nagongera, bendiocarb-resistant mosquitoes were collected, and real-time quantitative PCR analyses detected significant overexpression of D7r2 and D7r4 in resistant mosquitoes. A single nucleotide polymorphism located in a non-coding transcript downstream of the D7 genes was found at a significantly higher frequency in resistant individuals. In silico modelling of the interaction between D7r4 and bendiocarb demonstrated similarity between the insecticide and serotonin, a known ligand of D7 proteins. A meta-analysis of published microarray studies revealed a recurring association between D7 expression and insecticide resistance across Anopheles species and locations.

    Conclusions: A whole-genome microarray approach identified an association between novel insecticide resistance candidates and bendiocarb resistance in Uganda. In addition, a single nucleotide polymorphism associated with this resistance mechanism was discovered. The use of such impartial screening methods allows for discovery of resistance candidates that have no previously-ascribed function in insecticide binding or detoxification. Characterizing these novel candidates will broaden our understanding of resistance mechanisms and yield new strategies for combatting widespread insecticide resistance among malaria vectors.

    Funded by: National Institute of Allergy and Infectious Diseases: U19AI089674

    BMC genomics 2018;19;1;225

  • Meta-analysis of exome array data identifies six novel genetic loci for lung function.

    Jackson VE, Latourelle JC, Wain LV, Smith AV, Grove ML, Bartz TM, Obeidat M, Province MA, Gao W, Qaiser B, Porteous DJ, Cassano PA, Ahluwalia TS, Grarup N, Li J, Altmaier E, Marten J, Harris SE, Manichaikul A, Pottinger TD, Li-Gao R, Lind-Thomsen A, Mahajan A, Lahousse L, Imboden M, Teumer A, Prins B, Lyytikäinen LP, Eiriksdottir G, Franceschini N, Sitlani CM, Brody JA, Bossé Y, Timens W, Kraja A, Loukola A, Tang W, Liu Y, Bork-Jensen J, Justesen JM, Linneberg A, Lange LA, Rawal R, Karrasch S, Huffman JE, Smith BH, Davies G, Burkart KM, Mychaleckyj JC, Bonten TN, Enroth S, Lind L, Brusselle GG, Kumar A, Stubbe B, Understanding Society Scientific Group, Kähönen M, Wyss AB, Psaty BM, Heckbert SR, Hao K, Rantanen T, Kritchevsky SB, Lohman K, Skaaby T, Pisinger C, Hansen T, Schulz H, Polasek O, Campbell A, Starr JM, Rich SS, Mook-Kanamori DO, Johansson Å, Ingelsson E, Uitterlinden AG, Weiss S, Raitakari OT, Gudnason V, North KE, Gharib SA, Sin DD, Taylor KD, O'Connor GT, Kaprio J, Harris TB, Pederson O, Vestergaard H, Wilson JG, Strauch K, Hayward C, Kerr S, Deary IJ, Barr RG, de Mutsert R, Gyllensten U, Morris AP, Ikram MA, Probst-Hensch N, Gläser S, Zeggini E, Lehtimäki T, Strachan DP, Dupuis J, Morrison AC, Hall IP, Tobin MD and London SJ

    Department of Health Sciences, University of Leicester, Leicester, UK.

    <b>Background:</b> Over 90 regions of the genome have been associated with lung function to date, many of which have also been implicated in chronic obstructive pulmonary disease. <b>Methods:</b> We carried out meta-analyses of exome array data and three lung function measures: forced expiratory volume in one second (FEV <sub>1</sub>), forced vital capacity (FVC) and the ratio of FEV <sub>1</sub> to FVC (FEV <sub>1</sub>/FVC). These analyses by the SpiroMeta and CHARGE consortia included 60,749 individuals of European ancestry from 23 studies, and 7,721 individuals of African Ancestry from 5 studies in the discovery stage, with follow-up in up to 111,556 independent individuals. <b>Results:</b> We identified significant (P<2·8x10 <sup>-7</sup>) associations with six SNPs: a nonsynonymous variant in <i>RPAP1</i>, which is predicted to be damaging, three intronic SNPs ( <i>SEC24C, CASC17</i> and <i>UQCC1</i>) and two intergenic SNPs near to <i>LY86</i> and <i>FGF10.</i> Expression quantitative trait loci analyses found evidence for regulation of gene expression at three signals and implicated several genes, including <i>TYRO3</i> and <i>PLAU</i>. <b>Conclusions:</b> Further interrogation of these loci could provide greater understanding of the determinants of lung function and pulmonary disease.

    Wellcome open research 2018;3;4

  • Schistosoma mansoni infection is associated with quantitative and qualitative modifications of the mammalian intestinal microbiota.

    Jenkins TP, Peachey LE, Ajami NJ, MacDonald AS, Hsieh MH, Brindley PJ, Cantacessi C and Rinaldi G

    Department of Veterinary Medicine, University of Cambridge, Cambridge, CB3 0ES, UK.

    In spite of the extensive contribution of intestinal pathology to the pathophysiology of schistosomiasis, little is known of the impact of schistosome infection on the composition of the gut microbiota of its mammalian host. Here, we characterised the fluctuations in the composition of the gut microbial flora of the small and large intestine, as well as the changes in abundance of individual microbial species, of mice experimentally infected with Schistosoma mansoni with the goal of identifying microbial taxa with potential roles in the pathophysiology of infection and disease. Bioinformatic analyses of bacterial 16S rRNA gene data revealed an overall reduction in gut microbial alpha diversity, alongside a significant increase in microbial beta diversity characterised by expanded populations of Akkermansia muciniphila (phylum Verrucomicrobia) and lactobacilli, in the gut microbiota of S. mansoni-infected mice when compared to uninfected control animals. These data support a role of the mammalian gut microbiota in the pathogenesis of hepato-intestinal schistosomiasis and serves as a foundation for the design of mechanistic studies to unravel the complex relationships amongst parasitic helminths, gut microbiota, pathophysiology of infection and host immunity.

    Funded by: Division of Intramural Research, National Institute of Allergy and Infectious Diseases (Division of Intramural Research of the NIAID): R01AI072773, R21AI109532; NIAID NIH HHS: R01 AI072773, R21 AI109532

    Scientific reports 2018;8;1;12072

  • Single cell RNA-seq and ATAC-seq analysis of cardiac progenitor cell transition states and lineage settlement.

    Jia G, Preussner J, Chen X, Guenther S, Yuan X, Yekelchyk M, Kuenne C, Looso M, Zhou Y, Teichmann S and Braun T

    Department of Cardiac Development and Remodeling, Max Planck Institute for Heart and Lung Research, 61231, Bad Nauheim, Germany.

    Formation and segregation of cell lineages forming the heart have been studied extensively but the underlying gene regulatory networks and epigenetic changes driving cell fate transitions during early cardiogenesis are still only partially understood. Here, we comprehensively characterize mouse cardiac progenitor cells (CPCs) marked by Nkx2-5 and Isl1 expression from E7.5 to E9.5 using single-cell RNA sequencing and transposase-accessible chromatin profiling (ATAC-seq). By leveraging on cell-to-cell transcriptome and chromatin accessibility heterogeneity, we identify different previously unknown cardiac subpopulations. Reconstruction of developmental trajectories reveal that multipotent Isl1<sup>+</sup> CPC pass through an attractor state before separating into different developmental branches, whereas extended expression of Nkx2-5 commits CPC to an unidirectional cardiomyocyte fate. Furthermore, we show that CPC fate transitions are associated with distinct open chromatin states critically depending on Isl1 and Nkx2-5. Our data provide a model of transcriptional and epigenetic regulations during cardiac progenitor cell fate decisions at single-cell resolution.

    Nature communications 2018;9;1;4877

  • COSMIC-3D provides structural perspectives on cancer genetics for drug discovery.

    Jubb HC, Saini HK, Verdonk ML and Forbes SA

    COSMIC, Wellcome Sanger Institute, Cambridge, UK.

    Funded by: Wellcome Trust

    Nature genetics 2018;50;9;1200-1202

  • Reply to Dookie et al., "Whole-Genome Sequencing To Guide the Selection of Treatment for Drug-Resistant Tuberculosis".

    Köser CU, Heyckendorf J, Andres S, Olaru ID, Schön T, Sturegård E, Beckert P, Schleusener V, Kohl TA, Hillemann D, Moradigaravand D, Parkhill J, Peacock SJ, Niemann S, Lange C and Merker M

    Department of Genetics, University of Cambridge, Cambridge, United Kingdom.

    Antimicrobial agents and chemotherapy 2018;62;8

  • Whole-exome sequencing of a meningeal melanocytic tumour reveals activating CYSLTR2 and EIF1AX hotspot mutations and similarities to uveal melanoma.

    Küsters-Vandevelde HVN, Germans MR, Rabbie R, Rashid M, Ten Broek R, Blokx WAM, Prinsen CFM, Adams DJ and Ter Laan M

    Department of Pathology, Canisius Wilhelmina Hospital, P.O. Box 9015, 6500 GS, Nijmegen, The Netherlands.

    Funded by: Cancer Research UK: C20510/A13031; Wellcome Trust: 077012/Z/05/Z

    Brain tumor pathology 2018

  • KILchip v1.0: A Novel <i>Plasmodium falciparum</i> Merozoite Protein Microarray to Facilitate Malaria Vaccine Candidate Prioritization.

    Kamuyu G, Tuju J, Kimathi R, Mwai K, Mburu J, Kibinge N, Chong Kwan M, Hawkings S, Yaa R, Chepsat E, Njunge JM, Chege T, Guleid F, Rosenkranz M, Kariuki CK, Frank R, Kinyanjui SM, Murungi LM, Bejon P, Färnert A, Tetteh KKA, Beeson JG, Conway DJ, Marsh K, Rayner JC and Osier FHA

    KEMRI-Wellcome Trust Research Programme, Centre for Geographic Medicine Research-Coast, Kilifi, Kenya.

    Passive transfer studies in humans clearly demonstrated the protective role of IgG antibodies against malaria. Identifying the precise parasite antigens that mediate immunity is essential for vaccine design, but has proved difficult. Completion of the <i>Plasmodium falciparum</i> genome revealed thousands of potential vaccine candidates, but a significant bottleneck remains in their validation and prioritization for further evaluation in clinical trials. Focusing initially on the <i>Plasmodium falciparum</i> merozoite proteome, we used peer-reviewed publications, multiple proteomic and bioinformatic approaches, to select and prioritize potential immune targets. We expressed 109 <i>P. falciparum</i> recombinant proteins, the majority of which were obtained using a mammalian expression system that has been shown to produce biologically functional extracellular proteins, and used them to create KILchip v1.0: a novel protein microarray to facilitate high-throughput multiplexed antibody detection from individual samples. The microarray assay was highly specific; antibodies against <i>P. falciparum</i> proteins were detected exclusively in sera from malaria-exposed but not malaria-naïve individuals. The intensity of antibody reactivity varied as expected from strong to weak across well-studied antigens such as AMA1 and RH5 (Kruskal-Wallis H test for trend: <i>p</i> < 0.0001). The inter-assay and intra-assay variability was minimal, with reproducible results obtained in re-assays using the same chip over a duration of 3 months. Antibodies quantified using the multiplexed format in KILchip v1.0 were highly correlated with those measured in the gold-standard monoplex ELISA [median (range) Spearman's R of 0.84 (0.65-0.95)]. KILchip v1.0 is a robust, scalable and adaptable protein microarray that has broad applicability to studies of naturally acquired immunity against malaria by providing a standardized tool for the detection of antibody correlates of protection. It will facilitate rapid high-throughput validation and prioritization of potential <i>Plasmodium falciparum</i> merozoite-stage antigens paving the way for urgently needed clinical trials for the next generation of malaria vaccines.

    Funded by: Medical Research Council: MR/L00450X/1, MR/M003906/1; Wellcome Trust

    Frontiers in immunology 2018;9;2866

  • Biology and genome of a newly discovered sibling species of Caenorhabditis elegans.

    Kanzaki N, Tsai IJ, Tanaka R, Hunt VL, Liu D, Tsuyama K, Maeda Y, Namai S, Kumagai R, Tracey A, Holroyd N, Doyle SR, Woodruff GC, Murase K, Kitazume H, Chai C, Akagi A, Panda O, Ke HM, Schroeder FC, Wang J, Berriman M, Sternberg PW, Sugimoto A and Kikuchi T

    Forestry and Forest Products Research Institute, Tsukuba, 305-8687, Japan.

    A 'sibling' species of the model organism Caenorhabditis elegans has long been sought for use in comparative analyses that would enable deep evolutionary interpretations of biological phenomena. Here, we describe the first sibling species of C. elegans, C. inopinata n. sp., isolated from fig syconia in Okinawa, Japan. We investigate the morphology, developmental processes and behaviour of C. inopinata, which differ significantly from those of C. elegans. The 123-Mb C. inopinata genome was sequenced and assembled into six nuclear chromosomes, allowing delineation of Caenorhabditis genome evolution and revealing unique characteristics, such as highly expanded transposable elements that might have contributed to the genome evolution of C. inopinata. In addition, C. inopinata exhibits massive gene losses in chemoreceptor gene families, which could be correlated with its limited habitat area. We have developed genetic and molecular techniques for C. inopinata; thus C. inopinata provides an exciting new platform for comparative evolutionary studies.

    Funded by: Japan Society for the Promotion of Science (JSPS): 15K14503, 16H04722, 26292178; Wellcome Trust: 206194

    Nature communications 2018;9;1;3216

  • Identification, Characterization, and Heritability of Murine Metastable Epialleles: Implications for Non-genetic Inheritance.

    Kazachenka A, Bertozzi TM, Sjoberg-Herrera MK, Walker N, Gardner J, Gunning R, Pahita E, Adams S, Adams D and Ferguson-Smith AC

    Department of Genetics, University of Cambridge, Cambridge CB2 3EH, UK.

    Generally repressed by epigenetic mechanisms, retrotransposons represent around 40% of the murine genome. At the Agouti viable yellow (A<sup>vy</sup>) locus, an endogenous retrovirus (ERV) of the intracisternal A particle (IAP) class retrotransposed upstream of the agouti coat-color locus, providing an alternative promoter that is variably DNA methylated in genetically identical individuals. This results in variable expressivity of coat color that is inherited transgenerationally. Here, a systematic genome-wide screen identifies multiple C57BL/6J murine IAPs with A<sup>vy</sup> epigenetic properties. Each exhibits a stable methylation state within an individual but varies between individuals. Only in rare instances do they act as promoters controlling adjacent gene expression. Their methylation state is locus-specific within an individual, and their flanking regions are enriched for CTCF. Variably methylated IAPs are reprogrammed after fertilization and re-established as variable loci in the next generation, indicating reconstruction of metastable epigenetic states and challenging the generalizability of non-genetic inheritance at these regions.

    Funded by: Wellcome Trust

    Cell 2018;175;5;1259-1271.e13

  • htsget: a protocol for securely streaming genomic data.

    Kelleher J, Lin M, Albach CH, Birney E, Davies R, Gourtovaia M, Glazer D, Gonzalez CY, Jackson DK, Kemp A, Marshall J, Nowak A, Senf A, Tovar-Corona JM, Vikhorev A, Keane TM and GA4GH Streaming Task Team

    Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, UK.

    Summary: Standardised interfaces for efficiently accessing high-throughput sequencing data are a fundamental requirement for large-scale genomic data sharing. We have developed htsget, a protocol for secure, efficient and reliable access to sequencing read and variation data. We demonstrate four independent client and server implementations, and the results of a comprehensive interoperability demonstration.

    Availability and implementation:

    Supplementary information: Supplementary data are available at Bioinformatics online.

    Bioinformatics (Oxford, England) 2018

  • The mRNA cap methyltransferase gene TbCMT1 is not essential in vitro but is a virulence factor in vivo for bloodstream form Trypanosoma brucei.

    Kelner A, Tinti M, Guther MLS, Foth BJ, Chappell L, Berriman M, Cowling VH and Ferguson MAJ

    Wellcome Centre for Anti-Infectives Research, School of Life Sciences, University of Dundee, Dundee, United Kingdom.

    Messenger RNA is modified by the addition of a 5' methylated cap structure, which protects the transcript and recruits protein complexes that mediate RNA processing and/or the initiation of translation. Two genes encoding mRNA cap methyltransferases have been identified in T. brucei: TbCMT1 and TbCGM1. Here we analysed the impact of TbCMT1 gene deletion on bloodstream form T. brucei cells. TbCMT1 was dispensable for parasite proliferation in in vitro culture. However, significantly decreased parasitemia was observed in mice inoculated with TbCMT1 null and conditional null cell lines. Using RNA-Seq, we observed that several cysteine peptidase mRNAs were downregulated in TbCMT1 null cells lines. The cysteine peptidase Cathepsin-L was also shown to be reduced at the protein level in TbCMT1 null cell lines. Our data suggest that TbCMT1 is not essential to bloodstream form T. brucei growth in vitro or in vivo but that it contributes significantly to parasite virulence in vivo.

    PloS one 2018;13;7;e0201263

  • The Impact of NOD2 Variants on Fecal Microbiota in Crohn's Disease and Controls Without Gastrointestinal Disease.

    Kennedy NA, Lamb CA, Berry SH, Walker AW, Mansfield J, Parkes M, Simpkins R, Tremelling M, Nutland S, UK IBD Genetics Consortium, Parkhill J, Probert C, Hold GL and Lees CW

    GI Unit, Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh, UK.

    Background/aims: Current models of Crohn's disease (CD) describe an inappropriate immune response to gut microbiota in genetically susceptible individuals. NOD2 variants are strongly associated with development of CD, and NOD2 is part of the innate immune response to bacteria. This study aimed to identify differences in fecal microbiota in CD patients and non-IBD controls stratified by NOD2 genotype.

    Methods: Patients with CD and non-IBD controls of known NOD2 genotype were identified from patients in previous UK IBD genetics studies and the Cambridge bioresource (genotyped/phenotyped volunteers). Individuals with known CD-associated NOD2 mutations were matched to those with wild-type genotype. We obtained fecal samples from patients in clinical remission with low fecal calprotectin (<250 µg/g) and controls without gastrointestinal disease. After extracting DNA, the V1-2 region of 16S rRNA genes were polymerase chain reaction (PCR)-amplified and sequenced. Analysis was undertaken using the mothur package. Volatile organic compounds (VOC) were also measured.

    Results: Ninety-one individuals were in the primary analysis (37 CD, 30 bioresource controls, and 24 household controls). Comparing CD with nonIBD controls, there were reductions in bacterial diversity, Ruminococcaceae, Rikenellaceae, and Christensenellaceae and an increase in Enterobacteriaceae. No significant differences could be identified in microbiota by NOD2 genotype, but fecal butanoic acid was higher in Crohn's patients carrying NOD2 mutations.

    Conclusions: In this well-controlled study of NOD2 genotype and fecal microbiota, we identified no significant genotype-microbiota associations. This suggests that the changes associated with NOD2 genotype might only be seen at the mucosal level, or that environmental factors and prior inflammation are the predominant determinant of the observed dysbiosis in gut microbiota.

    Funded by: Department of Health: NIHR-RP-R3-12-026; Medical Research Council: MC_UU_12010/7; Wellcome Trust: 093885 , 097943 , 098051

    Inflammatory bowel diseases 2018;24;3;583-592

  • Inducible developmental reprogramming redefines commitment to sexual development in the malaria parasite Plasmodium berghei.

    Kent RS, Modrzynska KK, Cameron R, Philip N, Billker O and Waters AP

    Institute of Infection, Immunity and Inflammation, University of Glasgow, Glasgow, UK.

    During malaria infection, Plasmodium spp. parasites cyclically invade red blood cells and can follow two different developmental pathways. They can either replicate asexually to sustain the infection, or differentiate into gametocytes, the sexual stage that can be taken up by mosquitoes, ultimately leading to disease transmission. Despite its importance for malaria control, the process of gametocytogenesis remains poorly understood, partially due to the difficulty of generating high numbers of sexually committed parasites in laboratory conditions<sup>1</sup>. Recently, an apicomplexa-specific transcription factor (AP2-G) was identified as necessary for gametocyte production in multiple Plasmodium species<sup>2,3</sup>, and suggested to be an epigenetically regulated master switch that initiates gametocytogenesis<sup>4,5</sup>. Here we show that in a rodent malaria parasite, Plasmodium berghei, conditional overexpression of AP2-G can be used to synchronously convert the great majority of the population into fertile gametocytes. This discovery allowed us to redefine the time frame of sexual commitment, identify a number of putative AP2-G targets and chart the sequence of transcriptional changes through gametocyte development, including the observation that gender-specific transcription occurred within 6 h of induction. These data provide entry points for further detailed characterization of the key process required for malaria transmission.

    Nature microbiology 2018

  • Functional analysis of Salmonella Typhi adaptation to survival in water.

    Kingsley RA, Langridge G, Smith SE, Makendi C, Fookes M, Wileman TM, El Ghany MA, Keith Turner A, Dyson ZA, Sridhar S, Pickard D, Kay S, Feasey N, Wong V, Barquist L and Dougan G

    Quadram Institute Bioscience, Norwich Research Park, Norwich, UK.

    Contaminated water is a major risk factor associated with the transmission of Salmonella enterica serovar Typhi (S. Typhi), the aetiological agent of human typhoid. However, little is known about how this pathogen adapts to living in the aqueous environment. We used transcriptome analysis (RNA-seq) and transposon mutagenesis (TraDIS) to characterize these adaptive changes and identify multiple genes that contribute to survival. Over half of the genes in the S. Typhi genome altered expression level within the first 24 h following transfer from broth culture to water, although relatively few did so in the first 30 min. Genes linked to central metabolism, stress associated with arrested proton motive force and respiratory chain factors changed expression levels. Additionally, motility and chemotaxis genes increased expression, consistent with a scavenging lifestyle. The viaB-associated gene tviC encoding a glcNAc epimerase that is required for Vi polysaccharide biosynthesis was, along with several other genes, shown to contribute to survival in water. Thus, we define regulatory adaptation operating in S. Typhi that facilitates survival in water.

    Funded by: Biotechnology and Biological Sciences Research Council: BB/R012504/1; Wellcome Trust

    Environmental microbiology 2018;20;11;4079-4090

  • Multiplexing for Oxidative Bisulfite Sequencing (oxBS-seq).

    Kirschner K, Krueger F, Green AR and Chandra T

    Cambridge Institute for Medical Research, University of Cambridge, Cambridge, CB2 0XY, UK.

    DNA modifications, especially methylation, are known to play a crucial part in many regulatory processes in the cell. Recently, 5-hydroxymethylcytosine (5hmC) was discovered, a DNA modification derived as an intermediate of 5-methylcytosine (5mC) oxidation. Efforts to gain insights into function of this DNA modification are underway and several methods were recently described to assess 5hmC levels using sequencing approaches. Here we integrate adaptation based multiplexing and high-efficiency library prep into the oxidative Bisulfite Sequencing (oxBS-seq) workflow reducing the starting amount and cost per sample to identify 5hmC levels genome-wide.

    Methods in molecular biology (Clifton, N.J.) 2018;1708;665-678

  • Quantitative mass spectrometry for human melanocortin peptides in vitro and in vivo suggests prominent roles for β-MSH and desacetyl α-MSH in energy homeostasis.

    Kirwan P, Kay RG, Brouwers B, Herranz-Pérez V, Jura M, Larraufie P, Jerber J, Pembroke J, Bartels T, White A, Gribble FM, Reimann F, Farooqi IS, O'Rahilly S and Merkle FT

    Metabolic Research Laboratories and Medical Research Council Metabolic Diseases Unit, Wellcome Trust-Medical Research Council Institute of Metabolic Science, University of Cambridge, Cambridge, CB2 0QQ, UK; The Anne McLaren Laboratory for Regenerative Medicine, Wellcome Trust-Medical Research Council Cambridge Stem Cell Institute, University of Cambridge, Cambridge, CB2 0SZ, UK.

    Objective: The lack of pro-opiomelanocortin (POMC)-derived melanocortin peptides results in hypoadrenalism and severe obesity in both humans and rodents that is treatable with synthetic melanocortins. However, there are significant differences in POMC processing between humans and rodents, and little is known about the relative physiological importance of POMC products in the human brain. The aim of this study was to determine which POMC-derived peptides are present in the human brain, to establish their relative concentrations, and to test if their production is dynamically regulated.

    Methods: We analysed both fresh post-mortem human hypothalamic tissue and hypothalamic neurons derived from human pluripotent stem cells (hPSCs) using liquid chromatography tandem mass spectrometry (LC-MS/MS) to determine the sequence and quantify the production of hypothalamic neuropeptides, including those derived from POMC.

    Results: In both in vitro and in vivo hypothalamic cells, LC-MS/MS revealed the sequence of hundreds of neuropeptides as a resource for the field. Although the existence of β-melanocyte stimulating hormone (MSH) is controversial, we found that both this peptide and desacetyl α-MSH (d-α-MSH) were produced in considerable excess of acetylated α-MSH. In hPSC-derived hypothalamic neurons, these POMC derivatives were appropriately trafficked, secreted, and their production was significantly (P < 0.0001) increased in response to the hormone leptin.

    Conclusions: Our findings challenge the assumed pre-eminence of α-MSH and suggest that in humans, d-α-MSH and β-MSH are likely to be the predominant physiological products acting on melanocortin receptors.

    Molecular metabolism 2018

  • scmap: projection of single-cell RNA-seq data across data sets.

    Kiselev VY, Yiu A and Hemberg M

    Wellcome Sanger Institute, Hinxton, UK.

    Single-cell RNA-seq (scRNA-seq) allows researchers to define cell types on the basis of unsupervised clustering of the transcriptome. However, differences in experimental methods and computational analyses make it challenging to compare data across experiments. Here we present scmap (; web version at, a method for projecting cells from an scRNA-seq data set onto cell types or individual cells from other experiments.

    Nature methods 2018;15;5;359-362

  • Human Coronavirus NL63 Molecular Epidemiology and Evolutionary Patterns in Rural Coastal Kenya.

    Kiyuka PK, Agoti CN, Munywoki PK, Njeru R, Bett A, Otieno JR, Otieno GP, Kamau E, Clark TG, van der Hoek L, Kellam P, Nokes DJ and Cotten M

    Epidemiology and Demography Department, Kenya Medical Research Institute-Wellcome Trust Research Programme.

    Background: Human coronavirus NL63 (HCoV-NL63) is a globally endemic pathogen causing mild and severe respiratory tract infections with reinfections occurring repeatedly throughout a lifetime.

    Methods: Nasal samples were collected in coastal Kenya through community-based and hospital-based surveillance. HCoV-NL63 was detected with multiplex real-time reverse transcription PCR, and positive samples were targeted for nucleotide sequencing of the spike (S) protein. Additionally, paired samples from 25 individuals with evidence of repeat HCoV-NL63 infection were selected for whole-genome virus sequencing.

    Results: HCoV-NL63 was detected in 1.3% (75/5573) of child pneumonia admissions. Two HCoV-NL63 genotypes circulated in Kilifi between 2008 and 2014. Full genome sequences formed a monophyletic clade closely related to contemporary HCoV-NL63 from other global locations. An unexpected pattern of repeat infections was observed with some individuals showing higher viral titers during their second infection. Similar patterns for 2 other endemic coronaviruses, HCoV-229E and HCoV-OC43, were observed. Repeat infections by HCoV-NL63 were not accompanied by detectable genotype switching.

    Conclusions: In this coastal Kenya setting, HCoV-NL63 exhibited low prevalence in hospital pediatric pneumonia admissions. Clade persistence with low genetic diversity suggest limited immune selection, and absence of detectable clade switching in reinfections indicates initial exposure was insufficient to elicit a protective immune response.

    The Journal of infectious diseases 2018;217;11;1728-1739

  • A large impact crater beneath Hiawatha Glacier in northwest Greenland.

    Kjær KH, Larsen NK, Binder T, Bjørk AA, Eisen O, Fahnestock MA, Funder S, Garde AA, Haack H, Helm V, Houmark-Nielsen M, Kjeldsen KK, Khan SA, Machguth H, McDonald I, Morlighem M, Mouginot J, Paden JD, Waight TE, Weikusat C, Willerslev E and MacGregor JA

    Centre for GeoGenetics, Natural History Museum, University of Copenhagen, Copenhagen, Denmark.

    We report the discovery of a large impact crater beneath Hiawatha Glacier in northwest Greenland. From airborne radar surveys, we identify a 31-kilometer-wide, circular bedrock depression beneath up to a kilometer of ice. This depression has an elevated rim that cross-cuts tributary subglacial channels and a subdued central uplift that appears to be actively eroding. From ground investigations of the deglaciated foreland, we identify overprinted structures within Precambrian bedrock along the ice margin that strike tangent to the subglacial rim. Glaciofluvial sediment from the largest river draining the crater contains shocked quartz and other impact-related grains. Geochemical analysis of this sediment indicates that the impactor was a fractionated iron asteroid, which must have been more than a kilometer wide to produce the identified crater. Radiostratigraphy of the ice in the crater shows that the Holocene ice is continuous and conformable, but all deeper and older ice appears to be debris rich or heavily disturbed. The age of this impact crater is presently unknown, but from our geological and geophysical evidence, we conclude that it is unlikely to predate the Pleistocene inception of the Greenland Ice Sheet.

    Science advances 2018;4;11;eaar8173

  • Implications of insecticide resistance for malaria vector control with long-lasting insecticidal nets: a WHO-coordinated, prospective, international, observational cohort study.

    Kleinschmidt I, Bradley J, Knox TB, Mnzava AP, Kafy HT, Mbogo C, Ismail BA, Bigoga JD, Adechoubou A, Raghavendra K, Cook J, Malik EM, Nkuni ZJ, Macdonald M, Bayoh N, Ochomo E, Fondjo E, Awono-Ambene HP, Etang J, Akogbeto M, Bhatt RM, Chourasia MK, Swain DK, Kinyari T, Subramaniam K, Massougbodji A, Okê-Sopoh M, Ogouyemi-Hounto A, Kouambeng C, Abdin MS, West P, Elmardi K, Cornelie S, Corbel V, Valecha N, Mathenge E, Kamau L, Lines J and Donnelly MJ

    MRC Tropical Epidemiology Group, Department of Infectious Disease Epidemiology, London School of Hygiene & Tropical Medicine, London, UK; School of Public Health, University of the Witwatersrand, Johannesburg, South Africa. Electronic address:

    Background: Scale-up of insecticide-based interventions has averted more than 500 million malaria cases since 2000. Increasing insecticide resistance could herald a rebound in disease and mortality. We aimed to investigate whether insecticide resistance was associated with loss of effectiveness of long-lasting insecticidal nets and increased malaria disease burden.

    Methods: This WHO-coordinated, prospective, observational cohort study was done at 279 clusters (villages or groups of villages in which phenotypic resistance was measurable) in Benin, Cameroon, India, Kenya, and Sudan. Pyrethroid long-lasting insecticidal nets were the principal form of malaria vector control in all study areas; in Sudan this approach was supplemented by indoor residual spraying. Cohorts of children from randomly selected households in each cluster were recruited and followed up by community health workers to measure incidence of clinical malaria and prevalence of infection. Mosquitoes were assessed for susceptibility to pyrethroids using the standard WHO bioassay test. Country-specific results were combined using meta-analysis.

    Findings: Between June 2, 2012, and Nov 4, 2016, 40 000 children were enrolled and assessed for clinical incidence during 1·4 million follow-up visits. 80 000 mosquitoes were assessed for insecticide resistance. Long-lasting insecticidal net users had lower infection prevalence (adjusted odds ratio [OR] 0·63, 95% CI 0·51-0·78) and disease incidence (adjusted rate ratio [RR] 0·62, 0·41-0·94) than did non-users across a range of resistance levels. We found no evidence of an association between insecticide resistance and infection prevalence (adjusted OR 0·86, 0·70-1·06) or incidence (adjusted RR 0·89, 0·72-1·10). Users of nets, although significantly better protected than non-users, were nevertheless subject to high malaria infection risk (ranging from an average incidence in net users of 0·023, [95% CI 0·016-0·033] per person-year in India, to 0·80 [0·65-0·97] per person year in Kenya; and an average infection prevalence in net users of 0·8% [0·5-1·3] in India to an average infection prevalence of 50·8% [43·4-58·2] in Benin).

    Interpretation: Irrespective of resistance, populations in malaria endemic areas should continue to use long-lasting insecticidal nets to reduce their risk of infection. As nets provide only partial protection, the development of additional vector control tools should be prioritised to reduce the unacceptably high malaria burden.

    Funding: Bill & Melinda Gates Foundation, UK Medical Research Council, and UK Department for International Development.

    The Lancet. Infectious diseases 2018

  • Emergence of an Extensively Drug-Resistant <i>Salmonella enterica</i> Serovar Typhi Clone Harboring a Promiscuous Plasmid Encoding Resistance to Fluoroquinolones and Third-Generation Cephalosporins.

    Klemm EJ, Shakoor S, Page AJ, Qamar FN, Judge K, Saeed DK, Wong VK, Dallman TJ, Nair S, Baker S, Shaheen G, Qureshi S, Yousafzai MT, Saleem MK, Hasan Z, Dougan G and Hasan R

    Wellcome Trust Sanger Institute, Hinxton, United Kingdom.

    Antibiotic resistance is a major problem in <i>Salmonella enterica</i> serovar Typhi, the causative agent of typhoid. Multidrug-resistant (MDR) isolates are prevalent in parts of Asia and Africa and are often associated with the dominant H58 haplotype. Reduced susceptibility to fluoroquinolones is also widespread, and sporadic cases of resistance to third-generation cephalosporins or azithromycin have also been reported. Here, we report the first large-scale emergence and spread of a novel <i>S</i> Typhi clone harboring resistance to three first-line drugs (chloramphenicol, ampicillin, and trimethoprim-sulfamethoxazole) as well as fluoroquinolones and third-generation cephalosporins in Sindh, Pakistan, which we classify as extensively drug resistant (XDR). Over 300 XDR typhoid cases have emerged in Sindh, Pakistan, since November 2016. Additionally, a single case of travel-associated XDR typhoid has recently been identified in the United Kingdom. Whole-genome sequencing of over 80 of the XDR isolates revealed remarkable genetic clonality and sequence conservation, identified a large number of resistance determinants, and showed that these isolates were of haplotype H58. The XDR <i>S</i> Typhi clone encodes a chromosomally located resistance region and harbors a plasmid encoding additional resistance elements, including the <i>bla</i><sub>CTX-M-15</sub> extended-spectrum β-lactamase, and carrying the <i>qnrS</i> fluoroquinolone resistance gene. This antibiotic resistance-associated IncY plasmid exhibited high sequence identity to plasmids found in other enteric bacteria isolated from widely distributed geographic locations. This study highlights three concerning problems: the receding antibiotic arsenal for typhoid treatment, the ability of <i>S</i> Typhi to transform from MDR to XDR in a single step by acquisition of a plasmid, and the ability of XDR clones to spread globally.<b>IMPORTANCE</b> Typhoid fever is a severe disease caused by the Gram-negative bacterium <i>Salmonella enterica</i> serovar Typhi. Antibiotic-resistant <i>S</i> Typhi strains have become increasingly common. Here, we report the first large-scale emergence and spread of a novel extensively drug-resistant (XDR) <i>S</i> Typhi clone in Sindh, Pakistan. The XDR <i>S</i> Typhi is resistant to the majority of drugs available for the treatment of typhoid fever. This study highlights the evolving threat of antibiotic resistance in <i>S</i> Typhi and the value of antibiotic susceptibility testing and whole-genome sequencing in understanding emerging infectious diseases. We genetically characterized the XDR <i>S</i> Typhi to investigate the phylogenetic relationship between these isolates and a global collection of <i>S</i> Typhi isolates and to identify multiple genes linked to antibiotic resistance. This <i>S</i> Typhi clone harbored a promiscuous antibiotic resistance plasmid previously identified in other enteric bacteria. The increasing antibiotic resistance in <i>S</i> Typhi observed here adds urgency to the need for typhoid prevention measures.

    Funded by: Wellcome Trust

    mBio 2018;9;1

  • Emergence of dominant multidrug-resistant bacterial clades: Lessons from history and whole-genome sequencing.

    Klemm EJ, Wong VK and Dougan G

    Infection Genomics Programme, Wellcome Trust Sanger Institute, CB10 1SA Cambridge, United Kingdom.

    Antibiotic resistance in bacteria has emerged as a global challenge over the past 90 years, compromising our ability to effectively treat infections. There has been a dramatic increase in antibiotic resistance-associated determinants in bacterial populations, driven by the mobility and infectious nature of such determinants. Bacterial genome flexibility and antibiotic-driven selection are at the root of the problem. Genome evolution and the emergence of highly successful multidrug-resistant clades in different pathogens have made this a global challenge. Here, we describe some of the factors driving the origin, evolution, and spread of the antibiotic resistance genotype.

    Proceedings of the National Academy of Sciences of the United States of America 2018;115;51;12872-12877

  • XenofilteR: computational deconvolution of mouse and human reads in tumor xenograft sequence data.

    Kluin RJC, Kemper K, Kuilman T, de Ruiter JR, Iyer V, Forment JV, Cornelissen-Steijger P, de Rink I, Ter Brugge P, Song JY, Klarenbeek S, McDermott U, Jonkers J, Velds A, Adams DJ, Peeper DS and Krijgsman O

    Central Genomic Facility, Netherlands Cancer Institute, Amsterdam, The Netherlands.

    Background: Mouse xenografts from (patient-derived) tumors (PDX) or tumor cell lines are widely used as models to study various biological and preclinical aspects of cancer. However, analyses of their RNA and DNA profiles are challenging, because they comprise reads not only from the grafted human cancer but also from the murine host. The reads of murine origin result in false positives in mutation analysis of DNA samples and obscure gene expression levels when sequencing RNA. However, currently available algorithms are limited and improvements in accuracy and ease of use are necessary.

    Results: We developed the R-package XenofilteR, which separates mouse from human sequence reads based on the edit-distance between a sequence read and reference genome. To assess the accuracy of XenofilteR, we generated sequence data by in silico mixing of mouse and human DNA sequence data. These analyses revealed that XenofilteR removes > 99.9% of sequence reads of mouse origin while retaining human sequences. This allowed for mutation analysis of xenograft samples with accurate variant allele frequencies, and retrieved all non-synonymous somatic tumor mutations.

    Conclusions: XenofilteR accurately dissects RNA and DNA sequences from mouse and human origin, thereby outperforming currently available tools. XenofilteR is open source and available at .

    Funded by: FP7 Ideas: European Research Council: 319661; KWF Kankerbestrijding: NKI-2013-5799; Wellcome Trust

    BMC bioinformatics 2018;19;1;366

  • Zoonotic Transfer of Clostridium difficile Harboring Antimicrobial Resistance between Farm Animals and Humans.

    Knetsch CW, Kumar N, Forster SC, Connor TR, Browne HP, Harmanus C, Sanders IM, Harris SR, Turner L, Morris T, Perry M, Miyajima F, Roberts P, Pirmohamed M, Songer JG, Weese JS, Indra A, Corver J, Rupnik M, Wren BW, Riley TV, Kuijper EJ and Lawley TD

    Section Experimental Bacteriology, Department of Medical Microbiology, Leiden University Medical Center, Leiden, Netherlands.

    The emergence of <i>Clostridium difficile</i> as a significant human diarrheal pathogen is associated with the production of highly transmissible spores and the acquisition of antimicrobial resistance genes (ARGs) and virulence factors. Unlike the hospital-associated <i>C. difficile</i> RT027 lineage, the community-associated <i>C. difficile</i> RT078 lineage is isolated from both humans and farm animals; however, the geographical population structure and transmission networks remain unknown. Here, we applied whole-genome phylogenetic analysis of 248 <i>C. difficile</i> RT078 strains from 22 countries. Our results demonstrate limited geographical clustering for <i>C. difficile</i> RT078 and extensive coclustering of human and animal strains, thereby revealing a highly linked intercontinental transmission network between humans and animals. Comparative whole-genome analysis reveals indistinguishable accessory genomes between human and animal strains and a variety of antimicrobial resistance genes in the pangenome of <i>C. difficile</i> RT078. Thus, bidirectional spread of <i>C. difficile</i> RT078 between farm animals and humans may represent an unappreciated route disseminating antimicrobial resistance genes between humans and animals. These results highlight the importance of the "One Health" concept to monitor infectious disease emergence and the dissemination of antimicrobial resistance genes.

    Funded by: Medical Research Council: G0902453, MR/K000551/1, MR/L015080/1, PF451; Wellcome Trust: 098051

    Journal of clinical microbiology 2018;56;3

  • Chromosome assembly of large and complex genomes using multiple references.

    Kolmogorov M, Armstrong J, Raney BJ, Streeter I, Dunn M, Yang F, Odom D, Flicek P, Keane TM, Thybert D, Paten B and Pham S

    Department of Computer Science and Engineering, University of California, San Diego, California 92093, USA.

    Despite the rapid development of sequencing technologies, the assembly of mammalian-scale genomes into complete chromosomes remains one of the most challenging problems in bioinformatics. To help address this difficulty, we developed Ragout 2, a reference-assisted assembly tool that works for large and complex genomes. By taking one or more target assemblies (generated from an NGS assembler) and one or multiple related reference genomes, Ragout 2 infers the evolutionary relationships between the genomes and builds the final assemblies using a genome rearrangement approach. By using Ragout 2, we transformed NGS assemblies of 16 laboratory mouse strains into sets of complete chromosomes, leaving <5% of sequence unlocalized per set. Various benchmarks, including PCR testing and realigning of long Pacific Biosciences (PacBio) reads, suggest only a small number of structural errors in the final assemblies, comparable with direct assembly approaches. We applied Ragout 2 to the <i>Mus caroli</i> and <i>Mus pahari</i> genomes, which exhibit karyotype-scale variations compared with other genomes from the <i>Muridae</i> family. Chromosome painting maps confirmed most large-scale rearrangements that Ragout 2 detected. We applied Ragout 2 to improve draft sequences of three ape genomes that have recently been published. Ragout 2 transformed three sets of contigs (generated using PacBio reads only) into chromosome-scale assemblies with accuracy comparable to chromosome assemblies generated in the original study using BioNano maps, Hi-C, BAC clones, and FISH.

    Genome research 2018

  • Repair of double-strand breaks induced by CRISPR-Cas9 leads to large deletions and complex rearrangements.

    Kosicki M, Tomberg K and Bradley A

    Wellcome Sanger Institute, Hinxton, UK.

    CRISPR-Cas9 is poised to become the gene editing tool of choice in clinical contexts. Thus far, exploration of Cas9-induced genetic alterations has been limited to the immediate vicinity of the target site and distal off-target sequences, leading to the conclusion that CRISPR-Cas9 was reasonably specific. Here we report significant on-target mutagenesis, such as large deletions and more complex genomic rearrangements at the targeted sites in mouse embryonic stem cells, mouse hematopoietic progenitors and a human differentiated cell line. Using long-read sequencing and long-range PCR genotyping, we show that DNA breaks introduced by single-guide RNA/Cas9 frequently resolved into deletions extending over many kilobases. Furthermore, lesions distal to the cut site and crossover events were identified. The observed genomic damage in mitotically active cells caused by CRISPR-Cas9 editing may have pathogenic consequences.

    Nature biotechnology 2018

  • Naturally occurring polymorphisms in the virulence regulator Rsp modulate Staphylococcus aureus survival in blood and antibiotic susceptibility.

    Krishna A, Holden MTG, Peacock SJ, Edwards AM and Wigneshweraraj S

    1​MRC Centre for Molecular Bacteriology and Infection, Imperial College London, London, UK.

    Nasal colonization by the pathogen Staphylococcus aureus is a risk factor for subsequent infection. Loss of function mutations in the gene encoding the virulence regulator Rsp are associated with the transition of S. aureus from a colonizing isolate to one that causes bacteraemia. Here, we report the identification of several novel activity-altering mutations in rsp detected in clinical isolates, including for the first time, mutations that enhance agr operon activity. We assessed how these mutations affected infection-relevant phenotypes and found loss and enhancement of function mutations to have contrasting effects on S. aureus survival in blood and antibiotic susceptibility. These findings add to the growing body of evidence that suggests S. aureus 'trades off' virulence for the acquisition of traits that benefit survival in the host, and indicates that infection severity and treatment options can be significantly affected by mutations in the virulence regulator rsp.

    Microbiology (Reading, England) 2018

  • Nasal carriage of Staphylococcus pseudintermedius in patients with granulomatosis with polyangiitis.

    Kronbichler A, Blane B, Holmes MA, Wagner J, Parkhill J, Peacock SJ, Jayne DRW and Harrison EM

    Vasculitis and Lupus Clinic, Addenbrooke's Hospital, Cambridge, UK.

    Rheumatology (Oxford, England) 2018

  • Assessing Rare Variation in Complex Traits.

    Kuchenbaecker K and Appel EVR

    Wellcome Trust Sanger Institute, Cambridge, UK.

    While genome-wide association studies have been very successful in identifying associations of common genetic variants with many different traits, the rarer frequency spectrum of the genome has not yet been comprehensively explored. Technological developments increasingly lift restrictions to access rare genetic variation. Dense reference panels enable improved genotype imputation for rarer variants in studies using DNA microarrays. Moreover, the decreasing cost of next generation sequencing makes whole exome and genome sequencing increasingly affordable for large samples. Large-scale efforts based on sequencing, such as ExAC, 100,000 Genomes, and TopMed, are likely to significantly advance this field.The main challenge in evaluating complex trait associations of rare variants is statistical power. The choice of population should be considered carefully because allele frequencies and linkage disequilibrium structure differ between populations. Genetically isolated populations can have favorable genomic characteristics for the study of rare variants.One strategy to increase power is to assess the combined effect of multiple rare variants within a region, known as aggregate testing. A  range of methods have been developed for this. Model performance depends on the genetic architecture of the region of interest.

    Methods in molecular biology (Clifton, N.J.) 2018;1793;51-71

  • High-resolution genetic mapping of putative causal interactions between regions of open chromatin.

    Kumasaka N, Knights AJ and Gaffney DJ

    Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK.

    Physical interaction of regulatory elements in three-dimensional space poses a challenge for studies of disease because non-coding risk variants may be great distances from the genes they regulate. Experimental methods to capture these interactions, such as chromosome conformation capture, usually cannot assign causal direction of effect between regulatory elements, an important component of fine-mapping studies. We developed a Bayesian hierarchical approach that uses two-stage least squares and applied it to an ATAC-seq (assay for transposase-accessible chromatin using sequencing) data set from 100 individuals, to identify over 15,000 high-confidence causal interactions. Most (60%) interactions occurred over <20 kb, where chromosome conformation capture-based methods perform poorly. For a fraction of loci, we identified a single variant that alters accessibility across multiple regions, and experimentally validated the BLK locus, which is associated with multiple autoimmune diseases, using CRISPR genome editing. Our study highlights how association genetics of chromatin state is a powerful approach for identifying interactions between regulatory elements.

    Nature genetics 2018

  • Immune Cell Dynamics Unfolded by Single-Cell Technologies.

    Kunz DJ, Gomes T and James KR

    Cavendish Laboratory, Department of Physics, University of Cambridge, Cambridge, United Kingdom.

    The single-cell revolution is paving the way towards the molecular characterisation of every cell type in the human body, revealing relationships between cell types and states at high resolution. Changes in cellular phenotypes are particularly prevalent in the immune system and can be observed in its continuous remodelling up to adulthood, response to disease and development of immunological memory. In this review, we delve into the world of cellular dynamics of the immune system. We discuss current single-cell experimental and computational approaches in this area, giving insights into plasticity and commitment of cell fates. Finally, we provide an outlook on upcoming technological developments and predict how these will improve our understanding of the immune system.

    Frontiers in immunology 2018;9;1435

  • A Standard Nomenclature for Referencing and Authentication of Pluripotent Stem Cells.

    Kurtz A, Seltmann S, Bairoch A, Bittner MS, Bruce K, Capes-Davis A, Clarke L, Crook JM, Daheron L, Dewender J, Faulconbridge A, Fujibuchi W, Gutteridge A, Hei DJ, Kim YO, Kim JH, Kokocinski AK, Lekschas F, Lomax GP, Loring JF, Ludwig T, Mah N, Matsui T, Müller R, Parkinson H, Sheldon M, Smith K, Stachelscheid H, Stacey G, Streeter I, Veiga A and Xu RH

    Charité - Universitätsmedizin Berlin, Berlin-Brandenburg Center for Regenerative Therapies, Berlin 13353, Germany. Electronic address:

    Unambiguous cell line authentication is essential to avoid loss of association between data and cells. The risk for loss of references increases with the rapidity that new human pluripotent stem cell (hPSC) lines are generated, exchanged, and implemented. Ideally, a single name should be used as a generally applied reference for each cell line to access and unify cell-related information across publications, cell banks, cell registries, and databases and to ensure scientific reproducibility. We discuss the needs and requirements for such a unique identifier and implement a standard nomenclature for hPSCs, which can be automatically generated and registered by the human pluripotent stem cell registry (hPSCreg). To avoid ambiguities in PSC-line referencing, we strongly urge publishers to demand registration and use of the standard name when publishing research based on hPSC lines.

    Stem cell reports 2018;10;1;1-6

  • Excision-reintegration at a pneumococcal phase-variable restriction-modification locus drives within- and between-strain epigenetic differentiation and inhibits gene acquisition.

    Kwun MJ, Oggioni MR, De Ste Croix M, Bentley SD and Croucher NJ

    MRC Centre for Global Infectious Disease Analysis, Department of Infectious Disease Epidemiology, Imperial College London, London W2 1PG, UK.

    Phase-variation of Type I restriction-modification systems can rapidly alter the sequence motifs they target, diversifying both the epigenetic patterns and endonuclease activity within clonally descended populations. Here, we characterize the Streptococcus pneumoniae SpnIV phase-variable Type I RMS, encoded by the translocating variable restriction (tvr) locus, to identify its target motifs, mechanism and regulation of phase variation, and effects on exchange of sequence through transformation. The specificity-determining hsdS genes were shuffled through a recombinase-mediated excision-reintegration mechanism involving circular intermediate molecules, guided by two types of direct repeat. The rate of rearrangements was limited by an attenuator and toxin-antitoxin system homologs that inhibited recombinase gene transcription. Target motifs for both the SpnIV, and multiple Type II, MTases were identified through methylation-sensitive sequencing of a panel of recombinase-null mutants. This demonstrated the species-wide diversity observed at the tvr locus can likely specify nine different methylation patterns. This will reduce sequence exchange in this diverse species, as the native form of the SpnIV RMS was demonstrated to inhibit the acquisition of genomic islands by transformation. Hence the tvr locus can drive variation in genome methylation both within and between strains, and limits the genomic plasticity of S. pneumoniae.

    Nucleic acids research 2018

  • Detecting eukaryotic microbiota with single-cell sensitivity in human tissue.

    Lager S, de Goffau MC, Sovio U, Peacock SJ, Parkhill J, Charnock-Jones DS and Smith GCS

    Department of Obstetrics and Gynaecology, University of Cambridge, National Institute for Health Research Cambridge Biomedical Research Centre, Cambridge, UK.

    Background: Fetal growth restriction, pre-eclampsia, and pre-term birth are major adverse pregnancy outcomes. These complications are considerable contributors to fetal/maternal morbidity and mortality worldwide. A significant proportion of these cases are thought to be due to dysfunction of the placenta. However, the underlying mechanisms of placental dysfunction are unclear. The aim of the present study was to investigate whether adverse pregnancy outcomes are associated with evidence of placental eukaryotic infection.

    Results: We modified the 18S Illumina Amplicon Protocol of the Earth Microbiome Project and made it capable of detecting just a single spiked-in genome copy of Plasmodium falciparum, Saccharomyces cerevisiae, or Toxoplasma gondii among more than 70,000 human cells. Using this method, we were unable to detect eukaryotic pathogens in placental biopsies in instances of adverse pregnancy outcome (n = 199) or in healthy controls (n = 99).

    Conclusions: Eukaryotic infection of the placenta is not an underlying cause of the aforementioned pregnancy complications. Possible clinical applications for this non-targeted, yet extremely sensitive, eukaryotic screening method are manifest.

    Funded by: Medical Research Council: G1100221

    Microbiome 2018;6;1;151

  • Toll-like receptor 2 costimulation potentiates the antitumor efficacy of CAR T Cells.

    Lai Y, Weng J, Wei X, Qin L, Lai P, Zhao R, Jiang Z, Li B, Lin S, Wang S, Wu Q, Tang Z, Liu P, Pei D, Yao Y, Du X and Li P

    Key Laboratory of Regenerative Biology, South China Institute for Stem Cell Biology and Regenerative Medicine, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou, China.

    Chimeric antigen receptor (CAR) T-cell immunotherapies have shown unprecedented success in treating leukemia but limited clinical efficacy in solid tumors. Here, we generated 1928zT2 and m28zT2, targeting CD19 and mesothelin, respectively, by introducing the Toll/interleukin-1 receptor domain of Toll-like receptor 2 (TLR2) to 1928z and m28z. T cells expressing 1928zT2 or m28zT2 showed improved expansion, persistency and effector function against CD19<sup>+</sup> leukemia or mesothelin<sup>+</sup> solid tumors respectively in vitro and in vivo. In a patient with relapsed B-cell acute lymphoblastic leukemia, a single dose of 5 × 10<sup>4</sup>/kg 1928zT2 T cells resulted in robust expansion and leukemia eradication and led to complete remission. Hence, our results demonstrate that TLR2 signaling can contribute to the efficacy of CAR T cells. Further clinical trials are warranted to establish the safety and efficacy of this approach.

    Leukemia 2018;32;3;801-808

  • Loss of genomic diversity in a Neisseria meningitidis clone through a colonization bottleneck.

    Lamelas A, Hamid AM, Dangy JP, Hauser J, Jud M, Röltgen K, Hodgson A, Junghanss T, Harris SR, Parkhill J, Bentley SD and Pluschke G

    Swiss Tropical and Public Health Institute, Socinstr. 57, Basel, Switzerland.

    Neisseria meningitidis is the leading cause of epidemic meningitis in the 'meningitis belt' of Africa, where clonal waves of colonization and disease are observed. Point mutations and horizontal gene exchange lead to constant diversification of meningococcal populations during clonal spread. Maintaining a high genomic diversity may be an evolutionary strategy of meningococci that increases chances of fixing occasionally new highly successful 'fit genotypes'. We have performed a longitudinal study of meningococcal carriage and disease in northern Ghana by analysing cerebrospinal fluid samples from all suspected meningitis cases and monitoring carriage of meningococci by twice yearly colonization surveys. In the framework of this study we observed complete replacement of an A:ST-2859 clone by a W:ST-2881 clone. However, after a gap of one year, A:ST-2859 meningococci re-emerged both as colonizer and meningitis causing agent. Our whole genome sequencing analyses compared the A population isolated prior to the W colonization and disease wave with the re-emerging A meningococci. This analysis revealed expansion of one clone differing in only one non-synonymous SNP from several isolates already present in the original A:ST-2859 population. The colonization bottleneck caused by the competing W meningococci thus resulted in a profound reduction in genomic diversity of the A meningococcal population.

    Genome biology and evolution 2018

  • Automated typing of red blood cell and platelet antigens: a whole-genome sequencing study.

    Lane WJ, Westhoff CM, Gleadall NS, Aguad M, Smeland-Wagman R, Vege S, Simmons DP, Mah HH, Lebo MS, Walter K, Soranzo N, Di Angelantonio E, Danesh J, Roberts DJ, Watkins NA, Ouwehand WH, Butterworth AS, Kaufman RM, Rehm HL, Silberstein LE, Green RC and MedSeq Project

    Department of Pathology, Brigham and Women's Hospital, Boston, MA, USA; Harvard Medical School, Boston, MA, USA. Electronic address:

    Background: There are more than 300 known red blood cell (RBC) antigens and 33 platelet antigens that differ between individuals. Sensitisation to antigens is a serious complication that can occur in prenatal medicine and after blood transfusion, particularly for patients who require multiple transfusions. Although pre-transfusion compatibility testing largely relies on serological methods, reagents are not available for many antigens. Methods based on single-nucleotide polymorphism (SNP) arrays have been used, but typing for ABO and Rh-the most important blood groups-cannot be done with SNP typing alone. We aimed to develop a novel method based on whole-genome sequencing to identify RBC and platelet antigens.

    Methods: This whole-genome sequencing study is a subanalysis of data from patients in the whole-genome sequencing arm of the MedSeq Project randomised controlled trial (NCT01736566) with no measured patient outcomes. We created a database of molecular changes in RBC and platelet antigens and developed an automated antigen-typing algorithm based on whole-genome sequencing (bloodTyper). This algorithm was iteratively improved to address cis-trans haplotype ambiguities and homologous gene alignments. Whole-genome sequencing data from 110 MedSeq participants (30 × depth) were used to initially validate bloodTyper through comparison with conventional serology and SNP methods for typing of 38 RBC antigens in 12 blood-group systems and 22 human platelet antigens. bloodTyper was further validated with whole-genome sequencing data from 200 INTERVAL trial participants (15 × depth) with serological comparisons.

    Findings: We iteratively improved bloodTyper by comparing its typing results with conventional serological and SNP typing in three rounds of testing. The initial whole-genome sequencing typing algorithm was 99·5% concordant across the first 20 MedSeq genomes. Addressing discordances led to development of an improved algorithm that was 99·8% concordant for the remaining 90 MedSeq genomes. Additional modifications led to the final algorithm, which was 99·2% concordant across 200 INTERVAL genomes (or 99·9% after adjustment for the lower depth of coverage).

    Interpretation: By enabling more precise antigen-matching of patients with blood donors, antigen typing based on whole-genome sequencing provides a novel approach to improve transfusion outcomes with the potential to transform the practice of transfusion medicine.

    Funding: National Human Genome Research Institute, Doris Duke Charitable Foundation, National Health Service Blood and Transplant, National Institute for Health Research, and Wellcome Trust.

    The Lancet. Haematology 2018

  • Population-based analysis of ocular Chlamydia trachomatis in trachoma-endemic West African communities identifies genomic markers of disease severity.

    Last AR, Pickering H, Roberts CH, Coll F, Phelan J, Burr SE, Cassama E, Nabicassa M, Seth-Smith HMB, Hadfield J, Cutcliffe LT, Clarke IN, Mabey DCW, Bailey RL, Clark TG, Thomson NR and Holland MJ

    Clinical Research Department, London School of Hygiene and Tropical Medicine, Keppel Street, London, UK.

    Background: Chlamydia trachomatis (Ct) is the most common infectious cause of blindness and bacterial sexually transmitted infection worldwide. Ct strain-specific differences in clinical trachoma suggest that genetic polymorphisms in Ct may contribute to the observed variability in severity of clinical disease.

    Methods: Using Ct whole genome sequences obtained directly from conjunctival swabs, we studied Ct genomic diversity and associations between Ct genetic polymorphisms with ocular localization and disease severity in a treatment-naïve trachoma-endemic population in Guinea-Bissau, West Africa.

    Results: All Ct sequences fall within the T2 ocular clade phylogenetically. This is consistent with the presence of the characteristic deletion in trpA resulting in a truncated non-functional protein and the ocular tyrosine repeat regions present in tarP associated with ocular tissue localization. We have identified 21 Ct non-synonymous single nucleotide polymorphisms (SNPs) associated with ocular localization, including SNPs within pmpD (odds ratio, OR = 4.07, p* = 0.001) and tarP (OR = 0.34, p* = 0.009). Eight synonymous SNPs associated with disease severity were found in yjfH (rlmB) (OR = 0.13, p* = 0.037), CTA0273 (OR = 0.12, p* = 0.027), trmD (OR = 0.12, p* = 0.032), CTA0744 (OR = 0.12, p* = 0.041), glgA (OR = 0.10, p* = 0.026), alaS (OR = 0.10, p* = 0.032), pmpE (OR = 0.08, p* = 0.001) and the intergenic region CTA0744-CTA0745 (OR = 0.13, p* = 0.043).

    Conclusions: This study demonstrates the extent of genomic diversity within a naturally circulating population of ocular Ct and is the first to describe novel genomic associations with disease severity. These findings direct investigation of host-pathogen interactions that may be important in ocular Ct pathogenesis and disease transmission.

    Funded by: Medical Research Council: MR/K000551/1; Wellcome Trust: 079246/Z/06/Z, 097330/Z/11/Z, 098051, 105609/Z/14/Z

    Genome medicine 2018;10;1;15

  • Support for a clade of Placozoa and Cnidaria in genes with minimal compositional bias.

    Laumer CE, Gruber-Vodicka H, Hadfield MG, Pearse VB, Riesgo A, Marioni JC and Giribet G

    Wellcome Trust Sanger Institute, Hinxton, United Kingdom.

    The phylogenetic placement of the morphologically simple placozoans is crucial to understanding the evolution of complex animal traits. Here, we examine the influence of adding new genomes from placozoans to a large dataset designed to study the deepest splits in the animal phylogeny. Using site-heterogeneous substitution models, we show that it is possible to obtain strong support, in both amino acid and reduced-alphabet matrices, for either a sister-group relationship between Cnidaria and Placozoa, or for Cnidaria and Bilateria as seen in most published work to date, depending on the orthologues selected to construct the matrix. We demonstrate that a majority of genes show evidence of compositional heterogeneity, and that support for the Cnidaria+Bilateria clade can be assigned to this source of systematic error. In interpreting these results, we caution against a peremptory reading of placozoans as secondarily reduced forms of little relevance to broader discussions of early animal evolution.

    eLife 2018;7

  • BCL11A interacts with SOX2 to control the expression of epigenetic regulators in lung squamous carcinoma.

    Lazarus KA, Hadi F, Zambon E, Bach K, Santolla MF, Watson JK, Correia LL, Das M, Ugur R, Pensa S, Becker L, Campos LS, Ladds G, Liu P, Evan GI, McCaughan FM, Le Quesne J, Lee JH, Calado D and Khaled WT

    Department of Pharmacology, University of Cambridge, Cambridge, CB2 1PD, UK.

    Patients diagnosed with lung squamous cell carcinoma (LUSC) have limited targeted therapies. We report here the identification and characterisation of BCL11A, as a LUSC oncogene. Analysis of cancer genomics datasets revealed BCL11A to be upregulated in LUSC but not in lung adenocarcinoma (LUAD). Experimentally we demonstrate that non-physiological levels of BCL11A in vitro and in vivo promote squamous-like phenotypes, while its knockdown abolishes xenograft tumour formation. At the molecular level we found that BCL11A is transcriptionally regulated by SOX2 and is required for its oncogenic functions. Furthermore, we show that BCL11A and SOX2 regulate the expression of several transcription factors, including SETD8. We demonstrate that shRNA-mediated or pharmacological inhibition of SETD8 selectively inhibits LUSC growth. Collectively, our study indicates that BCL11A is integral to LUSC pathology and highlights the disruption of the BCL11A-SOX2 transcriptional programme as a novel candidate for drug development.

    Funded by: Biotechnology and Biological Sciences Research Council (BBSRC): BB/M00015X/2; Cancer Research UK (CRUK): C47525/A17348

    Nature communications 2018;9;1;3327

  • Terminal uridylyltransferases target RNA viruses as part of the innate immune system.

    Le Pen J, Jiang H, Di Domenico T, Kneuss E, Kosałka J, Leung C, Morgan M, Much C, Rudolph KLM, Enright AJ, O'Carroll D, Wang D and Miska EA

    Gurdon Institute, University of Cambridge, Cambridge, UK.

    RNA viruses are a major threat to animals and plants. RNA interference (RNAi) and the interferon response provide innate antiviral defense against RNA viruses. Here, we performed a large-scale screen using Caenorhabditis elegans and its natural pathogen the Orsay virus (OrV), and we identified cde-1 as important for antiviral defense. CDE-1 is a homolog of the mammalian TUT4 and TUT7 terminal uridylyltransferases (collectively called TUT4(7)); its catalytic activity is required for its antiviral function. CDE-1 uridylates the 3' end of the OrV RNA genome and promotes its degradation in a manner independent of the RNAi pathway. Likewise, TUT4(7) enzymes uridylate influenza A virus (IAV) mRNAs in mammalian cells. Deletion of TUT4(7) leads to increased IAV mRNA and protein levels. Collectively, these data implicate 3'-terminal uridylation of viral RNAs as a conserved antiviral defense mechanism.

    Nature structural & molecular biology 2018

  • A Distinct Class of Genome Rearrangements Driven by Heterologous Recombination.

    León-Ortiz AM, Panier S, Sarek G, Vannier JB, Patel H, Campbell PJ and Boulton SJ

    DSB Repair Metabolism Laboratory, The Francis Crick Institute, 1 Midland Road, London NW1 1AT, UK.

    Erroneous DNA repair by heterologous recombination (Ht-REC) is a potential threat to genome stability, but evidence supporting its prevalence is lacking. Here we demonstrate that recombination is possible between heterologous sequences and that it is a source of chromosomal alterations in mitotic and meiotic cells. Mechanistically, we find that the RTEL1 and HIM-6/BLM helicases and the BRCA1 homolog BRC-1 counteract Ht-REC in Caenorhabditis elegans, whereas mismatch repair does not. Instead, MSH-2/6 drives Ht-REC events in rtel-1 and brc-1 mutants and excessive crossovers in rtel-1 mutant meioses. Loss of vertebrate Rtel1 also causes a variety of unusually large and complex structural variations, including chromothripsis, breakage-fusion-bridge events, and tandem duplications with distant intra-chromosomal insertions, whose structure are consistent with a role for RTEL1 in preventing Ht-REC during break-induced replication. Our data establish Ht-REC as an unappreciated source of genome instability that underpins a novel class of complex genome rearrangements that likely arise during replication stress.

    Molecular cell 2018;69;2;292-305.e6

  • Integrated pathogen load and dual transcriptome analysis of systemic host-pathogen interactions in severe malaria.

    Lee HJ, Georgiadou A, Walther M, Nwakanma D, Stewart LB, Levin M, Otto TD, Conway DJ, Coin LJ and Cunnington AJ

    Institute for Molecular Bioscience, University of Queensland, Brisbane, Queensland 4072, Australia.

    The pathogenesis of infectious diseases depends on the interaction of host and pathogen. In <i>Plasmodium falciparum</i> malaria, host and parasite processes can be assessed by dual RNA sequencing of blood from infected patients. We performed dual transcriptome analyses on samples from 46 malaria-infected Gambian children to reveal mechanisms driving the systemic pathophysiology of severe malaria. Integrating these transcriptomic data with estimates of parasite load and detailed clinical information allowed consideration of potentially confounding effects due to differing leukocyte proportions in blood, parasite developmental stage, and whole-body pathogen load. We report hundreds of human and parasite genes differentially expressed between severe and uncomplicated malaria, with distinct profiles associated with coma, hyperlactatemia, and thrombocytopenia. High expression of neutrophil granule-related genes was consistently associated with all severe malaria phenotypes. We observed severity-associated variation in the expression of parasite genes, which determine cytoadhesion to vascular endothelium, rigidity of infected erythrocytes, and parasite growth rate. Up to 99% of human differential gene expression in severe malaria was driven by differences in parasite load, whereas parasite gene expression showed little association with parasite load. Coexpression analyses revealed interactions between human and <i>P. falciparum</i>, with prominent co-regulation of translation genes in severe malaria between host and parasite. Multivariate analyses suggested that increased expression of granulopoiesis and interferon-γ-related genes, together with inadequate suppression of type 1 interferon signaling, best explained severity of infection. These findings provide a framework for understanding the contributions of host and parasite to the pathogenesis of severe malaria and identifying new treatments.

    Science translational medicine 2018;10;447

  • Population dynamics of normal human blood inferred from somatic mutations.

    Lee-Six H, Øbro NF, Shepherd MS, Grossmann S, Dawson K, Belmonte M, Osborne RJ, Huntly BJP, Martincorena I, Anderson E, O'Neill L, Stratton MR, Laurenti E, Green AR, Kent DG and Campbell PJ

    Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, UK.

    Haematopoietic stem cells drive blood production, but their population size and lifetime dynamics have not been quantified directly in humans. Here we identified 129,582 spontaneous, genome-wide somatic mutations in 140 single-cell-derived haematopoietic stem and progenitor colonies from a healthy 59-year-old man and applied population-genetics approaches to reconstruct clonal dynamics. Cell divisions from early embryogenesis were evident in the phylogenetic tree; all blood cells were derived from a common ancestor that preceded gastrulation. The size of the stem cell population grew steadily in early life, reaching a stable plateau by adolescence. We estimate the numbers of haematopoietic stem cells that are actively making white blood cells at any one time to be in the range of 50,000-200,000. We observed adult haematopoietic stem cell clones that generate multilineage outputs, including granulocytes and B lymphocytes. Harnessing naturally occurring mutations to report the clonal architecture of an organ enables the high-resolution reconstruction of somatic cell dynamics in humans.

    Funded by: Medical Research Council: MC_PC_12009, MR/M008975/1, MR/M010392/1; Wellcome Trust

    Nature 2018;561;7724;473-478

  • pyseer: a comprehensive tool for microbial pangenome-wide association studies.

    Lees JA, Galardini M, Bentley SD, Weiser JN and Corander J

    Department of Microbiology, New York University School of Medicine, New York, NY, USA.

    Summary: Genome-wide association studies (GWAS) in microbes have different challenges to GWAS in eukaryotes. These have been addressed by a number of different methods. pyseer brings these techniques together in one package tailored to microbial GWAS, allows greater flexibility of the input data used, and adds new methods to interpret the association results.

    Availability and implementation: pyseer is written in python and is freely available at, or can be installed through pip. Documentation and a tutorial are available at

    Supplementary information: Supplementary data are available at Bioinformatics online.

    Bioinformatics (Oxford, England) 2018;34;24;4310-4312

  • Evaluation of phylogenetic reconstruction methods using bacterial whole genomes: a simulation based study.

    Lees JA, Kendall M, Parkhill J, Colijn C, Bentley SD and Harris SR

    Infection Genomics, Wellcome Sanger Institute, Hinxton, Cambridgeshire, CB10 1SA, UK.

    <b>Background</b>: Phylogenetic reconstruction is a necessary first step in many analyses which use whole genome sequence data from bacterial populations. There are many available methods to infer phylogenies, and these have various advantages and disadvantages, but few unbiased comparisons of the range of approaches have been made. <b>Methods</b>: We simulated data from a defined "true tree" using a realistic evolutionary model. We built phylogenies from this data using a range of methods, and compared reconstructed trees to the true tree using two measures, noting the computational time needed for different phylogenetic reconstructions. We also used real data from <i>Streptococcus pneumoniae</i> alignments to compare individual core gene trees to a core genome tree. <b>Results</b>: We found that, as expected, maximum likelihood trees from good quality alignments were the most accurate, but also the most computationally intensive. Using less accurate phylogenetic reconstruction methods, we were able to obtain results of comparable accuracy; we found that approximate results can rapidly be obtained using genetic distance based methods. In real data we found that highly conserved core genes, such as those involved in translation, gave an inaccurate tree topology, whereas genes involved in recombination events gave inaccurate branch lengths. We also show a tree-of-trees, relating the results of different phylogenetic reconstructions to each other. <b>Conclusions</b>: We recommend three approaches, depending on requirements for accuracy and computational time. Quicker approaches that do not perform full maximum likelihood optimisation may be useful for many analyses requiring a phylogeny, as generating a high quality input alignment is likely to be the major limiting factor of accurate tree topology. We have publicly released our simulated data and code to enable further comparisons.

    Wellcome open research 2018;3;33

  • Genetics of HbA1c: a case study in clinical translation.

    Leong A and Wheeler E

    Division of General Internal Medicine, Massachusetts General Hospital, Boston, MA 02114, USA; Harvard Medical School, Boston, MA 02115, USA.

    Glycated hemoglobin (HbA1c) measures the amount of glucose in the blood in the previous 2-3 months and is used to test whether an individual has diabetes (HbA1c≥6.5%), or how well they are managing their diabetes. Genome-wide association studies have successfully identified multiple genomic loci influencing HbA1c, through both glycemic (factors that affect the amount blood glucose levels) and erythrocytic (factors that affect the red blood cell) pathways. Inaccuracies in HbA1c, due to non-glycemic variants, could lead to suboptimal care or adverse health consequences. A recently published example is the erythrocytic variant (rs1050828) in G6PD, which leads to the artificial lowering of HbA1c and missed diagnosis of diabetes using current thresholds. In this review we will discuss recent insights into the genetic etiology of HbA1c, and how these can translate to the clinic.

    Current opinion in genetics & development 2018;50;79-85

  • BCL11B mutations in patients affected by a neurodevelopmental disorder with reduced type 2 innate lymphoid cells.

    Lessel D, Gehbauer C, Bramswig NC, Schluth-Bolard C, Venkataramanappa S, van Gassen KLI, Hempel M, Haack TB, Baresic A, Genetti CA, Funari MFA, Lessel I, Kuhlmann L, Simon R, Liu P, Denecke J, Kuechler A, de Kruijff I, Shoukier M, Lek M, Mullen T, Lüdecke HJ, Lerario AM, Kobbe R, Krieger T, Demeer B, Lebrun M, Keren B, Nava C, Buratti J, Afenjar A, Shinawi M, Guillen Sacoto MJ, Gauthier J, Hamdan FF, Laberge AM, Campeau PM, Louie RJ, Cathey SS, Prinz I, Jorge AAL, Terhal PA, Lenhard B, Wieczorek D, Strom TM, Agrawal PB, Britsch S, Tolosa E and Kubisch C

    Institute of Human Genetics, University Medical Center Hamburg-Eppendorf, Hamburg, Germany.

    The transcription factor BCL11B is essential for development of the nervous and the immune system, and Bcl11b deficiency results in structural brain defects, reduced learning capacity, and impaired immune cell development in mice. However, the precise role of BCL11B in humans is largely unexplored, except for a single patient with a BCL11B missense mutation, affected by multisystem anomalies and profound immune deficiency. Using massively parallel sequencing we identified 13 patients bearing heterozygous germline alterations in BCL11B. Notably, all of them are affected by global developmental delay with speech impairment and intellectual disability; however, none displayed overt clinical signs of immune deficiency. Six frameshift mutations, two nonsense mutations, one missense mutation, and two chromosomal rearrangements resulting in diminished BCL11B expression, arose de novo. A further frameshift mutation was transmitted from a similarly affected mother. Interestingly, the most severely affected patient harbours a missense mutation within a zinc-finger domain of BCL11B, probably affecting the DNA-binding structural interface, similar to the recently published patient. Furthermore, the most C-terminally located premature termination codon mutation fails to rescue the progenitor cell proliferation defect in hippocampal slice cultures from Bcl11b-deficient mice. Concerning the role of BCL11B in the immune system, extensive immune phenotyping of our patients revealed alterations in the T cell compartment and lack of peripheral type 2 innate lymphoid cells (ILC2s), consistent with the findings described in Bcl11b-deficient mice. Unsupervised analysis of 102 T lymphocyte subpopulations showed that the patients clearly cluster apart from healthy children, further supporting the common aetiology of the disorder. Taken together, we show here that mutations leading either to BCL11B haploinsufficiency or to a truncated BCL11B protein clinically cause a non-syndromic neurodevelopmental delay. In addition, we suggest that missense mutations affecting specific sites within zinc-finger domains might result in distinct and more severe clinical outcomes.

    Brain : a journal of neurology 2018

  • Earth BioGenome Project: Sequencing life for the future of life.

    Lewin HA, Robinson GE, Kress WJ, Baker WJ, Coddington J, Crandall KA, Durbin R, Edwards SV, Forest F, Gilbert MTP, Goldstein MM, Grigoriev IV, Hackett KJ, Haussler D, Jarvis ED, Johnson WE, Patrinos A, Richards S, Castilla-Rubio JC, van Sluys MA, Soltis PS, Xu X, Yang H and Zhang G

    Department of Evolution and Ecology, University of California, Davis, CA 95616;

    Increasing our understanding of Earth's biodiversity and responsibly stewarding its resources are among the most crucial scientific and social challenges of the new millennium. These challenges require fundamental new knowledge of the organization, evolution, functions, and interactions among millions of the planet's organisms. Herein, we present a perspective on the Earth BioGenome Project (EBP), a moonshot for biology that aims to sequence, catalog, and characterize the genomes of all of Earth's eukaryotic biodiversity over a period of 10 years. The outcomes of the EBP will inform a broad range of major issues facing humanity, such as the impact of climate change on biodiversity, the conservation of endangered species and ecosystems, and the preservation and enhancement of ecosystem services. We describe hurdles that the project faces, including data-sharing policies that ensure a permanent, freely available resource for future scientific discovery while respecting access and benefit sharing guidelines of the Nagoya Protocol. We also describe scientific and organizational challenges in executing such an ambitious project, and the structure proposed to achieve the project's goals. The far-reaching potential benefits of creating an open digital repository of genomic information for life on Earth can be realized only by a coordinated international effort.

    Proceedings of the National Academy of Sciences of the United States of America 2018;115;17;4325-4333

  • Whole exome sequencing in adult-onset hearing loss reveals a high load of predicted pathogenic variants in known deafness-associated genes and identifies new candidate genes.

    Lewis MA, Nolan LS, Cadge BA, Matthews LJ, Schulte BA, Dubno JR, Steel KP and Dawson SJ

    Wolfson Centre for Age-Related Diseases, King's College London, WC2R 2LS, London, UK.

    Background: Deafness is a highly heterogenous disorder with over 100 genes known to underlie human non-syndromic hearing impairment. However, many more remain undiscovered, particularly those involved in the most common form of deafness: adult-onset progressive hearing loss. Despite several genome-wide association studies of adult hearing status, it remains unclear whether the genetic architecture of this common sensory loss consists of multiple rare variants each with large effect size or many common susceptibility variants each with small to medium effects. As next generation sequencing is now being utilised in clinical diagnosis, our aim was to explore the viability of diagnosing the genetic cause of hearing loss using whole exome sequencing in individual subjects as in a clinical setting.

    Methods: We performed exome sequencing of thirty patients selected for distinct phenotypic sub-types from well-characterised cohorts of 1479 people with adult-onset hearing loss.

    Results: Every individual carried predicted pathogenic variants in at least ten deafness-associated genes; similar findings were obtained from an analysis of the 1000 Genomes Project data unselected for hearing status. We have identified putative causal variants in known deafness genes and several novel candidate genes, including NEDD4 and NEFH that were mutated in multiple individuals.

    Conclusions: The high frequency of predicted-pathogenic variants detected in known deafness-associated genes was unexpected and has significant implications for current diagnostic sequencing in deafness. Our findings suggest that in a clinic setting, efforts should be made to a) confirm key sequence results by Sanger sequencing, b) assess segregations of variants and phenotypes within the family if at all possible, and c) use caution in applying current pathogenicity prediction algorithms for diagnostic purposes. We conclude that there may be a high number of pathogenic variants affecting hearing in the ageing population, including many in known deafness-associated genes. Our findings of frequent predicted-pathogenic variants in both our hearing-impaired sample and in the larger 1000 Genomes Project sample unselected for auditory function suggests that the reference population for interpreting variants for this very common disorder should be a population of people with good hearing for their age rather than an unselected population.

    Funded by: Deafness Research UK: 444:UEI:SD; Medical Research Council: 098051; Medical University of South Carolina: NIH/NCATS UL1 TR000062, UL1 TR001450; National Institute on Deafness and Other Communication Disorders: P50 DC000422; Research into Ageing, Help the Aged: 223; Teresa Rosenbaum Golden Charitable Trust, Telethon Foundation: GGP09037; Wellcome Trust: 100669

    BMC medical genomics 2018;11;1;77

  • Genome-wide CRISPR-KO Screen Uncovers mTORC1-Mediated Gsk3 Regulation in Naive Pluripotency Maintenance and Dissolution.

    Li M, Yu JSL, Tilgner K, Ong SH, Koike-Yusa H and Yusa K

    Wellcome Sanger Institute, Hinxton, Cambridge CB10 1SA, UK.

    The genetic basis of naive pluripotency maintenance and loss is a central question in embryonic stem cell biology. Here, we deploy CRISPR-knockout-based screens in mouse embryonic stem cells to interrogate this question through a genome-wide, non-biased approach using the Rex1GFP reporter as a phenotypic readout. This highly sensitive and efficient method identified genes in diverse biological processes and pathways. We uncovered a key role for negative regulators of mTORC1 in maintenance and exit from naive pluripotency and provided an integrated account of how mTORC1 activity influences naive pluripotency through Gsk3. Our study therefore reinforces Gsk3 as the central node and provides a comprehensive, data-rich resource that will improve our understanding of mechanisms regulating pluripotency and stimulate avenues for further mechanistic studies.

    Cell reports 2018;24;2;489-502

  • Organoid cultures recapitulate esophageal adenocarcinoma heterogeneity providing a model for clonality studies and precision therapeutics.

    Li X, Francies HE, Secrier M, Perner J, Miremadi A, Galeano-Dalmau N, Barendt WJ, Letchford L, Leyden GM, Goffin EK, Barthorpe A, Lightfoot H, Chen E, Gilbert J, Noorani A, Devonshire G, Bower L, Grantham A, MacRae S, Grehan N, Wedge DC, Fitzgerald RC and Garnett MJ

    MRC Cancer Unit, University of Cambridge, Cambridge, CB2 0XZ, UK.

    Esophageal adenocarcinoma (EAC) incidence is increasing while 5-year survival rates remain less than 15%. A lack of experimental models has hampered progress. We have generated clinically annotated EAC organoid cultures that recapitulate the morphology, genomic, and transcriptomic landscape of the primary tumor including point mutations, copy number alterations, and mutational signatures. Karyotyping of organoid cultures has confirmed polyclonality reflecting the clonal architecture of the primary tumor. Furthermore, subclones underwent clonal selection associated with driver gene status. Medium throughput drug sensitivity testing demonstrates the potential of targeting receptor tyrosine kinases and downstream mediators. EAC organoid cultures provide a pre-clinical tool for studies of clonal evolution and precision therapeutics.

    Funded by: Cancer Research UK (CRUK): C44943/A22536, RG66287; DH | National Institute for Health Research (NIHR): RG67258; EIF | Stand Up To Cancer (SU2C): SU2C-AACR-DT1213; Medical Research Council: MC_UU_12022/2; Medical Research Council (MRC): RG84369; Wellcome Trust: 102696

    Nature communications 2018;9;1;2983

  • Sixteen diverse laboratory mouse reference genomes define strain-specific haplotypes and novel functional loci.

    Lilue J, Doran AG, Fiddes IT, Abrudan M, Armstrong J, Bennett R, Chow W, Collins J, Collins S, Czechanski A, Danecek P, Diekhans M, Dolle DD, Dunn M, Durbin R, Earl D, Ferguson-Smith A, Flicek P, Flint J, Frankish A, Fu B, Gerstein M, Gilbert J, Goodstadt L, Harrow J, Howe K, Ibarra-Soria X, Kolmogorov M, Lelliott CJ, Logan DW, Loveland J, Mathews CE, Mott R, Muir P, Nachtweide S, Navarro FCP, Odom DT, Park N, Pelan S, Pham SK, Quail M, Reinholdt L, Romoth L, Shirley L, Sisu C, Sjoberg-Herrera M, Stanke M, Steward C, Thomas M, Threadgold G, Thybert D, Torrance J, Wong K, Wood J, Yalcin B, Yang F, Adams DJ, Paten B and Keane TM

    European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK.

    We report full-length draft de novo genome assemblies for 16 widely used inbred mouse strains and find extensive strain-specific haplotype variation. We identify and characterize 2,567 regions on the current mouse reference genome exhibiting the greatest sequence diversity. These regions are enriched for genes involved in pathogen defence and immunity and exhibit enrichment of transposable elements and signatures of recent retrotransposition events. Combinations of alleles and genes unique to an individual strain are commonly observed at these loci, reflecting distinct strain phenotypes. We used these genomes to improve the mouse reference genome, resulting in the completion of 10 new gene structures. Also, 62 new coding loci were added to the reference genome annotation. These genomes identified a large, previously unannotated, gene (Efcab3-like) encoding 5,874 amino acids. Mutant Efcab3-like mice display anomalies in multiple brain regions, suggesting a possible role for this gene in the regulation of brain development.

    Funded by: NHGRI NIH HHS: U41 HG007234; NIH HHS: U42 OD010921

    Nature genetics 2018;50;11;1574-1583

  • Genomic and transcriptomic comparisons of closely related malaria parasites differing in virulence and sequestration pattern.

    Lin JW, Reid AJ, Cunningham D, Böhme U, Tumwine I, Keller-Mclaughlin S, Sanders M, Berriman M and Langhorne J

    Malaria Immunology laboratory, Francis Crick Institute, London, NW1 1AT, UK.

    <b>Background:</b> Malaria parasite species differ greatly in the harm they do to humans. While <i>P. falciparum</i> kills hundreds of thousands per year, <i>P. vivax</i> kills much less often and <i>P. malariae</i> is relatively benign. Strains of the rodent malaria parasite <i>Plasmodium chabaudi</i> show phenotypic variation in virulence during infections of laboratory mice. This make it an excellent species to study genes which may be responsible for this trait. By understanding the mechanisms which underlie differences in virulence we can learn how parasites adapt to their hosts and how we might prevent disease. <b>Methods:</b> Here we present a complete reference genome sequence for a more virulent <i>P. chabaudi</i> strain, PcCB, and perform a detailed comparison with the genome of the less virulent PcAS strain. <b>Results:</b> We found the greatest variation in the subtelomeric regions, in particular amongst the sequences of the <i>pir</i> gene family, which has been associated with virulence and establishment of chronic infection. Despite substantial variation at the sequence level, the repertoire of these genes has been largely maintained, highlighting the requirement for functional conservation as well as diversification in host-parasite interactions. However, a subset of <i>pir</i> genes, previously associated with increased virulence, were more highly expressed in PcCB, suggesting a role for this gene family in virulence differences between strains. We found that core genes involved in red blood cell invasion have been under positive selection and that the more virulent strain has a greater preference for reticulocytes, which has elsewhere been associated with increased virulence. <b>Conclusions:</b> These results provide the basis for a mechanistic understanding of the phenotypic differences between <i>Plasmodium chabaudi</i> strains, which might ultimately be translated into a better understanding of malaria parasites affecting humans.

    Wellcome open research 2018;3;142

  • BraCeR: B-cell-receptor reconstruction and clonality inference from single-cell RNA-seq.

    Lindeman I, Emerton G, Mamanova L, Snir O, Polanski K, Qiao SW, Sollid LM, Teichmann SA and Stubbington MJT

    Centre for Immune Regulation, University of Oslo and Oslo University Hospital, Oslo, Norway.

    Nature methods 2018;15;8;563-565

  • Investigating the Campylobacter jejuni Transcriptional Response to Host Intestinal Extracts Reveals the Involvement of a Widely Conserved Iron Uptake System.

    Liu MM, Boinett CJ, Chan ACK, Parkhill J, Murphy MEP and Gaynor EC

    Department of Microbiology and Immunology, University of British Columbia, Vancouver, BC, Canada.

    <i>Campylobacter jejuni</i> is a pathogenic bacterium that causes gastroenteritis in humans yet is a widespread commensal in wild and domestic animals, particularly poultry. Using RNA sequencing, we assessed <i>C. jejuni</i> transcriptional responses to medium supplemented with human fecal versus chicken cecal extracts and in extract-supplemented medium versus medium alone. <i>C. jejuni</i> exposed to extracts had altered expression of 40 genes related to iron uptake, metabolism, chemotaxis, energy production, and osmotic stress response. In human fecal versus chicken cecal extracts, <i>C. jejuni</i> displayed higher expression of genes involved in respiration (<i>fdhTU</i>) and in known or putative iron uptake systems (<i>cfbpA</i>, <i>ceuB</i>, <i>chuC</i>, and <i>CJJ81176_1649-1655</i> [here designated <i>1649-1655</i>]). The <i>1649-1655</i> genes and downstream overlapping gene <i>1656</i> were investigated further. Uncharacterized homologues of this system were identified in 33 diverse bacterial species representing 6 different phyla, 21 of which are associated with human disease. The <i>1649</i> and <i>1650</i> (<i>p19</i>) genes encode an iron transporter and a periplasmic iron binding protein, respectively; however, the role of the downstream <i>1651-1656</i> genes was unknown. A Δ<i>1651</i>-<i>1656</i> deletion strain had an iron-sensitive phenotype, consistent with a previously characterized Δ<i>p19</i> mutant, and showed reduced growth in acidic medium, increased sensitivity to streptomycin, and higher resistance to H<sub>2</sub>O<sub>2</sub> stress. In iron-restricted medium, the <i>1651-1656</i> and <i>p19</i> genes were required for optimal growth when using human fecal extracts as an iron source. Collectively, this implicates a function for the <i>1649-1656</i> gene cluster in <i>C. jejuni</i> iron scavenging and stress survival in the human intestinal environment.<b>IMPORTANCE</b> Direct comparative studies of <i>C. jejuni</i> infection of a zoonotic commensal host and a disease-susceptible host are crucial to understanding the causes of infection outcome in humans. These studies are hampered by the lack of a disease-susceptible animal model reliably displaying a similar pathology to human campylobacteriosis. In this work, we compared the phenotypic and transcriptional responses of <i>C. jejuni</i> to intestinal compositions of humans (disease-susceptible host) and chickens (zoonotic host) by using human fecal and chicken cecal extracts. The mammalian gut is a complex and dynamic system containing thousands of metabolites that contribute to host health and modulate pathogen activity. We identified <i>C. jejuni</i> genes more highly expressed during exposure to human fecal extracts in comparison to chicken cecal extracts and differentially expressed in extracts compared with medium alone, and targeted one specific iron uptake system for further molecular, genetic, and phenotypic study.

    mBio 2018;9;4

  • Analysis of novel missense ATR mutations reveals new splicing defects underlying Seckel syndrome.

    Llorens-Agost M, Luessing J, van Beneden A, Eykelenboom J, O'Reilly D, Bicknell LS, Reynolds JJ, van Koegelenberg M, Hurles ME, Brady AF, Jackson AP, Stewart GS and Lowndes NF

    Centre for Chromosome Biology, National University of Ireland in Galway, Galway, Ireland.

    Ataxia Telangiectasia and Rad3 related (ATR) is one of the main regulators of the DNA damage response. It coordinates cell cycle checkpoint activation, replication fork stability, restart and origin firing to maintain genome integrity. Mutations of the ATR gene have been reported in Seckel patients, who suffer from a rare genetic disease characterized by severe microcephaly and growth retardation. Here, we report the case of a Seckel patient with compound heterozygous mutations in ATR. One allele has an intronic mutation affecting splicing of neighboring exons, the other an exonic missense mutation, producing the variant p.Lys1665Asn, of unknown pathogenicity. We have modeled this novel missense mutation, as well as a previously described missense mutation p.Met1159Ile, and assessed their effect on ATR function. Interestingly, our data indicate that both missense mutations have no direct effect on protein function, but rather result in defective ATR splicing. These results emphasize the importance of splicing mutations in Seckel Syndrome.

    Funded by: CRUK: C17183/A23303; European Union FP6 Integrated Project DNA repair: 512113; Health Research Board: PR001/2001; Higher Education Authority of Ireland: RH5402; Science Foundation Ireland: 07/IN1/B958, 13/IA/1954; Worldwide Cancer Research: RIN1007; the Beckman Fund scholarship: RNR921

    Human mutation 2018;39;12;1847-1853

  • Global Distribution of Invasive Serotype 35D Streptococcus pneumoniae Isolates following Introduction of 13-Valent Pneumococcal Conjugate Vaccine.

    Lo SW, Gladstone RA, van Tonder AJ, Hawkins PA, Kwambana-Adams B, Cornick JE, Madhi SA, Nzenze SA, du Plessis M, Kandasamy R, Carter PE, Eser ÖK, Ho PL, Elmdaghri N, Shakoor S, Clarke SC, Antonio M, Everett DB, von Gottberg A, Klugman KP, McGee L, Breiman RF and Bentley SD

    Infection Genomics, The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom

    A newly recognized pneumococcal serotype, 35D, which differs from the 35B polysaccharide in structure and serology by not binding to factor serum 35a, was recently reported. The genetic basis for this distinctive serology is due to the presence of an inactivating mutation in <i>wciG</i>, which encodes an O-acetyltransferase responsible for O-acetylation of a galactofuranose. Here, we assessed the genomic data of a worldwide pneumococcal collection to identify serotype 35D isolates and understand their geographical distribution, genetic background, and invasiveness potential. Of 21,980 pneumococcal isolates, 444 were originally typed as serotype 35B by PneumoCaT. Analysis of the <i>wciG</i> gene revealed 23 isolates from carriage (<i>n</i> = 4) and disease (<i>n</i> = 19) with partial or complete loss-of-function mutations, including mutations resulting in premature stop codons (<i>n</i> = 22) and an in-frame mutation (<i>n</i> = 1). These were selected for further analysis. The putative 35D isolates were geographically widespread, and 65.2% (15/23) of them was recovered after the introduction of pneumococcal conjugate vaccine 13 (PCV13). Compared with serotype 35B isolates, putative serotype 35D isolates have higher invasive disease potentials based on odds ratios (OR) (11.58; 95% confidence interval[CI], 1.42 to 94.19 versus 0.61; 95% CI, 0.40 to 0.92) and a higher prevalence of macrolide resistance mediated by <i>mefA</i> (26.1% versus 7.6%; <i>P</i> = 0.009). Using the Quellung reaction, 50% (10/20) of viable isolates were identified as serotype 35D, 25% (5/20) as serotype 35B, and 25% (5/20) as a mixture of 35B/35D. The discrepancy between phenotype and genotype requires further investigation. These findings illustrated a global distribution of an invasive serotype, 35D, among young children post-PCV13 introduction and underlined the invasive potential conferred by the loss of O-acetylation in the pneumococcal capsule.

    Journal of clinical microbiology 2018;56;7

  • Breaking the code of antibiotic resistance.

    Lo SW, Kumar N and Wheeler NE

    Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK.

    Nature reviews. Microbiology 2018;16;5;262

  • DNA Polymerase Epsilon Deficiency Causes IMAGe Syndrome with Variable Immunodeficiency.

    Logan CV, Murray JE, Parry DA, Robertson A, Bellelli R, Tarnauskaitė Ž, Challis R, Cleal L, Borel V, Fluteau A, Santoyo-Lopez J, SGP Consortium, Aitman T, Barroso I, Basel D, Bicknell LS, Goel H, Hu H, Huff C, Hutchison M, Joyce C, Knox R, Lacroix AE, Langlois S, McCandless S, McCarrier J, Metcalfe KA, Morrissey R, Murphy N, Netchine I, O'Connell SM, Olney AH, Paria N, Rosenfeld JA, Sherlock M, Syverson E, White PC, Wise C, Yu Y, Zacharin M, Banerjee I, Reijns M, Bober MB, Semple RK, Boulton SJ, Rios JJ and Jackson AP

    MRC Human Genetics Unit, MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh EH4 2XU, UK.

    During genome replication, polymerase epsilon (Pol ε) acts as the major leading-strand DNA polymerase. Here we report the identification of biallelic mutations in POLE, encoding the Pol ε catalytic subunit POLE1, in 15 individuals from 12 families. Phenotypically, these individuals had clinical features closely resembling IMAGe syndrome (intrauterine growth restriction [IUGR], metaphyseal dysplasia, adrenal hypoplasia congenita, and genitourinary anomalies in males), a disorder previously associated with gain-of-function mutations in CDKN1C. POLE1-deficient individuals also exhibited distinctive facial features and variable immune dysfunction with evidence of lymphocyte deficiency. All subjects shared the same intronic variant (c.1686+32C>G) as part of a common haplotype, in combination with different loss-of-function variants in trans. The intronic variant alters splicing, and together the biallelic mutations lead to cellular deficiency of Pol ε and delayed S-phase progression. In summary, we establish POLE as a second gene in which mutations cause IMAGe syndrome. These findings add to a growing list of disorders due to mutations in DNA replication genes that manifest growth restriction alongside adrenal dysfunction and/or immunodeficiency, consolidating these as replisome phenotypes and highlighting a need for future studies to understand the tissue-specific development roles of the encoded proteins.

    Funded by: Chief Scientist Office: SGP/1; Medical Research Council: MC_PC_15080, MC_PC_U127580972, MR/N005902/1; NCATS NIH HHS: UL1 TR001105; Wellcome Trust: 210752/Z/18/Z

    American journal of human genetics 2018;103;6;1038-1044

  • Complement Susceptibility in Relation to Genome Sequence of Recent Klebsiella pneumoniae Isolates from Thai Hospitals.

    Loraine J, Heinz E, De Sousa Almeida J, Milevskyy O, Voravuthikunchai SP, Srimanote P, Kiratisin P, Thomson NR and Taylor PW

    School of Pharmacy, University College London, London, United Kingdom.

    The capacity to resist the bactericidal action of complement (C') is a strong but poorly understood virulence trait in <i>Klebsiella</i> spp. Killing requires activation of one or more C' pathways, assembly of C5b-9 membrane attack complexes (MACs) on the surface of the outer membrane (OM), and penetration of MACs into the target bilayer. We interrogated whole-genome sequences of 164 <i>Klebsiella</i> isolates from three tertiary hospitals in Thailand for genes encoding surface-located macromolecules considered to play a role in determination of C' resistance. Most isolates (154/164) were identified as <i>Klebsiella pneumoniae</i>, and the collection conformed to previously established population structures and antibiotic resistance patterns. The distribution of sequence types (STs) and capsular (K) types were also typical of global populations. The majority (64%) of isolates were resistant to C', and the remainder were either rapidly or slowly killed. All isolates carried genes encoding capsular polysaccharides (K antigens), which have been strongly linked to C' resistance. In contrast to previous reports, there were no differences in the amount of capsule produced by C'-resistant isolates compared to C'-susceptible isolates, nor was there any correlation between serum reactivity and the presence of hypermucoviscous capsules. Similarly, there were no correlations between the presence of genes specifying lipopolysaccharide O-side chains or major OM proteins. Some virulence factors were found more frequently in C'-resistant isolates but were considered to reflect clonal ST expansion. Thus, no single gene accounts for the C' resistance of the isolates sequenced in this study.<b>IMPORTANCE</b> Multidrug-resistant <i>Klebsiella pneumoniae</i> is responsible for an increasing proportion of nosocomial infections, and emerging hypervirulent <i>K. pneumoniae</i> clones now cause severe community-acquired infections in otherwise healthy individuals. These bacteria are adept at circumventing immune defenses, and most survive and grow in serum; their capacity to avoid C'-mediated destruction is correlated with their invasive potential. Killing of Gram-negative bacteria occurs following activation of the C' cascades and stable deposition of C5b-9 MACs onto the OM. For <i>Klebsiella</i>, studies with mutants and conjugants have invoked capsules, lipopolysaccharide O-side chains, and OM proteins as determinants of C' resistance, although the precise roles of the macromolecules are unclear. In this study, we sequenced 164 <i>Klebsiella</i> isolates with different C' susceptibilities to identify genes involved in resistance. We conclude that no single OM constituent can account for resistance, which is likely to depend on biophysical properties of the target bilayer.

    Funded by: Medical Research Council: MR/N012542/1

    mSphere 2018;3;6

  • A novel PIGA variant associated with severe X-linked epilepsy and profound developmental delay.

    Low KJ, James M, Sharples PM, Eaton M, Jenkinson S, Study DDD and Smithson SF

    Department of Clinical Genetics, St. Michaels Hospital, Bristol, UK; School of Clinical Sciences, University of Bristol, Bristol, UK.

    Seizure 2018;56;1-3

  • Phenotype of CNTNAP1: a study of patients demonstrating a specific severe congenital hypomyelinating neuropathy with survival beyond infancy.

    Low KJ, Stals K, Caswell R, Wakeling M, Clayton-Smith J, Donaldson A, Foulds N, Norman A, Splitt M, Urankar K, Vijayakumar K, Majumdar A, Study D, Ellard S and Smithson SF

    Department of Clinical Genetics, St Michaels Hospital, Bristol, UK.

    CHN is genetically heterogeneous and its genetic basis is difficult to determine on features alone. CNTNAP1 encodes CASPR, integral in the paranodal junction high molecular mass complex. Nineteen individuals with biallelic variants have been described in association with severe congenital hypomyelinating neuropathy, respiratory compromise, profound intellectual disability and death within the first year. We report 7 additional patients ascertained through exome sequencing. We identified 9 novel CNTNAP1 variants in 6 families: three missense variants, four nonsense variants, one frameshift variant and one splice site variant. Significant polyhydramnios occurred in 6/7 pregnancies. Severe respiratory compromise was seen in 6/7 (tracheostomy in 5). A complex neurological phenotype was seen in all patients who had marked brain hypomyelination/demyelination and profound developmental delay. Additional neurological findings included cranial nerve compromise: orobulbar dysfunction in 5/7, facial nerve weakness in 4/7 and vocal cord paresis in 5/7. Dystonia occurred in 2/7 patients and limb contractures in 5/7. All had severe gastroesophageal reflux, and a gastrostomy was required in 5/7. In contrast to most previous reports, only one patient died in the first year of life. Protein modelling was performed for all detected CNTNAP1 variants. We propose a genotype-phenotype correlation, whereby hypomorphic missense variants partially ameliorate the phenotype, prolonging survival. This study suggests that biallelic variants in CNTNAP1 cause a distinct recognisable syndrome, which is not caused by other genes associated with CHN. Neonates presenting with this phenotype will benefit from early genetic definition to inform clinical management and enable essential genetic counselling for their families.

    Funded by: Department of Health; Wellcome Trust: 098051

    European journal of human genetics : EJHG 2018;26;6;796-807

  • Evolutionary history of human <i>Plasmodium vivax</i> revealed by genome-wide analyses of related ape parasites.

    Loy DE, Plenderleith LJ, Sundararaman SA, Liu W, Gruszczyk J, Chen YJ, Trimboli S, Learn GH, MacLean OA, Morgan ALK, Li Y, Avitto AN, Giles J, Calvignac-Spencer S, Sachse A, Leendertz FH, Speede S, Ayouba A, Peeters M, Rayner JC, Tham WH, Sharp PM and Hahn BH

    Department of Medicine, University of Pennsylvania, Philadelphia, PA 19104.

    Wild-living African apes are endemically infected with parasites that are closely related to human <i>Plasmodium vivax</i>, a leading cause of malaria outside Africa. This finding suggests that the origin of <i>P. vivax</i> was in Africa, even though the parasite is now rare in humans there. To elucidate the emergence of human <i>P. vivax</i> and its relationship to the ape parasites, we analyzed genome sequence data of <i>P. vivax</i> strains infecting six chimpanzees and one gorilla from Cameroon, Gabon, and Côte d'Ivoire. We found that ape and human parasites share nearly identical core genomes, differing by only 2% of coding sequences. However, compared with the ape parasites, human strains of <i>P. vivax</i> exhibit about 10-fold less diversity and have a relative excess of nonsynonymous nucleotide polymorphisms, with site-frequency spectra suggesting they are subject to greatly relaxed purifying selection. These data suggest that human <i>P. vivax</i> has undergone an extreme bottleneck, followed by rapid population expansion. Investigating potential host-specificity determinants, we found that ape <i>P. vivax</i> parasites encode intact orthologs of three reticulocyte-binding protein genes (<i>rbp2d</i>, <i>rbp2e</i>, and <i>rbp3</i>), which are pseudogenes in all human <i>P. vivax</i> strains. However, binding studies of recombinant RBP2e and RBP3 proteins to human, chimpanzee, and gorilla erythrocytes revealed no evidence of host-specific barriers to red blood cell invasion. These data suggest that, from an ancient stock of <i>P. vivax</i> parasites capable of infecting both humans and apes, a severely bottlenecked lineage emerged out of Africa and underwent rapid population growth as it spread globally.

    Proceedings of the National Academy of Sciences of the United States of America 2018

  • Meta-analysis of Immunochip data of four autoimmune diseases reveals novel single-disease and cross-phenotype associations.

    Márquez A, Kerick M, Zhernakova A, Gutierrez-Achury J, Chen WM, Onengut-Gumuscu S, González-Álvaro I, Rodriguez-Rodriguez L, Rios-Fernández R, González-Gay MA, Coeliac Disease Immunochip Consortium, Rheumatoid Arthritis Consortium International for Immunochip (RACI), International Scleroderma Group, Type 1 Diabetes Genetics Consortium, Mayes MD, Raychaudhuri S, Rich SS, Wijmenga C and Martín J

    Instituto de Parasitología y Biomedicina "López-Neyra", CSIC, PTS Granada, Granada, Spain.

    Background: In recent years, research has consistently proven the occurrence of genetic overlap across autoimmune diseases, which supports the existence of common pathogenic mechanisms in autoimmunity. The objective of this study was to further investigate this shared genetic component.

    Methods: For this purpose, we performed a cross-disease meta-analysis of Immunochip data from 37,159 patients diagnosed with a seropositive autoimmune disease (11,489 celiac disease (CeD), 15,523 rheumatoid arthritis (RA), 3477 systemic sclerosis (SSc), and 6670 type 1 diabetes (T1D)) and 22,308 healthy controls of European origin using the R package ASSET.

    Results: We identified 38 risk variants shared by at least two of the conditions analyzed, five of which represent new pleiotropic loci in autoimmunity. We also identified six novel genome-wide associations for the diseases studied. Cell-specific functional annotations and biological pathway enrichment analyses suggested that pleiotropic variants may act by deregulating gene expression in different subsets of T cells, especially Th17 and regulatory T cells. Finally, drug repositioning analysis evidenced several drugs that could represent promising candidates for CeD, RA, SSc, and T1D treatment.

    Conclusions: In this study, we have been able to advance in the knowledge of the genetic overlap existing in autoimmunity, thus shedding light on common molecular mechanisms of disease and suggesting novel drug targets that could be explored for the treatment of the autoimmune diseases studied.

    Funded by: European Research Council: 322698, FP7/2007-2013/; NIDDK NIH HHS: U01 DK062418

    Genome medicine 2018;10;1;97

  • Disease-associated genotypes of the commensal skin bacterium Staphylococcus epidermidis.

    Méric G, Mageiros L, Pensar J, Laabei M, Yahara K, Pascoe B, Kittiwan N, Tadee P, Post V, Lamble S, Bowden R, Bray JE, Morgenstern M, Jolley KA, Maiden MCJ, Feil EJ, Didelot X, Miragaia M, de Lencastre H, Moriarty TF, Rohde H, Massey R, Mack D, Corander J and Sheppard SK

    The Milner Centre for Evolution, University of Bath, Claverton Down, Bath, BA2 7AY, UK.

    Some of the most common infectious diseases are caused by bacteria that naturally colonise humans asymptomatically. Combating these opportunistic pathogens requires an understanding of the traits that differentiate infecting strains from harmless relatives. Staphylococcus epidermidis is carried asymptomatically on the skin and mucous membranes of virtually all humans but is a major cause of nosocomial infection associated with invasive procedures. Here we address the underlying evolutionary mechanisms of opportunistic pathogenicity by combining pangenome-wide association studies and laboratory microbiology to compare S. epidermidis from bloodstream and wound infections and asymptomatic carriage. We identify 61 genes containing infection-associated genetic elements (k-mers) that correlate with in vitro variation in known pathogenicity traits (biofilm formation, cell toxicity, interleukin-8 production, methicillin resistance). Horizontal gene transfer spreads these elements, allowing divergent clones to cause infection. Finally, Random Forest model prediction of disease status (carriage vs. infection) identifies pathogenicity elements in 415 S. epidermidis isolates with 80% accuracy, demonstrating the potential for identifying risk genotypes pre-operatively.

    Nature communications 2018;9;1;5034

  • Convergent amino acid signatures in polyphyletic Campylobacter jejuni sub-populations suggest human niche tropism.

    Méric G, McNally A, Pessia A, Mourkas E, Pascoe B, Mageiros L, Vehkala M, Corander J and Sheppard SK

    The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, United Kingdom.

    Human infection with the gastrointestinal pathogen C. jejuni is dependent upon the opportunity for zoonotic transmission and the ability of strains to colonize the human host. Certain lineages of this diverse organism are more common in human infection but the factors underlying this overrepresentation are not fully understood. We analysed 601 isolate genomes from agricultural animals and human clinical cases, including isolates from the multi-host (ecological generalist) ST-21 and ST-45 clonal complexes (CCs). Combined nucleotide and amino acid sequence analysis identified 12 human-only amino acid KPAX clusters among polyphyletic lineages within the common disease causing CC21 group isolates, with no such clusters among CC45 isolates. Isolate sequence types within human-only CC21 group KPAX clusters have been sampled from other hosts, including poultry, so rather than representing unsampled reservoir hosts, the increase in relative frequency in human infection potentially reflects a genetic bottleneck at the point of human infection. Consistent with this, sequence enrichment analysis identified nucleotide variation in genes with putative functions related to human colonisation and pathogenesis, in human-only clusters. Furthermore, the tight clustering and polyphyly of human-only lineage clusters within a single clonal complex suggest the repeated evolution of human association through acquisition of genetic elements within this complex. Taken together, combined nucleotide and amino acid analysis of large isolate collections may provide clues about human niche tropism and the nature of the forces that promote the emergence of clinically important C. jejuni lineages.

    Genome biology and evolution 2018

  • Ancient hepatitis B viruses from the Bronze Age to the Medieval period.

    Mühlemann B, Jones TC, Damgaard PB, Allentoft ME, Shevnina I, Logvin A, Usmanova E, Panyushkina IP, Boldgiv B, Bazartseren T, Tashbaeva K, Merz V, Lau N, Smrčka V, Voyakin D, Kitov E, Epimakhov A, Pokutta D, Vicze M, Price TD, Moiseyev V, Hansen AJ, Orlando L, Rasmussen S, Sikora M, Vinner L, Osterhaus ADME, Smith DJ, Glebe D, Fouchier RAM, Drosten C, Sjögren KG, Kristiansen K and Willerslev E

    Center for Pathogen Evolution, Department of Zoology, University of Cambridge, Cambridge, UK.

    Hepatitis B virus (HBV) is a major cause of human hepatitis. There is considerable uncertainty about the timescale of its evolution and its association with humans. Here we present 12 full or partial ancient HBV genomes that are between approximately 0.8 and 4.5 thousand years old. The ancient sequences group either within or in a sister relationship with extant human or other ape HBV clades. Generally, the genome properties follow those of modern HBV. The root of the HBV tree is projected to between 8.6 and 20.9 thousand years ago, and we estimate a substitution rate of 8.04 × 10<sup>-6</sup>-1.51 × 10<sup>-5</sup> nucleotide substitutions per site per year. In several cases, the geographical locations of the ancient genotypes do not match present-day distributions. Genotypes that today are typical of Africa and Asia, and a subgenotype from India, are shown to have an early Eurasian presence. The geographical and temporal patterns that we observe in ancient and modern HBV genotypes are compatible with well-documented human migrations during the Bronze and Iron Ages<sup>1,2</sup>. We provide evidence for the creation of HBV genotype A via recombination, and for a long-term association of modern HBV genotypes with humans, including the discovery of a human genotype that is now extinct. These data expose a complexity of HBV evolution that is not evident when considering modern sequences alone.

    Nature 2018;557;7705;418-423

  • Ancient human parvovirus B19 in Eurasia reveals its long-term association with humans.

    Mühlemann B, Margaryan A, Damgaard PB, Allentoft ME, Vinner L, Hansen AJ, Weber A, Bazaliiskii VI, Molak M, Arneborg J, Bogdanowicz W, Falys C, Sablin M, Smrčka V, Sten S, Tashbaeva K, Lynnerup N, Sikora M, Smith DJ, Fouchier RAM, Drosten C, Sjögren KG, Kristiansen K, Willerslev E and Jones TC

    Center for Pathogen Evolution, Department of Zoology, University of Cambridge, CB2 3EJ Cambridge, United Kingdom.

    Human parvovirus B19 (B19V) is a ubiquitous human pathogen associated with a number of conditions, such as fifth disease in children and arthritis and arthralgias in adults. B19V is thought to evolve exceptionally rapidly among DNA viruses, with substitution rates previously estimated to be closer to those typical of RNA viruses. On the basis of genetic sequences up to ∼70 years of age, the most recent common ancestor of all B19V has been dated to the early 1800s, and it has been suggested that genotype 1, the most common B19V genotype, only started circulating in the 1960s. Here we present 10 genomes (63.9-99.7% genome coverage) of B19V from dental and skeletal remains of individuals who lived in Eurasia and Greenland from ∼0.5 to ∼6.9 thousand years ago (kya). In a phylogenetic analysis, five of the ancient B19V sequences fall within or basal to the modern genotype 1, and five fall basal to genotype 2, showing a long-term association of B19V with humans. The most recent common ancestor of all B19V is placed ∼12.6 kya, and we find a substitution rate that is an order of magnitude lower than inferred previously. Further, we are able to date the recombination event between genotypes 1 and 3 that formed genotype 2 to ∼5.0-6.8 kya. This study emphasizes the importance of ancient viral sequences for our understanding of virus evolution and phylogenetics.

    Proceedings of the National Academy of Sciences of the United States of America 2018

  • Genome organization and DNA accessibility control antigenic variation in trypanosomes.

    Müller LSM, Cosentino RO, Förstner KU, Guizetti J, Wedel C, Kaplan N, Janzen CJ, Arampatzi P, Vogel J, Steinbiss S, Otto TD, Saliba AE, Sebra RP and Siegel TN

    Department of Veterinary Sciences, Experimental Parasitology, Ludwig-Maximilians-Universität München, Munich, Germany.

    Many evolutionarily distant pathogenic organisms have evolved similar survival strategies to evade the immune responses of their hosts. These include antigenic variation, through which an infecting organism prevents clearance by periodically altering the identity of proteins that are visible to the immune system of the host<sup>1</sup>. Antigenic variation requires large reservoirs of immunologically diverse antigen genes, which are often generated through homologous recombination, as well as mechanisms to ensure the expression of one or very few antigens at any given time. Both homologous recombination and gene expression are affected by three-dimensional genome architecture and local DNA accessibility<sup>2,3</sup>. Factors that link three-dimensional genome architecture, local chromatin conformation and antigenic variation have, to our knowledge, not yet been identified in any organism. One of the major obstacles to studying the role of genome architecture in antigenic variation has been the highly repetitive nature and heterozygosity of antigen-gene arrays, which has precluded complete genome assembly in many pathogens. Here we report the de novo haplotype-specific assembly and scaffolding of the long antigen-gene arrays of the model protozoan parasite Trypanosoma brucei, using long-read sequencing technology and conserved features of chromosome folding<sup>4</sup>. Genome-wide chromosome conformation capture (Hi-C) reveals a distinct partitioning of the genome, with antigen-encoding subtelomeric regions that are folded into distinct, highly compact compartments. In addition, we performed a range of analyses-Hi-C, fluorescence in situ hybridization, assays for transposase-accessible chromatin using sequencing and single-cell RNA sequencing-that showed that deletion of the histone variants H3.V and H4.V increases antigen-gene clustering, DNA accessibility across sites of antigen expression and switching of the expressed antigen isoform, via homologous recombination. Our analyses identify histone variants as a molecular link between global genome architecture, local chromatin conformation and antigenic variation.

    Funded by: Wellcome Trust: 098051

    Nature 2018;563;7729;121-125

  • The murine hepatic sequelae of long-term ethanol consumption are sex-specific and exacerbated by Aldh1b1 loss.

    Müller MF, Kendall TJ, Adams DJ, Zhou Y and Arends MJ

    University of Edinburgh, Division of Pathology, Centre for Comparative Pathology, Cancer Research UK Edinburgh Centre, Institute of Genetics & Molecular Medicine, Western General Hospital, Crewe Road South, Edinburgh EH4 2XR, UK.

    Disease progression in alcoholic and non-alcoholic fatty liver disease shows sex-specific differences and is influenced by mechanisms linked to oxidative stress. Acetaldehyde plays a critical pathogenic role but its effects are mitigated by the activity of aldehyde dehydrogenases. Aldehyde dehydrogenase 1b1 (Aldh1b1) is the aldehyde dehydrogenase isoform with the second highest affinity for acetaldehyde after Aldh2, and is highly expressed in the intestine and liver. We examined sex differences and the effect of Aldh1b1 depletion in a murine model of chronic alcohol-induced liver disease. Male and female wild-type and Aldh1b1-depleted mice received either ethanol (10-20% v/v) in drinking water or water alone for one year, and livers were examined histopathologically, histochemically and by immunohistochemistry. A significant increase in hepatic steatosis was observed in female mice after one year of ethanol consumption, and expression of ethanol-metabolising enzymes and up-regulation by ethanol was also sex-dependent. Ethanol-induced hyperproliferation of hepatocytes was observed in female and male wild-type mice, and Aldh1b1 depletion enhanced this effect in males. Further, one ethanol-treated, Aldh1b1-depleted male developed a steatohepatitic hepatocellular carcinoma. These sex-specific differences in susceptibility to hepatic steatosis and disease progression may be related to differences in expression of ethanol-metabolising enzymes, informing the clinically significant differences. Aldh1b1 plays a role in protection from ethanol-induced hepatocellular hyperproliferation and may protect from tumour development.

    Funded by: Wellcome Trust: 095898/Z/11/Z

    Experimental and molecular pathology 2018;105;1;63-70

  • Staphylococcus caeli sp. nov., isolated from air sampling in an industrial rabbit holding.

    MacFadyen AC, Drigo I, Harrison EM, Parkhill J, Holmes MA and Paterson GK

    1​Royal (Dick) School of Veterinary Studies and The Roslin Institute, University of Edinburgh, Easter Bush Campus, Midlothian, EH25 9RG, UK.

    Strain 82<sup>T</sup>, a Gram-stain-positive, coagulase-negative staphylococcus was isolated from an air sample obtained from an industrial rabbit holding in Italy. It is phylogenetically closely related to the coagulase-negative species Staphylococcus saprophyticus, Staphylococcus xylosus and Staphylococcus edaphicus. However, it could be distinguished from these species by sequence differences between the 16S rRNA, hsp60, rpoB, dnaJ and gap genes. At the whole genome level, the isolate had an average nucleotide identity of <95 % and an inferred DNA-DNA hybridization of <70 % when compared to these species. Based on the genotypic results, it is proposed that this isolate is a novel species, with the name Staphylococcus caeli sp. nov. The type strain is 82B<sup>T</sup> (=NCTC 14063<sup>T</sup>=CCUG 71912<sup>T</sup>).

    International journal of systematic and evolutionary microbiology 2018

  • A mecC allotype, mecC3, in the CoNS Staphylococcus caeli, encoded within a variant SCCmecC.

    MacFadyen AC, Harrison EM, Drigo I, Parkhill J, Holmes MA and Paterson GK

    Royal (Dick) School of Veterinary Studies and The Roslin Institute, University of Edinburgh, Easter Bush Campus, Midlothian EH25 9RG, UK.

    Background: Methicillin resistance in staphylococci is conferred by an alternative PBP (PBP2a/2') with low affinity for most β-lactam antibiotics. PBP2a is encoded by mecA, which is carried on a mobile genetic element known as SCCmec. A variant of mecA, mecC, was described in 2011 and has been found in Staphylococcus aureus from humans and a wide range of animal species as well as a small number of other staphylococcal species from animals.

    Objectives: We characterized a novel mecC allotype, mecC3, encoded by an environmental isolate of Staphylococcus caeli cultured from air sampling of a commercial rabbit holding.

    Methods: The S. caeli isolate 82BT was collected in Italy in 2013 and genome sequenced using MiSeq technology. This allowed the assembly and comparative genomic study of the novel SCCmec region encoding mecC3.

    Results: The study isolate encodes a novel mecA allotype, mecC3, with 92% nucleotide identity to mecC. mecC3 is encoded within a novel SCCmec element distinct from those previously associated with mecC, including a ccrAB pairing (ccrA5B3) not previously linked to mecC.

    Conclusions: This is the first description of the novel mecC allotype mecC3, the first isolation of a mecC-positive Staphylococcus in Italy and the first report of mecC in S. caeli. Furthermore, the SCCmec element described here is highly dissimilar to the archetypal SCCmec XI encoding mecC in S. aureus and to elements encoding mecC in other staphylococci. Our report highlights the diversity of mecC allotypes and the diverse staphylococcal species, ecological settings and genomic context in which mecC may be found.

    The Journal of antimicrobial chemotherapy 2018

  • A highly conserved mecC-encoding SCCmec type XI in a bovine isolate of methicillin-resistant Staphylococcus xylosus.

    MacFadyen AC, Harrison EM, Ellington MJ, Parkhill J, Holmes MA and Paterson GK

    Royal (Dick) School of Veterinary Studies and Roslin Institute, University of Edinburgh, Easter Bush Campus, Midlothian EH25 9RG, UK.

    The Journal of antimicrobial chemotherapy 2018

  • AP-1 Takes Centre Stage in Enhancer Chromatin Dynamics.

    Madrigal P and Alasoo K

    Wellcome Sanger Institute, Hinxton CB10 1SA, UK; Current address: Centre for Trophoblast Research, Department of Physiology, Development and Neuroscience, University of Cambridge, Cambridge CB2 3EG, UK. Electronic address:

    Recent studies involving induced pluripotent stem cell (iPSC) reprogramming in mice and monocyte-to-macrophage differentiation in humans have revealed a role for the transcription factor (TF) activator protein 1 (AP-1) in chromatin accessibility. Enhancer selection may be determined not only by cell type-specific sets of TFs but also by broadly expressed ones like AP-1.

    Trends in cell biology 2018

  • Vitamin D and risk of pregnancy related hypertensive disorders: mendelian randomisation study.

    Magnus MC, Miliku K, Bauer A, Engel SM, Felix JF, Jaddoe VWV, Lawlor DA, London SJ, Magnus P, McGinnis R, Nystad W, Page CM, Rivadeneira F, Stene LC, Tapia G, Williams N, Bonilla C and Fraser A

    Medical Research Council Integrative Epidemiology Unit, University of Bristol, Bristol BS8 2BN, UK

    Objective: To use mendelian randomisation to investigate whether 25-hydroxyvitamin D concentration has a causal effect on gestational hypertension or pre-eclampsia.

    Design: One and two sample mendelian randomisation analyses.

    Setting: Two European pregnancy cohorts (Avon Longitudinal Study of Parents and Children, and Generation R Study), and two case-control studies (subgroup nested within the Norwegian Mother and Child Cohort Study, and the UK Genetics of Pre-eclampsia Study).

    Participants: 7389 women in a one sample mendelian randomisation analysis (751 with gestational hypertension and 135 with pre-eclampsia), and 3388 pre-eclampsia cases and 6059 controls in a two sample mendelian randomisation analysis.

    Exposures: Single nucleotide polymorphisms in genes associated with vitamin D synthesis (rs10741657 and rs12785878) and metabolism (rs6013897 and rs2282679) were used as instrumental variables.

    Main outcome measures: Gestational hypertension and pre-eclampsia defined according to the International Society for the Study of Hypertension in Pregnancy.

    Results: In the conventional multivariable analysis, the relative risk for pre-eclampsia was 1.03 (95% confidence interval 1.00 to 1.07) per 10% decrease in 25-hydroxyvitamin D level, and 2.04 (1.02 to 4.07) for 25-hydroxyvitamin D levels <25 nmol/L compared with ≥75 nmol/L. No association was found for gestational hypertension. The one sample mendelian randomisation analysis using the total genetic risk score as an instrument did not provide strong evidence of a linear effect of 25-hydroxyvitamin D on the risk of gestational hypertension or pre-eclampsia: odds ratio 0.90 (95% confidence interval 0.78 to 1.03) and 1.19 (0.92 to 1.52) per 10% decrease, respectively. The two sample mendelian randomisation estimate gave an odds ratio for pre-eclampsia of 0.98 (0.89 to 1.07) per 10% decrease in 25-hydroxyvitamin D level, an odds ratio of 0.96 (0.80 to 1.15) per unit increase in the log(odds) of 25-hydroxyvitamin D level <75 nmol/L, and an odds ratio of 0.93 (0.73 to 1.19) per unit increase in the log(odds) of 25-hydroxyvitamin D levels <50 nmol/L.

    Conclusions: No strong evidence was found to support a causal effect of vitamin D status on gestational hypertension or pre-eclampsia. Future mendelian randomisation studies with a larger number of women with pre-eclampsia or more genetic instruments that would increase the proportion of 25-hydroxyvitamin D levels explained by the instrument are needed.

    Funded by: European Research Council: 648916

    BMJ (Clinical research ed.) 2018;361;k2167

  • Fine-mapping type 2 diabetes loci to single-variant resolution using high-density imputation and islet-specific epigenome maps.

    Mahajan A, Taliun D, Thurner M, Robertson NR, Torres JM, Rayner NW, Payne AJ, Steinthorsdottir V, Scott RA, Grarup N, Cook JP, Schmidt EM, Wuttke M, Sarnowski C, Mägi R, Nano J, Gieger C, Trompet S, Lecoeur C, Preuss MH, Prins BP, Guo X, Bielak LF, Below JE, Bowden DW, Chambers JC, Kim YJ, Ng MCY, Petty LE, Sim X, Zhang W, Bennett AJ, Bork-Jensen J, Brummett CM, Canouil M, Ec Kardt KU, Fischer K, Kardia SLR, Kronenberg F, Läll K, Liu CT, Locke AE, Luan J, Ntalla I, Nylander V, Schönherr S, Schurmann C, Yengo L, Bottinger EP, Brandslund I, Christensen C, Dedoussis G, Florez JC, Ford I, Franco OH, Frayling TM, Giedraitis V, Hackinger S, Hattersley AT, Herder C, Ikram MA, Ingelsson M, Jørgensen ME, Jørgensen T, Kriebel J, Kuusisto J, Ligthart S, Lindgren CM, Linneberg A, Lyssenko V, Mamakou V, Meitinger T, Mohlke KL, Morris AD, Nadkarni G, Pankow JS, Peters A, Sattar N, Stančáková A, Strauch K, Taylor KD, Thorand B, Thorleifsson G, Thorsteinsdottir U, Tuomilehto J, Witte DR, Dupuis J, Peyser PA, Zeggini E, Loos RJF, Froguel P, Ingelsson E, Lind L, Groop L, Laakso M, Collins FS, Jukema JW, Palmer CNA, Grallert H, Metspalu A, Dehghan A, Köttgen A, Abecasis GR, Meigs JB, Rotter JI, Marchini J, Pedersen O, Hansen T, Langenberg C, Wareham NJ, Stefansson K, Gloyn AL, Morris AP, Boehnke M and McCarthy MI

    Wellcome Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford, UK.

    We expanded GWAS discovery for type 2 diabetes (T2D) by combining data from 898,130 European-descent individuals (9% cases), after imputation to high-density reference panels. With these data, we (i) extend the inventory of T2D-risk variants (243 loci, 135 newly implicated in T2D predisposition, comprising 403 distinct association signals); (ii) enrich discovery of lower-frequency risk alleles (80 index variants with minor allele frequency <5%, 14 with estimated allelic odds ratio >2); (iii) substantially improve fine-mapping of causal variants (at 51 signals, one variant accounted for >80% posterior probability of association (PPA)); (iv) extend fine-mapping through integration of tissue-specific epigenomic information (islet regulatory annotations extend the number of variants with PPA >80% to 73); (v) highlight validated therapeutic targets (18 genes with associations attributable to coding variants); and (vi) demonstrate enhanced potential for clinical translation (genome-wide chip heritability explains 18% of T2D risk; individuals in the extremes of a T2D polygenic risk score differ more than ninefold in prevalence).

    Nature genetics 2018

  • Sequencing of Supernumerary Chromosomes of Red Fox and Raccoon Dog Confirms a Non-Random Gene Acquisition by B Chromosomes.

    Makunin AI, Romanenko SA, Beklemisheva VR, Perelman PL, Druzhkova AS, Petrova KO, Prokopov DY, Chernyaeva EN, Johnson JL, Kukekova AV, Yang F, Ferguson-Smith MA, Graphodatsky AS and Trifonov VA

    Institute of Molecular and Cellular Biology Siberian Branch of the Russian Academy of Sciences, 630090 Novosibirsk, Russia.

    B chromosomes (Bs) represent a variable addition to the main karyotype in some lineages of animals and plants. Bs accumulate through non-Mendelian inheritance and become widespread in populations. Despite the presence of multiple genes, most Bs lack specific phenotypic effects, although their influence on host genome epigenetic status and gene expression are recorded. Previously, using sequencing of isolated Bs of ruminants and rodents, we demonstrated that Bs originate as segmental duplications of specific genomic regions, and subsequently experience pseudogenization and repeat accumulation. Here, we used a similar approach to characterize Bs of the red fox (<i>Vulpes vulpes</i> L.) and the Chinese raccoon dog (<i>Nyctereutes procyonoides procyonoides</i> Gray). We confirm the previous findings of the <i>KIT</i> gene on Bs of both species, but demostrate an independent origin of Bs in these species, with two reused regions. Comparison of gene ensembles in Bs of canids, ruminants, and rodents once again indicates enrichment with cell-cycle genes, development-related genes, and genes functioning in the neuron synapse. The presence of B-chromosomal copies of genes involved in cell-cycle regulation and tissue differentiation may indicate importance of these genes for B chromosome establishment.

    Genes 2018;9;8

  • Whole-genome sequences of Malawi cichlids reveal multiple radiations interconnected by gene flow.

    Malinsky M, Svardal H, Tyers AM, Miska EA, Genner MJ, Turner GF and Durbin R

    Wellcome Sanger Institute, Cambridge, UK.

    The hundreds of cichlid fish species in Lake Malawi constitute the most extensive recent vertebrate adaptive radiation. Here we characterize its genomic diversity by sequencing 134 individuals covering 73 species across all major lineages. The average sequence divergence between species pairs is only 0.1-0.25%. These divergence values overlap diversity within species, with 82% of heterozygosity shared between species. Phylogenetic analyses suggest that diversification initially proceeded by serial branching from a generalist Astatotilapia-like ancestor. However, no single species tree adequately represents all species relationships, with evidence for substantial gene flow at multiple times. Common signatures of selection on visual and oxygen transport genes shared by distantly related deep-water species point to both adaptive introgression and independent selection. These findings enhance our understanding of genomic processes underlying rapid species diversification, and provide a platform for future genetic analysis of the Malawi radiation.

    Funded by: Wellcome Trust

    Nature ecology & evolution 2018;2;12;1940-1955

  • RADpainter and fineRADstructure: Population Inference from RADseq Data.

    Malinsky M, Trucchi E, Lawson DJ and Falush D

    Zoological Institute, Department of Environmental Sciences, University of Basel, Basel, Switzerland.

    Powerful approaches to inferring recent or current population structure based on nearest neighbor haplotype "coancestry" have so far been inaccessible to users without high quality genome-wide haplotype data. With a boom in nonmodel organism genomics, there is a pressing need to bring these methods to communities without access to such data. Here, we present RADpainter, a new program designed to infer the coancestry matrix from restriction-site-associated DNA sequencing (RADseq) data. We combine this program together with a previously published MCMC clustering algorithm into fineRADstructure-a complete, easy to use, and fast population inference package for RADseq data (; last accessed February 24, 2018). Finally, with two example data sets, we illustrate its use, benefits, and robustness to missing RAD alleles in double digest RAD sequencing.

    Funded by: Medical Research Council: MR/L015080/1, MR/M501608/1; Wellcome Trust: 097677/Z/11/Z, WT104125MA

    Molecular biology and evolution 2018;35;5;1284-1290

  • Combination therapy as a potential risk factor for the development of type 2 diabetes in patients with schizophrenia: the GOMAP study.

    Mamakou V, Hackinger S, Zengini E, Tsompanaki E, Marouli E, Serafetinidis I, Prins B, Karabela A, Glezou E, Southam L, Rayner NW, Kuchenbaecker K, Lamnissou K, Kontaxakis V, Dedoussis G, Gonidakis F, Thanopoulou A, Tentolouris N and Zeggini E

    Medical School, National and Kapodistrian University Athens, 75 M. Assias Street, 115 27, Athens, Greece.

    Background: Schizophrenia (SCZ) is associated with increased risk of type 2 diabetes (T2D). The potential diabetogenic effect of concomitant application of psychotropic treatment classes in patients with SCZ has not yet been evaluated. The overarching goal of the Genetic Overlap between Metabolic and Psychiatric disease (GOMAP) study is to assess the effect of pharmacological, anthropometric, lifestyle and clinical measurements, helping elucidate the mechanisms underlying the aetiology of T2D.

    Methods: The GOMAP case-control study (Genetic Overlap between Metabolic and Psychiatric disease) includes hospitalized patients with SCZ, some of whom have T2D. We enrolled 1653 patients with SCZ; 611 with T2D and 1042 patients without T2D. This is the first study of SCZ and T2D comorbidity at this scale in the Greek population. We retrieved detailed information on first- and second-generation antipsychotics (FGA, SGA), antidepressants and mood stabilizers, applied as monotherapy, 2-drug combination, or as 3- or more drug combination. We assessed the effects of psychotropic medication, body mass index, duration of schizophrenia, number of hospitalizations and physical activity on risk of T2D. Using logistic regression, we calculated crude and adjusted odds ratios (OR) to identify associations between demographic factors and the psychiatric medications.

    Results: Patients with SCZ on a combination of at least three different classes of psychiatric drugs had a higher risk of T2D [OR 1.81 (95% CI 1.22-2.69); p = 0.003] compared to FGA alone therapy, after adjustment for age, BMI, sex, duration of SCZ and number of hospitalizations. We did not find evidence for an association of SGA use or the combination of drugs belonging to two different classes of psychiatric medications with increased risk of T2D [1.27 (0.84-1.93), p = 0.259 and 0.98 (0.71-1.35), p = 0.885, respectively] compared to FGA use.

    Conclusions: We find an increased risk of T2D in patients with SCZ who take a combination of at least three different psychotropic medication classes compared to patients whose medication consists only of one or two classes of drugs.

    Funded by: Wellcome Trust: 098051

    BMC psychiatry 2018;18;1;249

  • Ancient pathogen DNA in human teeth and petrous bones.

    Margaryan A, Hansen HB, Rasmussen S, Sikora M, Moiseyev V, Khoklov A, Epimakhov A, Yepiskoposyan L, Kriiska A, Varul L, Saag L, Lynnerup N, Willerslev E and Allentoft ME

    Centre for GeoGeneticsNatural History Museum of DenmarkUniversity of CopenhagenCopenhagenDenmark.

    Recent ancient DNA (aDNA) studies of human pathogens have provided invaluable insights into their evolutionary history and prevalence in space and time. Most of these studies were based on DNA extracted from teeth or postcranial bones. In contrast, no pathogen DNA has been reported from the petrous bone which has become the most desired skeletal element in ancient DNA research due to its high endogenous DNA content. To compare the potential for pathogenic aDNA retrieval from teeth and petrous bones, we sampled these elements from five ancient skeletons, previously shown to be carrying <i>Yersinia pestis</i>. Based on shotgun sequencing data, four of these five plague victims showed clearly detectable levels of <i>Y. pestis</i> DNA in the teeth, whereas all the petrous bones failed to produce <i>Y. pestis</i> DNA above baseline levels. A broader comparative metagenomic analysis of teeth and petrous bones from 10 historical skeletons corroborated these results, showing a much higher microbial diversity in teeth than petrous bones, including pathogenic and oral microbial taxa. Our results imply that although petrous bones are highly valuable for ancient genomic analyses as an excellent source of endogenous DNA, the metagenomic potential of these dense skeletal elements is highly limited. This trade-off must be considered when designing the sampling strategy for an aDNA project.

    Ecology and evolution 2018;8;6;3534-3542

  • Direct Whole-Genome Sequencing of Cutaneous Strains of Haemophilus ducreyi.

    Marks M, Fookes M, Wagner J, Ghinai R, Sokana O, Sarkodie YA, Solomon AW, Mabey DCW and Thomson NR

    Haemophilus ducreyi, which causes chancroid, has emerged as a cause of pediatric skin disease. Isolation of H. ducreyi in low-income settings is challenging, limiting phylogenetic investigation. Next-generation sequencing demonstrates that cutaneous strains arise from class I and II H. ducreyi clades and that class II may represent a distinct subspecies.

    Emerging infectious diseases 2018;24;4;786-789

  • Editorial overview: Sequences and topology: Dynamic sequences and topologies of proteins.

    Marsh JA and Teichmann SA

    MRC Human Genetics Unit, Institute of Genetics & Molecular Medicine, University of Edinburgh, Edinburgh EH4 2XU, UK. Electronic address:

    Current opinion in structural biology 2018;50;vii-viii

  • Quantifying the contribution of recessive coding variation to developmental disorders.

    Martin HC, Jones WD, McIntyre R, Sanchez-Andrade G, Sanderson M, Stephenson JD, Jones CP, Handsaker J, Gallone G, Bruntraeger M, McRae JF, Prigmore E, Short P, Niemi M, Kaplanis J, Radford EJ, Akawi N, Balasubramanian M, Dean J, Horton R, Hulbert A, Johnson DS, Johnson K, Kumar D, Lynch SA, Mehta SG, Morton J, Parker MJ, Splitt M, Turnpenny PD, Vasudevan PC, Wright M, Bassett A, Gerety SS, Wright CF, FitzPatrick DR, Firth HV, Hurles ME, Barrett JC and Deciphering Developmental Disorders Study

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, UK.

    We estimated the genome-wide contribution of recessive coding variation in 6040 families from the Deciphering Developmental Disorders study. The proportion of cases attributable to recessive coding variants was 3.6% in patients of European ancestry, compared with 50% explained by de novo coding mutations. It was higher (31%) in patients with Pakistani ancestry, owing to elevated autozygosity. Half of this recessive burden is attributable to known genes. We identified two genes not previously associated with recessive developmental disorders, <i>KDM5B</i> and <i>EIF3F</i>, and functionally validated them with mouse and cellular models. Our results suggest that recessive coding variants account for a small fraction of currently undiagnosed nonconsanguineous individuals, and that the role of noncoding variants, incomplete penetrance, and polygenic mechanisms need further exploration.

    Science (New York, N.Y.) 2018;362;6419;1161-1164

  • Somatic mutant clones colonize the human esophagus with age.

    Martincorena I, Fowler JC, Wabik A, Lawson ARJ, Abascal F, Hall MWJ, Cagan A, Murai K, Mahbubani K, Stratton MR, Fitzgerald RC, Handford PA, Campbell PJ, Saeb-Parsy K and Jones PH

    Wellcome Sanger Institute, Hinxton, Cambridgeshire CB10 1SA, UK.

    The extent to which cells in normal tissues accumulate mutations throughout life is poorly understood. Some mutant cells expand into clones that can be detected by genome sequencing. We mapped mutant clones in normal esophageal epithelium from nine donors (age range, 20 to 75 years). Somatic mutations accumulated with age and were caused mainly by intrinsic mutational processes. We found strong positive selection of clones carrying mutations in 14 cancer genes, with tens to hundreds of clones per square centimeter. In middle-aged and elderly donors, clones with cancer-associated mutations covered much of the epithelium, with <i>NOTCH1</i> and <i>TP53</i> mutations affecting 12 to 80% and 2 to 37% of cells, respectively. Unexpectedly, the prevalence of <i>NOTCH1</i> mutations in normal esophagus was several times higher than in esophageal cancers. These findings have implications for our understanding of cancer and aging.

    Funded by: Cancer Research UK; Medical Research Council: MC_UU_12022/2, MC_UU_12022/3; Wellcome Trust

    Science (New York, N.Y.) 2018;362;6417;911-917

  • Functional characterization of common BCL11B gene desert variants suggests a lymphocyte-mediated association of BCL11B with aortic stiffness.

    Maskari RA, Hardege I, Cleary S, Figg N, Li Y, Siew K, Khir A, Yu Y, Liu P, Wilkinson I, O'Shaughnessy K and Yasmin

    Division of Experimental Medicine & Immunotherapeutics (EMIT), Department of Medicine, University of Cambridge, Cambridge, UK.

    The recent genome-wide analysis of carotid-femoral pulse wave velocity (PWV) identified a significant locus within the 14q32.2 gene desert. Gene regulatory elements for the transcriptional regulator B-cell CLL/lymphoma 11B (BCL11B) are within this locus and an attractive target for the gene association. We investigated the functional impact of these gene desert SNPs on BCL11B transcript in human aorta to characterize further its role in aortic stiffness. To do this, we used a large repository of aortic tissues (n = 185) from an organ transplant program and assessed ex vivo stiffness of the aortic rings. We tested association of three lead SNPs from the GWAS meta-analysis with ex vivo aortic stiffness and BCL11B aortic mRNA expression: rs1381289 and rs10782490 SNPs associated significantly with PWV and showed allele-specific differences in BCL11B mRNA. The risk alleles associated with lower BCL11B expression, suggesting a protective role for BCL11B. Despite strong association, we could not detect BCL11B protein in the human aorta. However, qPCR for CD markers showed that BCL11B transcript correlated strongly with markers for activated lymphocytes. Our data confirm the significance of the 14q32.2 region as a risk locus for aortic stiffness and an upstream regulator of BCL11B. The BCL11B transcript detected in the human aorta may reflect lymphocyte infiltration, suggesting that immune mechanisms contribute to the observed association of BCL11B with aortic stiffness.

    European journal of human genetics : EJHG 2018

  • Single Cell Gene Expression to Understand the Dynamic Architecture of the Heart.

    Massaia A, Chaves P, Samari S, Miragaia RJ, Meyer K, Teichmann SA and Noseda M

    British Heart Foundation Centre of Research Excellence and British Heart Foundation Centre for Regenerative Medicine, National Heart and Lung Institute, Imperial College London, London, United Kingdom.

    The recent development of single cell gene expression technologies, and especially single cell transcriptomics, have revolutionized the way biologists and clinicians investigate organs and organisms, allowing an unprecedented level of resolution to the description of cell demographics in both healthy and diseased states. Single cell transcriptomics provide information on prevalence, heterogeneity, and gene co-expression at the individual cell level. This enables a cell-centric outlook to define intracellular gene regulatory networks and to bridge toward the definition of intercellular pathways otherwise masked in bulk analysis. The technologies have developed at a fast pace producing a multitude of different approaches, with several alternatives to choose from at any step, including single cell isolation and capturing, lysis, RNA reverse transcription and cDNA amplification, library preparation, sequencing, and computational analyses. Here, we provide guidelines for the experimental design of single cell RNA sequencing experiments, exploring the current options for the crucial steps. Furthermore, we provide a complete overview of the typical data analysis workflow, from handling the raw sequencing data to making biological inferences. Significantly, advancements in single cell transcriptomics have already contributed to outstanding exploratory and functional studies of cardiac development and disease models, as summarized in this review. In conclusion, we discuss achievable outcomes of single cell transcriptomics' applications in addressing unanswered questions and influencing future cardiac clinical applications.

    Frontiers in cardiovascular medicine 2018;5;167

  • New Variant of Multidrug-Resistant <i>Salmonella enterica</i> Serovar Typhimurium Associated with Invasive Disease in Immunocompromised Patients in Vietnam.

    Mather AE, Phuong TLT, Gao Y, Clare S, Mukhopadhyay S, Goulding DA, Hoang NTD, Tuyen HT, Lan NPH, Thompson CN, Trang NHT, Carrique-Mas J, Tue NT, Campbell JI, Rabaa MA, Thanh DP, Harcourt K, Hoa NT, Trung NV, Schultsz C, Perron GG, Coia JE, Brown DJ, Okoro C, Parkhill J, Thomson NR, Chau NVV, Thwaites GE, Maskell DJ, Dougan G, Kenney LJ and Baker S

    Department of Veterinary Medicine, University of Cambridge, Cambridge, United Kingdom.

    Nontyphoidal <i>Salmonella</i> (NTS), particularly <i>Salmonella enterica</i> serovar Typhimurium, is among the leading etiologic agents of bacterial enterocolitis globally and a well-characterized cause of invasive disease (iNTS) in sub-Saharan Africa. In contrast, <i>S</i> Typhimurium is poorly defined in Southeast Asia, a known hot spot for zoonotic disease with a recently described burden of iNTS disease. Here, we aimed to add insight into the epidemiology and potential impact of zoonotic transfer and antimicrobial resistance (AMR) in <i>S</i> Typhimurium associated with iNTS and enterocolitis in Vietnam. We performed whole-genome sequencing and phylogenetic reconstruction on 85 human (enterocolitis, carriage, and iNTS) and 113 animal <i>S</i> Typhimurium isolates isolated in Vietnam. We found limited evidence for the zoonotic transmission of <i>S</i> Typhimurium. However, we describe a chain of events where a pandemic monophasic variant of <i>S</i> Typhimurium (serovar I:4,[5],12:i:- sequence type 34 [ST34]) has been introduced into Vietnam, reacquired a phase 2 flagellum, and acquired an IncHI2 multidrug-resistant plasmid. Notably, these novel biphasic ST34 <i>S</i> Typhimurium variants were significantly associated with iNTS in Vietnamese HIV-infected patients. Our study represents the first characterization of novel iNTS organisms isolated outside sub-Saharan Africa and outlines a new pathway for the emergence of alternative <i>Salmonella</i> variants into susceptible human populations.<b>IMPORTANCE</b><i>Salmonella</i> Typhimurium is a major diarrheal pathogen and associated with invasive nontyphoid <i>Salmonella</i> (iNTS) disease in vulnerable populations. We present the first characterization of iNTS organisms in Southeast Asia and describe a different evolutionary trajectory from that of organisms causing iNTS in sub-Saharan Africa. In Vietnam, the globally distributed monophasic variant of <i>Salmonella</i> Typhimurium, the serovar I:4,[5],12:i:- ST34 clone, has reacquired a phase 2 flagellum and gained a multidrug-resistant plasmid to become associated with iNTS disease in HIV-infected patients. We document distinct communities of <i>S</i> Typhimurium and I:4,[5],12:i:- in animals and humans in Vietnam, despite the greater mixing of these host populations here. These data highlight the importance of whole-genome sequencing surveillance in a One Health context in understanding the evolution and spread of resistant bacterial infections.

    mBio 2018;9;5

  • SLAM-ITseq: Sequencing cell type-specific transcriptomes without cell sorting.

    Matsushima W, Herzog VA, Neumann T, Gapp K, Zuber J, Ameres SL and Miska EA

    Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, UK.

    Cell type-specific transcriptome analysis is an essential tool in understanding biological processes in which diverse types of cells are involved. Although cell isolation methods such as fluorescence-activated cell sorting (FACS) in combination with transcriptome analysis have widely been used so far, their time-consuming and harsh procedures limit their applications. Here, we report a novel <i>in vivo</i> metabolic RNA sequencing method, SLAM-ITseq, which metabolically labels RNA with 4-thiouracil in a specific cell type <i>in vivo</i> followed by detection through an RNA-seq-based method that specifically distinguishes the thiolated uridine by base conversion. This method has successfully identified the cell type-specific transcriptome in three different tissues: endothelial cells in brain, epithelial cells in intestine, and adipocytes in white adipose tissue. Since this method does not require isolation of cells or RNA prior to the transcriptomic analysis, SLAM-ITseq provides an easy yet accurate snapshot of the transcriptional state <i>in vivo</i>.

    Development (Cambridge, England) 2018

  • Biological and prognostic impact of APOBEC-induced mutations in the spectrum of plasma cell dyscrasias and multiple myeloma cell lines.

    Maura F, Petljak M, Lionetti M, Cifola I, Liang W, Pinatel E, Alexandrov LB, Fullam A, Martincorena I, Dawson KJ, Angelopoulos N, Samur MK, Szalat R, Zamora J, Tarpey P, Davies H, Corradini P, Anderson KC, Minvielle S, Neri A, Avet-Loiseau H, Keats J, Campbell PJ, Munshi NC and Bolli N

    Department of Oncology and Hemato-Oncology, University of Milan, Milan, Italy.

    Funded by: NCI NIH HHS: P01 CA155258

    Leukemia 2018;32;4;1044-1048

  • The prehistoric peopling of Southeast Asia.

    McColl H, Racimo F, Vinner L, Demeter F, Gakuhari T, Moreno-Mayar JV, van Driem G, Gram Wilken U, Seguin-Orlando A, de la Fuente Castro C, Wasef S, Shoocongdej R, Souksavatdy V, Sayavongkhamdy T, Saidin MM, Allentoft ME, Sato T, Malaspinas AS, Aghakhanian FA, Korneliussen T, Prohaska A, Margaryan A, de Barros Damgaard P, Kaewsutthi S, Lertrit P, Nguyen TMH, Hung HC, Minh Tran T, Nghia Truong H, Nguyen GH, Shahidan S, Wiradnyana K, Matsumae H, Shigehara N, Yoneda M, Ishida H, Masuyama T, Yamada Y, Tajima A, Shibata H, Toyoda A, Hanihara T, Nakagome S, Deviese T, Bacon AM, Duringer P, Ponche JL, Shackelford L, Patole-Edoumba E, Nguyen AT, Bellina-Pryce B, Galipaud JC, Kinaston R, Buckley H, Pottier C, Rasmussen S, Higham T, Foley RA, Lahr MM, Orlando L, Sikora M, Phipps ME, Oota H, Higham C, Lambert DM and Willerslev E

    Centre for GeoGenetics, Natural History Museum of Denmark, Copenhagen, Denmark.

    The human occupation history of Southeast Asia (SEA) remains heavily debated. Current evidence suggests that SEA was occupied by Hòabìnhian hunter-gatherers until ~4000 years ago, when farming economies developed and expanded, restricting foraging groups to remote habitats. Some argue that agricultural development was indigenous; others favor the "two-layer" hypothesis that posits a southward expansion of farmers giving rise to present-day Southeast Asian genetic diversity. By sequencing 26 ancient human genomes (25 from SEA, 1 Japanese Jōmon), we show that neither interpretation fits the complexity of Southeast Asian history: Both Hòabìnhian hunter-gatherers and East Asian farmers contributed to current Southeast Asian diversity, with further migrations affecting island SEA and Vietnam. Our results help resolve one of the long-standing controversies in Southeast Asian prehistory.

    Science (New York, N.Y.) 2018;361;6397;88-92

  • The Developmental Shift of NMDA Receptor Composition Proceeds Independently of GluN2 Subunit-Specific GluN2 C-Terminal Sequences.

    McKay S, Ryan TJ, McQueen J, Indersmitten T, Marwick KFM, Hasel P, Kopanitsa MV, Baxter PS, Martel MA, Kind PC, Wyllie DJA, O'Dell TJ, Grant SGN, Hardingham GE and Komiyama NH

    Centre for Discovery Brain Sciences, University of Edinburgh, Hugh Robson Building, George Square, Edinburgh EH8 9XD, UK; Simons Initiative for the Developing Brain, University of Edinburgh, Hugh Robson Building, George Square, Edinburgh EH8 9XD, UK; UK Dementia Research Institute at the University of Edinburgh, Chancellor's Building, Edinburgh Medical School, Edinburgh EH16 4SB, UK.

    The GluN2 subtype (2A versus 2B) determines biophysical properties and signaling of forebrain NMDA receptors (NMDARs). During development, GluN2A becomes incorporated into previously GluN2B-dominated NMDARs. This "switch" is proposed to be driven by distinct features of GluN2 cytoplasmic C-terminal domains (CTDs), including a unique CaMKII interaction site in GluN2B that drives removal from the synapse. However, these models remain untested in the context of endogenous NMDARs. We show that, although mutating the endogenous GluN2B CaMKII site has secondary effects on GluN2B CTD phosphorylation, the developmental changes in NMDAR composition occur normally and measures of plasticity and synaptogenesis are unaffected. Moreover, the switch proceeds normally in mice that have the GluN2A CTD replaced by that of GluN2B and commences without an observable decline in GluN2B levels but is impaired by GluN2A haploinsufficiency. Thus, GluN2A expression levels, and not GluN2 subtype-specific CTD-driven events, are the overriding factor in the developmental switch in NMDAR composition.

    Cell reports 2018;25;4;841-851.e4

  • A guideline for the diagnosis and management of polycythaemia vera. A British Society for Haematology Guideline.

    McMullin MF, Harrison CN, Ali S, Cargo C, Chen F, Ewing J, Garg M, Godfrey A, S SK, McLornan DP, Nangalia J, Sekhar M, Wadelin F, Mead AJ and BSH Committee

    Centre for Medical Education, Queen's University, Belfast, UK.

    British journal of haematology 2018

  • Mutational signatures of DNA mismatch repair deficiency in <i>C. elegans</i> and human cancers.

    Meier B, Volkova NV, Hong Y, Schofield P, Campbell PJ, Gerstung M and Gartner A

    Centre for Gene Regulation and Expression, University of Dundee, Dundee DD1 5EH, United Kingdom.

    Throughout their lifetime, cells are subject to extrinsic and intrinsic mutational processes leaving behind characteristic signatures in the genome. DNA mismatch repair (MMR) deficiency leads to hypermutation and is found in different cancer types. Although it is possible to associate mutational signatures extracted from human cancers with possible mutational processes, the exact causation is often unknown. Here, we use <i>C. elegans</i> genome sequencing of <i>pms-2</i> and <i>mlh-1</i> knockouts to reveal the mutational patterns linked to <i>C. elegans</i> MMR deficiency and their dependency on endogenous replication errors and errors caused by deletion of the polymerase ε subunit <i>pole-4</i> Signature extraction from 215 human colorectal and 289 gastric adenocarcinomas revealed three MMR-associated signatures, one of which closely resembles the <i>C. elegans</i> MMR spectrum and strongly discriminates microsatellite stable and unstable tumors (AUC = 98%). A characteristic difference between human and <i>C. elegans</i> MMR deficiency is the lack of elevated levels of N<u>C</u>G > NTG mutations in <i>C. elegans,</i> likely caused by the absence of cytosine (CpG) methylation in worms<i>.</i> The other two human MMR signatures may reflect the interaction between MMR deficiency and other mutagenic processes, but their exact cause remains unknown. In summary, combining information from genetically defined models and cancer samples allows for better aligning mutational signatures to causal mutagenic processes.

    Genome research 2018

  • Dairy Consumption and Body Mass Index Among Adults: Mendelian Randomization Analysis of 184802 Individuals from 25 Studies.

    Mendelian Randomization of Dairy Consumption Working Group

    Background: Associations between dairy intake and body mass index (BMI) have been inconsistently observed in epidemiological studies, and the causal relationship remains ill defined.

    Methods: We performed Mendelian randomization (MR) analysis using an established dairy intake-associated genetic polymorphism located upstream of the lactase gene (<i>LCT</i>-13910 C/T, rs4988235) as an instrumental variable (IV). Linear regression models were fitted to analyze associations between (<i>a</i>) dairy intake and BMI, (<i>b</i>) rs4988235 and dairy intake, and (<i>c</i>) rs4988235 and BMI in each study. The causal effect of dairy intake on BMI was quantified by IV estimators among 184802 participants from 25 studies.

    Results: Higher dairy intake was associated with higher BMI (β = 0.03 kg/m<sup>2</sup> per serving/day; 95% CI, 0.00-0.06; <i>P</i> = 0.04), whereas the <i>LCT</i> genotype with 1 or 2 T allele was significantly associated with 0.20 (95% CI, 0.14-0.25) serving/day higher dairy intake (<i>P</i> = 3.15 × 10<sup>-12</sup>) and 0.12 (95% CI, 0.06-0.17) kg/m<sup>2</sup> higher BMI (<i>P</i> = 2.11 × 10<sup>-5</sup>). MR analysis showed that the genetically determined higher dairy intake was significantly associated with higher BMI (β = 0.60 kg/m<sup>2</sup> per serving/day; 95% CI, 0.27-0.92; <i>P</i> = 3.0 × 10<sup>-4</sup>).

    Conclusions: The present study provides strong evidence to support a causal effect of higher dairy intake on increased BMI among adults.

    Funded by: CIHR: MOP-82893; NCATS NIH HHS: UL1 TR000040, UL1 TR001079, UL1 TR001881; NCI NIH HHS: U01 CA137088, U54 CA155626; NCRR NIH HHS: UL1 RR025005; NHGRI NIH HHS: U01 HG004399, U01 HG004402, U01 HG004728; NHLBI NIH HHS: HHSN268200800007C, HHSN268201100005C, HHSN268201100006C, HHSN268201100007C, HHSN268201100008C, HHSN268201100009C, HHSN268201100010C, HHSN268201100011C, HHSN268201100012C, HHSN268201200036C, K08 HL112845, N01HC55222, N01HC85079, N01HC85080, N01HC85081, N01HC85082, N01HC85083, N01HC85086, N01HC95159, N01HC95160, N01HC95161, N01HC95162, N01HC95163, N01HC95164, N01HC95165, N01HC95166, N01HC95167, N01HC95168, N01HC95169, P50 HL105185, R01 HL086694, R01 HL087641, R01 HL087652, R01 HL091357, R01 HL103612, R01 HL105756, R01 HL120393, R21 HL126024, U01 HL072524, U01 HL080295, U01 HL130114; NIA NIH HHS: P01 AG023394, R01 AG023629; NIDDK NIH HHS: R01 DK089256, R01 DK091718, R01 DK100383, R01 DK115679

    Clinical chemistry 2018;64;1;183-191

  • The germline genetic component of drug sensitivity in cancer cell lines.

    Menden MP, Casale FP, Stephan J, Bignell GR, Iorio F, McDermott U, Garnett MJ, Saez-Rodriguez J and Stegle O

    Oncology Innovative Medicines & Early Drug Development, AstraZeneca, Milton Science Park, Cambridge, CB4 0FZ, UK.

    Patients with seemingly the same tumour can respond very differently to treatment. There are strong, well-established effects of somatic mutations on drug efficacy, but there is at-most anecdotal evidence of a germline component to drug response. Here, we report a systematic survey of how inherited germline variants affect drug susceptibility in cancer cell lines. We develop a joint analysis approach that leverages both germline and somatic variants, before applying it to screening data from 993 cell lines and 265 drugs. Surprisingly, we find that the germline contribution to variation in drug susceptibility can be as large or larger than effects due to somatic mutations. Several of the associations identified have a direct relationship to the drug target. Finally, using 17-AAG response as an example, we show how germline effects in combination with transcriptomic data can be leveraged for improved patient stratification and to identify new markers for drug sensitivity.

    Nature communications 2018;9;1;3385

  • Pre-clinical evaluation of a <i>P. berghei</i>-based whole-sporozoite malaria vaccine candidate.

    Mendes AM, Reuling IJ, Andrade CM, Otto TD, Machado M, Teixeira F, Pissarra J, Gonçalves-Rosa N, Bonaparte D, Sinfrónio J, Sanders M, Janse CJ, Khan SM, Newbold CI, Berriman M, Lee CK, Wu Y, Ockenhouse CF, Sauerwein RW and Prudêncio M

    1Instituto de Medicina Molecular, Faculdade de Medicina, Universidade de Lisboa, Avenida Professor Egas Moniz, 1649-028 Lisboa, Portugal.

    Whole-sporozoite vaccination/immunization induces high levels of protective immunity in both rodent models of malaria and in humans. Recently, we generated a transgenic line of the rodent malaria parasite <i>P. berghei</i> (<i>Pb</i>) that expresses the <i>P. falciparum</i> (<i>Pf</i>) circumsporozoite protein (<i>Pf</i>CS), and showed that this parasite line (<i>Pb</i>Vac) was capable of (1) infecting and developing in human hepatocytes but not in human erythrocytes, and (2) inducing neutralizing antibodies against the human <i>Pf</i> parasite. Here, we analyzed <i>Pb</i>Vac in detail and developed tools necessary for its use in clinical studies. A microbiological contaminant-free Master Cell Bank of <i>Pb</i>Vac parasites was generated through a process of cyclic propagation and clonal expansion in mice and mosquitoes and was genetically characterized. A highly sensitive qRT-PCR-based method was established that enables <i>Pb</i>Vac parasite detection and quantification at low parasite densities in vivo. This method was employed in a biodistribution study in a rabbit model, revealing that the parasite is only present at the site of administration and in the liver up to 48 h post infection and is no longer detectable at any site 10 days after administration. An extensive toxicology investigation carried out in rabbits further showed the absence of <i>Pb</i>Vac-related toxicity. In vivo drug sensitivity assays employing rodent models of infection showed that both the liver and the blood stage forms of <i>Pb</i>Vac were completely eliminated by Malarone<sup>®</sup> treatment. Collectively, our pre-clinical safety assessment demonstrates that <i>Pb</i>Vac possesses all characteristics necessary to advance into clinical evaluation.

    NPJ vaccines 2018;3;54

  • Comparative immunogenicity and efficacy of equivalent outer membrane vesicle and glycoconjugate vaccines against nontyphoidal <i>Salmonella</i>.

    Micoli F, Rondini S, Alfini R, Lanzilao L, Necchi F, Negrea A, Rossi O, Brandt C, Clare S, Mastroeni P, Rappuoli R, Saul A and MacLennan CA

    GSK Vaccines Institute for Global Health S.r.l. (GVGH), 53100 Siena, Italy;

    Nontyphoidal <i>Salmonellae</i> cause a devastating burden of invasive disease in sub-Saharan Africa with high levels of antimicrobial resistance. Vaccination has potential for a major global health impact, but no licensed vaccine is available. The lack of commercial incentive makes simple, affordable technologies the preferred route for vaccine development. Here we compare equivalent Generalized Modules for Membrane Antigens (GMMA) outer membrane vesicles and O-antigen-CRM<sub>197</sub> glycoconjugates to deliver lipopolysaccharide O-antigen in bivalent <i>Salmonella</i> Typhimurium and Enteritidis vaccines. <i>Salmonella</i> strains were chosen and <i>tolR</i> deleted to induce GMMA production. O-antigens were extracted from wild-type bacteria and conjugated to CRM<sub>197</sub> Purified GMMA and glycoconjugates were characterized and tested in mice for immunogenicity and ability to reduce <i>Salmonella</i> infection. GMMA and glycoconjugate O-antigen had similar structural characteristics, O-acetylation, and glucosylation levels. Immunization with GMMA induced higher anti-O-antigen IgG than glycoconjugate administered without Alhydrogel adjuvant. With Alhydrogel, antibody levels were similar. GMMA induced a diverse antibody isotype profile with greater serum bactericidal activity than glycoconjugate, which induced almost exclusively IgG1. Immunization reduced bacterial colonization of mice subsequently infected with <i>Salmonella</i><i>S</i> Typhimurium numbers were lower in tissues of mice vaccinated with GMMA compared with glycoconjugate. <i>S.</i> Enteritidis burden in the tissues was similar in mice immunized with either vaccine. With favorable immunogenicity, low cost, and ability to induce functional antibodies and reduce bacterial burden, GMMA offer a promising strategy for the development of a nontyphoidal <i>Salmonella</i> vaccine compared with established glycoconjugates. GMMA technology is potentially attractive for development of vaccines against other bacteria of global health significance.

    Proceedings of the National Academy of Sciences of the United States of America 2018;115;41;10428-10433

  • Society and personal genome data.

    Middleton A

    Head of Society and Ethics Research Group, Connecting Science, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.

    Genomic data offers a goldmine of information for understanding the contribution genetic variation makes to health and disease. The potential of genomic medicine, to predict, diagnose, manage and treat genetic disease, is underpinned by accurate variant interpretation. This in itself hinges on the ability to access large and varied genomic databases. There is now recognition that international collaboration between research and healthcare systems are paramount to delivering the scale of genomic data required. No single research group, institute or country will liberate our understanding, it is only through global cooperation, together with super computing power, will we truly make sense of how genotype and phenotype correlate. Whilst it is logistically possible to create computing systems that talk to each other and aggregate datasets ready to reveal novel correlations, the bottom line is that this will only happen if people (whether they be scientists, clinicians, patients, research participants, policy makers, politicians, law makers) support the principle that we should be donating, accessing and sharing our DNA data in this way. And in order to make the most sense of genomics, given the geographical and ancestral variation between us, such people are likely to be the majority of society.Within this review, a perspective is proffered on the human story that underpins genomic 'big data' access and how we are at a tipping point as a society - we need to decide collectively, are we in? and if so, what needs to be in place to protect us? or are we out?

    Human molecular genetics 2018

  • 'Your DNA, Your Say': global survey gathering attitudes toward genomics: design, delivery and methods.

    Middleton A, Niemiec E, Prainsack B, Bobe J, Farley L, Steed C, Smith J, Bevan P, Bonhomme N, Kleiderman E, Thorogood A, Schickhardt C, Garattini C, Vears D, Littler K, Banner N, Scott E, Kovalevskaya NV, Levin E, Morley KI and Howard HC

    Society & Ethics Research, Connecting Science, Wellcome Genome Campus, Cambridge, UK.

    Our international study, 'Your DNA, Your Say', uses film and an online cross-sectional survey to gather public attitudes toward the donation, access and sharing of DNA information. We describe the methodological approach used to create an engaging and bespoke survey, suitable for translation into many different languages. We address some of the particular challenges in designing a survey on the subject of genomics. In order to understand the significance of a genomic result, researchers and clinicians alike use external databases containing DNA and medical information from thousands of people. We ask how publics would like their 'anonymous' data to be used (or not to be used) and whether they are concerned by the potential risks of reidentification; the results will be used to inform policy.

    Personalized medicine 2018;15;4;311-318

  • Eavesdropping and crosstalk between secreted quorum sensing peptide signals that regulate bacteriocin production in Streptococcus pneumoniae.

    Miller EL, Kjos M, Abrudan MI, Roberts IS, Veening JW and Rozen DE

    School of Biological Science, Faculty of Biology, Medicine, and Health, Manchester Academic Health Science Centre, University of Manchester, Manchester, M13 9PL, UK.

    Quorum sensing (QS), where bacteria secrete and respond to chemical signals to coordinate population-wide behaviors, has revealed that bacteria are highly social. Here, we investigate how diversity in QS signals and receptors can modify social interactions controlled by the QS system regulating bacteriocin secretion in Streptococcus pneumoniae, encoded by the blp operon (bacteriocin-like peptide). Analysis of 4096 pneumococcal genomes detected nine blp QS signals (BlpC) and five QS receptor groups (BlpH). Imperfect concordance between signals and receptors suggested widespread social interactions between cells, specifically eavesdropping (where cells respond to signals that they do not produce) and crosstalk (where cells produce signals that non-clones detect). This was confirmed in vitro by measuring the response of reporter strains containing six different blp QS receptors to cognate and non-cognate peptides. Assays between pneumococcal colonies grown adjacent to one another provided further evidence that crosstalk and eavesdropping occur at endogenous levels of signal secretion. Finally, simulations of QS strains producing bacteriocins revealed that eavesdropping can be evolutionarily beneficial even when the affinity for non-cognate signals is very weak. Our results highlight that social interactions can mediate intraspecific competition among bacteria and reveal that competitive interactions can be modified by polymorphic QS systems.

    Funded by: Biotechnology and Biological Sciences Research Council (BBSRC): BB/J006009/1, BB/M000281/1; EC | European Research Council (ERC): 337399-PneumoCell; European Research Council: 337399; Netherlands Organisation for Scientific Research | Stichting voor de Technische Wetenschappen (Technology Foundation STW): 864.12.001; Wellcome Trust

    The ISME journal 2018;12;10;2363-2375

  • Deep phenotyping in zebrafish reveals genetic and diet-induced adiposity changes that may inform disease risk.

    Minchin JEN, Scahill CM, Staudt N, Busch-Nentwich EM and Rawls JF

    University of Edinburgh, United Kingdom;

    The regional distribution of adipose tissues is implicated in a wide range of diseases. For example, proportional increases in visceral adipose tissue increase the risk for insulin resistance, diabetes and cardiovascular disease. Zebrafish offer a tractable model system by which to obtain unbiased and quantitative phenotypic information on regional adiposity, and deep phenotyping can explore complex disease-related adiposity traits. To facilitate deep phenotyping of zebrafish adiposity traits, we used pairwise correlations between 67 adiposity traits to generate stage-specific adiposity profiles that describe changing adiposity patterns and relationships during growth. Linear discriminant analysis classified individual fish according to adiposity profile with 87.5% accuracy. Deep phenotyping of eight previously uncharacterized zebrafish mutants identified neuropilin 2b as a novel gene that alters adipose distribution. When we applied deep phenotyping to identify changes in adiposity during diet manipulations, zebrafish that underwent food restriction and re-feeding had widespread adiposity changes when compared to continuously-fed, equivalently-sized control animals. In particular, internal adipose tissues (e.g., visceral adipose) exhibited a reduced capacity to replenish lipid following food restriction. Together, these results in zebrafish establish a new deep phenotyping technique as an unbiased and quantitative method to help uncover new relationships between genotype, diet and adiposity.

    Journal of lipid research 2018

  • Genomics and clinical correlates of renal cell carcinoma.

    Mitchell TJ, Rossi SH, Klatte T and Stewart GD

    Cancer Genome Project, Wellcome Sanger Institute, Hinxton, CB10 1SA, UK.

    Purpose: Clear cell, papillary cell, and chromophobe renal cell carcinomas (RCCs) have now been well characterised thanks to large collaborative projects such as The Cancer Genome Atlas (TCGA). Not only has knowledge of the genomic landscape helped inform the development of new drugs, it also promises to fine tune prognostication.

    Methods: A literature review was performed summarising the current knowledge on the genetic basis of RCC.

    Results: The Von Hippel-Lindau (VHL) tumour suppressor gene undergoes bi-allelic knockout in the vast majority of clear cell RCCs. The next most prevalent aberrations include a cohort of chromatin-modifying genes with diverse roles including PBRM1, SETD2, BAP1, and KMD5C. The most common non-clear cell renal cancers have also undergone genomic profiling and are characterised by distinct genomic landscapes. Many recurrent mutations have prognostic value and show promise in aiding decisions regarding treatment stratification. Intra-tumour heterogeneity appears to hamper the clinical applicability of sparsely sampled tumours. Ways to abrogate heterogeneity will be required to optimise the genomic classification of tumours.

    Conclusion: The somatic mutational landscape of the more common renal cancers is well known. Correlation with outcome needs to be more comprehensively furnished, particularly for small renal masses, rarer non-clear cell renal cancers, and for all tumours undergoing targeted therapy.

    World journal of urology 2018;36;12;1899-1911

  • Timing the Landmark Events in the Evolution of Clear Cell Renal Cell Cancer: TRACERx Renal.

    Mitchell TJ, Turajlic S, Rowan A, Nicol D, Farmery JHR, O'Brien T, Martincorena I, Tarpey P, Angelopoulos N, Yates LR, Butler AP, Raine K, Stewart GD, Challacombe B, Fernando A, Lopez JI, Hazell S, Chandra A, Chowdhury S, Rudman S, Soultati A, Stamp G, Fotiadis N, Pickering L, Au L, Spain L, Lynch J, Stares M, Teague J, Maura F, Wedge DC, Horswell S, Chambers T, Litchfield K, Xu H, Stewart A, Elaidi R, Oudard S, McGranahan N, Csabai I, Gore M, Futreal PA, Larkin J, Lynch AG, Szallasi Z, Swanton C, Campbell PJ and TRACERx Renal Consortium

    Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton CB10 1SA, UK; Academic Urology Group, Department of Surgery, Addenbrooke's Hospitals NHS Foundation Trust, University of Cambridge, Hills Road, Cambridge CB2 0QQ, UK.

    Clear cell renal cell carcinoma (ccRCC) is characterized by near-universal loss of the short arm of chromosome 3, deleting several tumor suppressor genes. We analyzed whole genomes from 95 biopsies across 33 patients with clear cell renal cell carcinoma. We find hotspots of point mutations in the 5' UTR of TERT, targeting a MYC-MAX-MAD1 repressor associated with telomere lengthening. The most common structural abnormality generates simultaneous 3p loss and 5q gain (36% patients), typically through chromothripsis. This event occurs in childhood or adolescence, generally as the initiating event that precedes emergence of the tumor's most recent common ancestor by years to decades. Similar genomic changes drive inherited ccRCC. Modeling differences in age incidence between inherited and sporadic cancers suggests that the number of cells with 3p loss capable of initiating sporadic tumors is no more than a few hundred. Early development of ccRCC follows well-defined evolutionary trajectories, offering opportunity for early intervention.

    Funded by: Cancer Research UK: C14303/A17197, C50947/A18176; Medical Research Council; Wellcome Trust: WT088340MA

    Cell 2018;173;3;611-623.e17

  • IBD risk loci are enriched in multigenic regulatory modules encompassing putative causative genes.

    Momozawa Y, Dmitrieva J, Théâtre E, Deffontaine V, Rahmouni S, Charloteaux B, Crins F, Docampo E, Elansary M, Gori AS, Lecut C, Mariman R, Mni M, Oury C, Altukhov I, Alexeev D, Aulchenko Y, Amininejad L, Bouma G, Hoentjen F, Löwenberg M, Oldenburg B, Pierik MJ, Vander Meulen-de Jong AE, Janneke van der Woude C, Visschedijk MC, International IBD Genetics Consortium, Lathrop M, Hugot JP, Weersma RK, De Vos M, Franchimont D, Vermeire S, Kubo M, Louis E and Georges M

    Unit of Animal Genomics, WELBIO, GIGA-R & Faculty of Veterinary Medicine, University of Liège (B34), 1 Avenue de l'Hôpital, Liège, 4000, Belgium.

    GWAS have identified >200 risk loci for Inflammatory Bowel Disease (IBD). The majority of disease associations are known to be driven by regulatory variants. To identify the putative causative genes that are perturbed by these variants, we generate a large transcriptome data set (nine disease-relevant cell types) and identify 23,650 cis-eQTL. We show that these are determined by ∼9720 regulatory modules, of which ∼3000 operate in multiple tissues and ∼970 on multiple genes. We identify regulatory modules that drive the disease association for 63 of the 200 risk loci, and show that these are enriched in multigenic modules. Based on these analyses, we resequence 45 of the corresponding 100 candidate genes in 6600 Crohn disease (CD) cases and 5500 controls, and show with burden tests that they include likely causative genes. Our analyses indicate that ≥10-fold larger sample sizes will be required to demonstrate the causality of individual genes using this approach.

    Funded by: NIDDK NIH HHS: K08 DK110415, U01 DK062431

    Nature communications 2018;9;1;2427

  • Novel <i>GPR34</i> and <i>CCR6</i> mutation and distinct genetic profiles in MALT lymphomas of different sites.

    Moody S, Thompson JS, Chuang SS, Liu H, Raderer M, Vassiliou G, Wlodarska I, Wu F, Cogliatti S, Robson A, Ashton-Key M, Bi Y, Goodlad J and Du MQ

    Division of Cellular and Molecular Pathology, Department of Pathology, University of Cambridge, UK.

    Mucosa-associated lymphoid tissue (MALT) lymphoma originates from a background of diverse chronic inflammatory disorders at various anatomic sites. The genetics underlying its development, particularly in those associated with autoimmune disorders, is poorly characterized. By whole exome sequencing of 21 cases of MALT lymphomas of the salivary gland and thyroid, we have identified recurrent somatic mutations in 2 G-protein coupled receptors (<i>GPR34</i> and <i>CCR6</i>) not previously reported in human malignancies, 3 genes (<i>PIK3CD</i>, <i>TET2</i>, <i>TNFRSF14</i>) not previously implicated in MALT lymphoma, and a further 2 genes (<i>TBL1XR1</i>, <i>NOTCH1</i>) recently described in MALT lymphoma. The majority of mutations in <i>GPR34</i> and <i>CCR6</i> were nonsense and frameshift changes clustered in the C-terminal cytoplasmic tail, and would result in truncated proteins that lack the phosphorylation motif important for β-arrestin-mediated receptor desensitization and internalization. Screening of these newly identified mutations, together with previously defined genetic changes, revealed distinct mutation profiles in MALT lymphoma of various sites, with those of salivary gland characterized by frequent <i>TBL1XR1</i> and <i>GPR34</i> mutations, thyroid by frequent <i>TET2</i>, <i>TNFRSF14</i> and <i>PIK3CD</i> mutations, and ocular adnexa by frequent <i>TNFAIP3</i> mutation. Interestingly, in MALT lymphoma of the salivary gland, there was a significant positive association between <i>TBL1XR1</i> mutation and <i>GPR34</i> mutation/translocation (<i>P</i>=0.0002). In those of ocular adnexa, <i>TBL1XR1</i> mutation was mutually exclusive from <i>TNFAIP3</i> mutation (<i>P</i>=0.049), but significantly associated with IGHV3-23 usage (<i>P</i>=0.03) and <i>PIK3CD</i> mutation (<i>P</i>=0.009). These findings unravel novel insights into the molecular mechanisms of MALT lymphoma and provide further evidence for potential oncogenic co-operation between receptor signaling and genetic changes.

    Funded by: Medical Research Council: MC_PC_12009

    Haematologica 2018;103;8;1329-1336

  • Dynamics of cholera epidemics from Benin to Mauritania.

    Moore S, Dongdem AZ, Opare D, Cottavoz P, Fookes M, Sadji AY, Dzotsi E, Dogbe M, Jeddi F, Bidjada B, Piarroux M, Valentin OT, Glèlè CK, Rebaudet S, Sow AG, Constantin de Magny G, Koivogui L, Dunoyer J, Bellet F, Garnotel E, Thomson N and Piarroux R

    Department of Parasitology, Aix-Marseille University/UMR MD3, Marseille, France.

    Background: The countries of West Africa are largely portrayed as cholera endemic, although the dynamics of outbreaks in this region of Africa remain largely unclear.

    Methodology/principal findings: To understand the dynamics of cholera in a major portion of West Africa, we analyzed cholera epidemics from 2009 to 2015 from Benin to Mauritania. We conducted a series of field visits as well as multilocus variable tandem repeat analysis and whole-genome sequencing analysis of V. cholerae isolates throughout the study region. During this period, Ghana accounted for 52% of the reported cases in the entire study region (coastal countries from Benin to Mauritania). From 2009 to 2015, we found that one major wave of cholera outbreaks spread from Accra in 2011 northwestward to Sierra Leone and Guinea in 2012. Molecular epidemiology analysis confirmed that the 2011 Ghanaian isolates were related to those that seeded the 2012 epidemics in Guinea and Sierra Leone. Interestingly, we found that many countries deemed "cholera endemic" actually suffered very few outbreaks, with multi-year lulls.

    Conclusions/significance: This study provides the first cohesive vision of the dynamics of cholera epidemics in a major portion of West Africa. This epidemiological overview shows that from 2009 to 2015, at least 54% of reported cases concerned populations living in the three urban areas of Accra, Freetown, and Conakry. These findings may serve as a guide to better target cholera prevention and control efforts in the identified cholera hotspots in West Africa.

    PLoS neglected tropical diseases 2018;12;4;e0006379

  • HUWE1 variants cause dominant X-linked intellectual disability: a clinical study of 21 patients.

    Moortgat S, Berland S, Aukrust I, Maystadt I, Baker L, Benoit V, Caro-Llopis A, Cooper NS, Debray FG, Faivre L, Gardeitchik T, Haukanes BI, Houge G, Kivuva E, Martinez F, Mehta SG, Nassogne MC, Powell-Hamilton N, Pfundt R, Rosello M, Prescott T, Vasudevan P, van Loon B, Verellen-Dumoulin C, Verloes A, Lippe CV, Wakeling E, Wilkie AOM, Wilson L, Yuen A, Study D, Low KJ and Newbury-Ecob RA

    Centre de Génétique Humaine, Institut de Pathologie et de Génétique, Charleroi (Gosselies), Belgium.

    Whole-gene duplications and missense variants in the HUWE1 gene (NM_031407.6) have been reported in association with intellectual disability (ID). Increased gene dosage has been observed in males with non-syndromic mild to moderate ID with speech delay. Missense variants reported previously appear to be associated with severe ID in males and mild or no ID in obligate carrier females. Here, we report the largest cohort of patients with HUWE1 variants, consisting of 14 females and 7 males, with 15 different missense variants and one splice site variant. Clinical assessment identified common clinical features consisting of moderate to profound ID, delayed or absent speech, short stature with small hands and feet and facial dysmorphism consisting of a broad nasal tip, deep set eyes, epicanthic folds, short palpebral fissures, and a short philtrum. We describe for the first time that females can be severely affected, despite preferential inactivation of the affected X chromosome. Three females with the c.329 G  >  A p.Arg110Gln variant, present with a phenotype of mild ID, specific facial features, scoliosis and craniosynostosis, as reported previously in a single patient. In these females, the X inactivation pattern appeared skewed in favour of the affected transcript. In summary, HUWE1 missense variants may cause syndromic ID in both males and females.

    Funded by: Department of Health; Wellcome Trust: WT098051

    European journal of human genetics : EJHG 2018;26;1;64-74

  • Genomic survey of Clostridium difficile reservoirs in the East of England implicates environmental contamination of wastewater treatment plants by clinical lineages.

    Moradigaravand D, Gouliouris T, Ludden C, Reuter S, Jamrozy D, Blane B, Naydenova P, Judge K, H Aliyu S, F Hadjirin N, A Holmes M, Török E, M Brown N, Parkhill J and Peacock S

    1​Wellcome Trust Sanger Institute, Hinxton, UK.

    There is growing evidence that patients with Clostridiumdifficile-associated diarrhoea often acquire their infecting strain before hospital admission. Wastewater is known to be a potential source of surface water that is contaminated with C. difficile spores. Here, we describe a study that used genome sequencing to compare C. difficile isolated from multiple wastewater treatment plants across the East of England and from patients with clinical disease at a major hospital in the same region. We confirmed that C. difficile from 65 patients were highly diverse and that most cases were not linked to other active cases in the hospital. In total, 186 C. difficile isolates were isolated from effluent water obtained from 18 municipal treatment plants at the point of release into the environment. Whole genome comparisons of clinical and environmental isolates demonstrated highly related populations, and confirmed extensive release of toxigenic C. difficile into surface waters. An analysis based on multilocus sequence types (STs) identified 19 distinct STs in the clinical collection and 38 STs in the wastewater collection, with 13 of 44 STs common to both clinical and wastewater collections. Furthermore, we identified five pairs of highly similar isolates (≤2 SNPs different in the core genome) in clinical and wastewater collections. Strategies to control community acquisition should consider the need for bacterial control of treated wastewater.

    Microbial genomics 2018

  • Prediction of antibiotic resistance in Escherichia coli from large-scale pan-genome data.

    Moradigaravand D, Palm M, Farewell A, Mustonen V, Warringer J and Parts L

    Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, United Kingdom.

    The emergence of microbial antibiotic resistance is a global health threat. In clinical settings, the key to controlling spread of resistant strains is accurate and rapid detection. As traditional culture-based methods are time consuming, genetic approaches have recently been developed for this task. The detection of antibiotic resistance is typically made by measuring a few known determinants previously identified from genome sequencing, and thus requires the prior knowledge of its biological mechanisms. To overcome this limitation, we employed machine learning models to predict resistance to 11 compounds across four classes of antibiotics from existing and novel whole genome sequences of 1936 E. coli strains. We considered a range of methods, and examined population structure, isolation year, gene content, and polymorphism information as predictors. Gradient boosted decision trees consistently outperformed alternative models with an average accuracy of 0.91 on held-out data (range 0.81-0.97). While the best models most frequently employed gene content, an average accuracy score of 0.79 could be obtained using population structure information alone. Single nucleotide variation data were less useful, and significantly improved prediction only for two antibiotics, including ciprofloxacin. These results demonstrate that antibiotic resistance in E. coli can be accurately predicted from whole genome sequences without a priori knowledge of mechanisms, and that both genomic and epidemiological data can be informative. This paves way to integrating machine learning approaches into diagnostic tools in the clinic.

    PLoS computational biology 2018;14;12;e1006258

  • Terminal Pleistocene Alaskan genome reveals first founding population of Native Americans.

    Moreno-Mayar JV, Potter BA, Vinner L, Steinrücken M, Rasmussen S, Terhorst J, Kamm JA, Albrechtsen A, Malaspinas AS, Sikora M, Reuther JD, Irish JD, Malhi RS, Orlando L, Song YS, Nielsen R, Meltzer DJ and Willerslev E

    Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, 1350 Copenhagen, Denmark.

    Despite broad agreement that the Americas were initially populated via Beringia, the land bridge that connected far northeast Asia with northwestern North America during the Pleistocene epoch, when and how the peopling of the Americas occurred remains unresolved. Analyses of human remains from Late Pleistocene Alaska are important to resolving the timing and dispersal of these populations. The remains of two infants were recovered at Upward Sun River (USR), and have been dated to around 11.5 thousand years ago (ka). Here, by sequencing the USR1 genome to an average coverage of approximately 17 times, we show that USR1 is most closely related to Native Americans, but falls basal to all previously sequenced contemporary and ancient Native Americans. As such, USR1 represents a distinct Ancient Beringian population. Using demographic modelling, we infer that the Ancient Beringian population and ancestors of other Native Americans descended from a single founding population that initially split from East Asians around 36 ± 1.5 ka, with gene flow persisting until around 25 ± 1.1 ka. Gene flow from ancient north Eurasians into all Native Americans took place 25-20 ka, with Ancient Beringians branching off around 22-18.1 ka. Our findings support a long-term genetic structure in ancestral Native Americans, consistent with the Beringian 'standstill model'. We show that the basal northern and southern Native American branches, to which all other Native Americans belong, diverged around 17.5-14.6 ka, and that this probably occurred south of the North American ice sheets. We also show that after 11.5 ka, some of the northern Native American populations received gene flow from a Siberian population most closely related to Koryaks, but not Palaeo-Eskimos, Inuits or Kets, and that Native American gene flow into Inuits was through northern and not southern Native American groups. Our findings further suggest that the far-northern North American presence of northern Native Americans is from a back migration that replaced or absorbed the initial founding population of Ancient Beringians.

    Funded by: NIGMS NIH HHS: R01 GM094402

    Nature 2018;553;7687;203-207

  • Early human dispersals within the Americas.

    Moreno-Mayar JV, Vinner L, de Barros Damgaard P, de la Fuente C, Chan J, Spence JP, Allentoft ME, Vimala T, Racimo F, Pinotti T, Rasmussen S, Margaryan A, Iraeta Orbegozo M, Mylopotamitaki D, Wooller M, Bataille C, Becerra-Valdivia L, Chivall D, Comeskey D, Devièse T, Grayson DK, George L, Harry H, Alexandersen V, Primeau C, Erlandson J, Rodrigues-Carvalho C, Reis S, Bastos MQR, Cybulski J, Vullo C, Morello F, Vilar M, Wells S, Gregersen K, Hansen KL, Lynnerup N, Mirazón Lahr M, Kjær K, Strauss A, Alfonso-Durruty M, Salas A, Schroeder H, Higham T, Malhi RS, Rasic JT, Souza L, Santos FR, Malaspinas AS, Sikora M, Nielsen R, Song YS, Meltzer DJ and Willerslev E

    Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, 1350 Copenhagen, Denmark.

    Studies of the peopling of the Americas have focused on the timing and number of initial migrations. Less attention has been paid to the subsequent spread of people within the Americas. We sequenced 15 ancient human genomes spanning from Alaska to Patagonia; six are ≥10,000 years old (up to ~18× coverage). All are most closely related to Native Americans, including those from an Ancient Beringian individual and two morphologically distinct "Paleoamericans." We found evidence of rapid dispersal and early diversification that included previously unknown groups as people moved south. This resulted in multiple independent, geographically uneven migrations, including one that provides clues of a Late Pleistocene Australasian genetic signal, as well as a later Mesoamerican-related expansion. These led to complex and dynamic population histories from North to South America.

    Funded by: European Research Council; NIGMS NIH HHS: R01 GM094402

    Science (New York, N.Y.) 2018;362;6419

  • CpG island composition differences are a source of gene expression noise indicative of promoter responsiveness.

    Morgan MD and Marioni JC

    Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.

    Background: Population phenotypic variation can arise from genetic differences between individuals, or from cellular heterogeneity in an isogenic group of cells or organisms. The emergence of gene expression differences between genetically identical cells is referred to as gene expression noise, the sources of which are not well understood.

    Results: In this work, by studying gene expression noise between multiple cell lineages and mammalian species, we find consistent evidence of a role for CpG islands as sources of gene expression noise. Variation in noise among CpG island promoters can be partially attributed to differences in island size, in which short islands have noisier gene expression. Building on these findings, we investigate the potential for short CpG islands to act as fast response elements to environmental stimuli. Specifically, we find that these islands are enriched amongst primary response genes in SWI/SNF-independent stimuli, suggesting that expression noise is an indicator of promoter responsiveness.

    Conclusions: Thus, through the integration of single-cell RNA expression profiling, chromatin landscape and temporal gene expression dynamics, we have uncovered a role for short CpG island promoters as fast response elements.

    Funded by: Wellcome Trust: 105045/Z/14/Z

    Genome biology 2018;19;1;81

  • An atlas of genetic influences on osteoporosis in humans and mice.

    Morris JA, Kemp JP, Youlten SE, Laurent L, Logan JG, Chai RC, Vulpescu NA, Forgetta V, Kleinman A, Mohanty ST, Sergio CM, Quinn J, Nguyen-Yamamoto L, Luco AL, Vijay J, Simon MM, Pramatarova A, Medina-Gomez C, Trajanoska K, Ghirardello EJ, Butterfield NC, Curry KF, Leitch VD, Sparkes PC, Adoum AT, Mannan NS, Komla-Ebri DSK, Pollard AS, Dewhurst HF, Hassall TAD, Beltejar MG, 23andMe Research Team, Adams DJ, Vaillancourt SM, Kaptoge S, Baldock P, Cooper C, Reeve J, Ntzani EE, Evangelou E, Ohlsson C, Karasik D, Rivadeneira F, Kiel DP, Tobias JH, Gregson CL, Harvey NC, Grundberg E, Goltzman D, Adams DJ, Lelliott CJ, Hinds DA, Ackert-Bicknell CL, Hsu YH, Maurano MT, Croucher PI, Williams GR, Bassett JHD, Evans DM and Richards JB

    Department of Human Genetics, McGill University, Montréal, Québec, Canada.

    Osteoporosis is a common aging-related disease diagnosed primarily using bone mineral density (BMD). We assessed genetic determinants of BMD as estimated by heel quantitative ultrasound in 426,824 individuals, identifying 518 genome-wide significant loci (301 novel), explaining 20% of its variance. We identified 13 bone fracture loci, all associated with estimated BMD (eBMD), in ~1.2 million individuals. We then identified target genes enriched for genes known to influence bone density and strength (maximum odds ratio (OR) = 58, P = 1 × 10<sup>-75</sup>) from cell-specific features, including chromatin conformation and accessible chromatin sites. We next performed rapid-throughput skeletal phenotyping of 126 knockout mice with disruptions in predicted target genes and found an increased abnormal skeletal phenotype frequency compared to 526 unselected lines (P < 0.0001). In-depth analysis of one gene, DAAM2, showed a disproportionate decrease in bone strength relative to mineralization. This genetic atlas provides evidence linking associated SNPs to causal genes, offers new insight into osteoporosis pathophysiology, and highlights opportunities for drug development.

    Funded by: NIAMS NIH HHS: R01 AR041398, R01 AR063702, R01 AR072199, R21 AR060981; NIGMS NIH HHS: R35 GM119703

    Nature genetics 2018

  • CRISPR/Cas9 editing in human pluripotent stem cell-cardiomyocytes highlights arrhythmias, hypocontractility, and energy depletion as potential therapeutic targets for hypertrophic cardiomyopathy.

    Mosqueira D, Mannhardt I, Bhagwan JR, Lis-Slimak K, Katili P, Scott E, Hassan M, Prondzynski M, Harmer SC, Tinker A, Smith JGW, Carrier L, Williams PM, Gaffney D, Eschenhagen T, Hansen A and Denning C

    Department of Stem Cell Biology, Centre of Biomolecular Sciences, University of Nottingham, NG7 2RD, UK.

    Aims: Sarcomeric gene mutations frequently underlie hypertrophic cardiomyopathy (HCM), a prevalent and complex condition leading to left ventricle thickening and heart dysfunction. We evaluated isogenic genome-edited human pluripotent stem cell-cardiomyocytes (hPSC-CM) for their validity to model, and add clarity to, HCM.

    Methods and results: CRISPR/Cas9 editing produced 11 variants of the HCM-causing mutation c.C9123T-MYH7 [(p.R453C-β-myosin heavy chain (MHC)] in 3 independent hPSC lines. Isogenic sets were differentiated to hPSC-CMs for high-throughput, non-subjective molecular and functional assessment using 12 approaches in 2D monolayers and/or 3D engineered heart tissues. Although immature, edited hPSC-CMs exhibited the main hallmarks of HCM (hypertrophy, multi-nucleation, hypertrophic marker expression, sarcomeric disarray). Functional evaluation supported the energy depletion model due to higher metabolic respiration activity, accompanied by abnormalities in calcium handling, arrhythmias, and contraction force. Partial phenotypic rescue was achieved with ranolazine but not omecamtiv mecarbil, while RNAseq highlighted potentially novel molecular targets.

    Conclusion: Our holistic and comprehensive approach showed that energy depletion affected core cardiomyocyte functionality. The engineered R453C-βMHC-mutation triggered compensatory responses in hPSC-CMs, causing increased ATP production and αMHC to energy-efficient βMHC switching. We showed that pharmacological rescue of arrhythmias was possible, while MHY7: MYH6 and mutant: wild-type MYH7 ratios may be diagnostic, and previously undescribed lncRNAs and gene modifiers are suggestive of new mechanisms.

    European heart journal 2018

  • Transplantation of<i>schistosome</i>sporocysts between host snails: A video guide.

    Mouahid G, Rognon A, de Carvalho Augusto R, Driguez P, Geyer K, Karinshak S, Luviano N, Mann V, Quack T, Rawlinson K, Wendt G, Grunau C and Moné H

    Laboratoire Interactions Hôtes-Pathogènes-Environnements (IHPE), UMR 5244 CNRS/UPVD/IFREMER/UM, University of Perpignan Via Domitia, 58 Avenue Paul Alduy, Bât R, F-66860 Perpignan Cedex, France.

    Schistosomiasis is an important parasitic disease, touching roughly 200 million people worldwide. The causative agents are different<i>Schistosoma</i>species. Schistosomes have a complex life cycle, with a freshwater snail as intermediate host. After infection, sporocysts develop inside the snail host and give rise to human dwelling larvae. We present here a detailed step-by-step video instruction in English, French, Spanish and Portuguese that shows how these sporocysts can be manipulated and transferred from one snail to another. This procedure provides a technical basis for different types of<i>ex vivo</i>modifications, such as those used in functional genomics studies.

    Wellcome open research 2018;3;3

  • Proteomic profiling of the brain of mice with experimental cerebral malaria.

    Moussa E, Huang H, Ahras M, Lall A, Thezenas ML, Fischer R, Kessler BM, Pain A, Billker O and Casals-Pascual C

    Wellcome Trust Centre for Human Genetics, Oxford, UK; King Abdulla University of Science and Technology, Saudi Arabia.

    Cerebral malaria (CM) is a severe neurological complication of malaria infection in both adults and children. In pursuit of effective treatment of CM, clinical studies, postmortem analysis and animal models have been employed to understand the pathology and identify effective interventions. In this study, a shotgun proteomics analysis was conducted to profile the proteomic signature of the brain tissue of mice with experimental cerebral malaria (ECM) in order to further understand the underlying pathology. To identify CM-associated response, proteomic signatures of the brains of C57/Bl6N mice infected with P. berghei ANKA that developed neurological syndrome were compared to those of mice infected with P. berghei NK65 that developed equally high parasite burdens without neurological signs, and to those of non-infected mice. The results show that the CM-associated response in mice that developed neurological signs comprise mainly acute-phase reaction and coagulation cascade activation, and indicate the leakage of plasma proteins into the brain parenchyma.

    Significance: Cerebral malaria (CM) remains a major cause of death in children. The majority of these deaths occur in sub-Saharan Africa. Even with adequate access to treatment, mortality remains high and neurological sequelae can be found in up to 20% of survivors. No adjuvant treatment to date has been shown to reduce mortality and the pathophysiology of CM is largely unknown. Experimental cerebral malaria (ECM) is a well-established model that may contribute to identify and test druggable targets. In this study we have identified the disruption of the blood-brain barrier following inflammatory and vascular injury as a mechanism of disease. In this study we report a number of proteins that could be validated as potential biomarkers of ECM. Further studies, will be required to validate the clinical relevance of these biomarkers in human CM.

    Funded by: Medical Research Council: G0701885

    Journal of proteomics 2018;180;61-69

  • Genetic Association in the HLA Region.

    Moutsianas L and Gutierrez-Achury J

    The Wellcome Trust Sanger Institute, Cambridgeshire, UK.

    The MHC/HLA region has been consistently associated with a large number of complex traits, including but not limited to, most immune-mediated ones. Efforts to pinpoint drivers of this commonly encountered association peak at the short arm of chromosome 6, however, have been challenging, owing to the high density of genes and the long and extended linkage disequilibrium that are characteristic of this region.The development of methods to impute classical HLA alleles and amino acids from SNP genotyping data has offered an important additional layer of information to the investigators seeking to fine map the signal in the region. As a result, imputation-aided association analyses are now typically employed to shed light on the relationship of this locus with disease susceptibility and response to drugs.In this chapter we discuss how the signal in the HLA region can be interrogated in practice, from performing the imputation to understanding its output and to incorporating it into downstream analysis. In addition, we recount some of the analytical approaches that are commonly used and suggest ways in which the findings from such imputation-aided analyses can be interpreted.

    Methods in molecular biology (Clifton, N.J.) 2018;1793;111-134

  • The International Mouse Phenotyping Consortium (IMPC): a functional catalogue of the mammalian genome that informs conservation.

    Muñoz-Fuentes V, Cacheiro P, Meehan TF, Aguilar-Pimentel JA, Brown SDM, Flenniken AM, Flicek P, Galli A, Mashhadi HH, Hrabě de Angelis M, Kim JK, Lloyd KCK, McKerlie C, Morgan H, Murray SA, Nutter LMJ, Reilly PT, Seavitt JR, Seong JK, Simon M, Wardle-Jones H, Mallon AM, Smedley D, Parkinson HE and IMPC consortium

    1European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD UK.

    The International Mouse Phenotyping Consortium (IMPC) is building a catalogue of mammalian gene function by producing and phenotyping a knockout mouse line for every protein-coding gene. To date, the IMPC has generated and characterised 5186 mutant lines. One-third of the lines have been found to be non-viable and over 300 new mouse models of human disease have been identified thus far. While current bioinformatics efforts are focused on translating results to better understand human disease processes, IMPC data also aids understanding genetic function and processes in other species. Here we show, using gorilla genomic data, how genes essential to development in mice can be used to help assess the potentially deleterious impact of gene variants in other species. This type of analyses could be used to select optimal breeders in endangered species to maintain or increase fitness and avoid variants associated to impaired-health phenotypes or loss-of-function mutations in genes of critical importance. We also show, using selected examples from various mammal species, how IMPC data can aid in the identification of candidate genes for studying a condition of interest, deliver information about the mechanisms involved, or support predictions for the function of genes that may play a role in adaptation. With genotyping costs decreasing and the continued improvements of bioinformatics tools, the analyses we demonstrate can be routinely applied.

    Funded by: NHGRI NIH HHS: U54 HG006332, U54 HG006348, U54 HG006364, U54 HG006370; NIH HHS: U42 OD011175, U42 OD011185, UM1 OD023221

    Conservation genetics (Print) 2018;19;4;995-1005

  • Evolutionary routes and KRAS dosage define pancreatic cancer phenotypes.

    Mueller S, Engleitner T, Maresch R, Zukowska M, Lange S, Kaltenbacher T, Konukiewitz B, Öllinger R, Zwiebel M, Strong A, Yen HY, Banerjee R, Louzada S, Fu B, Seidler B, Götzfried J, Schuck K, Hassan Z, Arbeiter A, Schönhuber N, Klein S, Veltkamp C, Friedrich M, Rad L, Barenboim M, Ziegenhain C, Hess J, Dovey OM, Eser S, Parekh S, Constantino-Casas F, de la Rosa J, Sierra MI, Fraga M, Mayerle J, Klöppel G, Cadiñanos J, Liu P, Vassiliou G, Weichert W, Steiger K, Enard W, Schmid RM, Yang F, Unger K, Schneider G, Varela I, Bradley A, Saur D and Rad R

    Center for Translational Cancer Research (TranslaTUM), Technische Universität München, 81675 Munich, Germany.

    The poor correlation of mutational landscapes with phenotypes limits our understanding of the pathogenesis and metastasis of pancreatic ductal adenocarcinoma (PDAC). Here we show that oncogenic dosage-variation has a critical role in PDAC biology and phenotypic diversification. We find an increase in gene dosage of mutant KRAS in human PDAC precursors, which drives both early tumorigenesis and metastasis and thus rationalizes early PDAC dissemination. To overcome the limitations posed to gene dosage studies by the stromal richness of PDAC, we have developed large cell culture resources of metastatic mouse PDAC. Integration of cell culture genomes, transcriptomes and tumour phenotypes with functional studies and human data reveals additional widespread effects of oncogenic dosage variation on cell morphology and plasticity, histopathology and clinical outcome, with the highest Kras<sup>MUT</sup> levels underlying aggressive undifferentiated phenotypes. We also identify alternative oncogenic gains (Myc, Yap1 or Nfkb2), which collaborate with heterozygous Kras<sup>MUT</sup> in driving tumorigenesis, but have lower metastatic potential. Mechanistically, different oncogenic gains and dosages evolve along distinct evolutionary routes, licensed by defined allelic states and/or combinations of hallmark tumour suppressor alterations (Cdkn2a, Trp53, Tgfβ-pathway). Thus, evolutionary constraints and contingencies direct oncogenic dosage gain and variation along defined routes to drive the early progression of PDAC and shape its downstream biology. Our study uncovers universal principles of Ras-driven oncogenesis that have potential relevance beyond pancreatic cancer.

    Funded by: European Research Council: 648521; Medical Research Council: MC_PC_12009

    Nature 2018;554;7690;62-68

  • H3Africa: current perspectives.

    Mulder N, Abimiku A, Adebamowo SN, de Vries J, Matimba A, Olowoyo P, Ramsay M, Skelton M and Stein DJ

    Computational Biology Division, Department of Integrative Biomedical Sciences, Institute of Infectious Disease and Molecular Medicine, University of Cape Town, Cape Town, South Africa.

    Precision medicine is being enabled in high-income countries by the growing availability of health data, increasing knowledge of the genetic determinants of disease and variation in response to treatment (pharmacogenomics), and the decreasing costs of data generation, which promote routine application of genomic technologies in the health sector. However, there is uncertainty about the feasibility of applying precision medicine approaches in low- and middle-income countries, due to the lack of population-specific knowledge, skills, and resources. The Human Heredity and Health in Africa (H3Africa) initiative was established to drive new research into the genetic and environmental basis for human diseases of relevance to Africans as well as to build capacity for genomic research on the continent. Precision medicine requires this capacity, in addition to reference data on local populations, and skills to analyze and interpret genomic data from the bedside. The H3Africa consortium is collectively processing samples and data for over 70,000 participants across the continent, accompanied in most cases by rich clinical information on a variety of non-communicable and infectious diseases. These projects are increasingly providing novel insights into the genetic basis of diseases in indigenous populations, insights that have the potential to drive the development of new diagnostics and treatments. The consortium has also invested significant resources into establishing high-quality biorepositories in Africa, a bioinformatic network, and a strong training program that has developed skills in genomic data analysis and interpretation among bioinformaticians, wet-lab researchers, and health-care professionals. Here, we describe the current perspectives of the H3Africa consortium and how it can contribute to making precision medicine in Africa a reality.

    Pharmacogenomics and personalized medicine 2018;11;59-66

  • Epidermal Tissue Adapts to Restrain Progenitors Carrying Clonal p53 Mutations.

    Murai K, Skrupskelyte G, Piedrafita G, Hall M, Kostiou V, Ong SH, Nagy T, Cagan A, Goulding D, Klein AM, Hall BA and Jones PH

    Wellcome Sanger Institute, Hinxton CB10 1SA, UK.

    Aging human tissues, such as sun-exposed epidermis, accumulate a high burden of progenitor cells that carry oncogenic mutations. However, most progenitors carrying such mutations colonize and persist in normal tissue without forming tumors. Here, we investigated tissue-level constraints on clonal progenitor behavior by inducing a single-allele p53 mutation (Trp53<sup>R245W</sup>; p53<sup>∗/wt</sup>), prevalent in normal human epidermis and squamous cell carcinoma, in transgenic mouse epidermis. p53<sup>∗/wt</sup> progenitors initially outcompeted wild-type cells due to enhanced proliferation, but subsequently reverted toward normal dynamics and homeostasis. Physiological doses of UV light accelerated short-term expansion of p53<sup>∗/wt</sup> clones, but their frequency decreased with protracted irradiation, possibly due to displacement by UV-induced mutant clones with higher competitive fitness. These results suggest multiple mechanisms restrain the proliferation of p53<sup>∗/wt</sup> progenitors, thereby maintaining epidermal integrity.

    Funded by: Medical Research Council: MC_UU_12022/3, MR/N501876/1; Wellcome Trust

    Cell stem cell 2018;23;5;687-699.e8

  • Chronic TNFα-driven injury delays cell migration to villi in the intestinal epithelium.

    Muraro D, Parker A, Vaux L, Filippi S, Almet AA, Fletcher AG, Watson AJM, Pin C, Maini PK and Byrne HM

    Wolfson Centre for Mathematical Biology, Mathematical Institute, University of Oxford, Oxford, UK

    The intestinal epithelium is a single layer of cells which provides the first line of defence of the intestinal mucosa to bacterial infection. Cohesion of this physical barrier is supported by renewal of epithelial stem cells, residing in invaginations called crypts, and by crypt cell migration onto protrusions called villi; dysregulation of such mechanisms may render the gut susceptible to chronic inflammation. The impact that excessive or misplaced epithelial cell death may have on villus cell migration is currently unknown. We integrated cell-tracking methods with computational models to determine how epithelial homeostasis is affected by acute and chronic TNFα-driven epithelial cell death. Parameter inference reveals that acute inflammatory cell death has a transient effect on epithelial cell dynamics, whereas cell death caused by chronic elevated TNFα causes a delay in the accumulation of labelled cells onto the villus compared to the control. Such a delay may be reproduced by using a cell-based model to simulate the dynamics of each cell in a crypt-villus geometry, showing that a prolonged increase in cell death slows the migration of cells from the crypt to the villus. This investigation highlights which injuries (acute or chronic) may be regenerated and which cause disruption of healthy epithelial homeostasis.

    Funded by: Biotechnology and Biological Sciences Research Council: BB/C513350/1

    Journal of the Royal Society, Interface 2018;15;145

  • Staphylococcus cornubiensis sp. nov., a member of the Staphylococcus intermedius Group (SIG).

    Murray AK, Lee J, Bendall R, Zhang L, Sunde M, Schau Slettemeås J, Gaze W, Page AJ and Vos M

    1​European Centre for Environment and Human Health, University of Exeter Medical School, University of Exeter, Penryn, UK.

    We here describe a novel species in the Staphylococcus intermedius Group (SIG) which is phenotypically similar to Staphylococcus pseudintermedius but is genomically distinct from it and other SIG members, with an average nucleotide identity of 90.2 % with its closest relative S. intermedius. The description of Staphylococcus cornubiensis sp. nov. is based on strain NW1<sup>T</sup> (=NCTC 13950<sup>T</sup>=DSM 105366<sup>T</sup>) isolated from a human skin infection in Cornwall, UK. Although pathogenic, NW1<sup>T</sup> carries no known virulence genes or mobilizable antibiotic resistance genes and further studies are required to assess the prevalence of this species in humans as well as its potential presence in companion animals.

    International journal of systematic and evolutionary microbiology 2018;68;11;3404-3408

  • Multiplex genomewide association analysis of breast milk fatty acid composition extends the phenotypic association and potential selection of<i>FADS1</i>variants to arachidonic acid, a critical infant micronutrient.

    Mychaleckyj JC, Nayak U, Colgate ER, Zhang D, Carstensen T, Ahmed S, Ahmed T, Mentzer AJ, Alam M, Kirkpatrick BD, Haque R, Faruque ASG and Petri WA

    Center for Public Health Genomics, Department of Public Health Sciences, University of Virginia, Charlottesville, Virginia, USA.

    Background: Breast milk is the sole nutrition source during exclusive breastfeeding, and polyunsaturated fatty acids (FAs) are critical micronutrients in infant physical and cognitive development. There has been no prior genomewide association study of breast milk, hence our objective was to test for genetic association with breast milk FA composition.

    Methods: We measured the fractional composition of 26 individual FAs in breast milk samples from three cohorts totalling 1142 Bangladeshi mothers whose infants were genotyped on the Illumina MEGA chip and replicated on a custom Affymetrix 30K SNP array (n=616). Maternal genotypes were imputed using IMPUTE.

    Results: After running 33 separate FA fraction phenotypes, we found that SNPs known to be associated with serum FAs in the<i>FADS1/2/3</i>region were also associated with breast milk FA composition (experiment-wise significance threshold 4.2×10<sup>-9</sup>). Hypothesis-neutral comparison of the 33 fractions showed that the most significant genetic association at the<i>FADS1/2/3</i>locus was with fraction of arachidonic acid (AA) at SNP rs174556, with a very large per major allele effect size of 17% higher breast milk AA level. There was no evidence of independent association at<i>FADS1/2/3</i>with any other FA or SNP after conditioning on AA and rs174556. We also found novel significant experiment-wise SNP associations with: polyunsaturated fatty acid (PUFA) 6/PUFA3 ratio (sorting nexin<i>29</i>), eicosenoic (intergenic) and capric (component of oligomeric Golgi complex 3) acids; and six additional loci at genomewide significance (<5×10<sup>-8</sup>).

    Conclusions: AA is the primary FA in breast milk influenced by genetic variation at the<i>FADS1/2/3</i>locus, extending the potential phenotypes under genetic selection to include breast milk composition, thereby possibly affecting infant growth or cognition. Breast milk FA composition is influenced by maternal genetics in addition to diet and body composition.

    Journal of medical genetics 2018

  • Circulating tumor DNA in patients with colorectal adenomas: assessment of detectability and genetic heterogeneity.

    Myint NNM, Verma AM, Fernandez-Garcia D, Sarmah P, Tarpey PS, Al-Aqbi SS, Cai H, Trigg R, West K, Howells LM, Thomas A, Brown K, Guttery DS, Singh B, Pringle HJ, McDermott U, Shaw JA and Rufini A

    Leicester Cancer Research Centre, University of Leicester, Leicester, LE2 7LX, UK.

    Improving early detection of colorectal cancer (CRC) is a key public health priority as adenomas and stage I cancer can be treated with minimally invasive procedures. Population screening strategies based on detection of occult blood in the feces have contributed to enhance detection rates of localized disease, but new approaches based on genetic analyses able to increase specificity and sensitivity could provide additional advantages compared to current screening methodologies. Recently, circulating cell-free DNA (cfDNA) has received much attention as a cancer biomarker for its ability to monitor the progression of advanced disease, predict tumor recurrence and reflect the complex genetic heterogeneity of cancers. Here, we tested whether analysis of cfDNA is a viable tool to enhance detection of colon adenomas. To address this, we assessed a cohort of patients with adenomas and healthy controls using droplet digital PCR (ddPCR) and mutation-specific assays targeted to trunk mutations. Additionally, we performed multiregional, targeted next-generation sequencing (NGS) of adenomas and unmasked extensive heterogeneity, affecting known drivers such as APC, KRAS and mismatch repair (MMR) genes. However, tumor-related mutations were undetectable in patients' plasma. Finally, we employed a preclinical mouse model of Apc-driven intestinal adenomas and confirmed the inability to identify tumor-related alterations via cfDNA, despite the enhanced disease burden displayed by this experimental cancer model. Therefore, we conclude that benign colon lesions display extensive genetic heterogeneity, that they are not prone to release DNA into the circulation and are unlikely to be reliably detected with liquid biopsies, at least with the current technologies.

    Cell death & disease 2018;9;9;894

  • Clonal analysis of Salmonella-specific effector T cells reveals serovar-specific and cross-reactive T cell responses.

    Napolitani G, Kurupati P, Teng KWW, Gibani MM, Rei M, Aulicino A, Preciado-Llanes L, Wong MT, Becht E, Howson L, de Haas P, Salio M, Blohmke CJ, Olsen LR, Pinto DMS, Scifo L, Jones C, Dobinson H, Campbell D, Juel HB, Thomaides-Brears H, Pickard D, Bumann D, Baker S, Dougan G, Simmons A, Gordon MA, Newell EW, Pollard AJ and Cerundolo V

    MRC Human Immunology Unit, MRC Weatherall Institute of Molecular Medicine, University of Oxford, Oxford, UK.

    To tackle the complexity of cross-reactive and pathogen-specific T cell responses against related Salmonella serovars, we used mass cytometry, unbiased single-cell cloning, live fluorescence barcoding, and T cell-receptor sequencing to reconstruct the Salmonella-specific repertoire of circulating effector CD4<sup>+</sup> T cells, isolated from volunteers challenged with Salmonella enterica serovar Typhi (S. Typhi) or Salmonella Paratyphi A (S. Paratyphi). We describe the expansion of cross-reactive responses against distantly related Salmonella serovars and of clonotypes recognizing immunodominant antigens uniquely expressed by S. Typhi or S. Paratyphi A. In addition, single-amino acid variations in two immunodominant proteins, CdtB and PhoN, lead to the accumulation of T cells that do not cross-react against the different serovars, thus demonstrating how minor sequence variations in a complex microorganism shape the pathogen-specific T cell repertoire. Our results identify immune-dominant, serovar-specific, and cross-reactive T cell antigens, which should aid in the design of T cell-vaccination strategies against Salmonella.

    Funded by: Department of Health: NIHR-RP-R3-12-026; Medical Research Council: MC_UU_12010/7

    Nature immunology 2018;19;7;742-754

  • Cotranslational protein assembly imposes evolutionary constraints on homomeric proteins.

    Natan E, Endoh T, Haim-Vilmovsky L, Flock T, Chalancon G, Hopper JTS, Kintses B, Horvath P, Daruka L, Fekete G, Pál C, Papp B, Oszi E, Magyar Z, Marsh JA, Elcock AH, Babu MM, Robinson CV, Sugimoto N and Teichmann SA

    The Aleph Lab Ltd, Oxford, UK.

    Cotranslational protein folding can facilitate rapid formation of functional structures. However, it can also cause premature assembly of protein complexes, if two interacting nascent chains are in close proximity. By analyzing known protein structures, we show that homomeric protein contacts are enriched toward the C termini of polypeptide chains across diverse proteomes. We hypothesize that this is the result of evolutionary constraints for folding to occur before assembly. Using high-throughput imaging of protein homomers in Escherichia coli and engineered protein constructs with N- and C-terminal oligomerization domains, we show that, indeed, proteins with C-terminal homomeric interface residues consistently assemble more efficiently than those with N-terminal interface residues. Using in vivo, in vitro and in silico experiments, we identify features that govern successful assembly of homomers, which have implications for protein design and expression optimization.

    Nature structural & molecular biology 2018

  • Novel read density distribution score shows possible aligner artefacts, when mapping a single chromosome.

    Naumenko FM, Abnizova II, Beka N, Genaev MA and Orlov YL

    Novosibirsk State University, Pirogova, 1, Novosibirsk, 630090, Russia.

    Background: The use of artificial data to evaluate the performance of aligners and peak callers not only improves its accuracy and reliability, but also makes it possible to reduce the computational time. One of the natural ways to achieve such time reduction is by mapping a single chromosome.

    Results: We investigated whether a single chromosome mapping causes any artefacts in the alignments' performances. In this paper, we compared the accuracy of the performance of seven aligners on well-controlled simulated benchmark data which was sampled from a single chromosome and also from a whole genome. We found that commonly used statistical methods are insufficient to evaluate an aligner performance, and applied a novel measure of a read density distribution similarity, which allowed to reveal artefacts in aligners' performances. We also calculated some interesting mismatch statistics, and constructed mismatch frequency distributions along the read.

    Conclusions: The generation of artificial data by mapping of reads generated from a single chro