Sanger Institute - Publications 2016
Number of papers published in 2016: 678
Whole-Genome Sequencing for Routine Pathogen Surveillance in Public Health: a Population Snapshot of Invasive Staphylococcus aureus in Europe.
Department of Infectious Disease Epidemiology, School of Public Health, Imperial College London, London, United Kingdom The Centre for Genomic Pathogen Surveillance, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom.
Unlabelled: The implementation of routine whole-genome sequencing (WGS) promises to transform our ability to monitor the emergence and spread of bacterial pathogens. Here we combined WGS data from 308 invasive Staphylococcus aureus isolates corresponding to a pan-European population snapshot, with epidemiological and resistance data. Geospatial visualization of the data is made possible by a generic software tool designed for public health purposes that is available at the project URL (http://www.microreact.org/project/EkUvg9uY?tt=rc). Our analysis demonstrates that high-risk clones can be identified on the basis of population level properties such as clonal relatedness, abundance, and spatial structuring and by inferring virulence and resistance properties on the basis of gene content. We also show that in silico predictions of antibiotic resistance profiles are at least as reliable as phenotypic testing. We argue that this work provides a comprehensive road map illustrating the three vital components for future molecular epidemiological surveillance: (i) large-scale structured surveys, (ii) WGS, and (iii) community-oriented database infrastructure and analysis tools.
Importance: The spread of antibiotic-resistant bacteria is a public health emergency of global concern, threatening medical intervention at every level of health care delivery. Several recent studies have demonstrated the promise of routine whole-genome sequencing (WGS) of bacterial pathogens for epidemiological surveillance, outbreak detection, and infection control. However, as this technology becomes more widely adopted, the key challenges of generating representative national and international data sets and the development of bioinformatic tools to manage and interpret the data become increasingly pertinent. This study provides a road map for the integration of WGS data into routine pathogen surveillance. We emphasize the importance of large-scale routine surveys to provide the population context for more targeted or localized investigation and the development of open-access bioinformatic tools to provide the means to combine and compare independently generated data with publicly available data sets.
Funded by: Medical Research Council: G1000803; Wellcome Trust: 089472, 098051, 099202
Genomic prediction of coronary heart disease.
Centre for Systems Genomics, School of BioSciences, The University of Melbourne, Parkville, Victoria 3010, Australia Department of Pathology, The University of Melbourne, Parkville, Victoria 3010, Australia.
Aims: Genetics plays an important role in coronary heart disease (CHD) but the clinical utility of genomic risk scores (GRSs) relative to clinical risk scores, such as the Framingham Risk Score (FRS), is unclear. Our aim was to construct and externally validate a CHD GRS, in terms of lifetime CHD risk and relative to traditional clinical risk scores.
Methods and results: We generated a GRS of 49 310 SNPs based on a CARDIoGRAMplusC4D Consortium meta-analysis of CHD, then independently tested it using five prospective population cohorts (three FINRISK cohorts, combined n = 12 676, 757 incident CHD events; two Framingham Heart Study cohorts (FHS), combined n = 3406, 587 incident CHD events). The GRS was associated with incident CHD (FINRISK HR = 1.74, 95% confidence interval (CI) 1.61-1.86 per S.D. of GRS; Framingham HR = 1.28, 95% CI 1.18-1.38), and was largely unchanged by adjustment for known risk factors, including family history. Integration of the GRS with the FRS or ACC/AHA13 scores improved the 10 years risk prediction (meta-analysis C-index: +1.5-1.6%, P < 0.001), particularly for individuals ≥60 years old (meta-analysis C-index: +4.6-5.1%, P < 0.001). Importantly, the GRS captured substantially different trajectories of absolute risk, with men in the top 20% of attaining 10% cumulative CHD risk 12-18 y earlier than those in the bottom 20%. High genomic risk was partially compensated for by low systolic blood pressure, low cholesterol level, and non-smoking.
Conclusions: A GRS based on a large number of SNPs improves CHD risk prediction and encodes different trajectories of lifetime risk not captured by traditional clinical risk scores.
European heart journal 2016
αv Integrins combine with LC3 and atg5 to regulate Toll-like receptor signalling in B cells.
Immunology Program, Benaroya Research Institute, 1201 Ninth Avenue, Seattle, Washington 98101, USA.
Integrin signalling triggers cytoskeletal rearrangements, including endocytosis and exocytosis of integrins and other membrane proteins. In addition to recycling integrins, this trafficking can also regulate intracellular signalling pathways. Here we describe a role for αv integrins in regulating Toll-like receptor (TLR) signalling by modulating intracellular trafficking. We show that deletion of αv or β3 causes increased B-cell responses to TLR stimulation in vitro, and αv-conditional knockout mice have elevated antibody responses to TLR-ligand-associated antigens. αv regulates TLR signalling by promoting recruitment of the autophagy component LC3 (microtubule-associated proteins 1 light chain 3) to TLR-containing endosomes, which is essential for progression from NF-κB to IRF signalling, and ultimately for traffic to lysosomes where signalling is terminated. Disruption of LC3 recruitment leads to prolonged NF-κB signalling and increased B-cell proliferation and antibody production. This work identifies a previously unrecognized role for αv and the autophagy components LC3 and atg5 in regulating TLR signalling and B-cell immunity.
Funded by: NIDDK NIH HHS: R01 DK093695
Nature communications 2016;7;10917
G9a inhibition potentiates the anti-tumour activity of DNA double-strand break inducing agents by impairing DNA repair independent of p53 status.
The Wellcome Trust/Cancer Research UK Gurdon Institute and Department of Biochemistry, University of Cambridge, Cambridge CB2 1QN, UK.
Cancer cells often exhibit altered epigenetic signatures that can misregulate genes involved in processes such as transcription, proliferation, apoptosis and DNA repair. As regulation of chromatin structure is crucial for DNA repair processes, and both DNA repair and epigenetic controls are deregulated in many cancers, we speculated that simultaneously targeting both might provide new opportunities for cancer therapy. Here, we describe a focused screen that profiled small-molecule inhibitors targeting epigenetic regulators in combination with DNA double-strand break (DSB) inducing agents. We identify UNC0638, a catalytic inhibitor of histone lysine N-methyl-transferase G9a, as hypersensitising tumour cells to low doses of DSB-inducing agents without affecting the growth of the non-tumorigenic cells tested. Similar effects are also observed with another, structurally distinct, G9a inhibitor A-366. We also show that small-molecule inhibition of G9a or siRNA-mediated G9a depletion induces tumour cell death under low DNA damage conditions by impairing DSB repair in a p53 independent manner. Furthermore, we establish that G9a promotes DNA non-homologous end-joining in response to DSB-inducing genotoxic stress. This study thus highlights the potential for using G9a inhibitors as anti-cancer therapeutic agents in combination with DSB-inducing chemotherapeutic drugs such as etoposide.
Cancer letters 2016;380;2;467-475
Human Rhinovirus B and C Genomes from Rural Coastal Kenya.
Epidemiology and Demography Department, KEMRI-Wellcome Trust Research Programme, Kilifi, Kenya School of Health and Human Sciences, Pwani University, Kilifi, Kenya.
Primer-independent agnostic deep sequencing was used to generate three human rhinovirus (HRV) B genomes and one HRV C genome from samples collected in a household respiratory survey in rural coastal Kenya. The study provides the first rhinovirus genomes from Kenya and will help improve the sensitivity of local molecular diagnostics.
Genome announcements 2016;4;4
Sleeping Beauty screen reveals Pparg activation in metastatic prostate cancer.
Cancer Research UK Beatson Institute, Bearsden, Glasgow G61 1BD, United Kingdom; Institute of Cancer Sciences, University of Glasgow, Glasgow G61 1QH, United Kingdom; email@example.com firstname.lastname@example.org.
Prostate cancer (CaP) is the most common adult male cancer in the developed world. The paucity of biomarkers to predict prostate tumor biology makes it important to identify key pathways that confer poor prognosis and guide potential targeted therapy. Using a murine forward mutagenesis screen in a Pten-null background, we identified peroxisome proliferator-activated receptor gamma (Pparg), encoding a ligand-activated transcription factor, as a promoter of metastatic CaP through activation of lipid signaling pathways, including up-regulation of lipid synthesis enzymes [fatty acid synthase (FASN), acetyl-CoA carboxylase (ACC), ATP citrate lyase (ACLY)]. Importantly, inhibition of PPARG suppressed tumor growth in vivo, with down-regulation of the lipid synthesis program. We show that elevated levels of PPARG strongly correlate with elevation of FASN in human CaP and that high levels of PPARG/FASN and PI3K/pAKT pathway activation confer a poor prognosis. These data suggest that CaP patients could be stratified in terms of PPARG/FASN and PTEN levels to identify patients with aggressive CaP who may respond favorably to PPARG/FASN inhibition.
Funded by: Cancer Research UK: 13031
Proceedings of the National Academy of Sciences of the United States of America 2016;113;29;8290-5
Established BMI-associated genetic variants and their prospective associations with BMI and other cardiometabolic traits: the GLACIER Study.
Department of Clinical Sciences, Genetic and Molecular Epidemiology Unit, Lund University Diabetes Center, Lund University, Malmö, Sweden.
Background: Recent cross-sectional genome-wide scans have reported associations of 97 independent loci with body mass index (BMI). In 3541 middle-aged adult participants from the GLACIER Study, we tested whether these loci are associated with 10-year changes in BMI and other cardiometabolic traits (fasting and 2-h glucose, triglycerides, total cholesterol, and systolic and diastolic blood pressures).
Methods: A BMI-specific genetic risk score (GRS) was calculated by summing the BMI-associated effect alleles at each locus. Trait-specific cardiometabolic GRSs comprised only the loci that show nominal association (P⩽0.10) with the respective trait in the original cross-sectional study. In longitudinal genetic association analyses, the second visit trait measure (assessed ~10 years after baseline) was used as the dependent variable and the models were adjusted for the baseline measure of the outcome trait, age, age(2), fasting time (for glucose and lipid traits), sex, follow-up time and population substructure.
Results: The BMI-specific GRS was associated with increased BMI at follow-up (β=0.014 kg m(-2) per allele per 10-year follow-up, s.e.=0.006, P=0.019) as were three loci (PARK2 rs13191362, P=0.005; C6orf106 rs205262, P=0.043; and C9orf93 rs4740619, P=0.01). Although not withstanding Bonferroni correction, a handful of single-nucleotide polymorphisms was nominally associated with changes in blood pressure, glucose and lipid levels.
Conclusions: Collectively, established BMI-associated loci convey modest but statistically significant time-dependent associations with long-term changes in BMI, suggesting a role for effect modification by factors that change with time in this population.
International journal of obesity (2005) 2016;40;9;1346-52
Quantitation of next generation sequencing library preparation protocol efficiencies using droplet digital PCR assays - a systematic comparison of DNA library preparation kits for Illumina sequencing.
Wellcome Trust Sanger Institute, Wellcome Trust Campus, Hinxton, Cambs, CB10 1SA, UK. email@example.com.
Background: The emergence of next-generation sequencing (NGS) technologies in the past decade has allowed the democratization of DNA sequencing both in terms of price per sequenced bases and ease to produce DNA libraries. When it comes to preparing DNA sequencing libraries for Illumina, the current market leader, a plethora of kits are available and it can be difficult for the users to determine which kit is the most appropriate and efficient for their applications; the main concerns being not only cost but also minimal bias, yield and time efficiency.
Results: We compared 9 commercially available library preparation kits in a systematic manner using the same DNA sample by probing the amount of DNA remaining after each protocol steps using a new droplet digital PCR (ddPCR) assay. This method allows the precise quantification of fragments bearing either adaptors or P5/P7 sequences on both ends just after ligation or PCR enrichment. We also investigated the potential influence of DNA input and DNA fragment size on the final library preparation efficiency. The overall library preparations efficiencies of the libraries show important variations between the different kits with the ones combining several steps into a single one exhibiting some final yields 4 to 7 times higher than the other kits. Detailed ddPCR data also reveal that the adaptor ligation yield itself varies by more than a factor of 10 between kits, certain ligation efficiencies being so low that it could impair the original library complexity and impoverish the sequencing results. When a PCR enrichment step is necessary, lower adaptor-ligated DNA inputs leads to greater amplification yields, hiding the latent disparity between kits.
Conclusion: We describe a ddPCR assay that allows us to probe the efficiency of the most critical step in the library preparation, ligation, and to draw conclusion on which kits is more likely to preserve the sample heterogeneity and reduce the need of amplification.
Funded by: Wellcome Trust: 098051
BMC genomics 2016;17;458
European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.
Ensembl (www.ensembl.org) is a database and genome browser for enabling research on vertebrate genomes. We import, analyse, curate and integrate a diverse collection of large-scale reference data to create a more comprehensive view of genome biology than would be possible from any individual dataset. Our extensive data resources include evidence-based gene and regulatory region annotation, genome variation and gene trees. An accompanying suite of tools, infrastructure and programmatic access methods ensure uniform data analysis and distribution for all supported species. Together, these provide a comprehensive solution for large-scale and targeted genomics applications alike. Among many other developments over the past year, we have improved our resources for gene regulation and comparative genomics, and added CRISPR/Cas9 target sites. We released new browser functionality and tools, including improved filtering and prioritization of genome variation, Manhattan plot visualization for linkage disequilibrium and eQTL data, and an ontology search for phenotypes, traits and disease. We have also enhanced data discovery and access with a track hub registry and a selection of new REST end points. All Ensembl data are freely released to the scientific community and our source code is available via the open source Apache 2.0 license.
Nucleic acids research 2016
The Ensembl gene annotation system.
European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK firstname.lastname@example.org email@example.com.
The Ensembl gene annotation system has been used to annotate over 70 different vertebrate species across a wide range of genome projects. Furthermore, it generates the automatic alignment-based annotation for the human and mouse GENCODE gene sets. The system is based on the alignment of biological sequences, including cDNAs, proteins and RNA-seq reads, to the target genome in order to construct candidate transcript models. Careful assessment and filtering of these candidate transcripts ultimately leads to the final gene set, which is made available on the Ensembl website. Here, we describe the annotation process in detail.Database URL: http://www.ensembl.org/index.html.
Funded by: Biotechnology and Biological Sciences Research Council: BB/E011640/1, BB/I025360/1, BB/I025360/2, BB/I025506/1, BB/K009524/1, BB/M011461/1, BB/M011615/1, BB/M018458/1, BBS/B/13446, BBS/B/13470; NHGRI NIH HHS: U41 HG007234, U54 HG004555; NICHD NIH HHS: R01 HD074078; Wellcome Trust: WT095908, WT098051
Database : the journal of biological databases and curation 2016;2016
FHF1 (FGF12) epileptic encephalopathy.
Program in Genetics and Genome Biology and Division of Neurology (S.A.-M., B.A.M.), Department of Paediatrics, The Hospital for Sick Children, and University of Toronto, Ontario, Canada; Institute of Genetic Medicine (M.S.), International Centre for Life, Pediatric Neurology (V.R.), Newcastle General Hospital, UK; Center for Human Genetics (S.D., K.D.), UH Case Medical Center, Cleveland, OH; Department of Molecular and Human Genetics (F.X., Y.Y., J.A.R.), Baylor College of Medicine, Houston, TX; Baylor Miraca Genetics Laboratories (F.X., Y.Y.), Houston, TX; The Deciphering Developmental Disorders (DDD) Study, Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK; Division of Neurology (P.C.), CHUM Notre-Dame, Hospital University of Montreal, Quebec, Canada; Department of Pediatrics (J.L.M., P.M.C.), Department of Neurosciences (J.L.M., P.M.C.), Université de Montréal, Québec, Canada; and CHU Sainte-Justine Research Center (J.L.M., F.A.H., P.M.C.), Montreal, Quebec, Canada.
Voltage-gated sodium channels (Na<sub>v</sub>s) are mainstays of neuronal function, and mutations in the genes encoding CNS Na<sub>v</sub>s (Na<sub>v</sub>1.1 [<i>SCN1A</i>], Na<sub>v</sub>1.2 [<i>SCN2A</i>], Na<sub>v</sub>1.3 [<i>SCN3A</i>], and Na<sub>v</sub>1.6 [<i>SCN8A</i>]) are causes of some of the most common and severe genetic epilepsies and epileptic encephalopathies (EE).<sup>1</sup> Fibroblast-growth-factor homologous factors (FHFs) compose a family of 4 proteins that interact with the C-terminal tails of Na<sub>v</sub>s to modulate the channels' fast, and long-term, inactivations.<sup>2</sup><i>FHF2</i> mutation is a rare cause of generalized epilepsy with febrile seizures plus (GEFS+).<sup>3</sup> Recently, a de novo <i>FHF1</i> mutation (p.R52H) was reported in early-onset EE in 2 siblings.<sup>4</sup> We report 3 patients from unrelated families with the same <i>FHF1</i> p.R52H mutation. The 5 cases together frame the FHF1 R52H EE from infancy to adulthood. As discussed below, this gain-of-function disease may be amenable to personalized therapy.
Funded by: NINDS NIH HHS: U54 NS078059
Neurology. Genetics 2016;2;6;e115
Complete Genome Sequence of Neisseria weaveri Strain NCTC13585.
Culture Collections, Public Health England, London, United Kingdom firstname.lastname@example.org.
Neisseria weaveri is a commensal organism of the canine oral cavity and an occasional opportunistic human pathogen which is associated with dog bite wounds. Here we report the first complete genomic sequence of the N. weaveri NCTC13585 (CCUG30381) strain, which was originally isolated from a patient with a canine bite wound.
Genome announcements 2016;4;4
Complete Genome Sequence of Plesiomonas shigelloides Type Strain NCTC10360.
Culture Collections, Public Health England, London, United Kingdom Sarah.email@example.com.
Plesiomonas shigelloides is a Gram-negative rod within the Enterobacteriaceae family. It is a gastrointestinal pathogen of increasing notoriety, often associated with diarrheal disease. P. shigelloides is waterborne, and infection is often linked to the consumption of seafood. Here, we describe the first complete genome for P. shigelloides type strain NCTC10360.
Genome announcements 2016;4;5
Mutational signatures associated with tobacco smoking in human cancer.
Tobacco smoking increases the risk of at least 17 classes of human cancer. We analyzed somatic mutations and DNA methylation in 5243 cancers of types for which tobacco smoking confers an elevated risk. Smoking is associated with increased mutation burdens of multiple distinct mutational signatures, which contribute to different extents in different cancers. One of these signatures, mainly found in cancers derived from tissues directly exposed to tobacco smoke, is attributable to misreplication of DNA damage caused by tobacco carcinogens. Others likely reflect indirect activation of DNA editing by APOBEC cytidine deaminases and of an endogenous clocklike mutational process. Smoking is associated with limited differences in methylation. The results are consistent with the proposition that smoking increases cancer risk by increasing the somatic mutation load, although direct evidence for this mechanism is lacking in some smoking-related cancer types.
Funded by: Cancer Research UK; Department of Health; Wellcome Trust
Science (New York, N.Y.) 2016;354;6312;618-622
Do Genetic Factors Modify the Relationship Between Obesity and Hypertriglyceridemia? Findings From the GLACIER and the MDC Studies.
From the Department of Clinical Sciences, Genetic & Molecular Epidemiology Unit (A.A., T.V.V., A.P., F.R., P.W.F.) and Department of Clinical Sciences, Diabetes & Cardiovascular Disease-Genetic Epidemiology (I.A.S., C.-A.S., M.O.-M.), Lund University, Malmö, Sweden; Department of Systems Medicine, Steno Diabetes Center, Gentofte, Denmark (A.A.); Department of Biobank Research (G.H., F.R.) and Department of Public Health & Clinical Medicine (P.W.F.), Umeå University, Umeå, Sweden; Human Genetics Programme, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton (I.B.); NIHR Cambridge Biomedical Research Centre, Institute of Metabolic Science (I.B.) and University of Cambridge, Metabolic Research Laboratories Institute of Metabolic Science (I.B.), Addenbrooke's Hospital, Cambridge, United Kingdom; Department of Genetics, Physical Anthropology & Animal Physiology, University of the Basque Country (UPV/EHU), Bilbao, Spain (A.P.); and Department of Nutrition, Harvard T.H. Chan School of Public Health, Boston, MA (P.W.F.). Paul.Franks@med.lu.se firstname.lastname@example.org.
Background: Obesity is a major risk factor for dyslipidemia, but this relationship is highly variable. Recently published data from 2 Danish cohorts suggest that genetic factors may underlie some of this variability.
Methods and results: We tested whether established triglyceride-associated loci modify the relationship of body mass index (BMI) and triglyceride concentrations in 2 Swedish cohorts (the Gene-Lifestyle Interactions and Complex Traits Involved in Elevated Disease Risk [GLACIER Study; N=4312] and the Malmö Diet and Cancer Study [N=5352]). The genetic loci were amalgamated into a weighted genetic risk score (WGRSTG) by summing the triglyceride-elevating alleles (weighted by their established marginal effects) for all loci. Both BMI and the WGRSTG were strongly associated with triglyceride concentrations in GLACIER, with each additional BMI unit (kg/m(2)) associated with 2.8% (P=8.4×10(-84)) higher triglyceride concentration and each additional WGRSTG unit with 2% (P=7.6×10(-48)) higher triglyceride concentration. Each unit of the WGRSTG was associated with 1.5% higher triglyceride concentrations in normal weight and 2.4% higher concentrations in overweight/obese participants (Pinteraction=0.056). Meta-analyses of results from the Swedish cohorts yielded a statistically significant WGRSTG×BMI interaction effect (Pinteraction=6.0×10(-4)), which was strengthened by including data from the Danish cohorts (Pinteraction=6.5×10(-7)). In the meta-analysis of the Swedish cohorts, nominal evidence of a 3-way interaction (WGRSTG×BMI×sex) was observed (Pinteraction=0.03), where the WGRSTG×BMI interaction was only statistically significant in females. Using protein-protein interaction network analyses, we identified molecular interactions and pathways elucidating the metabolic relationships between BMI and triglyceride-associated loci.
Conclusions: Our findings provide evidence that body fatness accentuates the effects of genetic susceptibility variants in hypertriglyceridemia, effects that are most evident in females.
Funded by: Medical Research Council; Wellcome Trust: 098051
Circulation. Cardiovascular genetics 2016;9;2;162-71
Decreased Rate of Plasma Arginine Appearance in Murine Malaria May Explain Hypoargininemia in Children With Cerebral Malaria.
Laboratory of Malaria and Vector Research, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Rockville.
Background: Plasmodium infection depletes arginine, the substrate for nitric oxide synthesis, and impairs endothelium-dependent vasodilation. Increased conversion of arginine to ornithine by parasites or host arginase is a proposed mechanism of arginine depletion.
Methods: We used high-performance liquid chromatography to measure plasma arginine, ornithine, and citrulline levels in Malawian children with cerebral malaria and in mice infected with Plasmodium berghei ANKA with or without the arginase gene. Heavy isotope-labeled tracers measured by quadrupole time-of-flight liquid chromatography-mass spectrometry were used to quantify the in vivo rate of appearance and interconversion of plasma arginine, ornithine, and citrulline in infected mice.
Results: Children with cerebral malaria and P. berghei-infected mice demonstrated depletion of plasma arginine, ornithine, and citrulline. Knock out of Plasmodium arginase did not alter arginine depletion in infected mice. Metabolic tracer analysis demonstrated that plasma arginase flux was unchanged by P. berghei infection. Instead, infected mice exhibited decreased rates of plasma arginine, ornithine, and citrulline appearance and decreased conversion of plasma citrulline to arginine. Notably, plasma arginine use by nitric oxide synthase was decreased in infected mice.
Conclusions: Simultaneous arginine and ornithine depletion in malaria parasite-infected children cannot be fully explained by plasma arginase activity. Our mouse model studies suggest that plasma arginine depletion is driven primarily by a decreased rate of appearance.
The Journal of infectious diseases 2016;214;12;1840-1849
Ebola virus disease cluster — Northern Sierra Leone, January 2016
Morbidity and Mortality Weekly Report 2016;65;26;681-3
Dihydroartemisinin-piperaquine resistance in Plasmodium falciparum malaria in Cambodia: a multisite prospective cohort study.
Laboratory of Malaria and Vector Research, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Rockville, MD, USA.
Background: Artemisinin resistance in Plasmodium falciparum threatens to reduce the efficacy of artemisinin combination therapies (ACTs), thus compromising global efforts to eliminate malaria. Recent treatment failures with dihydroartemisinin-piperaquine, the current first-line ACT in Cambodia, suggest that piperaquine resistance may be emerging in this country. We explored the relation between artemisinin resistance and dihydroartemisinin-piperaquine failures, and sought to confirm the presence of piperaquine-resistant P falciparum infections in Cambodia.
Methods: In this prospective cohort study, we enrolled patients aged 2-65 years with uncomplicated P falciparum malaria in three Cambodian provinces: Pursat, Preah Vihear, and Ratanakiri. Participants were given standard 3-day courses of dihydroartemisinin-piperaquine. Peripheral blood parasite densities were measured until parasites cleared and then weekly to 63 days. The primary outcome was recrudescent P falciparum parasitaemia within 63 days. We measured piperaquine plasma concentrations at baseline, 7 days, and day of recrudescence. We assessed phenotypic and genotypic markers of drug resistance in parasite isolates. The study is registered with ClinicalTrials.gov, number NCT01736319.
Findings: Between Sept 4, 2012, and Dec 31, 2013, we enrolled 241 participants. In Pursat, where artemisinin resistance is entrenched, 37 (46%) of 81 patients had parasite recrudescence. In Preah Vihear, where artemisinin resistance is emerging, ten (16%) of 63 patients had recrudescence and in Ratanakiri, where artemisinin resistance is rare, one (2%) of 60 patients did. Patients with recrudescent P falciparum infections were more likely to have detectable piperaquine plasma concentrations at baseline compared with non-recrudescent patients, but did not differ significantly in age, initial parasite density, or piperaquine plasma concentrations at 7 days. Recrudescent parasites had a higher prevalence of kelch13 mutations, higher piperaquine 50% inhibitory concentration (IC50) values, and lower mefloquine IC50 values; none had multiple pfmdr1 copies, a genetic marker of mefloquine resistance.
Interpretation: Dihydroartemisinin-piperaquine failures are caused by both artemisinin and piperaquine resistance, and commonly occur in places where dihydroartemisinin-piperaquine has been used in the private sector. In Cambodia, artesunate plus mefloquine may be a viable option to treat dihydroartemisinin-piperaquine failures, and a more effective first-line ACT in areas where dihydroartemisinin-piperaquine failures are common. The use of single low-dose primaquine to eliminate circulating gametocytes is needed in areas where artemisinin and ACT resistance is prevalent.
Funding: National Institute of Allergy and Infectious Diseases.
Funded by: Intramural NIH HHS: Z01 AI001000-01, Z01 AI001000-02; Wellcome Trust: 089275/Z/09/2
The Lancet. Infectious diseases 2016;16;3;357-65
Voices of biotech.
Weizmann Institute of Science, Rehovot, Israel.
Nature biotechnology 2016;34;3;270-5
Chlamydia trachomatis from Australian Aboriginal people with trachoma are polyphyletic composed of multiple distinctive lineages.
Global and Tropical Health Division, Menzies School of Health Research, Charles Darwin University, Darwin, Casuarina, Northern Territory 0811, Australia.
Chlamydia trachomatis causes sexually transmitted infections and the blinding disease trachoma. Current data on C. trachomatis phylogeny show that there is only a single trachoma-causing clade, which is distinct from the lineages causing urogenital tract (UGT) and lymphogranuloma venerum diseases. Here we report the whole-genome sequences of ocular C. trachomatis isolates obtained from young children with clinical signs of trachoma in a trachoma endemic region of northern Australia. The isolates form two lineages that fall outside the classical trachoma lineage, instead being placed within UGT clades of the C. trachomatis phylogenetic tree. The Australian trachoma isolates appear to be recombinants with UGT C. trachomatis genome backbones, in which loci that encode immunodominant surface proteins (ompA and pmpEFGH) have been replaced by those characteristic of classical ocular isolates. This suggests that ocular tropism and association with trachoma are functionally associated with some sequence variants of ompA and pmpEFGH.
Funded by: Wellcome Trust: 098051
Nature communications 2016;7;10688
Notes on the implementation of FAM
CEUR Workshop Proceedings 2016;1661;46-58
Parallel single-cell sequencing links transcriptional and epigenetic heterogeneity.
European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge, UK.
We report scM&T-seq, a method for parallel single-cell genome-wide methylome and transcriptome sequencing that allows for the discovery of associations between transcriptional and epigenetic variation. Profiling of 61 mouse embryonic stem cells confirmed known links between DNA methylation and transcription. Notably, the method revealed previously unrecognized associations between heterogeneously methylated distal regulatory elements and transcription of key pluripotency genes.
Funded by: Biotechnology and Biological Sciences Research Council; Medical Research Council: MC_U137761446, MR/K011332/1; Wellcome Trust: 095645, 105031REIK, 105045
Nature methods 2016;13;3;229-232
Deep learning for computational biology.
European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton Cambridge, UK.
Technological advances in genomics and imaging have led to an explosion of molecular and cellular profiling data from large numbers of samples. This rapid increase in biological data dimension and acquisition rate is challenging conventional analysis strategies. Modern machine learning methods, such as deep learning, promise to leverage very large data sets for finding hidden structure within them, and for making accurate predictions. In this review, we discuss applications of this new breed of analysis approaches in regulatory genomics and cellular imaging. We provide background of what deep learning is, and the settings in which it can be successfully applied to derive biological insights. In addition to presenting specific applications and providing tips for practical use, we also highlight possible pitfalls and limitations to guide computational biologists when and how to make the most use of this new technology.
Funded by: Wellcome Trust
Molecular systems biology 2016;12;7;878
Phase variation of a Type IIG restriction-modification enzyme alters site-specific methylation patterns and gene expression in Campylobacter jejuni strain NCTC11168.
Department of Genetics, University of Leicester, Leicester LE1 7RH, UK.
Phase-variable restriction-modification systems are a feature of a diverse range of bacterial species. Stochastic, reversible switches in expression of the methyltransferase produces variation in methylation of specific sequences. Phase-variable methylation by both Type I and Type III methyltransferases is associated with altered gene expression and phenotypic variation. One phase-variable gene of Campylobacter jejuni encodes a homologue of an unusual Type IIG restriction-modification system in which the endonuclease and methyltransferase are encoded by a single gene. Using both inhibition of restriction and PacBio-derived methylome analyses of mutants and phase-variants, the cj0031c allele in C. jejuni strain NCTC11168 was demonstrated to specifically methylate adenine in 5'CCCGA and 5'CCTGA sequences. Alterations in the levels of specific transcripts were detected using RNA-Seq in phase-variants and mutants of cj0031c but these changes did not correlate with observed differences in phenotypic behaviour. Alterations in restriction of phage growth were also associated with phase variation (PV) of cj0031c and correlated with presence of sites in the genomes of these phages. We conclude that PV of a Type IIG restriction-modification system causes changes in site-specific methylation patterns and gene expression patterns that may indirectly change adaptive traits.
Nucleic acids research 2016;44;10;4581-94
Centre for Genomic Pathogen Surveillance, The Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.
Nature reviews. Microbiology 2016;14;12;730
Rapid outbreak sequencing of Ebola virus in Sierra Leone identifies transmission chains linked to sporadic cases.
Division of Virology, Department of Pathology, University of Cambridge, Cambridge, United Kingdom.
To end the largest known outbreak of Ebola virus disease (EVD) in West Africa and to prevent new transmissions, rapid epidemiological tracing of cases and contacts was required. The ability to quickly identify unknown sources and chains of transmission is key to ending the EVD epidemic and of even greater importance in the context of recent reports of Ebola virus (EBOV) persistence in survivors. Phylogenetic analysis of complete EBOV genomes can provide important information on the source of any new infection. A local deep sequencing facility was established at the Mateneh Ebola Treatment Centre in central Sierra Leone. The facility included all wetlab and computational resources to rapidly process EBOV diagnostic samples into full genome sequences. We produced 554 EBOV genomes from EVD cases across Sierra Leone. These genomes provided a detailed description of EBOV evolution and facilitated phylogenetic tracking of new EVD cases. Importantly, we show that linked genomic and epidemiological data can not only support contact tracing but also identify unconventional transmission chains involving body fluids, including semen. Rapid EBOV genome sequencing, when linked to epidemiological information and a comprehensive database of virus sequences across the outbreak, provided a powerful tool for public health epidemic control efforts.
Funded by: Wellcome Trust; World Health Organization: 001
Virus evolution 2016;2;1;vew016
Origin of modern syphilis and emergence of a pandemic Treponema pallidum cluster.
Institute for Evolutionary Biology and Environmental Studies, University of Zurich, 8057 Zurich, Switzerland.
The abrupt onslaught of the syphilis pandemic that started in the late fifteenth century established this devastating infectious disease as one of the most feared in human history<sup>1</sup>. Surprisingly, despite the availability of effective antibiotic treatment since the mid-twentieth century, this bacterial infection, which is caused by Treponema pallidum subsp. pallidum (TPA), has been re-emerging globally in the last few decades with an estimated 10.6 million cases in 2008 (ref. 2). Although resistance to penicillin has not yet been identified, an increasing number of strains fail to respond to the second-line antibiotic azithromycin<sup>3</sup>. Little is known about the genetic patterns in current infections or the evolutionary origins of the disease due to the low quantities of treponemal DNA in clinical samples and difficulties in cultivating the pathogen<sup>4</sup>. Here, we used DNA capture and whole-genome sequencing to successfully interrogate genome-wide variation from syphilis patient specimens, combined with laboratory samples of TPA and two other subspecies. Phylogenetic comparisons based on the sequenced genomes indicate that the TPA strains examined share a common ancestor after the fifteenth century, within the early modern era. Moreover, most contemporary strains are azithromycin-resistant and are members of a globally dominant cluster, named here as SS14-Ω. The cluster diversified from a common ancestor in the mid-twentieth century subsequent to the discovery of antibiotics. Its recent phylogenetic divergence and global presence point to the emergence of a pandemic strain cluster.
Nature microbiology 2016;2;16245
Trans-ethnic study design approaches for fine-mapping.
Wellcome Trust Sanger Institute, Cambridge, UK.
Studies that traverse ancestrally diverse populations may increase power to detect novel loci and improve fine-mapping resolution of causal variants by leveraging linkage disequilibrium differences between ethnic groups. The inclusion of African ancestry samples may yield further improvements because of low linkage disequilibrium and high genetic heterogeneity. We investigate the fine-mapping resolution of trans-ethnic fixed-effects meta-analysis for five type II diabetes loci, under various settings of ancestral composition (European, East Asian, African), allelic heterogeneity, and causal variant minor allele frequency. In particular, three settings of ancestral composition were compared: (1) single ancestry (European), (2) moderate ancestral diversity (European and East Asian), and (3) high ancestral diversity (European, East Asian, and African). Our simulations suggest that the European/Asian and European ancestry-only meta-analyses consistently attain similar fine-mapping resolution. The inclusion of African ancestry samples in the meta-analysis leads to a marked improvement in fine-mapping resolution.
Funded by: Medical Research Council: MR/K021486/1; NIDDK NIH HHS: U01 DK085545; Wellcome Trust: 098017, 098051
European journal of human genetics : EJHG 2016;24;9;1330-6
The Allelic Landscape of Human Blood Cell Trait Variation and Links to Common Complex Disease.
Department of Haematology, University of Cambridge, Cambridge Biomedical Campus, Long Road, Cambridge CB2 0PT, UK; National Health Service (NHS) Blood and Transplant, Cambridge Biomedical Campus, Long Road, Cambridge CB2 0PT, UK; Medical Research Council Biostatistics Unit, Cambridge Institute of Public Health, Cambridge Biomedical Campus, Forvie Site, Robinson Way, Cambridge CB2 0SR, UK; MRC/BHF Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Strangeways Research Laboratory, Wort's Causeway, Cambridge CB1 8RN, UK.
Many common variants have been associated with hematological traits, but identification of causal genes and pathways has proven challenging. We performed a genome-wide association analysis in the UK Biobank and INTERVAL studies, testing 29.5 million genetic variants for association with 36 red cell, white cell, and platelet properties in 173,480 European-ancestry participants. This effort yielded hundreds of low frequency (<5%) and rare (<1%) variants with a strong impact on blood cell phenotypes. Our data highlight general properties of the allelic architecture of complex traits, including the proportion of the heritable component of each blood trait explained by the polygenic signal across different genome regulatory domains. Finally, through Mendelian randomization, we provide evidence of shared genetic pathways linking blood cell indices with complex pathologies, including autoimmune diseases, schizophrenia, and coronary heart disease and evidence suggesting previously reported population associations between blood cell indices and cardiovascular disease may be non-causal.
Funded by: British Heart Foundation: RG/09/012/28096; Department of Health: RP-PG-0310-1002, RP-PG-0310-1004; European Research Council: 268834; Medical Research Council: MC_QA137853, MR/L003120/1
A new <i>Plasmodium vivax</i> reference sequence with improved assembly of the subtelomeres reveals an abundance of <i>pir</i> genes.
Global and Tropical Health Division, Menzies School of Health Research and Charles Darwin University, Darwin, Australia.
<i>Plasmodium vivax</i> is now the predominant cause of malaria in the Asia-Pacific, South America and Horn of Africa. Laboratory studies of this species are constrained by the inability to maintain the parasite in continuous <i>ex vivo</i> culture, but genomic approaches provide an alternative and complementary avenue to investigate the parasite's biology and epidemiology. To date, molecular studies of <i>P. vivax</i> have relied on the Salvador-I reference genome sequence, derived from a monkey-adapted strain from South America. However, the Salvador-I reference remains highly fragmented with over 2500 unassembled scaffolds. Using high-depth Illumina sequence data, we assembled and annotated a new reference sequence, PvP01, sourced directly from a patient from Papua Indonesia. Draft assemblies of isolates from China (PvC01) and Thailand (PvT01) were also prepared for comparative purposes. The quality of the PvP01 assembly is improved greatly over Salvador-I, with fragmentation reduced to 226 scaffolds. Detailed manual curation has ensured highly comprehensive annotation, with functions attributed to 58% core genes in PvP01 versus 38% in Salvador-I. The assemblies of PvP01, PvC01 and PvT01 are larger than that of Salvador-I (28-30 versus 27 Mb), owing to improved assembly of the subtelomeres. An extensive repertoire of over 1200 <i>Plasmodium</i> interspersed repeat (<i>pir</i>) genes were identified in PvP01 compared to 346 in Salvador-I, suggesting a vital role in parasite survival or development. The manually curated PvP01 reference and PvC01 and PvT01 draft assemblies are important new resources to study vivax malaria. PvP01 is maintained at GeneDB and ongoing curation will ensure continual improvements in assembly and annotation quality.
Funded by: Wellcome Trust: 091625, 098051, 099198
Wellcome open research 2016;1;4
Genomic Analysis Reveals a Common Breakpoint in Amplifications of the Plasmodium vivax Multidrug Resistance 1 Locus in Thailand.
Global and Tropical Health Division, Menzies School of Health Research, Charles Darwin University, Australia.
In regions of coendemicity for Plasmodium falciparum and Plasmodium vivax where mefloquine is used to treat P. falciparum infection, drug pressure mediated by increased copy numbers of the multidrug resistance 1 gene (pvmdr1) may select for mefloquine-resistant P. vivax Surveillance is not undertaken routinely owing in part to methodological challenges in detection of gene amplification. Using genomic data on 88 P. vivax samples from western Thailand, we identified pvmdr1 amplification in 17 isolates, all exhibiting tandem copies of a 37.6-kilobase pair region with identical breakpoints. A novel breakpoint-specific polymerase chain reaction assay was designed to detect the amplification. The assay demonstrated high sensitivity, identifying amplifications in 13 additional, polyclonal infections. Application to 132 further samples identified the common breakpoint in all years tested (2003-2015), with a decline in prevalence after 2012 corresponding to local discontinuation of mefloquine regimens. Assessment of the structure of pvmdr1 amplification in other geographic regions will yield information about the population-specificity of the breakpoints and underlying amplification mechanisms.
Funded by: NIAID NIH HHS: R01 AI103228; Wellcome Trust: 091625
The Journal of infectious diseases 2016;214;8;1235-42
Making sense of big data in health research: Towards an EU action plan.
European Institute for Systems Biology and Medicine, 1 avenue Claude Vellefaux, 75010, Paris, France. email@example.com.
Medicine and healthcare are undergoing profound changes. Whole-genome sequencing and high-resolution imaging technologies are key drivers of this rapid and crucial transformation. Technological innovation combined with automation and miniaturization has triggered an explosion in data production that will soon reach exabyte proportions. How are we going to deal with this exponential increase in data production? The potential of "big data" for improving health is enormous but, at the same time, we face a wide range of challenges to overcome urgently. Europe is very proud of its cultural diversity; however, exploitation of the data made available through advances in genomic medicine, imaging, and a wide range of mobile health applications or connected devices is hampered by numerous historical, technical, legal, and political barriers. European health systems and databases are diverse and fragmented. There is a lack of harmonization of data formats, processing, analysis, and data transfer, which leads to incompatibilities and lost opportunities. Legal frameworks for data sharing are evolving. Clinicians, researchers, and citizens need improved methods, tools, and training to generate, analyze, and query data effectively. Addressing these barriers will contribute to creating the European Single Market for health, which will improve health and healthcare for all Europeans.
Genome medicine 2016;8;1;71
Whole-genome sequencing of multidrug-resistant Mycobacterium tuberculosis isolates from Myanmar.
Department of Microbiology and Immunology, Otago School of Medical Sciences, University of Otago, Dunedin, New Zealand; Maurice Wilkins Centre for Molecular Biodiscovery, University of Auckland, Auckland, New Zealand. Electronic address: firstname.lastname@example.org.
Drug-resistant tuberculosis (TB) is a major health threat in Myanmar. An initial study was conducted to explore the potential utility of whole-genome sequencing (WGS) for the diagnosis and management of drug-resistant TB in Myanmar. Fourteen multidrug-resistant Mycobacterium tuberculosis isolates were sequenced. Known resistance genes for a total of nine antibiotics commonly used in the treatment of drug-susceptible and multidrug-resistant TB (MDR-TB) in Myanmar were interrogated through WGS. All 14 isolates were MDR-TB, consistent with the results of phenotypic drug susceptibility testing (DST), and the Beijing lineage predominated. Based on the results of WGS, 9 of the 14 isolates were potentially resistant to at least one of the drugs used in the standard MDR-TB regimen but for which phenotypic DST is not conducted in Myanmar. This study highlights a need for the introduction of second-line DST as part of routine TB diagnosis in Myanmar as well as new classes of TB drugs to construct effective regimens.
Funded by: Wellcome Trust: 098600
Journal of global antimicrobial resistance 2016;6;113-117
EuPathDB: the eukaryotic pathogen genomics database resource.
Center for Tropical & Emerging Global Diseases, University of Georgia, Athens, GA 30602, USA.
The Eukaryotic Pathogen Genomics Database Resource (EuPathDB, http://eupathdb.org) is a collection of databases covering 170+ eukaryotic pathogens (protists & fungi), along with relevant free-living and non-pathogenic species, and select pathogen hosts. To facilitate the discovery of meaningful biological relationships, the databases couple preconfigured searches with visualization and analysis tools for comprehensive data mining via intuitive graphical interfaces and APIs. All data are analyzed with the same workflows, including creation of gene orthology profiles, so data are easily compared across data sets, data types and organisms. EuPathDB is updated with numerous new analysis tools, features, data sets and data types. New tools include GO, metabolic pathway and word enrichment analyses plus an online workspace for analysis of personal, non-public, large-scale data. Expanded data content is mostly genomic and functional genomic data while new data types include protein microarray, metabolic pathways, compounds, quantitative proteomics, copy number variation, and polysomal transcriptomics. New features include consistent categorization of searches, data sets and genome browser tracks; redesigned gene pages; effective integration of alternative transcripts; and a EuPathDB Galaxy instance for private analyses of a user's data. Forthcoming upgrades include user workspaces for private integration of data with existing EuPathDB data and improved integration and presentation of host-pathogen interactions.
Nucleic acids research 2016
An Integrated Genome-Wide Systems Genetics Screen for Breast Cancer Metastasis Susceptibility Genes.
Laboratory of Cancer Biology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, United States of America.
Metastasis remains the primary cause of patient morbidity and mortality in solid tumors and is due to the action of a large number of tumor-autonomous and non-autonomous factors. Here we report the results of a genome-wide integrated strategy to identify novel metastasis susceptibility candidate genes and molecular pathways in breast cancer metastasis. This analysis implicates a number of transcriptional regulators and suggests cell-mediated immunity is an important determinant. Moreover, the analysis identified novel or FDA-approved drugs as potentially useful for anti-metastatic therapy. Further explorations implementing this strategy may therefore provide a variety of information for clinical applications in the control and treatment of advanced neoplastic disease.
Funded by: Intramural NIH HHS
PLoS genetics 2016;12;4;e1005989
Travel- and Community-Based Transmission of Multidrug-Resistant Shigella sonnei Lineage among International Orthodox Jewish Communities.
Shigellae are sensitive indicator species for studying trends in the international transmission of antimicrobial-resistant Enterobacteriaceae. Orthodox Jewish communities (OJCs) are a known risk group for shigellosis; Shigella sonnei is cyclically epidemic in OJCs in Israel, and sporadic outbreaks occur in OJCs elsewhere. We generated whole-genome sequences for 437 isolates of S. sonnei from OJCs and non-OJCs collected over 22 years in Europe (the United Kingdom, France, and Belgium), the United States, Canada, and Israel and analyzed these within a known global genomic context. Through phylogenetic and genomic analysis, we showed that strains from outbreaks in OJCs outside of Israel are distinct from strains in the general population and relate to a single multidrug-resistant sublineage of S. sonnei that prevails in Israel. Further Bayesian phylogenetic analysis showed that this strain emerged approximately 30 years ago, demonstrating the speed at which antimicrobial drug-resistant pathogens can spread widely through geographically dispersed, but internationally connected, communities.
Emerging infectious diseases 2016;22;9;1545-53
Single-cell sequencing reveals karyotype heterogeneity in murine and human malignancies.
European Research Institute for the Biology of Ageing, University of Groningen, University Medical Center Groningen, A. Deusinglaan 1, Groningen, 9713 AV, The Netherlands.
Background: Chromosome instability leads to aneuploidy, a state in which cells have abnormal numbers of chromosomes, and is found in two out of three cancers. In a chromosomal instable p53 deficient mouse model with accelerated lymphomagenesis, we previously observed whole chromosome copy number changes affecting all lymphoma cells. This suggests that chromosome instability is somehow suppressed in the aneuploid lymphomas or that selection for frequently lost/gained chromosomes out-competes the CIN-imposed mis-segregation.
Results: To distinguish between these explanations and to examine karyotype dynamics in chromosome instable lymphoma, we use a newly developed single-cell whole genome sequencing (scWGS) platform that provides a complete and unbiased overview of copy number variations (CNV) in individual cells. To analyse these scWGS data, we develop AneuFinder, which allows annotation of copy number changes in a fully automated fashion and quantification of CNV heterogeneity between cells. Single-cell sequencing and AneuFinder analysis reveals high levels of copy number heterogeneity in chromosome instability-driven murine T-cell lymphoma samples, indicating ongoing chromosome instability. Application of this technology to human B cell leukaemias reveals different levels of karyotype heterogeneity in these cancers.
Conclusion: Our data show that even though aneuploid tumours select for particular and recurring chromosome combinations, single-cell analysis using AneuFinder reveals copy number heterogeneity. This suggests ongoing chromosome instability that other platforms fail to detect. As chromosome instability might drive tumour evolution, karyotype analysis using single-cell sequencing technology could become an essential tool for cancer treatment stratification.
Genome biology 2016;17;1;115
Synthetic lethality between PAXX and XLF in mammalian development.
Wellcome Trust/Cancer Research UK Gurdon Institute, University of Cambridge, Cambridge CB2 1QN, United Kingdom.
PAXX was identified recently as a novel nonhomologous end-joining DNA repair factor in human cells. To characterize its physiological roles, we generated Paxx-deficient mice. Like Xlf<sup>-/-</sup> mice, Paxx<sup>-/-</sup> mice are viable, grow normally, and are fertile but show mild radiosensitivity. Strikingly, while Paxx loss is epistatic with Ku80, Lig4, and Atm deficiency, Paxx/Xlf double-knockout mice display embryonic lethality associated with genomic instability, cell death in the central nervous system, and an almost complete block in lymphogenesis, phenotypes that closely resemble those of Xrcc4<sup>-/-</sup> and Lig4<sup>-/-</sup> mice. Thus, combined loss of Paxx and Xlf is synthetic-lethal in mammals.
Funded by: Cancer Research UK: 13031; European Research Council: 310917
Genes & development 2016;30;19;2152-2157
Dawning of the age of genomics for platelet granule disorders: improving insight, diagnosis and management.
Katharine Dormandy Haemophilia Centre and Thrombosis Unit, Royal Free London NHS Foundation Trust, London, UK.
Inherited disorders of platelet granules are clinically heterogeneous and their prevalence is underestimated because most patients do not undergo a complete diagnostic work-up. The lack of a genetic diagnosis limits the ability to tailor management, screen family members, aid with family planning, predict clinical progression and detect serious consequences, such as myelofibrosis, lung fibrosis and malignancy, in a timely manner. This is set to change with the introduction of high throughput sequencing (HTS) as a routine clinical diagnostic test. HTS diagnostic tests are now available, affordable and allow parallel screening of DNA samples for variants in all of the 80 known bleeding, thrombotic and platelet genes. Increased genetic diagnosis and curation of variants is, in turn, improving our understanding of the pathobiology and clinical course of inherited platelet disorders. Our understanding of the genetic causes of platelet granule disorders and the regulation of granule biogenesis is a work in progress and has been significantly enhanced by recent genomic discoveries from high-powered genome-wide association studies and genome sequencing projects. In the era of whole genome and epigenome sequencing, new strategies are required to integrate multiple sources of big data in the search for elusive, novel genes underlying granule disorders.
British journal of haematology 2016
Studying RNA Homology and Conservation with Infernal: From Single Sequences to RNA Families.
Institute for Molecular Infection Biology, University of Würzburg, Würzburg, Germany.
Emerging high-throughput technologies have led to a deluge of putative non-coding RNA (ncRNA) sequences identified in a wide variety of organisms. Systematic characterization of these transcripts will be a tremendous challenge. Homology detection is critical to making maximal use of functional information gathered about ncRNAs: identifying homologous sequence allows us to transfer information gathered in one organism to another quickly and with a high degree of confidence. ncRNA presents a challenge for homology detection, as the primary sequence is often poorly conserved and de novo secondary structure prediction and search remain difficult. This unit introduces methods developed by the Rfam database for identifying "families" of homologous ncRNAs starting from single "seed" sequences, using manually curated sequence alignments to build powerful statistical models of sequence and structure conservation known as covariance models (CMs), implemented in the Infernal software package. We provide a step-by-step iterative protocol for identifying ncRNA homologs and then constructing an alignment and corresponding CM. We also work through an example for the bacterial small RNA MicA, discovering a previously unreported family of divergent MicA homologs in genus Xenorhabdus in the process. © 2016 by John Wiley & Sons, Inc.
Funded by: Wellcome Trust: 098051
Current protocols in bioinformatics 2016;54;12.13.1-12.13.25
The TraDIS toolkit: sequencing and analysis for dense transposon mutant libraries.
Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, UK and Institute for Molecular Infection Biology, University of Würzburg, Würzburg D-97080, Germany.
Unlabelled: Transposon insertion sequencing is a high-throughput technique for assaying large libraries of otherwise isogenic transposon mutants providing insight into gene essentiality, gene function and genetic interactions. We previously developed the Transposon Directed Insertion Sequencing (TraDIS) protocol for this purpose, which utilizes shearing of genomic DNA followed by specific PCR amplification of transposon-containing fragments and Illumina sequencing. Here we describe an optimized high-yield library preparation and sequencing protocol for TraDIS experiments and a novel software pipeline for analysis of the resulting data. The Bio-Tradis analysis pipeline is implemented as an extensible Perl library which can either be used as is, or as a basis for the development of more advanced analysis tools. This article can serve as a general reference for the application of the TraDIS methodology.
Availability and implementation: The optimized sequencing protocol is included as supplementary information. The Bio-Tradis analysis pipeline is available under a GPL license at https://github.com/sanger-pathogens/Bio-Tradis
Supplementary information: Supplementary data are available at Bioinformatics online.
Funded by: Medical Research Council: G1100100; Wellcome Trust: WT098051
Bioinformatics (Oxford, England) 2016;32;7;1109-11
The need for an integrated approach for chronic disease research and care in Africa
Global Health, Epidemiology and Genomics 2016;1;e19
Genome-wide association studies of quantitative glycaemic traits
The Genetics of Type 2 Diabetes and Related Traits: Biology, Physiology and Translation 2016;63-89
SeqTools: visual tools for manual analysis of sequence alignments.
Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK. email@example.com.
Background: Manual annotation is essential to create high-quality reference alignments and annotation. Annotators need to be able to view sequence alignments in detail. The SeqTools package provides three tools for viewing different types of sequence alignment: Blixem is a many-to-one browser of pairwise alignments, displaying multiple match sequences aligned against a single reference sequence; Dotter provides a graphical dot-plot view of a single pairwise alignment; and Belvu is a multiple sequence alignment viewer, editor, and phylogenetic tool. These tools were originally part of the AceDB genome database system but have been completely rewritten to make them generally available as a standalone package of greatly improved function.
Findings: Blixem is used by annotators to give a detailed view of the evidence for particular gene models. Blixem displays the gene model positions and the match sequences aligned against the genomic reference sequence. Annotators use this for many reasons, including to check the quality of an alignment, to find missing/misaligned sequence and to identify splice sites and polyA sites and signals. Dotter is used to give a dot-plot representation of a particular pairwise alignment. This is used to identify sequence that is not represented (or is misrepresented) and to quickly compare annotated gene models with transcriptional and protein evidence that putatively supports them. Belvu is used to analyse conservation patterns in multiple sequence alignments and to perform a combination of manual and automatic processing of the alignment. High-quality reference alignments are essential if they are to be used as a starting point for further automatic alignment generation.
Conclusions: While there are many different alignment tools available, the SeqTools package provides unique functionality that annotators have found to be essential for analysing sequence alignments as part of the manual annotation process.
Funded by: NHGRI NIH HHS: 5U54HG00455-04; Wellcome Trust: 098051
BMC research notes 2016;9;39
The accessory genome of Shiga toxin-producing Escherichia coli (STEC) defines a persistent colonization type in cattle.
Friedrich-Loeffler-Institut/Federal Research Institute for Animal Health, Institute of Molecular Pathogenesis, Naumburger Str. 96a, 07743 Jena, Germany.
Shiga toxin-producing Escherichia coli (STEC) strains can colonize cattle for several months and may, thus, serve as gene reservoir for the genesis of highly virulent zoonotic enterohemorrhagic E. coli (EHEC). Attempts to reduce the human risk for acquiring EHEC infections should include strategies to control such STEC strains persisting in cattle. We therefore aimed to identify genetic patterns associated with the STEC colonization type in the bovine host. We included 88 persistent (STEC(per), shedding ≥ 4 months) and 74 sporadically colonizing STEC (STEC(spo), shedding ≤ 2 months) isolates from cattle and 16 bovine STEC isolates with unknown colonization type. Genoserotype and MLST were determined and the isolates probed with a DNA microarray for virulence-associated genes (VAGs). All STEC(per) belonged to only four genoserotypes (O26:H11, O156:H25, O165:H25, O182:H25) which formed three genetic clusters (ST21/396/1705, ST300/688, ST119). In contrast, STEC(spo) were scattered among 28 genoserotypes and 30 MLST types with O157:H7 (ST11) and O6:H49 (ST1079) being the most prevalent. The microarray analysis identified 139 unique gene patterns that clustered with the genoserotypes and MLST types of the strains. While the STEC(per) isolates possessed heterogeneous phylogenetic backgrounds, the accessory genome clustered these isolates together, separating them from STEC(spo) Given the vast genetic heterogeneity of bovine STEC strains, defining genetic patterns distinguishing STEC(per) from STEC(spo) will facilitate the targeted design of new intervention strategies counteracting these zoonotic pathogens at farm level.
Importance: Ruminants, especially cattle, are sources of food-borne infections in humans by Shiga toxin-producing Escherichia coli (STEC). Some STEC persist in cattle for longer periods of time, while others are detected only sporadically. Persisting strains can serve as gene reservoirs that supply E. coli with virulence factors, thereby generating new outbreak strains. Attempts to reduce the human risk for acquiring STEC infections should therefore include strategies to control such persisting STEC strains. By analyzing representative genes of their core and accessory genomes, we show that bovine STEC with a persistent colonization type emerged independently from sporadically colonizing isolates and evolved in parallel evolutionary branches. But, persistent colonizing strains share similar sets of accessory genes. Defining genetic patterns that distinguish persistent from sporadically colonizing STEC isolates will facilitate the targeted design of new intervention strategies to counteract these zoonotic pathogens at farm level.
Applied and environmental microbiology 2016
Eye on the B-ALL: B-cell receptor repertoires reveal persistence of numerous B-lymphoblastic leukemia subclones from diagnosis to relapse.
Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK.
The strongest predictor of relapse in B-cell acute lymphoblastic leukemia (B-ALL) is the level of persistence of tumor cells after initial therapy. The high mutation rate of the B-cell receptor (BCR) locus allows high-resolution tracking of the architecture, evolution and clonal dynamics of B-ALL. Using longitudinal BCR repertoire sequencing, we find that the BCR undergoes an unexpectedly high level of clonal diversification in B-ALL cells through both somatic hypermutation and secondary rearrangements, which can be used for tracking the subclonal composition of the disease and detect minimal residual disease with unprecedented sensitivity. We go on to investigate clonal dynamics of B-ALL using BCR phylogenetic analyses of paired diagnosis-relapse samples and find that large numbers of small leukemic subclones present at diagnosis re-emerge at relapse alongside a dominant clone. Our findings suggest that in all informative relapsed patients, the survival of large numbers of clonogenic cells beyond initial chemotherapy is a surrogate for inherent partial chemoresistance or inadequate therapy, providing an increased opportunity for subsequent emergence of fully resistant clones. These results frame early cytoreduction as an important determinant of long-term outcome.
Funded by: Medical Research Council: MC_PC_12009; Wellcome Trust: WT095663MA, WT098051, WT106068AIA
Dynamic variation of CD5 surface expression levels within individual chronic lymphocytic leukaemia clones.
Department of Medicine, University of Cambridge, Cambridge Biomedical Campus, Hills Road, Cambridge, CB2 0SP, UK.
Chronic lymphocytic leukaemia (CLL) is characterised by the accumulation of clonally-derived mature CD5(high) B-cells, however the cellular origin of CLL is still unknown. Patients with CLL also harbour variable numbers of CD5(low) B-cells, but the clonal relationship of these cells to the bulk disease is unknown and can have important implications for monitoring, treating and understanding the biology of CLL. Here we use B-cell receptors (BCRs) as molecular barcodes to first show that the great majority of CD5(low) B-cells in the blood of CLL patients are clonally related to CD5(high) CLL B-cells by single-cell BCR sequencing. We investigate whether CD5 state-switching was likely to occur continuously (common event) or as a rare event in CLL, by tracking somatic BCR mutations in bulk CLL B-cells and using them to reconstruct the phylogenetic relationships and evolutionary history of the CLL in each of four patients. Using statistical methods we show that there is no parsimonious route from a single or low number of CD5(low) switch events to the CD5(high) population, but rather large-scale and/or dynamic switching between these CD5 states is the most likely explanation. The overlapping BCR repertoires between CD5(high) and CD5(low) cells from CLL patient peripheral blood reveal that CLLs exist in a continuum of CD5 expression. The major proportion of CD5(low) B-cells in patients are leukemic, thus identifying CD5(low) B-cells as an important component of CLL, with implications for CLL pathogenesis, clinical monitoring and the development of anti-CD5-directed therapies.
Experimental hematology 2016
Six-Year Incidence of Blindness and Visual Impairment in Kenya: The Nakuru Eye Disease Cohort Study.
International Centre for Eye Health, Clinical Research Department, London School of Hygiene and Tropical Medicine, London, United Kingdom.
Purpose: To describe the cumulative 6-year incidence of visual impairment (VI) and blindness in an adult Kenyan population. The Nakuru Posterior Segment Eye Disease Study is a population-based sample of 4414 participants aged ≥50 years, enrolled in 2007-2008. Of these, 2170 (50%) were reexamined in 2013-2014.
Methods: The World Health Organization (WHO) and US definitions were used to calculate presenting visual acuity classifications based on logMAR visual acuity tests at baseline and follow-up. Detailed ophthalmic and anthropometric examinations as well as a questionnaire, which included past medical and ophthalmic history, were used to assess risk factors for study participation and vision loss. Cumulative incidence of VI and blindness, and factors associated with these outcomes, were estimated. Inverse probability weighting was used to adjust for nonparticipation.
Results: Visual acuity measurements were available for 2164 (99.7%) participants. Using WHO definitions, the 6-year cumulative incidence of VI was 11.9% (95%CI [confidence interval]: 10.3-13.8%) and blindness was 1.51% (95%CI: 1.0-2.2%); using the US classification, the cumulative incidence of blindness was 2.70% (95%CI: 1.8-3.2%). Incidence of VI increased strongly with older age, and independently with being diabetic. There are an estimated 21 new cases of VI per year in people aged ≥50 years per 1000 people, of whom 3 are blind. Therefore in Kenya we estimate that there are 92,000 new cases of VI in people aged ≥50 years per year, of whom 11,600 are blind, out of a total population of approximately 4.3 million people aged 50 and above.
Conclusions: The incidence of VI and blindness in this older Kenyan population was considerably higher than in comparable studies worldwide. A continued effort to strengthen the eye health system is necessary to support the growing unmet need in an aging and growing population.
Investigative ophthalmology & visual science 2016;57;14;5974-5983
Circulation of multiple genotypes of H1N2 viruses in a swine farm in Italy over a two-month period.
Istituto Zooprofilattico Sperimentale delle Venezie, Legnaro, PD, Italy. Electronic address: firstname.lastname@example.org.
In August 2012 repeated respiratory outbreaks caused by swine influenza A virus (swIAV) were registered for a whole year in a breeding farm in northeast Italy that supplied piglets for fattening. The virus, initially characterized in the farm, was a reassortant Eurasian avian-like H1N1 (H1avN1) genotype, containing a haemagglutinin segment derived from the pandemic H1N1 (A(H1N1)pdm09) lineage. To control infection, a vaccination program using vaccines against the A(H1N1)pdm09, human-like H1N2 (H1huN2), human-like H3N2 (H3N2), and H1avN1 viruses was implemented in sows in November 2013. Vaccine efficacy was assessed by sampling nasal swabs for two months in 35-75 day-old piglets born from vaccinated sows. Complete genome sequencing of eight swIAV-positive nasal swabs collected longitudinally from piglets after the implementation of the vaccination program was conducted to investigate the virus characteristics. Over the two-month period, two different genotypes involving multiple reassortment events were detected. The unexpected circulation of multiple reassortant genotypes in such a short time highlights the complexity of the genetic diversity of swIAV and the need for a better surveillance plan, based on the combination of clinical signs, epidemiological data and whole genome characterization.
Veterinary microbiology 2016;195;25-29
Retracing embryological fate.
Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton CB10 1SA, UK. email@example.com.
Science (New York, N.Y.) 2016;354;6316;1109
Mutational signatures of ionizing radiation in second malignancies.
Cancer Genome Project, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SA UK.
Ionizing radiation is a potent carcinogen, inducing cancer through DNA damage. The signatures of mutations arising in human tissues following in vivo exposure to ionizing radiation have not been documented. Here, we searched for signatures of ionizing radiation in 12 radiation-associated second malignancies of different tumour types. Two signatures of somatic mutation characterize ionizing radiation exposure irrespective of tumour type. Compared with 319 radiation-naive tumours, radiation-associated tumours carry a median extra 201 deletions genome-wide, sized 1-100 base pairs often with microhomology at the junction. Unlike deletions of radiation-naive tumours, these show no variation in density across the genome or correlation with sequence context, replication timing or chromatin structure. Furthermore, we observe a significant increase in balanced inversions in radiation-associated tumours. Both small deletions and inversions generate driver mutations. Thus, ionizing radiation generates distinctive mutational signatures that explain its carcinogenic potential.
Funded by: Cancer Research UK: 14835; Wellcome Trust
Nature communications 2016;7;12605
FINEMAP: efficient variable selection using summary data from genome-wide association studies.
Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, Finland, Department of Public Health, University of Helsinki, Helsinki, Finland.
Motivation: The goal of fine-mapping in genomic regions associated with complex diseases and traits is to identify causal variants that point to molecular mechanisms behind the associations. Recent fine-mapping methods using summary data from genome-wide association studies rely on exhaustive search through all possible causal configurations, which is computationally expensive.
Results: We introduce FINEMAP, a software package to efficiently explore a set of the most important causal configurations of the region via a shotgun stochastic search algorithm. We show that FINEMAP produces accurate results in a fraction of processing time of existing approaches and is therefore a promising tool for analyzing growing amounts of data produced in genome-wide association studies and emerging sequencing projects.
Availability and implementation: FINEMAP v1.0 is freely available for Mac OS X and Linux at http://www.christianbenner.com
Bioinformatics (Oxford, England) 2016;32;10;1493-501
Stage-Specific Transcriptome and Proteome Analyses of the Filarial Parasite Onchocerca volvulus and Its Wolbachia Endosymbiont.
Laboratory of Parasitic Diseases, NIAID, NIH, Bethesda, Maryland, USA.
Onchocerciasis (river blindness) is a neglected tropical disease that has been successfully targeted by mass drug treatment programs in the Americas and small parts of Africa. Achieving the long-term goal of elimination of onchocerciasis, however, requires additional tools, including drugs, vaccines, and biomarkers of infection. Here, we describe the transcriptome and proteome profiles of the major vector and the human host stages (L1, L2, L3, molting L3, L4, adult male, and adult female) of Onchocerca volvulus along with the proteome of each parasitic stage and of its Wolbachia endosymbiont (wOv). In so doing, we have identified stage-specific pathways important to the parasite's adaptation to its human host during its early development. Further, we generated a protein array that, when screened with well-characterized human samples, identified novel diagnostic biomarkers of O. volvulus infection and new potential vaccine candidates. This immunomic approach not only demonstrates the power of this postgenomic discovery platform but also provides additional tools for onchocerciasis control programs.
Importance: The global onchocerciasis (river blindness) elimination program will have to rely on the development of new tools (drugs, vaccines, biomarkers) to achieve its goals by 2025. As an adjunct to the completed genomic sequencing of O. volvulus, we used a comprehensive proteomic and transcriptomic profiling strategy to gain a comprehensive understanding of both the vector-derived and human host-derived parasite stages. In so doing, we have identified proteins and pathways that enable novel drug targeting studies and the discovery of novel vaccine candidates, as well as useful biomarkers of active infection.
Funded by: NIA NIH HHS: R24 AG042328; NIAID NIH HHS: R21 AI126466, T32 AI007180, ZIA AI000512; Wellcome Trust: 098051
Deep Roots for Aboriginal Australian Y Chromosomes.
The Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK.
Australia was one of the earliest regions outside Africa to be colonized by fully modern humans, with archaeological evidence for human presence by 47,000 years ago (47 kya) widely accepted [1, 2]. However, the extent of subsequent human entry before the European colonial age is less clear. The dingo reached Australia about 4 kya, indirectly implying human contact, which some have linked to changes in language and stone tool technology to suggest substantial cultural changes at the same time . Genetic data of two kinds have been proposed to support gene flow from the Indian subcontinent to Australia at this time, as well: first, signs of South Asian admixture in Aboriginal Australian genomes have been reported on the basis of genome-wide SNP data ; and second, a Y chromosome lineage designated haplogroup C(∗), present in both India and Australia, was estimated to have a most recent common ancestor around 5 kya and to have entered Australia from India . Here, we sequence 13 Aboriginal Australian Y chromosomes to re-investigate their divergence times from Y chromosomes in other continents, including a comparison of Aboriginal Australian and South Asian haplogroup C chromosomes. We find divergence times dating back to ∼50 kya, thus excluding the Y chromosome as providing evidence for recent gene flow from India into Australia.
Funded by: Wellcome Trust: 098051
Current biology : CB 2016;26;6;809-13
Chemokine (C-C Motif) Receptor 2 Mediates Dendritic Cell Recruitment to the Human Colon but Is Not Responsible for Differences Observed in Dendritic Cell Subsets, Phenotype, and Function Between the Proximal and Distal Colon.
Antigen Presentation Research Group, Imperial College London, Harrow, United Kingdom.
Background & aims: Most knowledge about gastrointestinal (GI)-tract dendritic cells (DC) relies on murine studies where CD103<sup>+</sup> DC specialize in generating immune tolerance with the functionality of CD11b<sup>+/-</sup> subsets being unclear. Information about human GI-DC is scarce, especially regarding regional specifications. Here, we characterized human DC properties throughout the human colon.
Methods: Paired proximal (right/ascending) and distal (left/descending) human colonic biopsies from 95 healthy subjects were taken; DC were assessed by flow cytometry and microbiota composition assessed by 16S rRNA gene sequencing.
Results: Colonic DC identified were myeloid (mDC, CD11c<sup>+</sup>CD123<sup>-</sup>) and further divided based on CD103 and SIRPα (human analog of murine CD11b) expression. CD103<sup>-</sup>SIRPα<sup>+</sup> DC were the major population and with CD103<sup>+</sup>SIRPα<sup>+</sup> DC were CD1c<sup>+</sup>ILT3<sup>+</sup>CCR2<sup>+</sup> (although CCR2 was not expressed on all CD103<sup>+</sup>SIRPα<sup>+</sup> DC). CD103<sup>+</sup>SIRPα<sup>-</sup> DC constituted a minor subset that were CD141<sup>+</sup>ILT3<sup>-</sup>CCR2<sup>-</sup>. Proximal colon samples had higher total DC counts and fewer CD103<sup>+</sup>SIRPα<sup>+</sup> cells. Proximal colon DC were more mature than distal DC with higher stimulatory capacity for CD4<sup>+</sup>CD45RA<sup>+</sup> T-cells. However, DC and DC-invoked T-cell expression of mucosal homing markers (β7, CCR9) was lower for proximal DC. CCR2 was expressed on circulating CD1c<sup>+</sup>, but not CD141<sup>+</sup> mDC, and mediated DC recruitment by colonic culture supernatants in transwell assays. Proximal colon DC produced higher levels of cytokines. Mucosal microbiota profiling showed a lower microbiota load in the proximal colon, but with no differences in microbiota composition between compartments.
Conclusions: Proximal colonic DC subsets differ from those in distal colon and are more mature. Targeted immunotherapy using DC in T-cell mediated GI tract inflammation may therefore need to reflect this immune compartmentalization.
Funded by: NIDDK NIH HHS: T32 DK007632; Worldwide Cancer Research: 12-0234
Cellular and molecular gastroenterology and hepatology 2016;2;1;22-39.e5
Optimized inducible shRNA and CRISPR/Cas9 platforms for in vitro studies of human development using hPSCs.
Inducible loss of gene function experiments are necessary to uncover mechanisms underlying development, physiology and disease. However, current methods are complex, lack robustness and do not work in multiple cell types. Here we address these limitations by developing single-step optimized inducible gene knockdown or knockout (sOPTiKD or sOPTiKO) platforms. These are based on genetic engineering of human genomic safe harbors combined with an improved tetracycline-inducible system and CRISPR/Cas9 technology. We exemplify the efficacy of these methods in human pluripotent stem cells (hPSCs), and show that generation of sOPTiKD/KO hPSCs is simple, rapid and allows tightly controlled individual or multiplexed gene knockdown or knockout in hPSCs and in a wide variety of differentiated cells. Finally, we illustrate the general applicability of this approach by investigating the function of transcription factors (OCT4 and T), cell cycle regulators (cyclin D family members) and epigenetic modifiers (DPY30). Overall, sOPTiKD and sOPTiKO provide a unique opportunity for functional analyses in multiple cell types relevant for the study of human development.
Funded by: British Heart Foundation: FS/13/29/30024; European Research Council: 281335; Medical Research Council: MC_PC_12009, MR/L016761/1
Development (Cambridge, England) 2016;143;23;4405-4418
A detailed clinical analysis of 13 patients with AUTS2 syndrome further delineates the phenotypic spectrum and underscores the behavioural phenotype.
Department of Clinical Genetics, VU University Medical Center Amsterdam, The Netherlands.
Background: AUTS2 syndrome is an 'intellectual disability (ID) syndrome' caused by genomic rearrangements, deletions, intragenic duplications or mutations disrupting AUTS2. So far, 50 patients with AUTS2 syndrome have been described, but clinical data are limited and almost all cases involved young children.
Methods: We present a detailed clinical description of 13 patients (including six adults) with AUTS2 syndrome who have a pathogenic mutation or deletion in AUTS2. All patients were systematically evaluated by the same clinical geneticist.
Results: All patients have borderline to severe ID/developmental delay, 83-100% have microcephaly and feeding difficulties. Congenital malformations are rare, but mild heart defects, contractures and genital malformations do occur. There are no major health issues in the adults; the oldest of whom is now 59 years of age. Behaviour is marked by it is a friendly outgoing social interaction. Specific features of autism (like obsessive behaviour) are seen frequently (83%), but classical autism was not diagnosed in any. A mild clinical phenotype is associated with a small in-frame 5' deletions, which are often inherited. Deletions and other mutations causing haploinsufficiency of the full-length AUTS2 transcript give a more severe phenotype and occur de novo.
Conclusions: The 13 patients with AUTS2 syndrome with unique pathogenic deletions scattered around the AUTS2 locus confirm a phenotype-genotype correlation. Despite individual variations, AUTS2 syndrome emerges as a specific ID syndrome with microcephaly, feeding difficulties, dysmorphic features and a specific behavioural phenotype.
Journal of medical genetics 2016;53;8;523-32
Sperm Meets Egg: The Genetics of Mammalian Fertilization.
Cell Surface Signalling Laboratory, Wellcome Trust Sanger Institute, Cambridge, CB10 1SA, United Kingdom; email: firstname.lastname@example.org.
Fertilization is the culminating event of sexual reproduction, which involves the union of the sperm and egg to form a single, genetically distinct organism. Despite the fundamental role of fertilization, the basic mechanisms involved have remained poorly understood. However, these mechanisms must involve an ordered schedule of cellular recognition events between the sperm and egg to ensure successful fusion. In this article, we review recent progress in our molecular understanding of mammalian fertilization, highlighting the areas in which genetic approaches have been particularly informative and focusing especially on the roles of secreted and cell surface proteins, expressed in a sex-specific manner, that mediate sperm-egg interactions. We discuss how the sperm interacts with the female reproductive tract, zona pellucida, and the oolemma. Finally, we review recent progress made in elucidating the mechanisms that reduce polyspermy and ensure that eggs normally fuse with only a single sperm. Expected final online publication date for the Annual Review of Genetics Volume 50 is November 23, 2016. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
Annual review of genetics 2016
Interferon-driven alterations of the host's amino acid metabolism in the pathogenesis of typhoid fever.
Oxford Vaccine Group, Department of Paediatrics, University of Oxford and the NIHR Oxford Biomedical Research Centre, Oxford OX3 7LE, England, UK email@example.com.
Enteric fever, caused by Salmonella enterica serovar Typhi, is an important public health problem in resource-limited settings and, despite decades of research, human responses to the infection are poorly understood. In 41 healthy adults experimentally infected with wild-type S. Typhi, we detected significant cytokine responses within 12 h of bacterial ingestion. These early responses did not correlate with subsequent clinical disease outcomes and likely indicate initial host-pathogen interactions in the gut mucosa. In participants developing enteric fever after oral infection, marked transcriptional and cytokine responses during acute disease reflected dominant type I/II interferon signatures, which were significantly associated with bacteremia. Using a murine and macrophage infection model, we validated the pivotal role of this response in the expression of proteins of the host tryptophan metabolism during Salmonella infection. Corresponding alterations in tryptophan catabolites with immunomodulatory properties in serum of participants with typhoid fever confirmed the activity of this pathway, and implicate a central role of host tryptophan metabolism in the pathogenesis of typhoid fever.
Funded by: Medical Research Council: MR/M02637X/1; NIAID NIH HHS: R01 AI036525, U01 AI082210, U19 AI057234, U19 AI082655, U19 AI089987, U19 AI109776; Wellcome Trust: 092661
The Journal of experimental medicine 2016;213;6;1061-77
Tissue-specific mutation accumulation in human adult stem cells during life.
Center for Molecular Medicine, Cancer Genomics Netherlands, Department of Genetics, University Medical Center Utrecht, Heidelberglaan 100, 3584CX Utrecht, The Netherlands.
The gradual accumulation of genetic mutations in human adult stem cells (ASCs) during life is associated with various age-related diseases, including cancer. Extreme variation in cancer risk across tissues was recently proposed to depend on the lifetime number of ASC divisions, owing to unavoidable random mutations that arise during DNA replication. However, the rates and patterns of mutations in normal ASCs remain unknown. Here we determine genome-wide mutation patterns in ASCs of the small intestine, colon and liver of human donors with ages ranging from 3 to 87 years by sequencing clonal organoid cultures derived from primary multipotent cells. Our results show that mutations accumulate steadily over time in all of the assessed tissue types, at a rate of approximately 40 novel mutations per year, despite the large variation in cancer incidence among these tissues. Liver ASCs, however, have different mutation spectra compared to those of the colon and small intestine. Mutational signature analysis reveals that this difference can be attributed to spontaneous deamination of methylated cytosine residues in the colon and small intestine, probably reflecting their high ASC division rate. In liver, a signature with an as-yet-unknown underlying mechanism is predominant. Mutation spectra of driver genes in cancer show high similarity to the tissue-specific ASC mutation spectra, suggesting that intrinsic mutational processes in ASCs can initiate tumorigenesis. Notably, the inter-individual variation in mutation rate and spectra are low, suggesting tissue-specific activity of common mutational processes throughout life.
Funded by: Worldwide Cancer Research: 16-0193
Mutation Rates and Discriminating Power for 13 Rapidly-Mutating Y-STRs between Related and Unrelated Individuals.
Dipartimento di Scienze Biologiche, Geologiche e Ambientali (BiGeA), Università di Bologna, Bologna, Italy.
Rapidly Mutating Y-STRs (RM Y-STRs) were recently introduced in forensics in order to increase the differentiation of Y-chromosomal profiles even in case of close relatives. We estimate RM Y-STRs mutation rates and their power to discriminate between related individuals by using samples extracted from a wide set of paternal pedigrees and by comparing RM Y-STRs results with those obtained from the Y-filer set. In addition, we tested the ability of RM Y-STRs to discriminate between unrelated individuals carrying the same Y-filer haplotype, using the haplogroup R-M269 (reportedly characterised by a strong resemblance in Y-STR profiles) as a case study. Our results, despite confirming the high mutability of RM Y-STRs, show significantly lower mutation rates than reference germline ones. Consequently, their power to discriminate between related individuals, despite being higher than the one of Y-filer, does not seem to improve significantly the performance of the latter. On the contrary, when considering R-M269 unrelated individuals, RM Y-STRs reveal significant discriminatory power and retain some phylogenetic signal, allowing the correct classification of individuals for some R-M269-derived sub-lineages. These results have important implications not only for forensics, but also for molecular anthropology, suggesting that RM Y-STRs are useful tools for exploring subtle genetic variability within Y-chromosomal haplogroups.
Funded by: European Research Council: 295733
PloS one 2016;11;11;e0165678
Characterization of Two Distinct Nucleosome Remodeling and Deacetylase (NuRD) Complex Assemblies in Embryonic Stem Cells.
From the ‡Proteomic Mass Spectrometry, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK;
Pluripotency and self-renewal, the defining properties of embryonic stem cells, are brought about by transcriptional programs involving an intricate network of transcription factors and chromatin remodeling complexes. The Nucleosome Remodeling and Deacetylase (NuRD) complex plays a crucial and dynamic role in the regulation of stemness and differentiation. Several NuRD-associated factors have been reported but how they are organized has not been investigated in detail. Here, we have combined affinity purification and blue native polyacrylamide gel electrophoresis followed by protein identification by mass spectrometry and protein correlation profiling to characterize the topology of the NuRD complex. Our data show that in mouse embryonic stem cells the NuRD complex is present as two distinct assemblies of differing topology with different binding partners. Cell cycle regulator Cdk2ap1 and transcription factor Sall4 associate only with the higher mass NuRD assembly. We further establish that only isoform Sall4a, and not Sall4b, associates with NuRD. By contrast, Suz12, a component of the PRC2 Polycomb repressor complex, associates with the lower mass entity. In addition, we identify and validate a novel NuRD-associated protein, Wdr5, a regulatory subunit of the MLL histone methyltransferase complex, which associates with both NuRD entities. Bioinformatic analyses of published target gene sets of these chromatin binding proteins are in agreement with these structural observations. In summary, this study provides an interesting insight into mechanistic aspects of NuRD function in stem cell biology. The relevance of our work has broader implications because of the ubiquitous nature of the NuRD complex. The strategy described here can be more broadly applicable to investigate the topology of the multiple complexes an individual protein can participate in.
Funded by: Wellcome Trust: WT098051
Molecular & cellular proteomics : MCP 2016;15;3;878-91
An organelle-specific protein landscape identifies novel diseases and molecular mechanisms.
Medical Proteome Center, Institute for Ophthalmic Research, University of Tuebingen, 72074 Tuebingen, Germany.
Cellular organelles provide opportunities to relate biological mechanisms to disease. Here we use affinity proteomics, genetics and cell biology to interrogate cilia: poorly understood organelles, where defects cause genetic diseases. Two hundred and seventeen tagged human ciliary proteins create a final landscape of 1,319 proteins, 4,905 interactions and 52 complexes. Reverse tagging, repetition of purifications and statistical analyses, produce a high-resolution network that reveals organelle-specific interactions and complexes not apparent in larger studies, and links vesicle transport, the cytoskeleton, signalling and ubiquitination to ciliary signalling and proteostasis. We observe sub-complexes in exocyst and intraflagellar transport complexes, which we validate biochemically, and by probing structurally predicted, disruptive, genetic variants from ciliary disease patients. The landscape suggests other genetic diseases could be ciliary including 3M syndrome. We show that 3M genes are involved in ciliogenesis, and that patient fibroblasts lack cilia. Overall, this organelle-specific targeting strategy shows considerable promise for Systems Medicine.
Funded by: NEI NIH HHS: R01 EY021872; NICHD NIH HHS: R01 HD042601; NIDDK NIH HHS: R01 DK072301, R01 DK075972; NIGMS NIH HHS: R01 GM121317
Nature communications 2016;7;11491
A DNA target-enrichment approach to detect mutations, copy number changes and immunoglobulin translocations in multiple myeloma.
Cancer Genome Project, Wellcome Trust Sanger Institute, Cambridge, UK.
Genomic lesions are not investigated during routine diagnostic workup for multiple myeloma (MM). Cytogenetic studies are performed to assess prognosis but with limited impact on therapeutic decisions. Recently, several recurrently mutated genes have been described, but their clinical value remains to be defined. Therefore, clinical-grade strategies to investigate the genomic landscape of myeloma samples are needed to integrate new and old prognostic markers. We developed a target-enrichment strategy followed by next-generation sequencing (NGS) to streamline simultaneous analysis of gene mutations, copy number changes and immunoglobulin heavy chain (IGH) translocations in MM in a high-throughput manner, and validated it in a panel of cell lines. We identified 548 likely oncogenic mutations in 182 genes. By integrating published data sets of NGS in MM, we retrieved a list of genes with significant relevance to myeloma and found that the mutational spectrum of primary samples and MM cell lines is partially overlapping. Gains and losses of chromosomes, chromosomal segments and gene loci were identified with accuracy comparable to conventional arrays, allowing identification of lesions with known prognostic significance. Furthermore, we identified IGH translocations with high positive and negative predictive value. Our approach could allow the identification of novel biomarkers with clinical relevance in myeloma.
Funded by: NCI NIH HHS: P01 CA155258, P50 CA100707; Wellcome Trust: 077012/Z/05/Z
Blood cancer journal 2016;6;9;e467
Mouse model of chromosome mosaicism reveals lineage-specific depletion of aneuploid cells and normal developmental potential.
Department of Physiology, Development and Neuroscience and Gurdon Institute, University of Cambridge, Downing Street, Cambridge CB2 3EG, UK.
Most human pre-implantation embryos are mosaics of euploid and aneuploid cells. To determine the fate of aneuploid cells and the developmental potential of mosaic embryos, here we generate a mouse model of chromosome mosaicism. By treating embryos with a spindle assembly checkpoint inhibitor during the four- to eight-cell division, we efficiently generate aneuploid cells, resulting in embryo death during peri-implantation development. Live-embryo imaging and single-cell tracking in chimeric embryos, containing aneuploid and euploid cells, reveal that the fate of aneuploid cells depends on lineage: aneuploid cells in the fetal lineage are eliminated by apoptosis, whereas those in the placental lineage show severe proliferative defects. Overall, the proportion of aneuploid cells is progressively depleted from the blastocyst stage onwards. Finally, we show that mosaic embryos have full developmental potential, provided they contain sufficient euploid cells, a finding of significance for the assessment of embryo vitality in the clinic.
Funded by: Wellcome Trust
Nature communications 2016;7;11165
The influence of a short-term gluten-free diet on the human gut microbiome.
Department of Genetics, University of Groningen, University Medical Centre Groningen, Groningen, The Netherlands.
Background: A gluten-free diet (GFD) is the most commonly adopted special diet worldwide. It is an effective treatment for coeliac disease and is also often followed by individuals to alleviate gastrointestinal complaints. It is known there is an important link between diet and the gut microbiome, but it is largely unknown how a switch to a GFD affects the human gut microbiome.
Methods: We studied changes in the gut microbiomes of 21 healthy volunteers who followed a GFD for four weeks. We collected nine stool samples from each participant: one at baseline, four during the GFD period, and four when they returned to their habitual diet (HD), making a total of 189 samples. We determined microbiome profiles using 16S rRNA sequencing and then processed the samples for taxonomic and imputed functional composition. Additionally, in all 189 samples, six gut health-related biomarkers were measured.
Results: Inter-individual variation in the gut microbiota remained stable during this short-term GFD intervention. A number of taxon-specific differences were seen during the GFD: the most striking shift was seen for the family Veillonellaceae (class Clostridia), which was significantly reduced during the intervention (p = 2.81 × 10(-05)). Seven other taxa also showed significant changes; the majority of them are known to play a role in starch metabolism. We saw stronger differences in pathway activities: 21 predicted pathway activity scores showed significant association to the change in diet. We observed strong relations between the predicted activity of pathways and biomarker measurements.
Conclusions: A GFD changes the gut microbiome composition and alters the activity of microbial pathways.
Funded by: European Research Council: 322698; Wellcome Trust: WT098051
Genome medicine 2016;8;1;45
Computational evaluation of exome sequence data using human and model organism phenotypes improves diagnostic efficiency.
Undiagnosed Diseases Program, Common Fund, Office of the Director, National Institutes of Health, Bethesda, Maryland, USA.
Purpose: Medical diagnosis and molecular or biochemical confirmation typically rely on the knowledge of the clinician. Although this is very difficult in extremely rare diseases, we hypothesized that the recording of patient phenotypes in Human Phenotype Ontology (HPO) terms and computationally ranking putative disease-associated sequence variants improves diagnosis, particularly for patients with atypical clinical profiles.
Methods: Using simulated exomes and the National Institutes of Health Undiagnosed Diseases Program (UDP) patient cohort and associated exome sequence, we tested our hypothesis using Exomiser. Exomiser ranks candidate variants based on patient phenotype similarity to (i) known disease-gene phenotypes, (ii) model organism phenotypes of candidate orthologs, and (iii) phenotypes of protein-protein association neighbors.
Results: Benchmarking showed Exomiser ranked the causal variant as the top hit in 97% of known disease-gene associations and ranked the correct seeded variant in up to 87% when detectable disease-gene associations were unavailable. Using UDP data, Exomiser ranked the causative variant(s) within the top 10 variants for 11 previously diagnosed variants and achieved a diagnosis for 4 of 23 cases undiagnosed by clinical evaluation.
Conclusion: Structured phenotyping of patients and computational analysis are effective adjuncts for diagnosing patients with genetic disorders.Genet Med 18 6, 608-617.
Funded by: NHGRI NIH HHS: HHSN268201300036C, U54 HG006370; NIH HHS: R24 OD011883; Wellcome Trust: 098051
Genetics in medicine : official journal of the American College of Medical Genetics 2016;18;6;608-17
Chromosome engineering in zygotes with CRISPR/Cas9.
Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, Cambridge, United Kingdom.
Deletions, duplications, and inversions of large genomic regions covering several genes are an important class of disease causing variants in humans. Modeling these structural variants in mice requires multistep processes in ES cells, which has limited their availability. Mutant mice containing small insertions, deletions, and single nucleotide polymorphisms can be reliably generated using CRISPR/Cas9 directly in mouse zygotes. Large structural variants can be generated using CRISPR/Cas9 in ES cells, but it has not been possible to generate these directly in zygotes. We now demonstrate the direct generation of deletions, duplications and inversions of up to one million base pairs by zygote injection.
Funded by: NIH HHS: U42OD011174; Wellcome Trust: WT098051
Genesis (New York, N.Y. : 2000) 2016;54;2;78-85
Complete Genome Sequence of MIDG2331, a Genetically Tractable Serovar 8 Clinical Isolate of Actinobacillus pleuropneumoniae.
Department of Medicine, Section of Paediatrics, Imperial College London, London, United Kingdom.
We report here the complete annotated genome sequence of a clinical serovar 8 isolate Actinobacillus pleuropneumoniae MIDG2331. Unlike the serovar 8 reference strain 405, MIDG2331 is amenable to genetic manipulation via natural transformation as well as conjugation, making it ideal for studies of gene function.
Funded by: Biotechnology and Biological Sciences Research Council: BB/G018553/1, BB/G019177/1, BB/G019274/1, BB/G020744/1, EGA16167
Genome announcements 2016;4;1
ICEApl1, an Integrative Conjugative Element Related to ICEHin1056, Identified in the Pig Pathogen Actinobacillus pleuropneumoniae.
Section of Paediatrics, Department of Medicine, Imperial College London London, UK.
ICEApl1 was identified in the whole genome sequence of MIDG2331, a tetracycline-resistant (MIC = 8 mg/L) serovar 8 clinical isolate of Actinobacillus pleuropneumoniae, the causative agent of porcine pleuropneumonia. PCR amplification of virB4, one of the core genes involved in conjugation, was used to identify other A. pleuropneumoniae isolates potentially carrying ICEApl1. MICs for tetracycline were determined for virB4 positive isolates, and shotgun whole genome sequence analysis was used to confirm presence of the complete ICEApl1. The sequence of ICEApl1 is 56083 bp long and contains 67 genes including a Tn10 element encoding tetracycline resistance. Comparative sequence analysis was performed with similar integrative conjugative elements (ICEs) found in other members of the Pasteurellaceae. ICEApl1 is most similar to the 59393 bp ICEHin1056, from Haemophilus influenzae strain 1056. Although initially identified only in serovar 8 isolates of A. pleuropneumoniae (31 from the UK and 1 from Cyprus), conjugal transfer of ICEApl1 to representative isolates of other serovars was confirmed. All isolates carrying ICEApl1 had a MIC for tetracycline of 8 mg/L. This is, to our knowledge, the first description of an ICE in A. pleuropneumoniae, and the first report of a member of the ICEHin1056 subfamily in a non-human pathogen. ICEApl1 confers resistance to tetracycline, currently one of the more commonly used antibiotics for treatment and control of porcine pleuropneumonia.
Funded by: Biotechnology and Biological Sciences Research Council: BB/G018553/1, BB/G019177/1, BB/G019274/1, BB/G020744/1
Frontiers in microbiology 2016;7;810
GFI1(36N) as a therapeutic and prognostic marker for myelodysplastic syndrome.
Department of Hematology, West German Cancer Center, University Hospital Essen, University Duisburg-Essen, Essen, Germany.
Inherited gene variants play an important role in malignant diseases. The transcriptional repressor growth factor independence 1 (GFI1) regulates hematopoietic stem cell (HSC) self-renewal and differentiation. A single-nucleotide polymorphism of GFI1 (rs34631763) generates a protein with an asparagine (N) instead of a serine (S) at position 36 (GFI1(36N)) and has a prevalence of 3%-5% among Caucasians. Because GFI1 regulates myeloid development, we examined the role of GFI1(36N) on the course of MDS disease. To this end, we determined allele frequencies of GFI1(36N) in four independent MDS cohorts from the Netherlands and Belgium, Germany, the ICGC consortium, and the United States. The GFI1(36N) allele frequency in the 723 MDS patients genotyped ranged between 9% and 12%. GFI1(36N) was an independent adverse prognostic factor for overall survival, acute myeloid leukemia-free survival, and event-free survival in a univariate analysis. After adjustment for age, bone marrow blast percentage, IPSS score, mutational status, and cytogenetic findings, GFI1(36N) remained an independent adverse prognostic marker. GFI1(36S) homozygous patients exhibited a sustained response to treatment with hypomethylating agents, whereas GFI1(36N) patients had a poor sustained response to this therapy. Because allele status of GFI1(36N) is readily determined using basic molecular techniques, we propose inclusion of GFI1(36N) status in future prospective studies for MDS patients to better predict prognosis and guide therapeutic decisions.
Experimental hematology 2016;44;7;590-595.e1
A multi-factorial analysis of response to warfarin in a UK prospective cohort.
Wellcome Trust Sanger Institute, Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK. firstname.lastname@example.org.
Background: Warfarin is the most widely used oral anticoagulant worldwide, but it has a narrow therapeutic index which necessitates constant monitoring of anticoagulation response. Previous genome-wide studies have focused on identifying factors explaining variance in stable dose, but have not explored the initial patient response to warfarin, and a wider range of clinical and biochemical factors affecting both initial and stable dosing with warfarin.
Methods: A prospective cohort of 711 patients starting warfarin was followed up for 6 months with analyses focusing on both non-genetic and genetic factors. The outcome measures used were mean weekly warfarin dose (MWD), stable mean weekly dose (SMWD) and international normalised ratio (INR) > 4 during the first week. Samples were genotyped on the Illumina Human610-Quad chip. Statistical analyses were performed using Plink and R.
Results: VKORC1 and CYP2C9 were the major genetic determinants of warfarin MWD and SMWD, with CYP4F2 having a smaller effect. Age, height, weight, cigarette smoking and interacting medications accounted for less than 20 % of the variance. Our multifactorial analysis explained 57.89 % and 56.97 % of the variation for MWD and SMWD, respectively. Genotypes for VKORC1 and CYP2C9*3, age, height and weight, as well as other clinical factors such as alcohol consumption, loading dose and concomitant drugs were important for the initial INR response to warfarin. In a small subset of patients for whom data were available, levels of the coagulation factors VII and IX (highly correlated) also played a role.
Conclusion: Our multifactorial analysis in a prospectively recruited cohort has shown that multiple factors, genetic and clinical, are important in determining the response to warfarin. VKORC1 and CYP2C9 genetic polymorphisms are the most important determinants of warfarin dosing, and it is highly unlikely that other common variants of clinical importance influencing warfarin dosage will be found. Both VKORC1 and CYP2C9*3 are important determinants of the initial INR response to warfarin. Other novel variants, which did not reach genome-wide significance, were identified for the different outcome measures, but need replication.
Funded by: British Heart Foundation: RG/14/5/30893; Department of Health; Medical Research Council: G0700654; Wellcome Trust
Genome medicine 2016;8;1;2
The spotted gar genome illuminates vertebrate evolution and facilitates human-teleost comparisons.
Institute of Neuroscience, University of Oregon, Eugene, Oregon, USA.
To connect human biology to fish biomedical models, we sequenced the genome of spotted gar (Lepisosteus oculatus), whose lineage diverged from teleosts before teleost genome duplication (TGD). The slowly evolving gar genome has conserved in content and size many entire chromosomes from bony vertebrate ancestors. Gar bridges teleosts to tetrapods by illuminating the evolution of immunity, mineralization and development (mediated, for example, by Hox, ParaHox and microRNA genes). Numerous conserved noncoding elements (CNEs; often cis regulatory) undetectable in direct human-teleost comparisons become apparent using gar: functional studies uncovered conserved roles for such cryptic CNEs, facilitating annotation of sequences identified in human genome-wide association studies. Transcriptomic analyses showed that the sums of expression domains and expression levels for duplicated teleost genes often approximate the patterns and levels of expression for gar genes, consistent with subfunctionalization. The gar genome provides a resource for understanding evolution after genome duplication, the origin of vertebrate genomes and the function of human regulatory sequences.
Nature genetics 2016
Maternal DNA Methylation Regulates Early Trophoblast Development.
Blizard Institute, Barts and The London School of Medicine and Dentistry, QMUL, London E1 2AT, UK. Electronic address: email@example.com.
Critical roles for DNA methylation in embryonic development are well established, but less is known about its roles during trophoblast development, the extraembryonic lineage that gives rise to the placenta. We dissected the role of DNA methylation in trophoblast development by performing mRNA and DNA methylation profiling of Dnmt3a/3b mutants. We find that oocyte-derived methylation plays a major role in regulating trophoblast development but that imprinting of the key placental regulator Ascl2 is only partially responsible for these effects. We have identified several methylation-regulated genes associated with trophoblast differentiation that are involved in cell adhesion and migration, potentially affecting trophoblast invasion. Specifically, trophoblast-specific DNA methylation is linked to the silencing of Scml2, a Polycomb Repressive Complex 1 protein that drives loss of cell adhesion in methylation-deficient trophoblast. Our results reveal that maternal DNA methylation controls multiple differentiation-related and physiological processes in trophoblast via both imprinting-dependent and -independent mechanisms.
Funded by: Biotechnology and Biological Sciences Research Council: BBS/E/B/000C0417; Canadian Institutes of Health Research: MOP-119357; Medical Research Council: MR/L00027X/1; Wellcome Trust: 095645, 101225, 101225/Z/13/Z
Developmental cell 2016;36;2;152-63
eFORGE: A Tool for Identifying Cell Type-Specific Signal in Epigenomic Data.
UCL Cancer Institute, University College London, London WC1E 6BT, UK. Electronic address: firstname.lastname@example.org.
Epigenome-wide association studies (EWAS) provide an alternative approach for studying human disease through consideration of non-genetic variants such as altered DNA methylation. To advance the complex interpretation of EWAS, we developed eFORGE (http://eforge.cs.ucl.ac.uk/), a new standalone and web-based tool for the analysis and interpretation of EWAS data. eFORGE determines the cell type-specific regulatory component of a set of EWAS-identified differentially methylated positions. This is achieved by detecting enrichment of overlap with DNase I hypersensitive sites across 454 samples (tissues, primary cell types, and cell lines) from the ENCODE, Roadmap Epigenomics, and BLUEPRINT projects. Application of eFORGE to 20 publicly available EWAS datasets identified disease-relevant cell types for several common diseases, a stem cell-like signature in cancer, and demonstrated the ability to detect cell-composition effects for EWAS performed on heterogeneous tissues. Our approach bridges the gap between large-scale epigenomics data and EWAS-derived target selection to yield insight into disease etiology.
Funded by: British Heart Foundation: RG/09/012/28096; Department of Health: RP-PG-0310-1002; Medical Research Council: G0800270; Wellcome Trust: 99148
Cell reports 2016;17;8;2137-2150
Efficient identification of CRISPR/Cas9-induced insertions/deletions by direct germline screening in zebrafish.
Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.
Background: The CRISPR/Cas9 system is a prokaryotic immune system that infers resistance to foreign genetic material and is a sort of 'adaptive immunity'. It has been adapted to enable high throughput genome editing and has revolutionised the generation of targeted mutations.
Results: We have developed a scalable analysis pipeline to identify CRISPR/Cas9 induced mutations in hundreds of samples using next generation sequencing (NGS) of amplicons. We have used this system to investigate the best way to screen mosaic Zebrafish founder individuals for germline transmission of induced mutations. Screening sperm samples from potential founders provides much better information on germline transmission rates and crucially the sequence of the particular insertions/deletions (indels) that will be transmitted. This enables us to combine screening with archiving to create a library of cryopreserved samples carrying known mutations. It also allows us to design efficient genotyping assays, making identifying F1 carriers straightforward.
Conclusions: The methods described will streamline the production of large numbers of knockout alleles in selected genes for phenotypic analysis, complementing existing efforts using random mutagenesis.
Funded by: Wellcome Trust: WT098051
BMC genomics 2016;17;259
Calcium signalling in malaria parasites.
Faculty of Medicine, University of Geneva, 1 Rue Michel-Servet, CH-1211 Geneva 4, Switzerland.
Ca(2+) is a ubiquitous intracellular messenger in malaria parasites with important functions in asexual blood stages responsible for malaria symptoms, the preceding liver-stage infection and transmission through the mosquito. Intracellular messengers amplify signals by binding to effector molecules that trigger physiological changes. The characterisation of some Ca(2+) effector proteins has begun to provide insights into the vast range of biological processes controlled by Ca(2+) signalling in malaria parasites, including host cell egress and invasion, protein secretion, motility, and cell cycle regulation. Despite the importance of Ca(2+) signalling during the life cycle of malaria parasites, little is known about Ca(2+) homeostasis. Recent findings highlighted that upstream of stage-specific Ca(2+) effectors is a conserved interplay between second messengers to control critical intracellular Ca(2+) signals throughout the life cycle. The identification of the molecular mechanisms integrating stagetranscending mechanisms of Ca(2+) homeostasis in a network of stage-specific regulator and effector pathways now represents a major challenge for a meaningful understanding of Ca(2+) signalling in malaria parasites. This article is protected by copyright. All rights reserved.
Molecular microbiology 2016
Whole-genome sequencing reveals transmission of vancomycin-resistant Enterococcus faecium in a healthcare network.
Department of Medicine, University of Cambridge, Addenbrooke's Hospital, Box 157, Hills Road, Cambridge, CB2 0QQ, UK. email@example.com.
Background: Bacterial whole-genome sequencing (WGS) has the potential to identify reservoirs of multidrug-resistant organisms and transmission of these pathogens across healthcare networks. We used WGS to define transmission of vancomycin-resistant enterococci (VRE) within a long-term care facility (LTCF), and between this and an acute hospital in the United Kingdom (UK).
Methods: A longitudinal prospective observational study of faecal VRE carriage was conducted in a LTCF in Cambridge, UK. Stool samples were collected at recruitment, and then repeatedly until the end of the study period, discharge or death. Selective culture media were used to isolate VRE, which were subsequently sequenced and analysed. We also analysed the genomes of 45 Enterococcus faecium bloodstream isolates collected at Cambridge University Hospitals NHS Foundation Trust (CUH).
Results: Forty-five residents were recruited during a 6-month period in 2014, and 693 stools were collected at a frequency of at least 1 week apart. Fifty-one stool samples from 3/45 participants (7 %) were positive for vancomycin-resistant E. faecium. Two residents carried multiple VRE lineages, and one carried a single VRE lineage. Genome analyses based on single nucleotide polymorphisms (SNPs) in the core genome indicated that VRE carried by each of the three residents were unrelated. Participants had extensive contact with the local healthcare network. We found that VRE genomes from LTCF residents and hospital-associated bloodstream infection were interspersed throughout the phylogenetic tree, with several instances of closely related VRE strains from the two settings.
Conclusions: A proportion of LTCF residents are long-term carriers of VRE. Evidence for genetic relatedness between these and VRE associated with bloodstream infection in a nearby acute NHS Trust indicate a shared bacterial population.
Funded by: Department of Health: HICF-T5-342; Wellcome Trust: 098600
Genome medicine 2016;8;1;4
Quantitative insertion-site sequencing (QIseq) for high throughput phenotyping of transposon mutants.
Malaria Programme, Wellcome Trust Sanger Institute, Genome Campus, Hinxton, Cambridgeshire CB10 1SA, United Kingdom;
Genetic screening using random transposon insertions has been a powerful tool for uncovering biology in prokaryotes, where whole-genome saturating screens have been performed in multiple organisms. In eukaryotes, such screens have proven more problematic, in part because of the lack of a sensitive and robust system for identifying transposon insertion sites. We here describe quantitative insertion-site sequencing, or QIseq, which uses custom library preparation and Illumina sequencing technology and is able to identify insertion sites from both the 5' and 3' ends of the transposon, providing an inbuilt level of validation. The approach was developed using piggyBac mutants in the human malaria parasite Plasmodium falciparum but should be applicable to many other eukaryotic genomes. QIseq proved accurate, confirming known sites in >100 mutants, and sensitive, identifying and monitoring sites over a >10,000-fold dynamic range of sequence counts. Applying QIseq to uncloned parasites shortly after transfections revealed multiple insertions in mixed populations and suggests that >4000 independent mutants could be generated from relatively modest scales of transfection, providing a clear pathway to genome-scale screens in P. falciparum QIseq was also used to monitor the growth of pools of previously cloned mutants and reproducibly differentiated between deleterious and neutral mutations in competitive growth. Among the mutants with fitness defects was a mutant with a piggyBac insertion immediately upstream of the kelch protein K13 gene associated with artemisinin resistance, implying mutants in this gene may have competitive fitness costs. QIseq has the potential to enable the scale-up of piggyBac-mediated genetics across multiple eukaryotic systems.
Genome research 2016;26;7;980-9
Antibiotics, gut bugs and the young.
Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.
Two recent studies have investigated the effects of antibiotic use on the intestinal microbiota of preterm infants and young children.
Nature reviews. Microbiology 2016;14;6;336
Culturing of 'unculturable' human microbiota reveals novel taxa and extensive sporulation.
Host-Microbiota Interactions Laboratory, Wellcome Trust Sanger Institute, Hinxton, UK.
Our intestinal microbiota harbours a diverse bacterial community required for our health, sustenance and wellbeing. Intestinal colonization begins at birth and climaxes with the acquisition of two dominant groups of strict anaerobic bacteria belonging to the Firmicutes and Bacteroidetes phyla. Culture-independent, genomic approaches have transformed our understanding of the role of the human microbiome in health and many diseases. However, owing to the prevailing perception that our indigenous bacteria are largely recalcitrant to culture, many of their functions and phenotypes remain unknown. Here we describe a novel workflow based on targeted phenotypic culturing linked to large-scale whole-genome sequencing, phylogenetic analysis and computational modelling that demonstrates that a substantial proportion of the intestinal bacteria are culturable. Applying this approach to healthy individuals, we isolated 137 bacterial species from characterized and candidate novel families, genera and species that were archived as pure cultures. Whole-genome and metagenomic sequencing, combined with computational and phenotypic analysis, suggests that at least 50-60% of the bacterial genera from the intestinal microbiota of a healthy individual produce resilient spores, specialized for host-to-host transmission. Our approach unlocks the human intestinal microbiota for phenotypic analysis and reveals how a marked proportion of oxygen-sensitive intestinal bacteria can be transmitted between individuals, affecting microbiota heritability.
Funded by: Medical Research Council: G1000214, MR/K000551/1, PF451; Wellcome Trust: 098051
A Biobank of Breast Cancer Explants with Preserved Intra-tumor Heterogeneity to Screen Anticancer Compounds.
Department of Oncology and Cancer Research UK Cambridge Institute, Li Ka Shing Centre, University of Cambridge, Cambridge CB2 0RE, UK.
The inter- and intra-tumor heterogeneity of breast cancer needs to be adequately captured in pre-clinical models. We have created a large collection of breast cancer patient-derived tumor xenografts (PDTXs), in which the morphological and molecular characteristics of the originating tumor are preserved through passaging in the mouse. An integrated platform combining in vivo maintenance of these PDTXs along with short-term cultures of PDTX-derived tumor cells (PDTCs) was optimized. Remarkably, the intra-tumor genomic clonal architecture present in the originating breast cancers was mostly preserved upon serial passaging in xenografts and in short-term cultured PDTCs. We assessed drug responses in PDTCs on a high-throughput platform and validated several ex vivo responses in vivo. The biobank represents a powerful resource for pre-clinical breast cancer pharmacogenomic studies (http://caldaslab.cruk.cam.ac.uk/bcape), including identification of biomarkers of response or resistance.
Funded by: Cancer Research UK: 9675; Marie Skłodowska-Curie Individual Fellowships: 660060; Medical Research Council: MR/M008975/1; NCI NIH HHS: P30 CA008748, R01 CA166422; Wellcome Trust
Sugar-sweetened beverage consumption and genetic predisposition to obesity in 2 Swedish cohorts.
Diabetes and Cardiovascular Disease-Genetic Epidemiology and.
Background: The consumption of sugar-sweetened beverages (SSBs), which has increased substantially during the last decades, has been associated with obesity and weight gain.
Objective: Common genetic susceptibility to obesity has been shown to modify the association between SSB intake and obesity risk in 3 prospective cohorts from the United States. We aimed to replicate these findings in 2 large Swedish cohorts.
Design: Data were available for 21,824 healthy participants from the Malmö Diet and Cancer study and 4902 healthy participants from the Gene-Lifestyle Interactions and Complex Traits Involved in Elevated Disease Risk Study. Self-reported SSB intake was categorized into 4 levels (seldom, low, medium, and high). Unweighted and weighted genetic risk scores (GRSs) were constructed based on 30 body mass index [(BMI) in kg/m(2)]-associated loci, and effect modification was assessed in linear regression equations by modeling the product and marginal effects of the GRS and SSB intake adjusted for age-, sex-, and cohort-specific covariates, with BMI as the outcome. In a secondary analysis, models were additionally adjusted for putative confounders (total energy intake, alcohol consumption, smoking status, and physical activity).
Results: In an inverse variance-weighted fixed-effects meta-analysis, each SSB intake category increment was associated with a 0.18 higher BMI (SE = 0.02; P = 1.7 × 10(-20); n = 26,726). In the fully adjusted model, a nominal significant interaction between SSB intake category and the unweighted GRS was observed (P-interaction = 0.03). Comparing the participants within the top and bottom quartiles of the GRS to each increment in SSB intake was associated with 0.24 (SE = 0.04; P = 2.9 × 10(-8); n = 6766) and 0.15 (SE = 0.04; P = 1.3 × 10(-4); n = 6835) higher BMIs, respectively.
Conclusions: The interaction observed in the Swedish cohorts is similar in magnitude to the previous analysis in US cohorts and indicates that the relation of SSB intake and BMI is stronger in people genetically predisposed to obesity.
Funded by: Wellcome Trust
The American journal of clinical nutrition 2016;104;3;809-15
Emergence and spread of a human-transmissible multidrug-resistant nontuberculous mycobacterium.
Wellcome Trust Sanger Institute, Hinxton, UK.
Lung infections with Mycobacterium abscessus, a species of multidrug-resistant nontuberculous mycobacteria, are emerging as an important global threat to individuals with cystic fibrosis (CF), in whom M. abscessus accelerates inflammatory lung damage, leading to increased morbidity and mortality. Previously, M. abscessus was thought to be independently acquired by susceptible individuals from the environment. However, using whole-genome analysis of a global collection of clinical isolates, we show that the majority of M. abscessus infections are acquired through transmission, potentially via fomites and aerosols, of recently emerged dominant circulating clones that have spread globally. We demonstrate that these clones are associated with worse clinical outcomes, show increased virulence in cell-based and mouse infection models, and thus represent an urgent international infection challenge.
Funded by: Medical Research Council: G1001712; Wellcome Trust
Science (New York, N.Y.) 2016;354;6313;751-757
Phylogenomic exploration of the relationships between strains of Mycobacterium avium subspecies paratuberculosis.
Wellcome Trust Sanger Institute, Genome Campus, Cambridge, UK. firstname.lastname@example.org.
Background: Mycobacterium avium subspecies paratuberculosis (Map) is an infectious enteric pathogen that causes Johne's disease in livestock. Determining genetic diversity is prerequisite to understanding the epidemiology and biology of Map. We performed the first whole genome sequencing (WGS) of 141 global Map isolates that encompass the main molecular strain types currently reported. We investigated the phylogeny of the Map strains, the diversity of the genome and the limitations of commonly used genotyping methods.
Results: Single nucleotide polymorphism (SNP) and phylogenetic analyses confirmed two major lineages concordant with the former Type S and Type C designations. The Type I and Type III strain groups are subtypes of Type S, and Type B strains are a subtype of Type C and not restricted to Bison species. We found that the genome-wide SNPs detected provided greater resolution between isolates than currently employed genotyping methods. Furthermore, the SNP used for IS1311 typing is not informative, as it is likely to have occurred after Type S and C strains diverged and does not assign all strains to the correct lineage. Mycobacterial Interspersed Repetitive Unit-Variable Number Tandem Repeat (MIRU-VNTR) differentiates Type S from Type C but provides limited resolution between isolates within these lineages and the polymorphisms detected do not necessarily accurately reflect the phylogenetic relationships between strains. WGS of passaged strains and coalescent analysis of the collection revealed a very high level of genetic stability, with the substitution rate estimated to be less than 0.5 SNPs per genome per year.
Conclusions: This study clarifies the phylogenetic relationships between the previously described Map strain groups, and highlights the limitations of current genotyping techniques. Map isolates exhibit restricted genetic diversity and a substitution rate consistent with a monomorphic pathogen. WGS provides the ultimate level of resolution for differentiation between strains. However, WGS alone will not be sufficient for tracing and tracking Map infections, yet importantly it can provide a phylogenetic context for affirming epidemiological connections.
Funded by: Wellcome Trust: 098051
BMC genomics 2016;17;79
Wbp2 is required for normal glutamatergic synapses in the cochlea and is crucial for hearing.
Wolfson Centre For Age-Related Diseases, King's College London, London, UK Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, UK email@example.com firstname.lastname@example.org.
WBP2 encodes the WW domain-binding protein 2 that acts as a transcriptional coactivator for estrogen receptor α (ESR1) and progesterone receptor (PGR). We reported that the loss of Wbp2 expression leads to progressive high-frequency hearing loss in mouse, as well as in two deaf children, each carrying two different variants in the WBP2 gene. The earliest abnormality we detect in Wbp2-deficient mice is a primary defect at inner hair cell afferent synapses. This study defines a new gene involved in the molecular pathway linking hearing impairment to hormonal signalling and provides new therapeutic targets.
Funded by: Medical Research Council: G0300212, MC_QA137918; Wellcome Trust: 098051, 100669, 102892
EMBO molecular medicine 2016;8;3;191-207
Mitochondrial Protein Lipoylation and the 2-Oxoglutarate Dehydrogenase Complex Controls HIF1α Stability in Aerobic Conditions.
Department of Medicine, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, CB2 0XY, UK.
Hypoxia-inducible transcription factors (HIFs) control adaptation to low oxygen environments by activating genes involved in metabolism, angiogenesis, and redox homeostasis. The finding that HIFs are also regulated by small molecule metabolites highlights the need to understand the complexity of their cellular regulation. Here we use a forward genetic screen in near-haploid human cells to identify genes that stabilize HIFs under aerobic conditions. We identify two mitochondrial genes, oxoglutarate dehydrogenase (OGDH) and lipoic acid synthase (LIAS), which when mutated stabilize HIF1α in a non-hydroxylated form. Disruption of OGDH complex activity in OGDH or LIAS mutants promotes L-2-hydroxyglutarate formation, which inhibits the activity of the HIFα prolyl hydroxylases (PHDs) and TET 2-oxoglutarate dependent dioxygenases. We also find that PHD activity is decreased in patients with homozygous germline mutations in lipoic acid synthesis, leading to HIF1 activation. Thus, mutations affecting OGDHC activity may have broad implications for epigenetic regulation and tumorigenesis.
Cell metabolism 2016;24;5;740-752
Admixture into and within sub-Saharan Africa.
Wellcome Trust Centre for Human Genetics, Oxford, United Kingdom.
Similarity between two individuals in the combination of genetic markers along their chromosomes indicates shared ancestry and can be used to identify historical connections between different population groups due to admixture. We use a genome-wide, haplotype-based, analysis to characterise the structure of genetic diversity and gene-flow in a collection of 48 sub-Saharan African groups. We show that coastal populations experienced an influx of Eurasian haplotypes over the last 7000 years, and that Eastern and Southern Niger-Congo speaking groups share ancestry with Central West Africans as a result of recent population expansions. In fact, most sub-Saharan populations share ancestry with groups from outside of their current geographic region as a result of gene-flow within the last 4000 years. Our in-depth analysis provides insight into haplotype sharing across different ethno-linguistic groups and the recent movement of alleles into new environments, both of which are relevant to studies of genetic epidemiology.
Funded by: Medical Research Council: G0600718, MR/M006212/1; Wellcome Trust
Non-CG DNA methylation is a biomarker for assessing endodermal differentiation capacity in pluripotent stem cells.
UCL Cancer Institute, University College London, 72 Huntley Street, London WC1E 6BT, UK.
Non-CG methylation is an unexplored epigenetic hallmark of pluripotent stem cells. Here we report that a reduction in non-CG methylation is associated with impaired differentiation capacity into endodermal lineages. Genome-wide analysis of 2,670 non-CG sites in a discovery cohort of 25 phenotyped human induced pluripotent stem cell (hiPSC) lines revealed unidirectional loss (Δβ=13%, P<7.4 × 10(-4)) of non-CG methylation that correctly identifies endodermal differentiation capacity in 23 out of 25 (92%) hiPSC lines. Translation into a simplified assay of only nine non-CG sites maintains predictive power in the discovery cohort (Δβ=23%, P<9.1 × 10(-6)) and correctly identifies endodermal differentiation capacity in nine out of ten pluripotent stem cell lines in an independent replication cohort consisting of hiPSCs reprogrammed from different cell types and different delivery systems, as well as human embryonic stem cell (hESC) lines. This finding infers non-CG methylation at these sites as a biomarker when assessing endodermal differentiation capacity as a readout.
Funded by: Medical Research Council: G0800784, G1000847, MC_PC_12009, MR/J001597/1; Wellcome Trust: 084071, 095606, WT098503
Nature communications 2016;7;10458
C13orf31 (FAMIN) is a central regulator of immunometabolic function.
Division of Gastroenterology and Hepatology, Department of Medicine, Addenbrooke's Hospital, University of Cambridge, Cambridge, UK.
Single-nucleotide variations in C13orf31 (LACC1) that encode p.C284R and p.I254V in a protein of unknown function (called 'FAMIN' here) are associated with increased risk for systemic juvenile idiopathic arthritis, leprosy and Crohn's disease. Here we set out to identify the biological mechanism affected by these coding variations. FAMIN formed a complex with fatty acid synthase (FASN) on peroxisomes and promoted flux through de novo lipogenesis to concomitantly drive high levels of fatty-acid oxidation (FAO) and glycolysis and, consequently, ATP regeneration. FAMIN-dependent FAO controlled inflammasome activation, mitochondrial and NADPH-oxidase-dependent production of reactive oxygen species (ROS), and the bactericidal activity of macrophages. As p.I254V and p.C284R resulted in diminished function and loss of function, respectively, FAMIN determined resilience to endotoxin shock. Thus, we have identified a central regulator of the metabolic function and bioenergetic state of macrophages that is under evolutionary selection and determines the risk of inflammatory and infectious disease.
Funded by: Biotechnology and Biological Sciences Research Council: BBS/E/B/000C0411, BBS/E/B/000C0413, BBS/E/B/000C0415, BBS/E/B/000C0417; European Research Council: 260961; Medical Research Council: MC_UP_A090_1006; NHGRI NIH HHS: U54 HG006348; NIH HHS: U42 OD011174; Wellcome Trust: 079643, 100675, 100891, 103077, 106260, 106260/Z/14/Z
Nature immunology 2016;17;9;1046-56
Comparative genome analysis and global phylogeny of the toxin variant Clostridium difficile PCR Ribotype 017 reveals the evolution of two independent sub-lineages.
Department of Pathogen Molecular Biology, London School of Hygiene and Tropical Medicine, Keppel Street, London, WC1E 7HT. UK.
The diarrhoeal pathogen Clostridium difficile consists of at least six distinct evolutionary lineages. The RT017 lineage is anomalous as strains only express toxin B, compared to strains from other lineages that produce toxins A and B and occasionally binary toxin. Historically, RT017 were initially reported in Asia but have now been reported worldwide. We used whole genome sequencing and phylogenetic analysis to investigate the patterns of global spread and population structure of 277 RT017 isolates from animal and human origins from six continents, isolated between 1990 and 2013. We reveal two distinct evenly split sub-lineages (SL1 and SL2) of C. difficile RT017 that contain multiple independent clonal expansions. All 24 animal isolates were contained within SL1 along with human isolates suggesting potential transmission between animals and humans. Genetic analyses revealed an over representation of antibiotic resistance genes. Phylogeographic analyses show a North American origin for RT017 as has been found for the recently emerged epidemic RT027 lineage. Despite only having one toxin, RT017 strains have evolved in parallel from at least two independent sources and can readily transmit between continents.
Journal of clinical microbiology 2016
Genomic variation in two gametocyte non-producing Plasmodium falciparum clonal lines.
Faculty of Infectious and Tropical Diseases, London School of Hygiene & Tropical Medicine, London, UK. email@example.com.
Background: Transmission of the malaria parasite Plasmodium falciparum from humans to the mosquito vector requires differentiation of a sub-population of asexual forms replicating within red blood cells into non-dividing male and female gametocytes. The nature of the molecular mechanism underlying this key differentiation event required for malaria transmission is not fully understood.
Methods: Whole genome sequencing was used to examine the genomic diversity of the gametocyte non-producing 3D7-derived lines F12 and A4. These lines were used in the recent detection of the PF3D7_1222600 locus (encoding PfAP2-G), which acts as a genetic master switch that triggers gametocyte development.
Results: The evolutionary changes from the 3D7 parental strain through its derivatives F12 (culture-passage derived cloned line) and A4 (transgenic cloned line) were identified. The genetic differences including the formation of chimeric var genes are presented.
Conclusion: A genomics resource is provided for the further study of gametocytogenesis or other phenotypes using these parasite lines.
Funded by: Medical Research Council: MR/K000551/1, MR/M006212/1, MR/M01360X/1, MR/N010469/1; Wellcome Trust: 090770, 098051, 106240, WT094752
Malaria journal 2016;15;229
A CRISPR outlook for apicomplexans.
Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.
Nature reviews. Microbiology 2016;14;11;668
Whole Genome Sequence Analysis of a Large Isoniazid-Resistant Tuberculosis Outbreak in London: A Retrospective Observational Study.
Department of Infectious Diseases and Immunity, Imperial College London, London, United Kingdom.
Background: A large isoniazid-resistant tuberculosis outbreak centred on London, United Kingdom, has been ongoing since 1995. The aim of this study was to investigate the power and value of whole genome sequencing (WGS) to resolve the transmission network compared to current molecular strain typing approaches, including analysis of intra-host diversity within a specimen, across body sites, and over time, with identification of genetic factors underlying the epidemiological success of this cluster.
Methods and findings: We sequenced 344 outbreak isolates from individual patients collected over 14 y (2 February 1998-22 June 2012). This demonstrated that 96 (27.9%) were indistinguishable, and only one differed from this major clone by more than five single nucleotide polymorphisms (SNPs). The maximum number of SNPs between any pair of isolates was nine SNPs, and the modal distance between isolates was two SNPs. WGS was able to reveal the direction of transmission of tuberculosis in 16 cases within the outbreak (4.7%), including within a multidrug-resistant cluster that carried a rare rpoB mutation associated with rifampicin resistance. Eleven longitudinal pairs of patient pulmonary isolates collected up to 48 mo apart differed from each other by between zero and four SNPs. Extrapulmonary dissemination resulted in acquisition of a SNP in two of five cases. WGS analysis of 27 individual colonies cultured from a single patient specimen revealed ten loci differed amongst them, with a maximum distance between any pair of six SNPs. A limitation of this study, as in previous studies, is that indels and SNPs in repetitive regions were not assessed due to the difficulty in reliably determining this variation.
Conclusions: Our study suggests that (1) certain paradigms need to be revised, such as the 12 SNP distance as the gold standard upper threshold to identify plausible transmissions; (2) WGS technology is helpful to rule out the possibility of direct transmission when isolates are separated by a substantial number of SNPs; (3) the concept of a transmission chain or network may not be useful in institutional or household settings; (4) the practice of isolating single colonies prior to sequencing is likely to lead to an overestimation of the number of SNPs between cases resulting from direct transmission; and (5) despite appreciable genomic diversity within a host, transmission of tuberculosis rarely results in minority variants becoming dominant. Thus, whilst WGS provided some increased resolution over variable number tandem repeat (VNTR)-based clustering, it was insufficient for inferring transmission in the majority of cases.
PLoS medicine 2016;13;10;e1002137
Novel Genetic Variants for Cartilage Thickness and Hip Osteoarthritis.
Department of Internal Medicine, Erasmus Medical Center, Rotterdam, The Netherlands.
Osteoarthritis is one of the most frequent and disabling diseases of the elderly. Only few genetic variants have been identified for osteoarthritis, which is partly due to large phenotype heterogeneity. To reduce heterogeneity, we here examined cartilage thickness, one of the structural components of joint health. We conducted a genome-wide association study of minimal joint space width (mJSW), a proxy for cartilage thickness, in a discovery set of 13,013 participants from five different cohorts and replication in 8,227 individuals from seven independent cohorts. We identified five genome-wide significant (GWS, P≤5·0×10-8) SNPs annotated to four distinct loci. In addition, we found two additional loci that were significantly replicated, but results of combined meta-analysis fell just below the genome wide significance threshold. The four novel associated genetic loci were located in/near TGFA (rs2862851), PIK3R1 (rs10471753), SLBP/FGFR3 (rs2236995), and TREH/DDX6 (rs496547), while the other two (DOT1L and SUPT3H/RUNX2) were previously identified. A systematic prioritization for underlying causal genes was performed using diverse lines of evidence. Exome sequencing data (n = 2,050 individuals) indicated that there were no rare exonic variants that could explain the identified associations. In addition, TGFA, FGFR3 and PIK3R1 were differentially expressed in OA cartilage lesions versus non-lesioned cartilage in the same individuals. In conclusion, we identified four novel loci (TGFA, PIK3R1, FGFR3 and TREH) and confirmed two loci known to be associated with cartilage thickness.The identified associations were not caused by rare exonic variants. This is the first report linking TGFA to human OA, which may serve as a new target for future therapies.
PLoS genetics 2016;12;10;e1006260
EphrinB1/EphB3b Coordinate Bidirectional Epithelial-Mesenchymal Interactions Controlling Liver Morphogenesis and Laterality.
Division of Developmental Biology, Mill Hill Laboratories, The Francis Crick Institute, London NW7 1AA, UK.
Positioning organs in the body often requires the movement of multiple tissues, yet the molecular and cellular mechanisms coordinating such movements are largely unknown. Here, we show that bidirectional signaling between EphrinB1 and EphB3b coordinates the movements of the hepatic endoderm and adjacent lateral plate mesoderm (LPM), resulting in asymmetric positioning of the zebrafish liver. EphrinB1 in hepatoblasts regulates directional migration and mediates interactions with the LPM, where EphB3b controls polarity and movement of the LPM. EphB3b in the LPM concomitantly repels hepatoblasts to move leftward into the liver bud. Cellular protrusions controlled by Eph/Ephrin signaling mediate hepatoblast motility and long-distance cell-cell contacts with the LPM beyond immediate tissue interfaces. Mechanistically, intracellular EphrinB1 domains mediate EphB3b-independent hepatoblast extension formation, while EpB3b interactions cause their destabilization. We propose that bidirectional short- and long-distance cell interactions between epithelial and mesenchyme-like tissues coordinate liver bud formation and laterality via cell repulsion.
Developmental cell 2016;39;3;316-328
Recombination in Streptococcus pneumoniae Lineages Increase with Carriage Duration and Size of the Polysaccharide Capsule.
Department of Clinical Infection, Microbiology and Immunology, Institute of Infection and Global Health, University of Liverpool, Liverpool, United Kingdom Microbial Ecology, Malawi-Liverpool-Wellcome Trust Clinical Research Programme, University of Malawi, College of Medicine, Blantyre, Malawi Chrispin.Chaguza@liverpool.ac.uk firstname.lastname@example.org.
Streptococcus pneumoniae causes a high burden of invasive pneumococcal disease (IPD) globally, especially in children from resource-poor settings. Like many bacteria, the pneumococcus can import DNA from other strains or even species by transformation and homologous recombination, which has allowed the pneumococcus to evade clinical interventions such as antibiotics and pneumococcal conjugate vaccines (PCVs). Pneumococci are enclosed in a complex polysaccharide capsule that determines the serotype; the capsule varies in size and is associated with properties including carriage prevalence and virulence. We determined and quantified the association between capsule and recombination events using genomic data from a diverse collection of serotypes sampled in Malawi. We determined both the amount of variation introduced by recombination relative to mutation (the relative rate) and how many individual recombination events occur per isolate (the frequency). Using univariate analyses, we found an association between both recombination measures and multiple factors associated with the capsule, including duration and prevalence of carriage. Because many capsular factors are correlated, we used multivariate analysis to correct for collinearity. Capsule size and carriage duration remained positively associated with recombination, although with a reduced P value, and this effect may be mediated through some unassayed additional property associated with larger capsules. This work describes an important impact of serotype on recombination that has been previously overlooked. While the details of how this effect is achieved remain to be determined, it may have important consequences for the serotype-specific response to vaccines and other interventions.
Importance: The capsule determines >90 different pneumococcal serotypes, which vary in capsule size, virulence, duration, and prevalence of carriage. Current serotype-specific vaccines elicit anticapsule antibodies. Pneumococcus can take up exogenous DNA by transformation and insert it into its chromosome by homologous recombination. This mechanism has disseminated drug resistance and generated vaccine escape variants. It is hence crucial to pneumococcal evolutionary response to interventions, but there has been no systematic study quantifying whether serotypes vary in recombination and whether this is associated with serotype-specific properties such as capsule size or carriage duration. Larger capsules could physically inhibit DNA uptake, or given the longer carriage duration for larger capsules, this may promote recombination. We find that recombination varies among capsules and is associated with capsule size, carriage duration, and carriage prevalence and negatively associated with invasiveness. The consequence of this work is that serotypes with different capsules may respond differently to selective pressures like vaccines.
Funded by: NIAID NIH HHS: R01 AI106786; Wellcome Trust
Understanding pneumococcal serotype 1 biology through population genomic analysis.
Department of Clinical Infection, Microbiology and Immunology, Institute of Infection and Global Health, University of Liverpool, Liverpool, L69 7BE, UK. Chrispin.Chaguza@liverpool.ac.uk.
Background: Pneumococcus kills over one million children annually and over 90 % of these deaths occur in low-income countries especially in Sub-Saharan Africa (SSA) where HIV exacerbates the disease burden. In SSA, serotype 1 pneumococci particularly the endemic ST217 clone, causes majority of the pneumococcal disease burden. To understand the evolution of the virulent ST217 clone, we analysed ST217 whole genomes from isolates sampled from African and Asian countries.
Methods: We analysed 226 whole genome sequences from the ST217 lineage sampled from 9 African and 4 Asian countries. We constructed a whole genome alignment and used it for phylogenetic and coalescent analyses. We also screened the genomes to determine presence of antibiotic resistance conferring genes.
Results: Population structure analysis grouped the ST217 isolates into five sequence clusters (SCs), which were highly associated with different geographical regions and showed limited intracontinental and intercontinental spread. The SCs showed lower than expected genomic sequence, which suggested strong purifying selection and small population sizes caused by bottlenecks. Recombination rates varied between the SCs but were lower than in other successful clones such as PMEN1. African isolates showed higher prevalence of antibiotic resistance genes than Asian isolates. Interestingly, certain West African isolates harbored a defective chloramphenicol and tetracycline resistance-conferring element (Tn5253) with a deletion in the loci encoding the chloramphenicol resistance gene (cat <sub>pC194</sub>), which caused lower chloramphenicol than tetracycline resistance. Furthermore, certain genes that promote colonisation were absent in the isolates, which may contribute to serotype 1's rarity in carriage and consequently its lower recombination rates.
Conclusions: The high phylogeographic diversity of the ST217 clone shows that this clone has been in circulation globally for a long time, which allowed its diversification and adaptation in different geographical regions. Such geographic adaptation reflects local variations in selection pressures in different locales. Further studies will be required to fully understand the biological mechanisms which makes the ST217 clone highly invasive but unable to successfully colonise the human nasopharynx for long durations which results in lower recombination rates.
Funded by: Medical Research Council: MC_U190074190, MC_U190081991, MC_UP_A900_1122; NIAID NIH HHS: R01 AI106786; Wellcome Trust: 084679/Z/08/Z, 100891
BMC infectious diseases 2016;16;1;649
Dataset for a Dugesia japonica de novo transcriptome assembly, utilized for defining the voltage-gated like ion channel superfamily.
Department of Pharmacology, University of Minnesota Medical School, MN 55455, USA.
This data article provides a transcriptomic resource for the free living planarian flatworm Dugesia japonica related to the research article entitled 'Utilizing the planarian voltage-gated ion channel transcriptome to resolve a role for a Ca(2+) channel in neuromuscular function and regeneration (J.D. Chan, D. Zhang, X. Liu, M. Zarowiecki, M. Berriman, J.S. Marchant, 2016) . Data provided in this submission comprise sequence information for the unfiltered de novo assembly, the filtered assembly and a curated analysis of voltage-gated like (VGL) ion channel sequences mined from this resource. Availability of this data should facilitate further adoption of this model by laboratories interested in studying the role of individual genes of interest in planarian physiology and regenerative biology.
Data in brief 2016;9;1044-1047
Utilizing the planarian voltage-gated ion channel transcriptome to resolve a role for a Ca(2+) channel in neuromuscular function and regeneration.
Department of Pharmacology, United Kingdom.
The robust regenerative capacity of planarian flatworms depends on the orchestration of signaling events from early wounding responses through the stem cell enacted differentiative outcomes that restore appropriate tissue types. Acute signaling events in excitable cells play an important role in determining regenerative polarity, rationalized by the discovery that sub-epidermal muscle cells express critical patterning genes known to control regenerative outcomes. These data imply a dual conductive (neuromuscular signaling) and instructive (anterior-posterior patterning) role for Ca(2+) signaling in planarian regeneration. Here, to facilitate study of acute signaling events in the excitable cell niche, we provide a de novo transcriptome assembly from the planarian Dugesia japonica allowing characterization of the diverse ionotropic portfolio of this model organism. We demonstrate the utility of this resource by proceeding to characterize the individual role of each of the planarian voltage-operated Ca(2+) channels during regeneration, and demonstrate that knockdown of a specific voltage operated Ca(2+) channel (Cav1B) that impairs muscle function uniquely creates an environment permissive for anteriorization. Provision of the full transcriptomic dataset should facilitate further investigations of molecules within the planarian voltage-gated channel portfolio to explore the role of excitable cell physiology on regenerative outcomes. This article is part of a Special Issue entitled: ECS Meeting edited by Claus Heizmann, Joachim Krebs and Jacques Haiech.
Biochimica et biophysica acta 2016
Chromosome organisation during ageing and senescence.
Epigenetics Programme, The Babraham Institute, Cambridge CB22 3AT, UK; The Wellcome Trust Sanger Institute, Cambridge CB10 1SA, UK. Electronic address: email@example.com.
Acute cellular stress caused by oncogene activation or high levels of DNA damage can engage a tumour suppressive response, which can lead to cellular senescence. Chronic cellular stress evoked by low levels of DNA damage or telomere erosion is involved in the ageing process. In oncogene induced senescence in fibroblasts, a dramatic rearrangement of heterochromatin into foci and accumulation of constitutive heterochromatin is well documented. In contrast, a loss of heterochromatin has been described in replicative senescence and premature ageing syndromes. The distinct nuclear phenotypes that accompany the stress response highlight the differences between acute and chronic stress models, and this review will address the differences and similarities between these models with a focus on chromosome organisation and heterochromatin.
Current opinion in cell biology 2016;40;161-167
Phenotypic insights into ADCY5-associated disease.
Movement Disorders Unit, Department of Neurology, Westmead Hospital, Sydney, Australia.
Background: Adenylyl cyclase 5 (ADCY5) mutations is associated with heterogenous syndromes: familial dyskinesia and facial myokymia; paroxysmal chorea and dystonia; autosomal-dominant chorea and dystonia; and benign hereditary chorea. We provide detailed clinical data on 7 patients from six new kindreds with mutations in the ADCY5 gene, in order to expand and define the phenotypic spectrum of ADCY5 mutations.
Methods: In 5 of the 7 patients, followed over a period of 9 to 32 years, ADCY5 was sequenced by Sanger sequencing. The other 2 unrelated patients participated in studies for undiagnosed pediatric hyperkinetic movement disorders and underwent whole-exome sequencing.
Results: Five patients had the previously reported p.R418W ADCY5 mutation; we also identified two novel mutations at p.R418G and p.R418Q. All patients presented with motor milestone delay, infantile-onset action-induced generalized choreoathetosis, dystonia, or myoclonus, with episodic exacerbations during drowsiness being a characteristic feature. Axial hypotonia, impaired upward saccades, and intellectual disability were variable features. The p.R418G and p.R418Q mutation patients had a milder phenotype. Six of seven patients had mild functional gain with clonazepam or clobazam. One patient had bilateral globus pallidal DBS at the age of 33 with marked reduction in dyskinesia, which resulted in mild functional improvement.
Conclusion: We further delineate the clinical features of ADCY5 gene mutations and illustrate its wide phenotypic expression. We describe mild improvement after treatment with clonazepam, clobazam, and bilateral pallidal DBS. ADCY5-associated dyskinesia may be under-recognized, and its diagnosis has important prognostic, genetic, and therapeutic implications. © 2016 The Authors. Movement Disorders published by Wiley Periodicals, Inc. on behalf of International Parkinson and Movement Disorder Society.
Movement disorders : official journal of the Movement Disorder Society 2016
Identifying the effect of patient sharing on between-hospital genetic differentiation of methicillin-resistant Staphylococcus aureus.
Department of Epidemiology, Center for Communicable Disease Dynamics, Harvard T.H. Chan School of Public Health, Boston, MA, USA. firstname.lastname@example.org.
Background: Methicillin-resistant Staphylococcus aureus (MRSA) is one of the most common healthcare-associated pathogens. To examine the role of inter-hospital patient sharing on MRSA transmission, a previous study collected 2,214 samples from 30 hospitals in Orange County, California and showed by spa typing that genetic differentiation decreased significantly with increased patient sharing. In the current study, we focused on the 986 samples with spa type t008 from the same population.
Methods: We used genome sequencing to determine the effect of patient sharing on genetic differentiation between hospitals. Genetic differentiation was measured by between-hospital genetic diversity, F ST , and the proportion of nearly identical isolates between hospitals.
Results: Surprisingly, we found very similar genetic diversity within and between hospitals, and no significant association between patient sharing and genetic differentiation measured by F ST . However, in contrast to F ST , there was a significant association between patient sharing and the proportion of nearly identical isolates between hospitals. We propose that the proportion of nearly identical isolates is more powerful at determining transmission dynamics than traditional estimators of genetic differentiation (F ST ) when gene flow between populations is high, since it is more responsive to recent transmission events. Our hypothesis was supported by the results from coalescent simulations.
Conclusions: Our results suggested that there was a high level of gene flow between hospitals facilitated by patient sharing, and that the proportion of nearly identical isolates is more sensitive to population structure than F ST when gene flow is high.
Funded by: NIGMS NIH HHS: U54 GM088558
Genome medicine 2016;8;1;18
Coordinated nuclease activities counteract Ku at single-ended DNA double-strand breaks.
Institut de Pharmacologie et de Biologie Structurale, Université de Toulouse, CNRS, UPS, 31077 Toulouse, France.
Repair of single-ended DNA double-strand breaks (seDSBs) by homologous recombination (HR) requires the generation of a 3' single-strand DNA overhang by exonuclease activities in a process called DNA resection. However, it is anticipated that the highly abundant DNA end-binding protein Ku sequesters seDSBs and shields them from exonuclease activities. Despite pioneering works in yeast, it is unclear how mammalian cells counteract Ku at seDSBs to allow HR to proceed. Here we show that in human cells, ATM-dependent phosphorylation of CtIP and the epistatic and coordinated actions of MRE11 and CtIP nuclease activities are required to limit the stable loading of Ku on seDSBs. We also provide evidence for a hitherto unsuspected additional mechanism that contributes to prevent Ku accumulation at seDSBs, acting downstream of MRE11 endonuclease activity and in parallel with MRE11 exonuclease activity. Finally, we show that Ku persistence at seDSBs compromises Rad51 focus assembly but not DNA resection.
Funded by: Cancer Research UK: 11224; Wellcome Trust
Nature communications 2016;7;12889
Extensive Proliferation of a Subset of Differentiated, yet Plastic, Medial Vascular Smooth Muscle Cells Contributes to Neointimal Formation in Mouse Injury and Atherosclerosis Models.
From the Cardiovascular Medicine Division, Department of Medicine (J.C., J.L.H., H.Y., K.F., M.R.B., H.F.J.), Cavendish Laboratory, Department of Physics (B.D.S.), The Wellcome Trust/Cancer Research UK Gurdon Institute (B.D.S.), and Wellcome Trust-Medical Research Council Stem Cell Institute (B.D.S.), University of Cambridge, United Kingdom; and The Wellcome Trust Sanger Institute, Hinxton, Cambridge, United Kingdom (V.M.N.).
Rationale: Vascular smooth muscle cell (VSMC) accumulation is a hallmark of atherosclerosis and vascular injury. However, fundamental aspects of proliferation and the phenotypic changes within individual VSMCs, which underlie vascular disease, remain unresolved. In particular, it is not known whether all VSMCs proliferate and display plasticity or whether individual cells can switch to multiple phenotypes.
Objective: To assess whether proliferation and plasticity in disease is a general characteristic of VSMCs or a feature of a subset of cells.
Methods and results: Using multicolor lineage labeling, we demonstrate that VSMCs in injury-induced neointimal lesions and in atherosclerotic plaques are oligoclonal, derived from few expanding cells. Lineage tracing also revealed that the progeny of individual VSMCs contributes to both alpha smooth muscle actin (aSma)-positive fibrous cap and Mac3-expressing macrophage-like plaque core cells. Costaining for phenotypic markers further identified a double-positive aSma+ Mac3+ cell population, which is specific to VSMC-derived plaque cells. In contrast, VSMC-derived cells generating the neointima after vascular injury generally retained the expression of VSMC markers and the upregulation of Mac3 was less pronounced. Monochromatic regions in atherosclerotic plaques and injury-induced neointima did not contain VSMC-derived cells expressing a different fluorescent reporter protein, suggesting that proliferation-independent VSMC migration does not make a major contribution to VSMC accumulation in vascular disease.
Conclusions: We demonstrate that extensive proliferation of a low proportion of highly plastic VSMCs results in the observed VSMC accumulation after injury and in atherosclerotic plaques. Therapeutic targeting of these hyperproliferating VSMCs might effectively reduce vascular disease without affecting vascular integrity.
Funded by: British Heart Foundation: FS/15/38/31516, PG/12/86/29930, PG/13/25/30014, RG/13/14/30314
Circulation research 2016;119;12;1313-1323
Whole-genome sequencing of a quarter-century melioidosis outbreak in temperate Australia uncovers a region of low-prevalence endemicity.
Melbourne Medical School, University of Melbourne, Melbourne, Victoria, Australia; Global and Tropical Health Division, Menzies School of Health Research, Darwin, Northern Territory, Australia.
Melioidosis, caused by the highly recombinogenic bacterium Burkholderia pseudomallei, is a disease with high mortality. Tracing the origin of melioidosis outbreaks and understanding how the bacterium spreads and persists in the environment are essential to protecting public and veterinary health and reducing mortality associated with outbreaks. We used whole-genome sequencing to compare isolates from a historical quarter-century outbreak that occurred between 1966 and 1991 in the Avon Valley, Western Australia, a region far outside the known range of B. pseudomallei endemicity. All Avon Valley outbreak isolates shared the same multilocus sequence type (ST-284), which has not been identified outside this region. We found substantial genetic diversity among isolates based on a comparison of genome-wide variants, with no clear correlation between genotypes and temporal, geographical or source data. We observed little evidence of recombination in the outbreak strains, indicating that genetic diversity among these isolates has primarily accrued by mutation. Phylogenomic analysis demonstrated that the isolates confidently grouped within the Australian B. pseudomallei clade, thereby ruling out introduction from a melioidosis-endemic region outside Australia. Collectively, our results point to B. pseudomallei ST-284 being present in the Avon Valley for longer than previously recognized, with its persistence and genomic diversity suggesting long-term, low-prevalence endemicity in this temperate region. Our findings provide a concerning demonstration of the potential for environmental persistence of B. pseudomallei far outside the conventional endemic regions. An expected increase in extreme weather events may reactivate latent B. pseudomallei populations in this region.
Microbial genomics 2016;2;7;e000067
Canalization of genetic and pharmacological perturbations in developing primary neuronal activity patterns.
Genes to Cognition Programme, Wellcome Trust Sanger Institute, Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK. Electronic address: email@example.com.
The function of the nervous system depends on the integrity of synapses and the patterning of electrical activity in brain circuits. The rapid advances in genome sequencing reveal a large number of mutations disrupting synaptic proteins, which potentially result in diseases known as synaptopathies. However, it is also evident that every normal individual carries hundreds of potentially damaging mutations. Although genetic studies in several organisms show that mutations can be masked during development by a process known as canalization, it is unknown if this occurs in the development of the electrical activity in the brain. Using longitudinal recordings of primary cultured neurons on multi-electrode arrays from mice carrying knockout mutations we report evidence of canalization in development of spontaneous activity patterns. Phenotypes in the activity patterns in young cultures from mice lacking the Gria1 subunit of the AMPA receptor were ameliorated as cultures matured. Similarly, the effects of chronic pharmacological NMDA receptor blockade diminished as cultures matured. Moreover, disturbances in activity patterns by simultaneous disruption of Gria1 and NMDA receptors were also canalized by three weeks in culture. Additional mutations and genetic variations also appeared to be canalized to varying degrees. These findings indicate that neuronal network canalization is a form of nervous system plasticity that provides resilience to developmental disruption. This article is part of the Special Issue entitled 'Synaptopathy--from Biology to Therapy'.
Genetic Drivers of Epigenetic and Transcriptional Variation in Human Immune Cells.
Department of Human Genetics, The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1HH, UK; Department of Haematology, University of Cambridge, Cambridge Biomedical Campus, Long Road, Cambridge CB2 0PT, UK.
Characterizing the multifaceted contribution of genetic and epigenetic factors to disease phenotypes is a major challenge in human genetics and medicine. We carried out high-resolution genetic, epigenetic, and transcriptomic profiling in three major human immune cell types (CD14<sup>+</sup> monocytes, CD16<sup>+</sup> neutrophils, and naive CD4<sup>+</sup> T cells) from up to 197 individuals. We assess, quantitatively, the relative contribution of cis-genetic and epigenetic factors to transcription and evaluate their impact as potential sources of confounding in epigenome-wide association studies. Further, we characterize highly coordinated genetic effects on gene expression, methylation, and histone variation through quantitative trait locus (QTL) mapping and allele-specific (AS) analyses. Finally, we demonstrate colocalization of molecular trait QTLs at 345 unique immune disease loci. This expansive, high-resolution atlas of multi-omics changes yields insights into cell-type-specific correlation between diverse genomic inputs, more generalizable correlations between these inputs, and defines molecular events that may underpin complex disease risk.
Funded by: British Heart Foundation: RG/09/012/28096; Department of Health: RP-PG-0310-1002; Medical Research Council: G0800270; Wellcome Trust
Single-cell analysis at the threshold.
Wellcome Trust Sanger Institute, Cambridge, UK.
Nature biotechnology 2016;34;11;1111-1118
Genome-Wide Association Analysis of Young-Onset Stroke Identifies a Locus on Chromosome 10q25 Near HABP2.
From the Veterans Affairs Maryland Health Care System, Baltimore, MD (Y.-C.C., S.J.K., J.W.C., B.D.M.); University of Maryland School of Medicine, Baltimore (Y.-C.C., H.X., S.J.K., J.W.C., J.R.O., B.D.M.); The University of Gothenburg, Gothenburg, Sweden (T.M.S., C.J.); University of Rostock, Rostock, Germany (A.-K.G., A. Rolfs); University of Nottingham Malaysia Campus, Selangor Darul Ehsa, Malaysia (W.K.H.); University of Cambridge, Cambridge, UK (M.T., J.D., S.B., H.S.M., S.D., D.S.); Institut Pasteur de Lille, F-59000 Lille, France (P.A.); University of Newcastle, Australia (E.G.H.); Ludwig-Maximilians-Universität, Munich, Germany (R.M., K.S., M.D.); Wellcome Trust Sanger Institute, Cambridge, UK (J.D.); Center for Non-Communicable Diseases, Karachi, Pakistan (A. Rasheed, D.S.); University of Pennsylvania (W.Z., D.S.); Basel University Hospital, Switzerland (S.E.); Heidelberg University Hospital, Germany (C.G.-G.); Centre d'Étude du Polymorphisme Humain, Paris, France (Y.K.); RIKEN Center for Integrative Medical Sciences, Yokohama, Japan (Y.K.); National Genotyping Center, Evry, France (M.L.); Genome Quebec, McGill University, Montreal, Canada (M.L.); Lille University Hospital, France (D.L., S.D.); KU Leuven - University of Leuven, Leuven, Belgium (V.T.); Vesalius Research Center, VIB, Leuven, Belgium (V.T.); University Hospitals Leuven, Leuven, Belgium (V.T.); Helsinki University Central Hospital, Helsinki, Finland (T.M.M., T.T.); Università degli Studi di Brescia, Brescia, Italy (A. Pezzini); Fondazione IRCCS Istituto Neurologico Carlo Besta, Milan, Italy (E.A.P., G.B.B.); University of Lund, Sweden (B.N.); University of Oxford, John Radcliffe Hospital (P.M.R.); University of Edinburgh, Edinburgh, UK (C.S.); Jagiellonian University Medical College, Krakow, Poland (A.S.); Lund University, Lund, Sweden (A.L.); Skåne University Hospital, Lund, Sweden (A.L.); University of Glasgow, Glasgow, UK (M.R.W.); University of Adelaide, Australia (J.J.); Mount Sinai Hos
Background and purpose: Although a genetic contribution to ischemic stroke is well recognized, only a handful of stroke loci have been identified by large-scale genetic association studies to date. Hypothesizing that genetic effects might be stronger for early- versus late-onset stroke, we conducted a 2-stage meta-analysis of genome-wide association studies, focusing on stroke cases with an age of onset <60 years.
Methods: The discovery stage of our genome-wide association studies included 4505 cases and 21 968 controls of European, South-Asian, and African ancestry, drawn from 6 studies. In Stage 2, we selected the lead genetic variants at loci with association P<5×10(-6) and performed in silico association analyses in an independent sample of ≤1003 cases and 7745 controls.
Results: One stroke susceptibility locus at 10q25 reached genome-wide significance in the combined analysis of all samples from the discovery and follow-up stages (rs11196288; odds ratio =1.41; P=9.5×10(-9)). The associated locus is in an intergenic region between TCF7L2 and HABP2. In a further analysis in an independent sample, we found that 2 single nucleotide polymorphisms in high linkage disequilibrium with rs11196288 were significantly associated with total plasma factor VII-activating protease levels, a product of HABP2.
Conclusions: HABP2, which encodes an extracellular serine protease involved in coagulation, fibrinolysis, and inflammatory pathways, may be a genetic susceptibility locus for early-onset stroke.
Stroke; a journal of cerebral circulation 2016;47;2;307-16
Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.
Nature reviews. Microbiology 2016;14;5;271
CRISPR-Cas9(D10A) nickase-based genotypic and phenotypic screening to enhance genome editing.
Wellcome Trust/Cancer Research UK Gurdon Institute, University of Cambridge, Cambridge CB2 1QN, UK.
The RNA-guided Cas9 nuclease is being widely employed to engineer the genomes of various cells and organisms. Despite the efficient mutagenesis induced by Cas9, off-target effects have raised concerns over the system's specificity. Recently a "double-nicking" strategy using catalytic mutant Cas9(D10A) nickase has been developed to minimise off-target effects. Here, we describe a Cas9(D10A)-based screening approach that combines an All-in-One Cas9(D10A) nickase vector with fluorescence-activated cell sorting enrichment followed by high-throughput genotypic and phenotypic clonal screening strategies to generate isogenic knockouts and knock-ins highly efficiently, with minimal off-target effects. We validated this approach by targeting genes for the DNA-damage response (DDR) proteins MDC1, 53BP1, RIF1 and P53, plus the nuclear architecture proteins Lamin A/C, in three different human cell lines. We also efficiently obtained biallelic knock-in clones, using single-stranded oligodeoxynucleotides as homologous templates, for insertion of an EcoRI recognition site at the RIF1 locus and introduction of a point mutation at the histone H2AFX locus to abolish assembly of DDR factors at sites of DNA double-strand breaks. This versatile screening approach should facilitate research aimed at defining gene functions, modelling of cancers and other diseases underpinned by genetic factors, and exploring new therapeutic opportunities.
Scientific reports 2016;6;24356
gEVAL - a web-based browser for evaluating genome assemblies.
Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, UK.
Motivation: For most research approaches, genome analyses are dependent on the existence of a high quality genome reference assembly. However, the local accuracy of an assembly remains difficult to assess and improve. The gEVAL browser allows the user to interrogate an assembly in any region of the genome by comparing it to different datasets and evaluating the concordance. These analyses include: a wide variety of sequence alignments, comparative analyses of multiple genome assemblies, and consistency with optical and other physical maps. gEVAL highlights allelic variations, regions of low complexity, abnormal coverage, and potential sequence and assembly errors, and offers strategies for improvement. Although gEVAL focuses primarily on sequence integrity, it can also display arbitrary annotation including from Ensembl or TrackHub sources. We provide gEVAL web sites for many human, mouse, zebrafish and chicken assemblies to support the Genome Reference Consortium, and gEVAL is also downloadable to enable its use for any organism and assembly.
Availability and implementation: Web Browser: http://geval.sanger.ac.uk, Plugin: http://wchow.github.io/wtsi-geval-plugin
Supplementary information: Supplementary data are available at Bioinformatics online.
Funded by: Wellcome Trust: 098051
Bioinformatics (Oxford, England) 2016;32;16;2508-10
South Asia as a Reservoir for the Global Spread of Ciprofloxacin-Resistant Shigella sonnei: A Cross-Sectional Study.
The Hospital for Tropical Diseases, Wellcome Trust Major Overseas Programme, Oxford University Clinical Research Unit, Ho Chi Minh City, Vietnam.
Background: Antimicrobial resistance is a major issue in the Shigellae, particularly as a specific multidrug-resistant (MDR) lineage of Shigella sonnei (lineage III) is becoming globally dominant. Ciprofloxacin is a recommended treatment for Shigella infections. However, ciprofloxacin-resistant S. sonnei are being increasingly isolated in Asia and sporadically reported on other continents. We hypothesized that Asia is a primary hub for the recent international spread of ciprofloxacin-resistant S. sonnei.
Methods and findings: We performed whole-genome sequencing on a collection of 60 contemporaneous ciprofloxacin-resistant S. sonnei isolated in four countries within Asia (Vietnam, n = 11; Bhutan, n = 12; Thailand, n = 1; Cambodia, n = 1) and two outside of Asia (Australia, n = 19; Ireland, n = 16). We reconstructed the recent evolutionary history of these organisms and combined these data with their geographical location of isolation. Placing these sequences into a global phylogeny, we found that all ciprofloxacin-resistant S. sonnei formed a single clade within a Central Asian expansion of lineage III. Furthermore, our data show that resistance to ciprofloxacin within S. sonnei may be globally attributed to a single clonal emergence event, encompassing sequential gyrA-S83L, parC-S80I, and gyrA-D87G mutations. Geographical data predict that South Asia is the likely primary source of these organisms, which are being regularly exported across Asia and intercontinentally into Australia, the United States and Europe. Our analysis was limited by the number of S. sonnei sequences available from diverse geographical areas and time periods, and we cannot discount the potential existence of other unsampled reservoir populations of antimicrobial-resistant S. sonnei.
Conclusions: This study suggests that a single clone, which is widespread in South Asia, is likely driving the current intercontinental surge of ciprofloxacin-resistant S. sonnei and is capable of establishing endemic transmission in new locations. Despite being limited in geographical scope, our work has major implications for understanding the international transfer of antimicrobial-resistant pathogens, with S. sonnei acting as a tractable model for studying how antimicrobial-resistant Gram-negative bacteria spread globally.
PLoS medicine 2016;13;8;e1002055
Modulation of the human gut microbiota by dietary fibres occurs at the species level.
Microbiology Group, Rowett Institute of Nutrition and Health, University of Aberdeen, Greenburn Road, Bucksburn, Aberdeen, Scotland, AB21 9SB, UK.
Background: Dietary intake of specific non-digestible carbohydrates (including prebiotics) is increasingly seen as a highly effective approach for manipulating the composition and activities of the human gut microbiota to benefit health. Nevertheless, surprisingly little is known about the global response of the microbial community to particular carbohydrates. Recent in vivo dietary studies have demonstrated that the species composition of the human faecal microbiota is influenced by dietary intake. There is now potential to gain insights into the mechanisms involved by using in vitro systems that produce highly controlled conditions of pH and substrate supply.
Results: We supplied two alternative non-digestible polysaccharides as energy sources to three different human gut microbial communities in anaerobic, pH-controlled continuous-flow fermentors. Community analysis showed that supply of apple pectin or inulin resulted in the highly specific enrichment of particular bacterial operational taxonomic units (OTUs; based on 16S rRNA gene sequences). Of the eight most abundant Bacteroides OTUs detected, two were promoted specifically by inulin and six by pectin. Among the Firmicutes, Eubacterium eligens in particular was strongly promoted by pectin, while several species were stimulated by inulin. Responses were influenced by pH, which was stepped up, and down, between 5.5, 6.0, 6.4 and 6.9 in parallel vessels within each experiment. In particular, several experiments involving downshifts to pH 5.5 resulted in Faecalibacterium prausnitzii replacing Bacteroides spp. as the dominant sequences observed. Community diversity was greater in the pectin-fed than in the inulin-fed fermentors, presumably reflecting the differing complexity of the two substrates.
Conclusions: We have shown that particular non-digestible dietary carbohydrates have enormous potential for modifying the gut microbiota, but these modifications occur at the level of individual strains and species and are not easily predicted a priori. Furthermore, the gut environment, especially pH, plays a key role in determining the outcome of interspecies competition. This makes it crucial to put greater effort into identifying the range of bacteria that may be stimulated by a given prebiotic approach. Both for reasons of efficacy and of safety, the development of prebiotics intended to benefit human health has to take account of the highly individual species profiles that may result.
Funded by: Biotechnology and Biological Sciences Research Council; Wellcome Trust: 098051
BMC biology 2016;14;3
metaCCA: summary statistics-based multivariate meta-analysis of genome-wide association studies using canonical correlation analysis.
Institute for Molecular Medicine Finland FIMM, University of Helsinki, Helsinki, Finland, Helsinki Institute for Information Technology HIIT, Department of Computer Science, Aalto University, Espoo, Finland.
Motivation: A dominant approach to genetic association studies is to perform univariate tests between genotype-phenotype pairs. However, analyzing related traits together increases statistical power, and certain complex associations become detectable only when several variants are tested jointly. Currently, modest sample sizes of individual cohorts, and restricted availability of individual-level genotype-phenotype data across the cohorts limit conducting multivariate tests.
Results: We introduce metaCCA, a computational framework for summary statistics-based analysis of a single or multiple studies that allows multivariate representation of both genotype and phenotype. It extends the statistical technique of canonical correlation analysis to the setting where original individual-level records are not available, and employs a covariance shrinkage algorithm to achieve robustness.Multivariate meta-analysis of two Finnish studies of nuclear magnetic resonance metabolomics by metaCCA, using standard univariate output from the program SNPTEST, shows an excellent agreement with the pooled individual-level analysis of original data. Motivated by strong multivariate signals in the lipid genes tested, we envision that multivariate association testing using metaCCA has a great potential to provide novel insights from already published summary statistics from high-throughput phenotyping technologies.
Availability and implementation: Code is available at https://github.com/aalto-ics-kepaco CONTACTS: firstname.lastname@example.org or email@example.comSupplementary information: Supplementary data are available at Bioinformatics online.
Bioinformatics (Oxford, England) 2016
Single-cell epigenomics: powerful new methods for understanding gene regulation and cell identity.
Epigenetics Programme, Babraham Institute, Cambridge, CB22 3AT, UK.
Emerging single-cell epigenomic methods are being developed with the exciting potential to transform our knowledge of gene regulation. Here we review available techniques and future possibilities, arguing that the full potential of single-cell epigenetic studies will be realized through parallel profiling of genomic, transcriptional, and epigenetic information.
Genome biology 2016;17;1;72
Comparative genomics of carriage and disease isolates of Streptococcus pneumoniae serotype 22F reveals lineage specific divergence and niche adaptation.
1. Academic Unit of Clinical and Experimental Sciences, Faculty of Medicine, University of Southampton, Southampton, UK 2. Institute for Life Sciences, University of Southampton, Southampton, UK.
Streptococcus pneumoniaeis a major cause of meningitis, sepsis and pneumonia worldwide. Pneumococcal conjugate vaccines (PCV) have been part of the UK's childhood immunisation programme since 2006 and have significantly reduced the incidence of disease due to vaccine efficacy in reducing carriage in the population. Here we isolated two clones of 22F (an emerging serotype of clinical concern, multilocus sequence types (MLST) 433 and 698) and conducted comparative genomic analysis on four isolates, paired by ST with one of each pair being derived from carriage and the other disease (sepsis). The most compelling observation was of non-synonymous mutations inpgdA, encoding peptidoglycanN-acetylglucosamine deacetylase A, which were found in the carriage isolates of both ST433 and 698. Deacetylation of pneumococcal peptidoglycan is known to enable resistance to lysozyme upon invasion. Whilst no other clear genotypic signatures related to disease or carriage could be determined, additional intriguing comparisons between the two STs were possible. These include the presence of an intact prophage, in addition to numerous additional phage insertions, within the carriage isolate of ST433. Contrasting gene repertoires related to virulence and colonisation, including: bacteriocins, lantibiotics, and toxin-antitoxin systems, were also observed.
Genome biology and evolution 2016
Cytomegalovirus-Specific IL-10-Producing CD4+ T Cells Are Governed by Type-I IFN-Induced IL-27 and Promote Virus Persistence.
Division of Infection & Immunity, Cardiff University, Cardiff, United Kingdom.
CD4+ T cells support host defence against herpesviruses and other viral pathogens. We identified that CD4+ T cells from systemic and mucosal tissues of hosts infected with the β-herpesviridae human cytomegalovirus (HCMV) or murine cytomegalovirus (MCMV) express the regulatory cytokine interleukin (IL)-10. IL-10+CD4+ T cells co-expressed TH1-associated transcription factors and chemokine receptors. Mice lacking T cell-derived IL-10 elicited enhanced antiviral T cell responses and restricted MCMV persistence in salivary glands and secretion in saliva. Thus, IL-10+CD4+ T cells suppress antiviral immune responses against CMV. Expansion of this T-cell population in the periphery was promoted by IL-27 whereas mucosal IL-10+ T cell responses were ICOS-dependent. Infected Il27rα-deficient mice with reduced peripheral IL-10+CD4+ T cell accumulation displayed robust T cell responses and restricted MCMV persistence and shedding. Temporal inhibition experiments revealed that IL-27R signaling during initial infection was required for the suppression of T cell immunity and control of virus shedding during MCMV persistence. IL-27 production was promoted by type-I IFN, suggesting that β-herpesviridae exploit the immune-regulatory properties of this antiviral pathway to establish chronicity. Further, our data reveal that cytokine signaling events during initial infection profoundly influence virus chronicity.
PLoS pathogens 2016;12;12;e1006050
Inherited determinants of Crohn's disease and ulcerative colitis phenotypes: a genetic association study.
Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK; Department of Clinical and Experimental Medicine, TARGID, KU Leuven, Leuven, Belgium.
Background: Crohn's disease and ulcerative colitis are the two major forms of inflammatory bowel disease; treatment strategies have historically been determined by this binary categorisation. Genetic studies have identified 163 susceptibility loci for inflammatory bowel disease, mostly shared between Crohn's disease and ulcerative colitis. We undertook the largest genotype association study, to date, in widely used clinical subphenotypes of inflammatory bowel disease with the goal of further understanding the biological relations between diseases.
Methods: This study included patients from 49 centres in 16 countries in Europe, North America, and Australasia. We applied the Montreal classification system of inflammatory bowel disease subphenotypes to 34,819 patients (19,713 with Crohn's disease, 14,683 with ulcerative colitis) genotyped on the Immunochip array. We tested for genotype-phenotype associations across 156,154 genetic variants. We generated genetic risk scores by combining information from all known inflammatory bowel disease associations to summarise the total load of genetic risk for a particular phenotype. We used these risk scores to test the hypothesis that colonic Crohn's disease, ileal Crohn's disease, and ulcerative colitis are all genetically distinct from each other, and to attempt to identify patients with a mismatch between clinical diagnosis and genetic risk profile.
Findings: After quality control, the primary analysis included 29,838 patients (16,902 with Crohn's disease, 12,597 with ulcerative colitis). Three loci (NOD2, MHC, and MST1 3p21) were associated with subphenotypes of inflammatory bowel disease, mainly disease location (essentially fixed over time; median follow-up of 10·5 years). Little or no genetic association with disease behaviour (which changed dramatically over time) remained after conditioning on disease location and age at onset. The genetic risk score representing all known risk alleles for inflammatory bowel disease showed strong association with disease subphenotype (p=1·65 × 10(-78)), even after exclusion of NOD2, MHC, and 3p21 (p=9·23 × 10(-18)). Predictive models based on the genetic risk score strongly distinguished colonic from ileal Crohn's disease. Our genetic risk score could also identify a small number of patients with discrepant genetic risk profiles who were significantly more likely to have a revised diagnosis after follow-up (p=6·8 × 10(-4)).
Interpretation: Our data support a continuum of disorders within inflammatory bowel disease, much better explained by three groups (ileal Crohn's disease, colonic Crohn's disease, and ulcerative colitis) than by Crohn's disease and ulcerative colitis as currently defined. Disease location is an intrinsic aspect of a patient's disease, in part genetically determined, and the major driver to changes in disease behaviour over time.
Funding: International Inflammatory Bowel Disease Genetics Consortium members funding sources (see Acknowledgments for full list).
Funded by: AHRQ HHS: HS021747, R01 HS021747; Chief Scientist Office: ETM/75; Medical Research Council: G0600329, G0800675; NCI NIH HHS: P30 CA016359, R01 CA141743; NIAID NIH HHS: AI067068, U01 AI067068; NIDCR NIH HHS: U54 DE023789, U54DE023789-01; NIDDK NIH HHS: DK062413, DK062420, DK062422, DK062423, DK062429, DK062429-S1, DK062431, DK062432, DK076984, DK084554, P01 DK046763, P01DK046763, P30 DK043351, P30 DK089502, R03 DK076984, R21 DK084554, U01 DK062413, U01 DK062418, U01 DK062420, U01 DK062422, U01 DK062423, U01 DK062429, U01 DK062431, U01 DK062432; Wellcome Trust: 083948/Z/07/Z, 085475/B/08/Z, 085475/Z/08/Z, 098051, 098759
Lancet (London, England) 2016;387;10014;156-67
Common polygenic variation in coeliac disease and confirmation of ZNF335 and NIFA as disease susceptibility loci.
Department of Medicine, Institute of Molecular Medicine, Trinity College Dublin, St. James's Hospital, Dublin, Ireland.
Coeliac disease (CD) is a chronic immune-mediated disease triggered by the ingestion of gluten. It has an estimated prevalence of approximately 1% in European populations. Specific HLA-DQA1 and HLA-DQB1 alleles are established coeliac susceptibility genes and are required for the presentation of gliadin to the immune system resulting in damage to the intestinal mucosa. In the largest association analysis of CD to date, 39 non-HLA risk loci were identified, 13 of which were new, in a sample of 12 014 individuals with CD and 12 228 controls using the Immunochip genotyping platform. Including the HLA, this brings the total number of known CD loci to 40. We have replicated this study in an independent Irish CD case-control population of 425 CD and 453 controls using the Immunochip platform. Using a binomial sign test, we show that the direction of the effects of previously described risk alleles were highly correlated with those reported in the Irish population, (P=2.2 × 10(-16)). Using the Polygene Risk Score (PRS) approach, we estimated that up to 35% of the genetic variance could be explained by loci present on the Immunochip (P=9 × 10(-75)). When this is limited to non-HLA loci, we explain a maximum of 4.5% of the genetic variance (P=3.6 × 10(-18)). Finally, we performed a meta-analysis of our data with the previous reports, identifying two further loci harbouring the ZNF335 and NIFA genes which now exceed genome-wide significance, taking the total number of CD susceptibility loci to 42.
European journal of human genetics : EJHG 2016;24;2;291-7
Clonal analysis of stem cells in differentiation and disease.
Wellcome Trust Sanger Institute, Hinxton CB10 1SA, UK.
Tracking the fate of individual cells and their progeny by clonal analysis has redefined the concept of stem cells and their role in health and disease. The maintenance of cell turnover in adult tissues is achieved by the collective action of populations of stem cells with an equal likelihood of self-renewal or differentiation. Following injury stem cells exhibit striking plasticity, switching from homeostatic behavior in order to repair damaged tissues. The effects of disease states on stem cells are also being uncovered, with new insights into how somatic mutations trigger clonal expansion in early neoplasia.
Funded by: Cancer Research UK: C609/A17257; Wellcome Trust
Current opinion in cell biology 2016;43;14-21
A survey of best practices for RNA-seq data analysis.
Institute for Food and Agricultural Sciences, Department of Microbiology and Cell Science, University of Florida, Gainesville, FL, 32603, USA. firstname.lastname@example.org.
RNA-sequencing (RNA-seq) has a wide variety of applications, but no single analysis pipeline can be used in all cases. We review all of the major steps in RNA-seq data analysis, including experimental design, quality control, read alignment, quantification of gene and transcript levels, visualization, differential gene expression, alternative splicing, functional analysis, gene fusion detection and eQTL mapping. We highlight the challenges associated with each step. We discuss the analysis of small RNAs and the integration of RNA-seq with other functional genomics techniques. Finally, we discuss the outlook for novel technologies that are changing the state of the art in transcriptomics.
Funded by: Medical Research Council: MC_PC_12009; NIGMS NIH HHS: DP2 GM111100; Wellcome Trust
Genome biology 2016;17;13
What's in a Name? Species-Wide Whole-Genome Sequencing Resolves Invasive and Noninvasive Lineages of Salmonella enterica Serotype Paratyphi B.
Cardiff University School of Biosciences, Cardiff University, Cardiff, United Kingdom Wellcome Trust Sanger Institute, Hinxton, United Kingdom email@example.com.
Unlabelled: For 100 years, it has been obvious that Salmonella enterica strains sharing the serotype with the formula 1,4,,12:b:1,2-now known as Paratyphi B-can cause diseases ranging from serious systemic infections to self-limiting gastroenteritis. Despite considerable predicted diversity between strains carrying the common Paratyphi B serotype, there remain few methods that subdivide the group into groups that are congruent with their disease phenotypes. Paratyphi B therefore represents one of the canonical examples in Salmonella where serotyping combined with classical microbiological tests fails to provide clinically informative information. Here, we use genomics to provide the first high-resolution view of this serotype, placing it into a wider genomic context of the Salmonella enterica species. These analyses reveal why it has been impossible to subdivide this serotype based upon phenotypic and limited molecular approaches. By examining the genomic data in detail, we are able to identify common features that correlate with strains of clinical importance. The results presented here provide new diagnostic targets, as well as posing important new questions about the basis for the invasive disease phenotype observed in a subset of strains.
Importance: Salmonella enterica strains carrying the serotype Paratyphi B have long been known to possess Jekyll and Hyde characteristics; some cause gastroenteritis, while others cause serious invasive disease. Understanding what makes up the population of strains carrying this serotype, as well as the source of their invasive disease, is a 100-year-old puzzle that we address here using genomics. Our analysis provides the first high-resolution view of this serotype, placing strains carrying serotype Paratyphi B into the wider genomic context of the Salmonella enterica species. This work reveals a history of disease dating back to the middle ages, caused by a group of distinct lineages with various abilities to cause invasive disease. By quantifying the key genomic differences between the invasive and noninvasive populations, we are able to identify key virulence-related targets that can form the basis of simple, rapid, point-of-care tests.
Funded by: Medical Research Council: MR/L015080/1; Wellcome Trust: 098051
The genome of Onchocerca volvulus, agent of river blindness.
Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK.
Human onchocerciasis is a serious neglected tropical disease caused by the filarial nematode Onchocerca volvulus that can lead to blindness and chronic disability. Control of the disease relies largely on mass administration of a single drug, and the development of new drugs and vaccines depends on a better knowledge of parasite biology. Here, we describe the chromosomes of O. volvulus and its Wolbachia endosymbiont. We provide the highest-quality sequence assembly for any parasitic nematode to date, giving a glimpse into the evolution of filarial parasite chromosomes and proteomes. This resource was used to investigate gene families with key functions that could be potentially exploited as targets for future drugs. Using metabolic reconstruction of the nematode and its endosymbiont, we identified enzymes that are likely to be essential for O. volvulus viability. In addition, we have generated a list of proteins that could be targeted by Federal-Drug-Agency-approved but repurposed drugs, providing starting points for anti-onchocerciasis drug development.
Funded by: NIAID NIH HHS: R01 AI078314, T32 AI007180, U19 AI110820; NIH HHS: DP2 OD007372
Nature microbiology 2016;2;16216
An expressed, endogenous Nodavirus-like element captured by a retrotransposon in the genome of the plant parasitic nematode Bursaphelenchus xylophilus.
Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.
Recently, nematode viruses infecting Caenorhabditis elegans have been reported from the family Nodaviridae, the first nematode viruses described. Here, we report the observation of a novel endogenous viral element (EVE) in the genome of Bursaphelenchus xylophilus, a plant parasitic nematode unrelated to other nematodes from which viruses have been characterised. This element derives from a different clade of nodaviruses to the previously reported nematode viruses. This represents the first endogenous nodavirus sequence, the first nematode endogenous viral element, and significantly extends our knowledge of the potential diversity of the Nodaviridae. A search for endogenous elements related to the Nodaviridae did not reveal any elements in other available nematode genomes. Further surveillance for endogenous viral elements is warranted as our knowledge of nematode genome diversity, and in particular of free-living nematodes, expands.
Funded by: Wellcome Trust: WT098051, WT099198MA
Scientific reports 2016;6;39749
RLZAP: Relative lempel-Ziv with adaptive pointers
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2016;9954 LNCS;1-14
Whole genome resequencing of the human parasite Schistosoma mansoni reveals population history and effects of selection.
Department of Infectious Disease Epidemiology, Imperial College London, St Mary's Campus, Norfolk Place, London W2 1PG, United Kingdom.
Schistosoma mansoni is a parasitic fluke that infects millions of people in the developing world. This study presents the first application of population genomics to S. mansoni based on high-coverage resequencing data from 10 global isolates and an isolate of the closely-related Schistosoma rodhaini, which infects rodents. Using population genetic tests, we document genes under directional and balancing selection in S. mansoni that may facilitate adaptation to the human host. Coalescence modeling reveals the speciation of S. mansoni and S. rodhaini as 107.5-147.6KYA, a period which overlaps with the earliest archaeological evidence for fishing in Africa. Our results indicate that S. mansoni originated in East Africa and experienced a decline in effective population size 20-90KYA, before dispersing across the continent during the Holocene. In addition, we find strong evidence that S. mansoni migrated to the New World with the 16-19th Century Atlantic Slave Trade.
Funded by: Medical Research Council; Wellcome Trust: 098051
Scientific reports 2016;6;20954
Reduced efficacy of praziquantel against Schistosoma mansoni is associated with multiple-rounds of mass drug administration.
Department of Infectious Disease Epidemiology and the London Centre for Neglected Tropical Disease Research, Imperial College London, St Mary's Campus, Norfolk Place, London W2 1PG, United Kingdom Wellcome Trust Sanger Institute, Hinxton, CB10 1SA, United Kingdom Department of Pathology and Pathogen Biology, Royal Veterinary College, University of London, Hertfordshire, AL9 7TA, United Kingdom firstname.lastname@example.org.
Background: Mass drug administration (MDA) with praziquantel is the cornerstone of schistosomiasis control in sub-Saharan Africa. The effectiveness of this strategy is dependent on the continued high efficacy of praziquantel, however drug efficacy is rarely monitored using appropriate statistical approaches that can detect early signs of wane.
Methods: We conducted a repeated cross-sectional study, examining children infected with Schistosoma mansoni from 6 schools in Uganda that had previously received between 1 and 9 rounds of MDA with praziquantel. We collected up to 12 S. mansoni egg counts from 414 children aged 6-12 before and 25-27 days after treatment with praziquantel. We estimated individual patient egg reduction rates (ERRs) using a statistical model to explore the influence of covariates, including the number of prior MDA rounds.
Results: The average ERR among children within schools that had received 8 or 9 previous rounds of MDA (95% Bayesian credible interval (BCI) 88.23%, 93.64%) was statistically significantly lower than the average in schools that had received 5 (95% BCI 96.13%, 99.08%) or 1 (95% BCI 95.51%, 98.96%) round of MDA. We estimate that 5.11%, 4.55% and 16.42% of children from schools that had received 1, 5, and 8/9 rounds of MDA respectively had ERRs below the 90% threshold of optimal praziquantel efficacy set by the World Health Organization.
Conclusions: The reduced efficacy of praziquantel in schools with a higher exposure to MDA may pose a threat to the effectiveness of schistosomiasis control programs. We call for the efficacy of anthelmintic drugs used in MDA to be closely monitored.
Clinical infectious diseases : an official publication of the Infectious Diseases Society of America 2016
Binding of Plasmodium falciparum Merozoite Surface Proteins DBLMSP and DBLMSP2 to Human Immunoglobulin M Is Conserved among Broadly Diverged Sequence Variants.
From the Cell Surface Signalling Laboratory, the Malaria Programme, and.
Diversity at pathogen genetic loci can be driven by host adaptive immune selection pressure and may reveal proteins important for parasite biology. Population-based genome sequencing of Plasmodium falciparum, the parasite responsible for the most severe form of malaria, has highlighted two related polymorphic genes called dblmsp and dblmsp2, which encode Duffy binding-like (DBL) domain-containing proteins located on the merozoite surface but whose function remains unknown. Using recombinant proteins and transgenic parasites, we show that DBLMSP and DBLMSP2 directly and avidly bind human IgM via their DBL domains. We used whole genome sequence data from over 400 African and Asian P. falciparum isolates to show that dblmsp and dblmsp2 exhibit extreme protein polymorphism in their DBL domain, with multiple variants of two major allelic classes present in every population tested. Despite this variability, the IgM binding function was retained across diverse sequence representatives. Although this interaction did not seem to have an effect on the ability of the parasite to invade red blood cells, binding of DBLMSP and DBLMSP2 to IgM inhibited the overall immunoreactivity of these proteins to IgG from patients who had been exposed to the parasite. This suggests that IgM binding might mask these proteins from the host humoral immune system.
The Journal of biological chemistry 2016;291;27;14285-99
Horizontal DNA Transfer Mechanisms of Bacteria as Weapons of Intragenomic Conflict.
Department of Infectious Disease Epidemiology, Imperial College London, London, United Kingdom.
Horizontal DNA transfer (HDT) is a pervasive mechanism of diversification in many microbial species, but its primary evolutionary role remains controversial. Much recent research has emphasised the adaptive benefit of acquiring novel DNA, but here we argue instead that intragenomic conflict provides a coherent framework for understanding the evolutionary origins of HDT. To test this hypothesis, we developed a mathematical model of a clonally descended bacterial population undergoing HDT through transmission of mobile genetic elements (MGEs) and genetic transformation. Including the known bias of transformation toward the acquisition of shorter alleles into the model suggested it could be an effective means of counteracting the spread of MGEs. Both constitutive and transient competence for transformation were found to provide an effective defence against parasitic MGEs; transient competence could also be effective at permitting the selective spread of MGEs conferring a benefit on their host bacterium. The coordination of transient competence with cell-cell killing, observed in multiple species, was found to result in synergistic blocking of MGE transmission through releasing genomic DNA for homologous recombination while simultaneously reducing horizontal MGE spread by lowering the local cell density. To evaluate the feasibility of the functions suggested by the modelling analysis, we analysed genomic data from longitudinal sampling of individuals carrying Streptococcus pneumoniae. This revealed the frequent within-host coexistence of clonally descended cells that differed in their MGE infection status, a necessary condition for the proposed mechanism to operate. Additionally, we found multiple examples of MGEs inhibiting transformation through integrative disruption of genes encoding the competence machinery across many species, providing evidence of an ongoing "arms race." Reduced rates of transformation have also been observed in cells infected by MGEs that reduce the concentration of extracellular DNA through secretion of DNases. Simulations predicted that either mechanism of limiting transformation would benefit individual MGEs, but also that this tactic's effectiveness was limited by competition with other MGEs coinfecting the same cell. A further observed behaviour we hypothesised to reduce elimination by transformation was MGE activation when cells become competent. Our model predicted that this response was effective at counteracting transformation independently of competing MGEs. Therefore, this framework is able to explain both common properties of MGEs, and the seemingly paradoxical bacterial behaviours of transformation and cell-cell killing within clonally related populations, as the consequences of intragenomic conflict between self-replicating chromosomes and parasitic MGEs. The antagonistic nature of the different mechanisms of HDT over short timescales means their contribution to bacterial evolution is likely to be substantially greater than previously appreciated.
PLoS biology 2016;14;3;e1002394
Practical Experience of the Application of a Weighted Burden Test to Whole Exome Sequence Data for Obesity and Schizophrenia.
UCL Genetics Institute, UCL, Darwin Building, Gower Street, London, WC1E 6BT, UK.
For biological and statistical reasons it makes sense to combine information from variants at the level of the gene. One may wish to give more weight to variants which are rare and those that are more likely to affect function. A combined weighting scheme, implemented in the SCOREASSOC program, was applied to whole exome sequence data for 1392 subjects with schizophrenia and 982 with obesity from the UK10K project. Results conformed fairly well with null hypothesis expectations and no individual gene was strongly implicated. However, a number of the higher ranked genes appear plausible candidates as being involved in one or other phenotype and may warrant further investigation. These include MC4R, NLGN2, CRP, DONSON, GTF3A, IL36B, ADCYAP1R1, ARSA, DLG1, SIK2, SLAIN1, UBE2Q2, ZNF507, CRHR1, MUSK, NSF, SNORD115, GDF3 and HIBADH. Some individual variants in these genes have different frequencies between cohorts and could be genotyped in additional subjects. For other genes, there is a general excess of variants at many different sites so attempts at replication would be more difficult. Overall, the weighted burden test provides a convenient method for using sequence data to highlight genes of interest.
Funded by: Wellcome Trust: WT091310
Annals of human genetics 2016;80;1;38-49
Respiratory microbiota resistance and resilience to pulmonary exacerbation and subsequent antimicrobial intervention.
NERC Centre for Ecology & Hydrology, Wallingford, UK.
Pulmonary symptoms in cystic fibrosis (CF) begin in early life with chronic lung infections and concomitant airway inflammation leading to progressive loss of lung function. Gradual pulmonary function decline is interspersed with periods of acute worsening of respiratory symptoms known as CF pulmonary exacerbations (CFPEs). Cumulatively, CFPEs are associated with more rapid disease progression. In this study multiple sputum samples were collected from adult CF patients over the course of CFPEs to better understand how changes in microbiota are associated with CFPE onset and management. Data were divided into five clinical periods: pre-CFPE baseline, CFPE, antibiotic treatment, recovery, and post-CFPE baseline. Samples were treated with propidium monoazide prior to DNA extraction, to remove the impact of bacterial cell death artefacts following antibiotic treatment, and then characterised by 16S rRNA gene-targeted high-throughput sequencing. Partitioning CF microbiota into core and rare groups revealed compositional resistance to CFPE and resilience to antibiotics interventions. Mixed effects modelling of core microbiota members revealed no significant negative impact on the relative abundance of Pseudomonas aeruginosa across the exacerbation cycle. Our findings have implications for current CFPE management strategies, supporting reassessment of existing antimicrobial treatment regimens, as antimicrobial resistance by pathogens and other members of the microbiota may be significant contributing factors.
The ISME journal 2016;10;5;1081-91
Mechanisms of fate decision and lineage commitment during haematopoiesis.
Department of Haematology, University of Cambridge, Cambridge, UK.
Blood stem cells need to both perpetuate themselves (self-renew) and differentiate into all mature blood cells to maintain blood formation throughout life. However, it is unclear how the underlying gene regulatory network maintains this population of self-renewing and differentiating stem cells and how it accommodates the transition from a stem cell to a mature blood cell. Our current knowledge of transcriptomes of various blood cell types has mainly been advanced by population-level analysis. However, a population of seemingly homogenous blood cells may include many distinct cell types with substantially different transcriptomes and abilities to make diverse fate decisions. Therefore, understanding the cell-intrinsic differences between individual cells is necessary for a deeper understanding of the molecular basis of their behaviour. Here we review recent single-cell studies in the haematopoietic system and their contribution to our understanding of the mechanisms governing cell fate choices and lineage commitment.
Immunology and cell biology 2016;94;3;230-5
Exome sequencing identifies rare variants in multiple genes in atrioventricular septal defect.
Division of Cardiology, Department of Pediatrics, Hospital for Sick Children, University of Toronto, Toronto, Ontario, Canada.
Purpose: The genetic etiology of atrioventricular septal defect (AVSD) is unknown in 40% cases. Conventional sequencing and arrays have identified the etiology in only a minority of nonsyndromic individuals with AVSD.
Methods: Whole-exome sequencing was performed in 81 unrelated probands with AVSD to identify potentially causal variants in a comprehensive set of 112 genes with strong biological relevance to AVSD.
Results: A significant enrichment of rare and rare damaging variants was identified in the gene set, compared with controls (odds ratio (OR): 1.52; 95% confidence interval (CI): 1.35-1.71; P = 4.8 × 10(-11)). The enrichment was specific to AVSD probands, compared with a cohort without AVSD with tetralogy of Fallot (OR: 2.25; 95% CI: 1.84-2.76; P = 2.2 × 10(-16)). Six genes (NIPBL, CHD7, CEP152, BMPR1a, ZFPM2, and MDM4) were enriched for rare variants in AVSD compared with controls, including three syndrome-associated genes (NIPBL, CHD7, and CEP152). The findings were confirmed in a replication cohort of 81 AVSD probands.
Conclusion: Mutations in genes with strong biological relevance to AVSD, including syndrome-associated genes, can contribute to AVSD, even in those with isolated heart disease. The identification of a gene set associated with AVSD will facilitate targeted genetic screening in this cohort.
Funded by: British Heart Foundation: CH/09/003/26631, PG/07/045/22690, RG/10/17/28553; Wellcome Trust: 090532, 098051, WT098051
Genetics in medicine : official journal of the American College of Medical Genetics 2016;18;2;189-98
A multiple-phenotype imputation method for genetic studies.
Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, UK.
Genetic association studies have yielded a wealth of biological discoveries. However, these studies have mostly analyzed one trait and one SNP at a time, thus failing to capture the underlying complexity of the data sets. Joint genotype-phenotype analyses of complex, high-dimensional data sets represent an important way to move beyond simple genome-wide association studies (GWAS) with great potential. The move to high-dimensional phenotypes will raise many new statistical problems. Here we address the central issue of missing phenotypes in studies with any level of relatedness between samples. We propose a multiple-phenotype mixed model and use a computationally efficient variational Bayesian algorithm to fit the model. On a variety of simulated and real data sets from a range of organisms and trait types, we show that our method outperforms existing state-of-the-art methods from the statistics and machine learning literature and can boost signals of association.
Nature genetics 2016;48;4;466-72
A Method for Checking Genomic Integrity in Cultured Cell Lines from SNP Genotyping Data.
Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, CB10 1SA, United Kingdom.
Genomic screening for chromosomal abnormalities is an important part of quality control when establishing and maintaining stem cell lines. We present a new method for sensitive detection of copy number alterations, aneuploidy, and contamination in cell lines using genome-wide SNP genotyping data. In contrast to other methods designed for identifying copy number variations in a single sample or in a sample composed of a mixture of normal and tumor cells, this new method is tailored for determining differences between cell lines and the starting material from which they were derived, which allows us to distinguish between normal and novel copy number variation. We implemented the method in the freely available BCFtools package and present results based on induced pluripotent stem cell lines obtained in the HipSci project.
Funded by: Wellcome Trust: WT098051, WT098503
PloS one 2016;11;5;e0155014
Using a Human Challenge Model of Infection to Measure Vaccine Efficacy: A Randomised, Controlled Trial Comparing the Typhoid Vaccines M01ZH09 with Placebo and Ty21a.
Oxford Vaccine Group, Department of Paediatrics, and the NIHR Oxford Biomedical Research Centre, University of Oxford, Oxford, United Kingdom.
Background: Typhoid persists as a major cause of global morbidity. While several licensed vaccines to prevent typhoid are available, they are of only moderate efficacy and unsuitable for use in children less than two years of age. Development of new efficacious vaccines is complicated by the human host-restriction of Salmonella enterica serovar Typhi (S. Typhi) and lack of clear correlates of protection. In this study, we aimed to evaluate the protective efficacy of a single dose of the oral vaccine candidate, M01ZH09, in susceptible volunteers by direct typhoid challenge.
Methods and findings: We performed a randomised, double-blind, placebo-controlled trial in healthy adult participants at a single centre in Oxford (UK). Participants were allocated to receive one dose of double-blinded M01ZH09 or placebo or 3-doses of open-label Ty21a. Twenty-eight days after vaccination, participants were challenged with 104CFU S. Typhi Quailes strain. The efficacy of M01ZH09 compared with placebo (primary outcome) was assessed as the percentage of participants reaching pre-defined endpoints constituting typhoid diagnosis (fever and/or bacteraemia) during the 14 days after challenge. Ninety-nine participants were randomised to receive M01ZH09 (n = 33), placebo (n = 33) or 3-doses of Ty21a (n = 33). After challenge, typhoid was diagnosed in 18/31 (58.1% [95% CI 39.1 to 75.5]) M01ZH09, 20/30 (66.7% [47.2 to 87.2]) placebo, and 13/30 (43.3% [25.5 to 62.6]) Ty21a vaccine recipients. Vaccine efficacy (VE) for one dose of M01ZH09 was 13% [95% CI -29 to 41] and 35% [-5 to 60] for 3-doses of Ty21a. Retrospective multivariable analyses demonstrated that pre-existing anti-Vi antibody significantly reduced susceptibility to infection after challenge; a 1 log increase in anti-Vi IgG resulting in a 71% decrease in the hazard ratio of typhoid diagnosis ([95% CI 30 to 88%], p = 0.006) during the 14 day challenge period. Limitations to the study included the requirement to limit the challenge period prior to treatment to 2 weeks, the intensity of the study procedures and the high challenge dose used resulting in a stringent model.
Conclusions: Despite successfully demonstrating the use of a human challenge study to directly evaluate vaccine efficacy, a single-dose M01ZH09 failed to demonstrate significant protection after challenge with virulent Salmonella Typhi in this model. Anti-Vi antibody detected prior to vaccination played a major role in outcome after challenge.
Trial registration: ClinicalTrials.gov (NCT01405521) and EudraCT (number 2011-000381-35).
PLoS neglected tropical diseases 2016;10;8;e0004926
Evaluation of an Optimal Epidemiological Typing Scheme for Legionella pneumophila with Whole-Genome Sequence Data Using Validation Guidelines.
Sequence-based typing (SBT), analogous to multilocus sequence typing (MLST), is the current "gold standard" typing method for investigation of legionellosis outbreaks caused by Legionella pneumophila However, as common sequence types (STs) cause many infections, some investigations remain unresolved. In this study, various whole-genome sequencing (WGS)-based methods were evaluated according to published guidelines, including (i) a single nucleotide polymorphism (SNP)-based method, (ii) extended MLST using different numbers of genes, (iii) determination of gene presence or absence, and (iv) a kmer-based method. L. pneumophila serogroup 1 isolates (n = 106) from the standard "typing panel," previously used by the European Society for Clinical Microbiology Study Group on Legionella Infections (ESGLI), were tested together with another 229 isolates. Over 98% of isolates were considered typeable using the SNP- and kmer-based methods. Percentages of isolates with complete extended MLST profiles ranged from 99.1% (50 genes) to 86.8% (1,455 genes), while only 41.5% produced a full profile with the gene presence/absence scheme. Replicates demonstrated that all methods offer 100% reproducibility. Indices of discrimination range from 0.972 (ribosomal MLST) to 0.999 (SNP based), and all values were higher than that achieved with SBT (0.940). Epidemiological concordance is generally inversely related to discriminatory power. We propose that an extended MLST scheme with ∼50 genes provides optimal epidemiological concordance while substantially improving the discrimination offered by SBT and can be used as part of a hierarchical typing scheme that should maintain backwards compatibility and increase discrimination where necessary. This analysis will be useful for the ESGLI to design a scheme that has the potential to become the new gold standard typing method for L. pneumophila.
Funded by: Wellcome Trust
Journal of clinical microbiology 2016;54;8;2135-48
Multiple major disease-associated clones of Legionella pneumophila have emerged recently and independently.
Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA Cambridge, United Kingdom.
Legionella pneumophila is an environmental bacterium and the leading cause of Legionnaires' disease. Just five sequence types (ST), from more than 2000 currently described, cause nearly half of disease cases in northwest Europe. Here, we report the sequence and analyses of 364 L. pneumophila genomes, including 337 from the five disease-associated STs and 27 representative of the species diversity. Phylogenetic analyses revealed that the five STs have independent origins within a highly diverse species. The number of de novo mutations is extremely low with maximum pairwise single-nucleotide polymorphisms (SNPs) ranging from 19 (ST47) to 127 (ST1), which suggests emergences within the last century. Isolates sampled geographically far apart differ by only a few SNPs, demonstrating rapid dissemination. These five STs have been recombining recently, leading to a shared pool of allelic variants potentially contributing to their increased disease propensity. The oldest clone, ST1, has spread globally; between 1940 and 2000, four new clones have emerged in Europe, which show long-distance, rapid dispersal. That a large proportion of clinical cases is caused by recently emerged and internationally dispersed clones, linked by convergent evolution, is surprising for an environmental bacterium traditionally considered to be an opportunistic pathogen. To simultaneously explain recent emergence, rapid spread and increased disease association, we hypothesize that these STs have adapted to new man-made environmental niches, which may be linked by human infection and transmission.
Funded by: Wellcome Trust: 098051
Genome research 2016;26;11;1555-1564
Formin Is Associated with Left-Right Asymmetry in the Pond Snail and the Frog.
School of Life Sciences, University of Nottingham, Nottingham NG7 2RD, UK. Electronic address: email@example.com.
While components of the pathway that establishes left-right asymmetry have been identified in diverse animals, from vertebrates to flies, it is striking that the genes involved in the first symmetry-breaking step remain wholly unknown in the most obviously chiral animals, the gastropod snails. Previously, research on snails was used to show that left-right signaling of Nodal, downstream of symmetry breaking, may be an ancestral feature of the Bilateria [1 and 2]. Here, we report that a disabling mutation in one copy of a tandemly duplicated, diaphanous-related formin is perfectly associated with symmetry breaking in the pond snail. This is supported by the observation that an anti-formin drug treatment converts dextral snail embryos to a sinistral phenocopy, and in frogs, drug inhibition or overexpression by microinjection of formin has a chirality-randomizing effect in early (pre-cilia) embryos. Contrary to expectations based on existing models [3, 4 and 5], we discovered asymmetric gene expression in 2- and 4-cell snail embryos, preceding morphological asymmetry. As the formin-actin filament has been shown to be part of an asymmetry-breaking switch in vitro [6 and 7], together these results are consistent with the view that animals with diverse body plans may derive their asymmetries from the same intracellular chiral elements .
Funded by: Biotechnology and Biological Sciences Research Council: BB/F018940/1, BB/F021135/1, BB/G00661X/1, F021135, G00661X; Medical Research Council: G0900740, MR/K001744/1; NCI NIH HHS: U54 CA143876, U54CA143876; Wellcome Trust: WT098051
Current biology : CB 2016;26;5;654-60
Prognostic impact of p15 gene aberrations in acute leukemia.
a Laboratoire d'Histologie, Embryologie et Cytogénétique, Faculté de Médecine et des Sciences de la Santé , Université de Brest , Brest , France ;
The p15 gene (also known as CDKN2B, INK4B, p15(INK4B)), located in band 9p21, encodes a protein that induces a G1-phase cell cycle arrest through inhibition of CDK4/6 (cyclin-dependent kinase 4/6). It also plays an important role in the regulation of cellular commitment of hematopoietic progenitor cells and myeloid cell differentiation. p15 can be silenced by several mechanisms, including deletion and hypermethylation of its promoter. Homozygous p15 deletion is rare in acute myeloblastic leukemia (AML) and myelodysplastic syndromes (MDS) but frequent in acute lymphoblastic leukemia (ALL). On the contrary, methylation of the p15 promoter is identified in some 50% of the patients with AML and MDS, but is less frequent in ALL. The analysis of the 28 studies available in the literature revealed conflicting results (unfavorable, favorable or no impact) that can be due, at least in part, to methodological and/or biological pitfalls. Among those, are the heterogeneity of the methylation patterns of the p15 gene and the lack of a comprehensive analysis including transcriptional and translational inactivation that have major impact on its expression. Therefore, detection of the p15 mRNA expression (quantitative or not) may represent a more appropriate method to determine the prognostic impact of the p15 gene.
Leukemia & lymphoma 2016;1-9
Chimpanzee genomic diversity reveals ancient admixture with bonobos.
Institut de Biologia Evolutiva (Consejo Superior de Investigaciones Científicas-Universitat Pompeu Fabra), Barcelona Biomedical Research Park, Doctor Aiguader 88, Barcelona, Catalonia 08003, Spain.
Our closest living relatives, chimpanzees and bonobos, have a complex demographic history. We analyzed the high-coverage whole genomes of 75 wild-born chimpanzees and bonobos from 10 countries in Africa. We found that chimpanzee population substructure makes genetic information a good predictor of geographic origin at country and regional scales. Multiple lines of evidence suggest that gene flow occurred from bonobos into the ancestors of central and eastern chimpanzees between 200,000 and 550,000 years ago, probably with subsequent spread into Nigeria-Cameroon chimpanzees. Together with another, possibly more recent contact (after 200,000 years ago), bonobos contributed less than 1% to the central chimpanzee genomes. Admixture thus appears to have been widespread during hominid evolution.
Science (New York, N.Y.) 2016;354;6311;477-481
A meta-analysis of 120 246 individuals identifies 18 new loci for fibrinogen concentration.
Department of Epidemiology.
Genome-wide association studies have previously identified 23 genetic loci associated with circulating fibrinogen concentration. These studies used HapMap imputation and did not examine the X-chromosome. 1000 Genomes imputation provides better coverage of uncommon variants, and includes indels. We conducted a genome-wide association analysis of 34 studies imputed to the 1000 Genomes Project reference panel and including ∼120 000 participants of European ancestry (95 806 participants with data on the X-chromosome). Approximately 10.7 million single-nucleotide polymorphisms and 1.2 million indels were examined. We identified 41 genome-wide significant fibrinogen loci; of which, 18 were newly identified. There were no genome-wide significant signals on the X-chromosome. The lead variants of five significant loci were indels. We further identified six additional independent signals, including three rare variants, at two previously characterized loci: FGB and IRF1. Together the 41 loci explain 3% of the variance in plasma fibrinogen concentration.
Funded by: Biotechnology and Biological Sciences Research Council: BB/F019394/1; Chief Scientist Office: CZB/4/505, ETM/55; Medical Research Council: G1000143, G1001799, MC_PC_U127561128, MR/K026992/1; NCATS NIH HHS: UL1 TR000124; NHLBI NIH HHS: R01 HL059367; NIDDK NIH HHS: P30 DK063491
Human molecular genetics 2016;25;2;358-70
CD4-Transgenic Zebrafish Reveal Tissue-Resident Th2- and Regulatory T Cell-like Populations and Diverse Mononuclear Phagocytes.
Faculty of Life Sciences, The University of Manchester, Manchester M13 9PT, United Kingdom.
CD4<sup>+</sup> T cells are at the nexus of the innate and adaptive arms of the immune system. However, little is known about the evolutionary history of CD4<sup>+</sup> T cells, and it is unclear whether their differentiation into specialized subsets is conserved in early vertebrates. In this study, we have created transgenic zebrafish with vibrantly labeled CD4<sup>+</sup> cells allowing us to scrutinize the development and specialization of teleost CD4<sup>+</sup> leukocytes in vivo. We provide further evidence that CD4<sup>+</sup> macrophages have an ancient origin and had already emerged in bony fish. We demonstrate the utility of this zebrafish resource for interrogating the complex behavior of immune cells at cellular resolution by the imaging of intimate contacts between teleost CD4<sup>+</sup> T cells and mononuclear phagocytes. Most importantly, we reveal the conserved subspecialization of teleost CD4<sup>+</sup> T cells in vivo. We demonstrate that the ancient and specialized tissues of the gills contain a resident population of il-4/13b-expressing Th2-like cells, which do not coexpress il-4/13a Additionally, we identify a contrasting population of regulatory T cell-like cells resident in the zebrafish gut mucosa, in marked similarity to that found in the intestine of mammals. Finally, we show that, as in mammals, zebrafish CD4<sup>+</sup> T cells will infiltrate melanoma tumors and obtain a phenotype consistent with a type 2 immune microenvironment. We anticipate that this unique resource will prove invaluable for future investigation of T cell function in biomedical research, the development of vaccination and health management in aquaculture, and for further research into the evolution of adaptive immunity.
Funded by: Biotechnology and Biological Sciences Research Council: BB/L007401/1; Cancer Research UK: A14953; European Research Council: 282059; Medical Research Council: MC_PC_12009, MR/J009156/1; Wellcome Trust
Journal of immunology (Baltimore, Md. : 1950) 2016;197;9;3520-3530
Discrete distributional differential expression (D3E)--a tool for gene expression analysis of single-cell RNA-seq data.
Department of Plant Sciences, University of Cambridge, Downing Street, Cambridge, CB2 3EA, UK. firstname.lastname@example.org.
Background: The advent of high throughput RNA-seq at the single-cell level has opened up new opportunities to elucidate the heterogeneity of gene expression. One of the most widespread applications of RNA-seq is to identify genes which are differentially expressed between two experimental conditions.
Results: We present a discrete, distributional method for differential gene expression (D(3)E), a novel algorithm specifically designed for single-cell RNA-seq data. We use synthetic data to evaluate D(3)E, demonstrating that it can detect changes in expression, even when the mean level remains unchanged. Since D(3)E is based on an analytically tractable stochastic model, it provides additional biological insights by quantifying biologically meaningful properties, such as the average burst size and frequency. We use D(3)E to investigate experimental data, and with the help of the underlying model, we directly test hypotheses about the driving mechanism behind changes in gene expression.
Conclusion: Evaluation using synthetic data shows that D(3)E performs better than other methods for identifying differentially expressed genes since it is designed to take full advantage of the information available from single-cell RNA-seq experiments. Moreover, the analytical model underlying D(3)E makes it possible to gain additional biological insights.
Funded by: Biotechnology and Biological Sciences Research Council; Wellcome Trust
BMC bioinformatics 2016;17;110
Tracing the origin of disseminated tumor cells in breast cancer using single-cell sequencing.
The Francis Crick Institute, London, UK.
Background: Single-cell micro-metastases of solid tumors often occur in the bone marrow. These disseminated tumor cells (DTCs) may resist therapy and lay dormant or progress to cause overt bone and visceral metastases. The molecular nature of DTCs remains elusive, as well as when and from where in the tumor they originate. Here, we apply single-cell sequencing to identify and trace the origin of DTCs in breast cancer.
Results: We sequence the genomes of 63 single cells isolated from six non-metastatic breast cancer patients. By comparing the cells' DNA copy number aberration (CNA) landscapes with those of the primary tumors and lymph node metastasis, we establish that 53% of the single cells morphologically classified as tumor cells are DTCs disseminating from the observed tumor. The remaining cells represent either non-aberrant "normal" cells or "aberrant cells of unknown origin" that have CNA landscapes discordant from the tumor. Further analyses suggest that the prevalence of aberrant cells of unknown origin is age-dependent and that at least a subset is hematopoietic in origin. Evolutionary reconstruction analysis of bulk tumor and DTC genomes enables ordering of CNA events in molecular pseudo-time and traced the origin of the DTCs to either the main tumor clone, primary tumor subclones, or subclones in an axillary lymph node metastasis.
Conclusions: Single-cell sequencing of bone marrow epithelial-like cells, in parallel with intra-tumor genetic heterogeneity profiling from bulk DNA, is a powerful approach to identify and study DTCs, yielding insight into metastatic processes. A heterogeneous population of CNA-positive cells is present in the bone marrow of non-metastatic breast cancer patients, only part of which are derived from the observed tumor lineages.
Funded by: Cancer Research UK: FC001202; Medical Research Council: FC001202; Wellcome Trust: FC001202
Genome biology 2016;17;1;250
Somatic, positive and negative domains of the Center for Epidemiological Studies Depression (CES-D) scale: a meta-analysis of genome-wide association studies.
Genetic Epidemiology Unit,Departments of Epidemiology and Clinical Genetics,Erasmus MC,Rotterdam,The Netherlands.
Background: Major depressive disorder (MDD) is moderately heritable, however genome-wide association studies (GWAS) for MDD, as well as for related continuous outcomes, have not shown consistent results. Attempts to elucidate the genetic basis of MDD may be hindered by heterogeneity in diagnosis. The Center for Epidemiological Studies Depression (CES-D) scale provides a widely used tool for measuring depressive symptoms clustered in four different domains which can be combined together into a total score but also can be analysed as separate symptom domains.
Method: We performed a meta-analysis of GWAS of the CES-D symptom clusters. We recruited 12 cohorts with the 20- or 10-item CES-D scale (32 528 persons).
Results: One single nucleotide polymorphism (SNP), rs713224, located near the brain-expressed melatonin receptor (MTNR1A) gene, was associated with the somatic complaints domain of depression symptoms, with borderline genome-wide significance (p discovery = 3.82 × 10-8). The SNP was analysed in an additional five cohorts comprising the replication sample (6813 persons). However, the association was not consistent among the replication sample (p discovery+replication = 1.10 × 10-6) with evidence of heterogeneity.
Conclusions: Despite the effort to harmonize the phenotypes across cohorts and participants, our study is still underpowered to detect consistent association for depression, even by means of symptom classification. On the contrary, the SNP-based heritability and co-heritability estimation results suggest that a very minor part of the variation could be captured by GWAS, explaining the reason of sparse findings.
Funded by: Department of Health; Intramural NIH HHS: Z99 AG999999; Medical Research Council: G0802462, G0901254; NCRR NIH HHS: UL1 RR025005; NHGRI NIH HHS: HHSN268200782096C, U01 HG004402; NHLBI NIH HHS: N01HC25195, N01HC55015, N01HC55016, N01HC55018, N01HC55019, N01HC55020, N01HC55021, N01HC55022, N02HL64278, R01 HL070825, R01 HL087641, R01 HL093029; NIA NIH HHS: N01 AG062101, N01 AG062103, N01 AG062106, P30 AG010161, R01 AG015819, R01 AG017917, R01 AG029451, R01 AG032098, RC2 AG036495, U01 AG009740, ZIA AG000932; Wellcome Trust: WT089062
Psychological medicine 2016;46;8;1613-23
Catalog of genetic progression of human cancers: breast cancer.
J.-C. Heuson Breast Cancer Translational Research Laboratory, Institut Jules Bordet, Université Libre de Bruxelles, Boulevard de Waterloo 121, 1000, Brussels, Belgium. email@example.com.
With the rapid development of next-generation sequencing, deeper insights are being gained into the molecular evolution that underlies the development and clinical progression of breast cancer. It is apparent that during evolution, breast cancers acquire thousands of mutations including single base pair substitutions, insertions, deletions, copy number aberrations, and structural rearrangements. As a consequence, at the whole genome level, no two cancers are identical and few cancers even share the same complement of "driver" mutations. Indeed, two samples from the same cancer may also exhibit extensive differences due to constant remodeling of the genome over time. In this review, we summarize recent studies that extend our understanding of the genomic basis of cancer progression. Key biological insights include the following: subclonal diversification begins early in cancer evolution, being detectable even in in situ lesions; geographical stratification of subclonal structure is frequent in primary tumors and can include therapeutically targetable alterations; multiple distant metastases typically arise from a common metastatic ancestor following a "metastatic cascade" model; systemic therapy can unmask preexisting resistant subclones or influence further treatment sensitivity and disease progression. We conclude the review by describing novel approaches such as the analysis of circulating DNA and patient-derived xenografts that promise to further our understanding of the genomic changes occurring during cancer evolution and guide treatment decision making.
Cancer metastasis reviews 2016;35;1;49-62
Genomic Characterization of Primary Invasive Lobular Breast Cancer.
Christine Desmedt, Gabriele Zoppoli, Denis Larsimont, Debora Fumagalli, David Brown, Françoise Rothé, Delphine Vincent, Naima Kheddoumi, Ghizlane Rouas, Samira Majjaj, Sylvain Brohée, Roberto Salgado, Martine Piccart-Gebhart, and Christos Sotiriou, Institut Jules Bordet; Christine Galant, Cliniques Universitaires Saint Luc, Brussels; Peter Van Loo, University of Leuven; Thomas Van Brussel and Diether Lambrechts, VIB Vesalius Research Center, Leuven, Belgium; Gabriele Zoppoli, University of Genoa and Istituto di Ricerca a Carattere Clinico-Scientifico San Martino-National Cancer Institute, Genoa; Giancarlo Pruneri, Patrick Maisonneuve, and Giuseppe Viale, European Institute of Oncology; Marco Fornili and Elia Biganzoli, University of Milan, Fondazione Istituto di Ricovero e Cura a Carattere Scientifico Istituto Nazionale Tumori, Milan, Italy; Gunes Gundem and Peter J. Campbell, Wellcome Trust Sanger Institute, Cambridgeshire; Peter Van Loo, The Francis Crick Institute, London, United Kingdom; Ron Bose, Washington University School of Medicine, St Louis, MO; Otto Metzger, Dana-Farber Cancer Institute, Boston, MA; and François Bertucci, Institut Paoli-Calmettes, Marseille, France. firstname.lastname@example.org.
Purpose: Invasive lobular breast cancer (ILBC) is the second most common histologic subtype after invasive ductal breast cancer (IDBC). Despite clinical and pathologic differences, ILBC is still treated as IDBC. We aimed to identify genomic alterations in ILBC with potential clinical implications.
Methods: From an initial 630 ILBC primary tumors, we interrogated oncogenic substitutions and insertions and deletions of 360 cancer genes and genome-wide copy number aberrations in 413 and 170 ILBC samples, respectively, and correlated those findings with clinicopathologic and outcome features.
Results: Besides the high mutation frequency of CDH1 in 65% of tumors, alterations in one of the three key genes of the phosphatidylinositol 3-kinase pathway, PIK3CA, PTEN, and AKT1, were present in more than one-half of the cases. HER2 and HER3 were mutated in 5.1% and 3.6% of the tumors, with most of these mutations having a proven role in activating the human epidermal growth factor receptor/ERBB pathway. Mutations in FOXA1 and ESR1 copy number gains were detected in 9% and 25% of the samples. All these alterations were more frequent in ILBC than in IDBC. The histologic diversity of ILBC was associated with specific alterations, such as enrichment for HER2 mutations in the mixed, nonclassic, and ESR1 gains in the solid subtype. Survival analyses revealed that chromosome 1q and 11p gains showed independent prognostic value in ILBC and that HER2 and AKT1 mutations were associated with increased risk of early relapse.
Conclusion: This study demonstrates that we can now begin to individualize the treatment of ILBC, with HER2, HER3, and AKT1 mutations representing high-prevalence therapeutic targets and FOXA1 mutations and ESR1 gains deserving urgent dedicated clinical investigation, especially in the context of endocrine treatment.
Funded by: Wellcome Trust
Journal of clinical oncology : official journal of the American Society of Clinical Oncology 2016;34;16;1872-81
Zygotes segregate entire parental genomes in distinct blastomere lineages causing cleavage-stage chimerism and mixoploidy.
Laboratory of Cytogenetics and Genome Research, Center of Human Genetics, KU Leuven, Leuven, 3000, Belgium;
Dramatic genome dynamics, such as chromosome instability, contribute to the remarkable genomic heterogeneity among the blastomeres comprising a single embryo during human preimplantation development. This heterogeneity, when compatible with life, manifests as constitutional mosaicism, chimerism, and mixoploidy in live-born individuals. Chimerism and mixoploidy are defined by the presence of cell lineages with different parental genomes or different ploidy states in a single individual, respectively. Our knowledge of their mechanistic origin results from indirect observations, often when the cell lineages have been subject to rigorous selective pressure during development. Here, we applied haplarithmisis to infer the haplotypes and the copy number of parental genomes in 116 single blastomeres comprising entire preimplantation bovine embryos (n = 23) following in vitro fertilization. We not only demonstrate that chromosome instability is conserved between bovine and human cleavage embryos, but we also discovered that zygotes can spontaneously segregate entire parental genomes into different cell lineages during the first post-zygotic cleavage division. Parental genome segregation was not exclusively triggered by abnormal fertilizations leading to triploid zygotes, but also normally fertilized zygotes can spontaneously segregate entire parental genomes into different cell lineages during cleavage of the zygote. We coin the term "heterogoneic division" to indicate the events leading to noncanonical zygotic cytokinesis, segregating the parental genomes into distinct cell lineages. Persistence of those cell lines during development is a likely cause of chimerism and mixoploidy in mammals.
Genome research 2016;26;5;567-78
The role of folate transport in antifolate drug action in Trypanosoma brucei.
University of Dundee, United Kingdom.
The aim of this study was to identify and characterise mechanisms of resistance to antifolate drugs in African trypanosomes. Genome-wide RNAi library screens were undertaken in bloodstream form Trypanosoma brucei exposed to the antifolates methotrexate and raltitrexed. RNAi knockdown, in conjunction with drug susceptibly and folate transport studies, were used to validate the functions of the putative folate transporters. The transport kinetics of folate and methotrexate were further characterised in whole cells. RNA interference target sequencing (RIT-seq) experiments identified a tandem array of genes encoding a folate transporter family, TbFT1-3, as major contributors to antifolate drug uptake. RNAi knockdown of TbFT1-3 substantially reduced folate transport into trypanosomes and reduced the parasite's susceptibly to the classical antifolates methotrexate and raltitrexed. In contrast, knockdown of TbFT1-3 increased susceptibly to the non-classical antifolates pyrimethamine and nolatrexed. Both folate and methotrexate transport were inhibited by classical antifolates, but not by non-classical antifolates or biopterin. Thus, TbFT1-3 mediate the uptake of folate and classical antifolates in trypanosomes and TbFT1-3 loss-of-function is a mechanism of anti-folate drug resistance.
The Journal of biological chemistry 2016
Coalescent Inference Using Serially Sampled, High-Throughput Sequencing Data from Intra-Host HIV Infection.
University of Oxford;
Human immunodeﬁciency virus (HIV) is a rapidly evolving pathogen that causes chronic infections, so genetic diversity within a single infection can be very high. High-throughput "deep" sequencing can now measure this diversity in unprecedented detail, particularly since it can be performed at diﬀerent timepoints during an infection, and this oﬀers a potentially powerful way to infer the evolutionary dynamics of the intra-host viral population. However, population genomic inference from HIV sequence data is challenging because of high rates of mutation and recombination, rapid demographic changes, and ongoing selective pressures. In this paper we develop a new method for inference using HIV deep sequencing data using an approach based on importance sampling of ancestral recombination graphs under a multi-locus coalescent model. The approach further extends recent progress in the approximation of so-called conditional sampling distributions, a quantity of key interest when approximating coalescent likelihoods. The chief novelties of our method are that it is able to infer rates of recombination and mutation, as well as the eﬀective population size, while handling sampling over diﬀerent timepoints and missing data without extra computational diﬃculty. We apply our method to a dataset of HIV-1, in which several hundred sequences were obtained from an infected individual at seven timepoints over two years. We ﬁnd mutation rate and eﬀective population size estimates to be comparable to those produced by the software BEAST. Additionally, our method is able to produce local recombination rate estimates. The software underlying our method, Coalescenator, is freely available.
BCL11A Haploinsufficiency Causes an Intellectual Disability Syndrome and Dysregulates Transcription.
Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1SA, UK.
Intellectual disability (ID) is a common condition with considerable genetic heterogeneity. Next-generation sequencing of large cohorts has identified an increasing number of genes implicated in ID, but their roles in neurodevelopment remain largely unexplored. Here we report an ID syndrome caused by de novo heterozygous missense, nonsense, and frameshift mutations in BCL11A, encoding a transcription factor that is a putative member of the BAF swi/snf chromatin-remodeling complex. Using a comprehensive integrated approach to ID disease modeling, involving human cellular analyses coupled to mouse behavioral, neuroanatomical, and molecular phenotyping, we provide multiple lines of functional evidence for phenotypic effects. The etiological missense variants cluster in the amino-terminal region of human BCL11A, and we demonstrate that they all disrupt its localization, dimerization, and transcriptional regulatory activity, consistent with a loss of function. We show that Bcl11a haploinsufficiency in mice causes impaired cognition, abnormal social behavior, and microcephaly in accordance with the human phenotype. Furthermore, we identify shared aberrant transcriptional profiles in the cortex and hippocampus of these mouse models. Thus, our work implicates BCL11A haploinsufficiency in neurodevelopmental disorders and defines additional targets regulated by this gene, with broad relevance for our understanding of ID and related syndromes.
Funded by: Department of Health; Wellcome Trust: WT098051
American journal of human genetics 2016;99;2;253-74
High-throughput discovery of novel developmental phenotypes.
Department of Molecular Physiology and Biophysics, Houston, Texas 77030, USA.
Approximately one-third of all mammalian genes are essential for life. Phenotypes resulting from knockouts of these genes in mice have provided tremendous insight into gene function and congenital disorders. As part of the International Mouse Phenotyping Consortium effort to generate and phenotypically characterize 5,000 knockout mouse lines, here we identify 410 lethal genes during the production of the first 1,751 unique gene knockouts. Using a standardized phenotyping platform that incorporates high-resolution 3D imaging, we identify phenotypes at multiple time points for previously uncharacterized genes and additional phenotypes for genes with previously reported mutant phenotypes. Unexpectedly, our analysis reveals that incomplete penetrance and variable expressivity are common even on a defined genetic background. In addition, we show that human disease genes are enriched for essential genes, thus providing a dataset that facilitates the prioritization and validation of mutations identified in clinical sequencing efforts.
Funded by: Cancer Research UK: 13031; Medical Research Council: MC_U142684171, MC_U142684172; NCI NIH HHS: P30 CA034196, P30 CA093373; NEI NIH HHS: P30 EY002520; NHGRI NIH HHS: U54 HG006332, U54 HG006348, U54 HG006364, U54 HG006370, UM1 HG006348, UM1 HG006370; NIDDK NIH HHS: U2C DK092993; NIH HHS: U42 OD011174, U42 OD011175, U42 OD011185, U42 OD012210, UM1 OD023221, UM1 OD023222; Welcome Trust; Wellcome Trust
Genomic Analysis and Comparison of Two Gonorrhea Outbreaks.
Unlabelled: Gonorrhea is a sexually transmitted disease causing growing concern, with a substantial increase in reported incidence over the past few years in the United Kingdom and rising levels of resistance to a wide range of antibiotics. Understanding its epidemiology is therefore of major biomedical importance, not only on a population scale but also at the level of direct transmission. However, the molecular typing techniques traditionally used for gonorrhea infections do not provide sufficient resolution to investigate such fine-scale patterns. Here we sequenced the genomes of 237 isolates from two local collections of isolates from Sheffield and London, each of which was resolved into a single type using traditional methods. The two data sets were selected to have different epidemiological properties: the Sheffield data were collected over 6 years from a predominantly heterosexual population, whereas the London data were gathered within half a year and strongly associated with men who have sex with men. Based on contact tracing information between individuals in Sheffield, we found that transmission is associated with a median time to most recent common ancestor of 3.4 months, with an upper bound of 8 months, which we used as a criterion to identify likely transmission links in both data sets. In London, we found that transmission happened predominantly between individuals of similar age, sexual orientation, and location and also with the same HIV serostatus, which may reflect serosorting and associated risk behaviors. Comparison of the two data sets suggests that the London epidemic involved about ten times more cases than the Sheffield outbreak.
Importance: The recent increases in gonorrhea incidence and antibiotic resistance are cause for public health concern. Successful intervention requires a better understanding of transmission patterns, which is not uncovered by traditional molecular epidemiology techniques. Here we studied two outbreaks that took place in Sheffield and London, United Kingdom. We show that whole-genome sequencing provides the resolution to investigate direct gonorrhea transmission between infected individuals. Combining genome sequencing with rich epidemiological information about infected individuals reveals the importance of several transmission routes and risk factors, which can be used to design better control measures.
Funded by: Medical Research Council: MR/K010174/1
Perturbed hematopoietic stem and progenitor cell hierarchy in myelodysplastic syndromes patients with monosomy 7 as the sole cytogenetic abnormality.
Center for Hematology and Regenerative Medicine, Karolinska Institutet, Department of Medicine, Karolinska University Hospital Huddinge, Stockholm, Sweden.
The stem and progenitor cell compartments in low- and intermediate-risk myelodysplastic syndromes (MDS) have recently been described, and shown to be highly conserved when compared to those in acute myeloid leukemia (AML). Much less is known about the characteristics of the hematopoietic hierarchy of subgroups of MDS with a high risk of transforming to AML. Immunophenotypic analysis of immature stem and progenitor cell compartments from patients with an isolated loss of the entire chromosome 7 (isolated -7), an independent high-risk genetic event in MDS, showed expansion and dominance of the malignant -7 clone in the granulocyte and macrophage progenitors (GMP), and other CD45RA+ progenitor compartments, and a significant reduction of the LIN-CD34+CD38low/-CD90+CD45RA- hematopoietic stem cell (HSC) compartment, highly reminiscent of what is typically seen in AML, and distinct from low-risk MDS. Established functional in vitro and in vivo stem cell assays showed a poor readout for -7 MDS patients irrespective of marrow blast counts. Moreover, while the -7 clone dominated at all stages of GM differentiation, the -7 clone had a competitive disadvantage in erythroid differentiation. In azacitidine-treated -7 MDS patients with a clinical response, the decreased clonal involvement in mononuclear bone marrow cells was not accompanied by a parallel reduced clonal involvement in the dominant CD45RA+ progenitor populations, suggesting a selective azacitidine-resistance of these distinct -7 progenitor compartments. Our data demonstrate, in a subgroup of high risk MDS with monosomy 7, that the perturbed stem and progenitor cell compartments resemble more that of AML than low-risk MDS.
Pitfalls in genetic testing: the story of missed SCN1A mutations.
Neurogenetics groupDepartment of Molecular GeneticsVIBAntwerpBelgium; Laboratory of NeurogeneticsInstitute Born-BungeUniversity of AntwerpAntwerpBelgium.
Background: Sanger sequencing, still the standard technique for genetic testing in most diagnostic laboratories and until recently widely used in research, is gradually being complemented by next-generation sequencing (NGS). No single mutation detection technique is however perfect in identifying all mutations. Therefore, we wondered to what extent inconsistencies between Sanger sequencing and NGS affect the molecular diagnosis of patients. Since mutations in SCN1A, the major gene implicated in epilepsy, are found in the majority of Dravet syndrome (DS) patients, we focused on missed SCN1A mutations.
Methods: We sent out a survey to 16 genetic centers performing SCN1A testing.
Results: We collected data on 28 mutations initially missed using Sanger sequencing. All patients were falsely reported as SCN1A mutation-negative, both due to technical limitations and human errors.
Conclusion: We illustrate the pitfalls of Sanger sequencing and most importantly provide evidence that SCN1A mutations are an even more frequent cause of DS than already anticipated.
Molecular genetics & genomic medicine 2016;4;4;457-64
Identification, Validation, and Application of Molecular Diagnostics for Insecticide Resistance in Malaria Vectors.
Department of Vector Biology, Liverpool School of Tropical Medicine, Pembroke Place, Liverpool L3 5QA, UK; Malaria Programme, Wellcome Trust Sanger Institute, Cambridge, UK. Electronic address: email@example.com.
Insecticide resistance is a major obstacle to control of Anopheles malaria mosquitoes in sub-Saharan Africa and requires an improved understanding of the underlying mechanisms. Efforts to discover resistance genes and DNA markers have been dominated by candidate gene and quantitative trait locus studies of laboratory strains, but with greater availability of genome sequences a shift toward field-based agnostic discovery is anticipated. Mechanisms evolve continually to produce elevated resistance yielding multiplicative diagnostic markers, co-screening of which can give high predictive value. With a shift toward prospective analyses, identification and screening of resistance marker panels will boost monitoring and programmatic decision making.
Trends in parasitology 2016;32;3;197-206
Deep genome sequencing and variation analysis of 13 inbred mouse strains defines candidate phenotypic alleles, private variation and homozygous truncating mutations.
Wellcome Trust Sanger Institute, Hinxton, Cambridge, CB10 1HH, UK.
Background: The Mouse Genomes Project is an ongoing collaborative effort to sequence the genomes of the common laboratory mouse strains. In 2011, the initial analysis of sequence variation across 17 strains found 56.7 M unique single nucleotide polymorphisms (SNPs) and 8.8 M indels. We carry out deep sequencing of 13 additional inbred strains (BUB/BnJ, C57BL/10J, C57BR/cdJ, C58/J, DBA/1J, I/LnJ, KK/HiJ, MOLF/EiJ, NZB/B1NJ, NZW/LacJ, RF/J, SEA/GnJ and ST/bJ), cataloguing molecular variation within and across the strains. These strains include important models for immune response, leukaemia, age-related hearing loss and rheumatoid arthritis. We now have several examples of fully sequenced closely related strains that are divergent for several disease phenotypes.
Results: Approximately 27.4 M unique SNPs and 5 M indels are identified across these strains compared to the C57BL/6 J reference genome (GRCm38). The amount of variation found in the inbred laboratory mouse genome has increased to 71 M SNPs and 12 M indels. We investigate the genetic basis of highly penetrant cancer susceptibility in RF/J finding private novel missense mutations in DNA damage repair and highly cancer associated genes. We use two highly related strains (DBA/1J and DBA/2J) to investigate the genetic basis of collagen-induced arthritis susceptibility.
Conclusions: This paper significantly expands the catalogue of fully sequenced laboratory mouse strains and now contains several examples of highly genetically similar strains with divergent phenotypes. We show how studying private missense mutations can lead to insights into the genetic mechanism for a highly penetrant phenotype.
Funded by: Biotechnology and Biological Sciences Research Council: BB/M000281/1; Cancer Research UK: 13031; Medical Research Council: MR/L007428/1; Wellcome Trust
Genome biology 2016;17;1;167
DNA supercoiling is a fundamental regulatory principle in the control of bacterial gene expression.
Department of Microbiology, Moyne Institute of Preventive Medicine, Trinity College Dublin, Dublin 2, Ireland. firstname.lastname@example.org.
Although it has become routine to consider DNA in terms of its role as a carrier of genetic information, it is also an important contributor to the control of gene expression. This regulatory principle arises from its structural properties. DNA is maintained in an underwound state in most bacterial cells and this has important implications both for DNA storage in the nucleoid and for the expression of genetic information. Underwinding of the DNA through reduction in its linking number potentially imparts energy to the duplex that is available to drive DNA transactions, such as transcription, replication and recombination. The topological state of DNA also influences its affinity for some DNA binding proteins, especially in DNA sequences that have a high A + T base content. The underwinding of DNA by the ATP-dependent topoisomerase DNA gyrase creates a continuum between metabolic flux, DNA topology and gene expression that underpins the global response of the genome to changes in the intracellular and external environments. These connections describe a fundamental and generalised mechanism affecting global gene expression that underlies the specific control of transcription operating through conventional transcription factors. This mechanism also provides a basal level of control for genes acquired by horizontal DNA transfer, assisting microbial evolution, including the evolution of pathogenic bacteria.
Biophysical reviews 2016;8;3;209-220
DNAH11 Localization in the Proximal Region of Respiratory Cilia Defines Distinct Outer Dynein Arm Complexes.
1 Department of General Pediatrics and.
Primary ciliary dyskinesia (PCD) is a recessively inherited disease that leads to chronic respiratory disorders owing to impaired mucociliary clearance. Conventional transmission electron microscopy (TEM) is a diagnostic standard to identify ultrastructural defects in respiratory cilia but is not useful in approximately 30% of PCD cases, which have normal ciliary ultrastructure. DNAH11 mutations are a common cause of PCD with normal ciliary ultrastructure and hyperkinetic ciliary beating, but its pathophysiology remains poorly understood. We therefore characterized DNAH11 in human respiratory cilia by immunofluorescence microscopy (IFM) in the context of PCD. We used whole-exome and targeted next-generation sequence analysis as well as Sanger sequencing to identify and confirm eight novel loss-of-function DNAH11 mutations. We designed and validated a monoclonal antibody specific to DNAH11 and performed high-resolution IFM of both control and PCD-affected human respiratory cells, as well as samples from green fluorescent protein (GFP)-left-right dynein mice, to determine the ciliary localization of DNAH11. IFM analysis demonstrated native DNAH11 localization in only the proximal region of wild-type human respiratory cilia and loss of DNAH11 in individuals with PCD with certain loss-of-function DNAH11 mutations. GFP-left-right dynein mice confirmed proximal DNAH11 localization in tracheal cilia. DNAH11 retained proximal localization in respiratory cilia of individuals with PCD with distinct ultrastructural defects, such as the absence of outer dynein arms (ODAs). TEM tomography detected a partial reduction of ODAs in DNAH11-deficient cilia. DNAH11 mutations result in a subtle ODA defect in only the proximal region of respiratory cilia, which is detectable by IFM and TEM tomography.
Funded by: NCATS NIH HHS: UL1 TR001863; NHLBI NIH HHS: R01 HL093280; NIDDK NIH HHS: R01 DK072301
American journal of respiratory cell and molecular biology 2016;55;2;213-24
RESEARCH ETHICS. Ethics review for international data-intensive research.
J. Kenyon Mason Institute for Medicine, Life Sciences and the Law, School of Law, University of Edinburgh, UK. email@example.com.
Funded by: Wellcome Trust: 099313, 103360
Science (New York, N.Y.) 2016;351;6280;1399-400
Identification of a germline F692L drug resistance variant in cis with Flt3-ITD in knock-in mice.
The Wellcome Trust Sanger Institute; firstname.lastname@example.org.
Phylogenetic Analysis of Invasive Serotype 1 Pneumococcus in South Africa, 1989-2013.
Centre for Respiratory Diseases and Meningitis, National Institute for Communicable Diseases, National Health Laboratory Service, Johannesburg, South Africa School of Pathology, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa email@example.com.
Background: Serotype 1 is an important cause of invasive pneumococcal disease in South Africa and has declined following introduction of the 13-valent pneumococcal conjugate vaccine in 2011.
Methods: We genetically characterized 912 invasive serotype 1 isolates from 1989-2013. Simpson's diversity index and recombination ratios were calculated. Factors associated with sequence types (ST) were assessed.
Results: Clonal complex 217 represented 96% (872/912) of sampled isolates. Post PCV13, ST diversity increased in children <5 years (0.39 to 0.63, p=0.002) and individuals >14 years (0.35 to 0.54, p<0.001): ST-217 declined proportionately in children <5 years [153/203 (75%) vs. 21/37 (57%), p=0.027], and individuals >14 years [242/305 (79%) vs. 96/148 (65%), p=0.001], whereas ST-9067 increased [4/684 (0.6%) vs. 24/228 (11%), p<0.001]. Three sub-clades were identified within ST-217: ST-217C1 (353/382, 92%), ST-217C2 (15/382, 4%) and ST-217C3 (14/382, 4%). ST-217C2, ST-217C3 and single-locus variant (SLV) ST-8314 (20/912, 2%) were associated with non-susceptibility to chloramphenicol, tetracycline and co-trimoxazole. ST-8314 (20/912, 2%) was also associated with increased non-susceptibility to penicillin (p<0.001). ST-217C3 and newly reported ST-9067 had higher recombination ratios compared to ST-217C1 (4.344 vs. 0.091, p<0.001 and 0.086 vs. 0.013, p<0.001, respectively).
Conclusions: Increases in genetic diversity were noted post PCV13, and lineages associated with antimicrobial non-susceptibility were identified.
Journal of clinical microbiology 2016
Bacterial pathogenesis: Getting all tangled up.
Nature reviews. Microbiology 2016
Wheat bran promotes enrichment within the human colonic microbiota of butyrate-producing bacteria that release ferulic acid.
Rowett Institute of Nutrition and Health, University of Aberdeen, Aberdeen, UK.
Cereal fibres such as wheat bran are considered to offer human health benefits via their impact on the intestinal microbiota. We show here by 16S rRNA gene-based community analysis that providing amylase-pretreated wheat bran as the sole added energy source to human intestinal microbial communities in anaerobic fermentors leads to the selective and progressive enrichment of a small number of bacterial species. In particular, OTUs corresponding to uncultured Lachnospiraceae (Firmicutes) related to Eubacterium xylanophilum and Butyrivibrio spp. were strongly enriched (by five to 160 fold) over 48 h in four independent experiments performed with different faecal inocula, while nine other Firmicutes OTUs showed > 5-fold enrichment in at least one experiment. Ferulic acid was released from the wheat bran during degradation but was rapidly converted to phenylpropionic acid derivatives via hydrogenation, demethylation and dehydroxylation to give metabolites that are detected in human faecal samples. Pure culture work using bacterial isolates related to the enriched OTUs, including several butyrate-producers, demonstrated that the strains caused substrate weight loss and released ferulic acid, but with limited further conversion. We conclude that breakdown of wheat bran involves specialist primary degraders while the conversion of released ferulic acid is likely to involve a multi-species pathway.
Environmental microbiology 2016;18;7;2214-25
Consent Codes: Upholding Standard Data Use Conditions.
Centre of Genomics and Policy, Faculty of Medicine, McGill University, Montreal, Quebec, Canada.
A systematic way of recording data use conditions that are based on consent permissions as found in the datasets of the main public genome archives (NCBI dbGaP and EMBL-EBI/CRG EGA).
Funded by: Canadian Institutes of Health Research: EP1-120608, EP2-120609
PLoS genetics 2016;12;1;e1005772
Alternative Splice Forms Influence Functions of Whirlin in Mechanosensory Hair Cell Stereocilia.
Wolfson Centre for Age-Related Diseases, King's College London, Guy's Campus, London SE1 1UL, UK.
WHRN (DFNB31) mutations cause diverse hearing disorders: profound deafness (DFNB31) or variable hearing loss in Usher syndrome type II. The known role of WHRN in stereocilia elongation does not explain these different pathophysiologies. Using spontaneous and targeted Whrn mutants, we show that the major long (WHRN-L) and short (WHRN-S) isoforms of WHRN have distinct localizations within stereocilia and also across hair cell types. Lack of both isoforms causes abnormally short stereocilia and profound deafness and vestibular dysfunction. WHRN-S expression, however, is sufficient to maintain stereocilia bundle morphology and function in a subset of hair cells, resulting in some auditory response and no overt vestibular dysfunction. WHRN-S interacts with EPS8, and both are required at stereocilia tips for normal length regulation. WHRN-L localizes midway along the shorter stereocilia, at the level of inter-stereociliary links. We propose that differential isoform expression underlies the variable auditory and vestibular phenotypes associated with WHRN mutations.
Cell reports 2016;15;5;935-43
MERVL/Zscan4 Network Activation Results in Transient Genome-wide DNA Demethylation of mESCs.
Epigenetics Programme, Babraham Institute, Cambridge CB22 3AT, UK. Electronic address: firstname.lastname@example.org.
Mouse embryonic stem cells are dynamic and heterogeneous. For example, rare cells cycle through a state characterized by decondensed chromatin and expression of transcripts, including the Zscan4 cluster and MERVL endogenous retrovirus, which are usually restricted to preimplantation embryos. Here, we further characterize the dynamics and consequences of this transient cell state. Single-cell transcriptomics identified the earliest upregulated transcripts as cells enter the MERVL/Zscan4 state. The MERVL/Zscan4 transcriptional network was also upregulated during induced pluripotent stem cell reprogramming. Genome-wide DNA methylation and chromatin analyses revealed global DNA hypomethylation accompanying increased chromatin accessibility. This transient DNA demethylation was driven by a loss of DNA methyltransferase proteins in the cells and occurred genome-wide. While methylation levels were restored once cells exit this state, genomic imprints remained hypomethylated, demonstrating a potential global and enduring influence of endogenous retroviral activation on the epigenome.
Cell reports 2016;17;1;179-92
The genetics of blood pressure regulation and its target organs from association studies in 342,415 individuals.
Center for Complex Disease Genomics, McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA.
To dissect the genetic architecture of blood pressure and assess effects on target organ damage, we analyzed 128,272 SNPs from targeted and genome-wide arrays in 201,529 individuals of European ancestry, and genotypes from an additional 140,886 individuals were used for validation. We identified 66 blood pressure-associated loci, of which 17 were new; 15 harbored multiple distinct association signals. The 66 index SNPs were enriched for cis-regulatory elements, particularly in vascular endothelial cells, consistent with a primary role in blood pressure control through modulation of vascular tone across multiple tissues. The 66 index SNPs combined in a risk score showed comparable effects in 64,421 individuals of non-European descent. The 66-SNP blood pressure risk score was significantly associated with target organ damage in multiple tissues but with minor effects in the kidney. Our findings expand current knowledge of blood pressure-related pathways and highlight tissues beyond the classical renal system in blood pressure regulation.
Funded by: Arthritis Research UK; British Heart Foundation: FS/13/6/29977, PG/02/128, RG/07/005/23633, RG/10/12/28456, RG/13/2/30098, RG/14/5/30893, RG08/008, RG2008/014, SP/04/002, SP/08/005/25115; Chief Scientist Office; FIC NIH HHS: R01 TW005596, R01 TW008288, RC1 TW008485; Medical Research Council: 85374, G0000934, G0401527, G0500539, G0600237, G0600705, G0601261, G0601966, G0700931, G1000143, G1002319, G9521010D, MC_PC_U127561128, MC_UU_12013/5, MC_UU_12015/1, MC_UU_12019/1, MR/K006584/1, MR/K013351/1, MR/L003120/1, MR/L01341X/1; NCATS NIH HHS: UL1 TR000124; NCI NIH HHS: UM1 CA182913; NCRR NIH HHS: M01 RR000052, M01 RR000425, M01 RR010284, M01 RR016500, P20 RR020649, U54 RR020278, UL1 RR024156, UL1 RR024975, UL1 RR025005, UL1 RR025774, UL1 RR033176; NEI NIH HHS: R01 EY014684, R01 EY018246, ZIA EY000401; NHGRI NIH HHS: HHSN268200782096C, N01HG65403, U01 HG004402, U01 HG007416, Z01 HG000024, Z01 HG200362; NHLBI NIH HHS: HHSN268201100005C, HHSN268201100006C, HHSN268201100007C, HHSN268201100008C, HHSN268201100009C, HHSN268201100010C, HHSN268201100011C, HHSN268201100012C, HHSN268201200036C, K23 HL080025, K24 HL004334, K99 HL094535, N01 HC005187, N01 HC015103, N01 HC035129, N01 HC045133, N01 HC045134, N01 HC045204, N01 HC045205, N01 HC048047, N01 HC048048, N01 HC048049, N01 HC048050, N01 HC055015, N01 HC055018, N01 HC055019, N01 HC085084, N01 HC085085, N01 HC095095, N01 WH42114, N01HC25195, N01HC55016, N01HC55020, N01HC55021, N01HC55222, N01HC65226, N01HC75150, N01HC85079, N01HC85080, N01HC85081, N01HC85082, N01HC85083, N01HC85086, N01HC95159, N01HC95160, N01HC95161, N01HC95162, N01HC95163, N01HC95164, N01HC95165, N01HC95166, N01HC95167, N01HC95168, N01HC95169, N01WH32102, N02HL64278, R01 HL036310, R01 HL043851, R01 HL046380, R01 HL053353, R01 HL055673, R01 HL059367, R01 HL059684, R01 HL071025, R01 HL074166, R01 HL075366, R01 HL077477, R01 HL080295, R01 HL080467, R01 HL085144, R01 HL085251, R01 HL086694, R01 HL087641, R01 HL087647, R01 HL087652, R01 HL087660, R01 HL087679, R01 HL087698, R01 HL088119, R01 HL093029, R01 HL093328, R01 HL098283, R01 HL103612, R01 HL105756, R01 HL109512, R01 HL109946, R01 HL113933, R01 HL120393, R01 HL122684, R21 HL123677, RC1 HL100245, RC2 HL101834, T32 HL007208, T32 HL007902, T32 HL098049, U01 HL054471, U01 HL054472, U01 HL054473, U01 HL054495, U01 HL054496, U01 HL054497, U01 HL054509, U01 HL069757, U01 HL072515, U01 HL072518, U01 HL080295, U01 HL084756, U01 HL096917; NIA NIH HHS: HHSN271201200022C, N01 AG012100, N01 AG062101, N01 AG062103, N01 AG062106, N01AG12109, R01 AG010175, R01 AG013196, R01 AG015928, R01 AG016592, R01 AG018728, R01 AG020098, R01 AG023629, R01 AG025941, R01 AG027058, R01 AG028555, R01 AG032098, T32 AG000219, Z01 AG000513, ZIA AG007380; NICHD NIH HHS: P2C HD050924, R03 HD061437, R03 HD062783; NIDDK NIH HHS: P30 DK020572, P30 DK056350, P30 DK063491, P30 DK072488, P60 DK079637, R01 DK054261, R01 DK062370, R01 DK072193, R01 DK078150, R01 DK079888, R01 DK084350, R01 DK093757, R01 DK101478, U01 DK062370; NIEHS NIH HHS: P30 ES007033, P30 ES010126; NIGMS NIH HHS: S06 GM008016, U01 GM074518; NIH HHS: N01 WH42124; NIMH NIH HHS: R01 MH063706, RC2 MH089951, RL1 MH083268; NINDS NIH HHS: R21 NS064908, U01 NS041588; WHI NIH HHS: N01 WH032105, N01 WH032118, N01 WH032119, N01 WH032122, N01 WH042107, N01 WH042109, N01 WH042110, N01 WH042111, N01 WH042112, N01 WH042115, N01 WH042116, N01 WH042117, N01 WH042118, N01 WH042119, N01 WH042120, N01 WH042121, N01 WH042122, N01 WH042123, N01 WH042125, N01 WH042126, N01 WH042129, N01 WH042130, N01 WH042131, N01 WH042132, N01 WH044221, N01WH22110, N01WH24152, N01WH32100, N01WH32101, N01WH32106, N01WH32108, N01WH32109, N01WH32111, N01WH32112, N01WH32113, N01WH32115, N01WH42108, N01WH42113; Wellcome Trust: 068545/Z/02, 081917/Z/07/Z, 084723/Z/08/Z, 085475/B/08/Z, 090532/Z/09/Z, 098051, WT098017
Nature genetics 2016;48;10;1171-1184
Community dynamics and the lower airway microbiota in stable chronic obstructive pulmonary disease, smokers and healthy non-smokers.
Halo, Queen's University Belfast, Belfast, UK Centre for Infection and Immunity, School of Medicine, Dentistry and Biomedical Sciences, Queen's University Belfast, Belfast, UK.
Rationale: The role bacteria play in the progression of COPD has increasingly been highlighted in recent years. However, the microbial community complexity in the lower airways of patients with COPD is poorly characterised.
Objectives: To compare the lower airway microbiota in patients with COPD, smokers and non-smokers.
Methods: Bronchial wash samples from adults with COPD (n=18), smokers with no airways disease (n=8) and healthy individuals (n=11) were analysed by extended-culture and culture-independent Illumina MiSeq sequencing. We determined aerobic and anaerobic microbiota load and evaluated differences in bacteria associated with the three cohorts. Culture-independent analysis was used to determine differences in microbiota between comparison groups including taxonomic richness, diversity, relative abundance, 'core' microbiota and co-occurrence.
Measurement and main results: Extended-culture showed no difference in total load of aerobic and anaerobic bacteria between the three cohorts. Culture-independent analysis revealed that the prevalence of members of Pseudomonas spp. was greater in the lower airways of patients with COPD; however, the majority of the sequence reads for this taxa were attributed to three patients. Furthermore, members of Bacteroidetes, such as Prevotella spp., were observed to be greater in the 'healthy' comparison groups. Community diversity (α and β) was significantly less in COPD compared with healthy groups. Co-occurrence of bacterial taxa and the observation of a putative 'core' community within the lower airways were also observed.
Conclusions: Microbial community composition in the lower airways of patients with COPD is significantly different to that found in smokers and non-smokers, indicating that a component of the disease is associated with changes in microbiological status.
Funded by: Wellcome Trust
Involvement of astrocyte and oligodendrocyte gene sets in migraine.
Department of Human Genetics, Leiden University Medical Centre, The Netherlands.
Background: Migraine is a common episodic brain disorder characterized by recurrent attacks of severe unilateral headache and additional neurological symptoms. Two main migraine types can be distinguished based on the presence of aura symptoms that can accompany the headache: migraine with aura and migraine without aura. Multiple genetic and environmental factors confer disease susceptibility. Recent genome-wide association studies (GWAS) indicate that migraine susceptibility genes are involved in various pathways, including neurotransmission, which have already been implicated in genetic studies of monogenic familial hemiplegic migraine, a subtype of migraine with aura.
Methods: To further explore the genetic background of migraine, we performed a gene set analysis of migraine GWAS data of 4954 clinic-based patients with migraine, as well as 13,390 controls. Curated sets of synaptic genes and sets of genes predominantly expressed in three glial cell types (astrocytes, microglia and oligodendrocytes) were investigated.
Discussion: Our results show that gene sets containing astrocyte- and oligodendrocyte-related genes are associated with migraine, which is especially true for gene sets involved in protein modification and signal transduction. Observed differences between migraine with aura and migraine without aura indicate that both migraine types, at least in part, seem to have a different genetic background.
Cephalalgia : an international journal of headache 2016;36;7;640-7
H3Africa multi-centre study of the prevalence and environmental and genetic determinants of type 2 diabetes in sub-Saharan Africa: study protocol.
Department of Medicine, University of Cambridge, Cambridge, UK.
The burden and aetiology of type 2 diabetes (T2D) and its microvascular complications may be influenced by varying behavioural and lifestyle environments as well as by genetic susceptibility. These aspects of the epidemiology of T2D have not been reliably clarified in sub-Saharan Africa (SSA), highlighting the need for context-specific epidemiological studies with the statistical resolution to inform potential preventative and therapeutic strategies. Therefore, as part of the Human Heredity and Health in Africa (H3Africa) initiative, we designed a multi-site study comprising case collections and population-based surveys at 11 sites in eight countries across SSA. The goal is to recruit up to 6000 T2D participants and 6000 control participants. We will collect questionnaire data, biophysical measurements and biological samples for chronic disease traits, risk factors and genetic data on all study participants. Through integrating epidemiological and genomic techniques, the study provides a framework for assessing the burden, spectrum and environmental and genetic risk factors for T2D and its complications across SSA. With established mechanisms for fieldwork, data and sample collection and management, data-sharing and consent for re-approaching participants, the study will be a resource for future research studies, including longitudinal studies, prospective case ascertainment of incident disease and interventional studies.
Funded by: Wellcome Trust
Global health, epidemiology and genomics 2016;1;e5
The role of hepatocyte nuclear factor 1β in disease and development.
Wellcome Trust-Medical Research Council Stem Cell Institute, Anne McLaren Laboratory, Department of Surgery, University of Cambridge, Cambridge, UK.
Heterozygous mutations in the gene that encodes the transcription factor hepatocyte nuclear factor 1β (HNF1B) result in a multi-system disorder. HNF1B was initially discovered as a monogenic diabetes gene; however, renal cysts are the most frequently detected feature. Other clinical features include pancreatic hypoplasia and exocrine insufficiency, genital tract malformations, abnormal liver function, cholestasis and early-onset gout. Heterozygous mutations and complete gene deletions in HNF1B each account for approximately 50% of all cases of HNF1B-associated disease and may show autosomal dominant inheritance or arise spontaneously. There is no clear genotype-phenotype correlation indicating that haploinsufficiency is the main disease mechanism. Data from animal models suggest that HNF1B is essential for several stages of pancreas and liver development. However, mice with heterozygous mutations in HNF1B show no phenotype in contrast to the phenotype seen in humans. This suggests that mouse models do not fully replicate the features of human disease and complementary studies in human systems are necessary to determine the molecular mechanisms underlying HNF1B-associated disease. This review discusses the role of HNF1B in human and murine pancreas and liver development, summarizes the disease phenotypes and identifies areas for future investigations in HNF1B-associated diabetes and liver disease.
Funded by: Medical Research Council: MC_PC_12009; Wellcome Trust
Diabetes, obesity & metabolism 2016;18 Suppl 1;23-32
Analysis of five chronic inflammatory diseases identifies 27 new associations and highlights disease-specific patterns at shared loci.
Institute of Clinical Molecular Biology, Christian Albrechts University of Kiel, Kiel, Germany.
We simultaneously investigated the genetic landscape of ankylosing spondylitis, Crohn's disease, psoriasis, primary sclerosing cholangitis and ulcerative colitis to investigate pleiotropy and the relationship between these clinically related diseases. Using high-density genotype data from more than 86,000 individuals of European ancestry, we identified 244 independent multidisease signals, including 27 new genome-wide significant susceptibility loci and 3 unreported shared risk loci. Complex pleiotropy was supported when contrasting multidisease signals with expression data sets from human, rat and mouse together with epigenetic and expressed enhancer profiles. The comorbidities among the five immune diseases were best explained by biological pleiotropy rather than heterogeneity (a subgroup of cases genetically identical to those with another disease, possibly owing to diagnostic misclassification, molecular subtypes or excessive comorbidity). In particular, the strong comorbidity between primary sclerosing cholangitis and inflammatory bowel disease is likely the result of a unique disease, which is genetically distinct from classical inflammatory bowel disease phenotypes.
Nature genetics 2016
Beegle: from literature mining to disease-gene discovery.
Department of Electrical Engineering (ESAT) STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics Department, KU Leuven, Leuven 3001, Belgium iMinds Future Health Department, KU Leuven, Leuven 3001, Belgium email@example.com.
Disease-gene identification is a challenging process that has multiple applications within functional genomics and personalized medicine. Typically, this process involves both finding genes known to be associated with the disease (through literature search) and carrying out preliminary experiments or screens (e.g. linkage or association studies, copy number analyses, expression profiling) to determine a set of promising candidates for experimental validation. This requires extensive time and monetary resources. We describe Beegle, an online search and discovery engine that attempts to simplify this process by automating the typical approaches. It starts by mining the literature to quickly extract a set of genes known to be linked with a given query, then it integrates the learning methodology of Endeavour (a gene prioritization tool) to train a genomic model and rank a set of candidate genes to generate novel hypotheses. In a realistic evaluation setup, Beegle has an average recall of 84% in the top 100 returned genes as a search engine, which improves the discovery engine by 12.6% in the top 5% prioritized genes. Beegle is publicly available at http://beegle.esat.kuleuven.be/.
Nucleic acids research 2016;44;2;e18
Phenotypic Characterization of Genetically Lowered Human Lipoprotein(a) Levels.
Center for Human Genetic Research, Cardiovascular Research Center and Cardiology Division, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts; Program in Medical and Population Genetics, Broad Institute, Cambridge, Massachusetts.
Background: Genomic analyses have suggested that the LPA gene and its associated plasma biomarker, lipoprotein(a) (Lp[a]), represent a causal risk factor for coronary heart disease (CHD). As such, lowering Lp(a) levels has emerged as a therapeutic strategy. Beyond target identification, human genetics may contribute to the development of new therapies by defining the full spectrum of beneficial and adverse consequences and by developing a dose-response curve of target perturbation.
Objectives: The goal of this study was to establish the full phenotypic impact of LPA gene variation and to estimate a dose-response curve between genetically altered plasma Lp(a) and risk for CHD.
Methods: We leveraged genetic variants at the LPA gene from 3 data sources: individual-level data from 112,338 participants in the U.K. Biobank; summary association results from large-scale genome-wide association studies; and LPA gene sequencing results from case subjects with CHD and control subjects free of CHD.
Results: One SD genetically lowered Lp(a) level was associated with a 29% lower risk of CHD (odds ratio [OR]: 0.71; 95% confidence interval [CI]: 0.69 to 0.73), a 31% lower risk of peripheral vascular disease (OR: 0.69; 95% CI: 0.59 to 0.80), a 13% lower risk of stroke (OR: 0.87; 95% CI: 0.79 to 0.96), a 17% lower risk of heart failure (OR: 0.83; 95% CI: 0.73 to 0.94), and a 37% lower risk of aortic stenosis (OR: 0.63; 95% CI: 0.47 to 0.83). We observed no association with 31 other disorders, including type 2 diabetes and cancer. Variants that led to gain of LPA gene function increased the risk for CHD, whereas those that led to loss of gene function reduced the CHD risk.
Conclusions: Beyond CHD, genetically lowered Lp(a) levels are associated with a lower risk of peripheral vascular disease, stroke, heart failure, and aortic stenosis. As such, pharmacological lowering of plasma Lp(a) may influence a range of atherosclerosis-related diseases.
Funded by: Medical Research Council: MC_QA137853, MR/L003120/1; NCATS NIH HHS: KL2 TR001100; NHGRI NIH HHS: U54 HG003067; NHLBI NIH HHS: K01 HL125751, K08 HL114642, R01 HL127564, R01 HL131961, RC2 HL102923, RC2 HL102924, RC2 HL102925, RC2 HL102926, RC2 HL103010, T32 HL007734
Journal of the American College of Cardiology 2016;68;25;2761-2772
Generation and Characterisation of a Pax8-CreERT2 Transgenic Line and a Slc22a6-CreERT2 Knock-In Line for Inducible and Specific Genetic Manipulation of Renal Tubular Epithelial Cells.
Department of Oncology, University of Cambridge, CRUK Cambridge institute, Cambridge, United Kingdom.
Genetically relevant mouse models need to recapitulate the hallmarks of human disease by permitting spatiotemporal gene targeting. This is especially important for replicating the biology of complex diseases like cancer, where genetic events occur in a sporadic fashion within developed somatic tissues. Though a number of renal tubule targeting mouse lines have been developed their utility for the study of renal disease is limited by lack of inducibility and specificity. In this study we describe the generation and characterisation of two novel mouse lines directing CreERT2 expression to renal tubular epithelia. The Pax8-CreERT2 transgenic line uses the mouse Pax8 promoter to direct expression of CreERT2 to all renal tubular compartments (proximal and distal tubules as well as collecting ducts) whilst the Slc22a6-CreERT2 knock-in line utilises the endogenous mouse Slc22a6 locus to specifically target the epithelium of proximal renal tubules. Both lines show high organ and tissue specificity with no extrarenal activity detected. To establish the utility of these lines for the study of renal cancer biology, Pax8-CreERT2 and Slc22a6-CreERT2 mice were crossed to conditional Vhl knockout mice to induce long-term renal tubule specific Vhl deletion. These models exhibited renal specific activation of the hypoxia inducible factor pathway (a VHL target). Our results establish Pax8-CreERT2 and Slc22a6-CreERT2 mice as valuable tools for the investigation and modelling of complex renal biology and disease.
Funded by: Cancer Research UK: 13031, C37839/A12177
PloS one 2016;11;2;e0148055
Genomic variations leading to alterations in cell morphology of Campylobacter spp.
Department of Veterinary Medicine, University of Cambridge, Madingley Road, Cambridge, UK.
Campylobacter jejuni, the most common cause of bacterial diarrhoeal disease, is normally helical. However, it can also adopt straight rod, elongated helical and coccoid forms. Studying how helical morphology is generated, and how it switches between its different forms, is an important objective for understanding this pathogen. Here, we aimed to determine the genetic factors involved in generating the helical shape of Campylobacter. A C. jejuni transposon (Tn) mutant library was screened for non-helical mutants with inconsistent results. Whole genome sequence variation and morphological trends within this Tn library, and in various C. jejuni wild type strains, were compared and correlated to detect genomic elements associated with helical and rod morphologies. All rod-shaped C. jejuni Tn mutants and all rod-shaped laboratory, clinical and environmental C. jejuni and Campylobacter coli contained genetic changes within the pgp1 or pgp2 genes, which encode peptidoglycan modifying enzymes. We therefore confirm the importance of Pgp1 and Pgp2 in the maintenance of helical shape and extended this to a wide range of C. jejuni and C. coli isolates. Genome sequence analysis revealed variation in the sequence and length of homopolymeric tracts found within these genes, providing a potential mechanism of phase variation of cell shape.
Scientific reports 2016;6;38303
The genome of the yellow potato cyst nematode, Globodera rostochiensis, reveals insights into the basis of parasitism and virulence.
Division of Plant Sciences, College of Life Sciences, University of Dundee, Dundee, DD1 5EH, UK. firstname.lastname@example.org.
Background: The yellow potato cyst nematode, Globodera rostochiensis, is a devastating plant pathogen of global economic importance. This biotrophic parasite secretes effectors from pharyngeal glands, some of which were acquired by horizontal gene transfer, to manipulate host processes and promote parasitism. G. rostochiensis is classified into pathotypes with different plant resistance-breaking phenotypes.
Results: We generate a high quality genome assembly for G. rostochiensis pathotype Ro1, identify putative effectors and horizontal gene transfer events, map gene expression through the life cycle focusing on key parasitic transitions and sequence the genomes of eight populations including four additional pathotypes to identify variation. Horizontal gene transfer contributes 3.5 % of the predicted genes, of which approximately 8.5 % are deployed as effectors. Over one-third of all effector genes are clustered in 21 putative 'effector islands' in the genome. We identify a dorsal gland promoter element motif (termed DOG Box) present upstream in representatives from 26 out of 28 dorsal gland effector families, and predict a putative effector superset associated with this motif. We validate gland cell expression in two novel genes by in situ hybridisation and catalogue dorsal gland promoter element-containing effectors from available cyst nematode genomes. Comparison of effector diversity between pathotypes highlights correlation with plant resistance-breaking.
Conclusions: These G. rostochiensis genome resources will facilitate major advances in understanding nematode plant-parasitism. Dorsal gland promoter element-containing effectors are at the front line of the evolutionary arms race between plant and parasite and the ability to predict gland cell expression a priori promises rapid advances in understanding their roles and mechanisms of action.
Funded by: Biotechnology and Biological Sciences Research Council: BB/F000642/1, BB/F00334X/1, BB/G007071/1; Wellcome Trust: 098051
Genome biology 2016;17;1;124
DNA Methylation Dynamics of Human Hematopoietic Stem Cell Differentiation.
CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, 1090 Vienna, Austria.
Hematopoietic stem cells give rise to all blood cells in a differentiation process that involves widespread epigenome remodeling. Here we present genome-wide reference maps of the associated DNA methylation dynamics. We used a meta-epigenomic approach that combines DNA methylation profiles across many small pools of cells and performed single-cell methylome sequencing to assess cell-to-cell heterogeneity. The resulting dataset identified characteristic differences between HSCs derived from fetal liver, cord blood, bone marrow, and peripheral blood. We also observed lineage-specific DNA methylation between myeloid and lymphoid progenitors, characterized immature multi-lymphoid progenitors, and detected progressive DNA methylation differences in maturing megakaryocytes. We linked these patterns to gene expression, histone modifications, and chromatin accessibility, and we used machine learning to derive a model of human hematopoietic differentiation directly from DNA methylation data. Our results contribute to a better understanding of human hematopoietic stem cell differentiation and provide a framework for studying blood-linked diseases.
Cell stem cell 2016
Complete Whole-Genome Sequence of Salmonella enterica subsp. enterica Serovar Java NCTC5706.
Culture Collections, Public Health England, London, United Kingdom.
Salmonellae are a significant cause of morbidity and mortality globally. Here, we report the first complete genome sequence for Salmonella enterica subsp. enterica serovar Java strain NCTC5706. This strain is of historical significance, having been isolated in the pre-antibiotic era and was deposited into the National Collection of Type Cultures in 1939.
Genome announcements 2016;4;6
Distinct Salmonella Enteritidis lineages associated with enterocolitis in high-income settings and invasive disease in low-income settings.
Liverpool School of Tropical Medicine, Liverpool, UK.
An epidemiological paradox surrounds Salmonella enterica serovar Enteritidis. In high-income settings, it has been responsible for an epidemic of poultry-associated, self-limiting enterocolitis, whereas in sub-Saharan Africa it is a major cause of invasive nontyphoidal Salmonella disease, associated with high case fatality. By whole-genome sequence analysis of 675 isolates of S. Enteritidis from 45 countries, we show the existence of a global epidemic clade and two new clades of S. Enteritidis that are geographically restricted to distinct regions of Africa. The African isolates display genomic degradation, a novel prophage repertoire, and an expanded multidrug resistance plasmid. S. Enteritidis is a further example of a Salmonella serotype that displays niche plasticity, with distinct clades that enable it to become a prominent cause of gastroenteritis in association with the industrial production of eggs and of multidrug-resistant, bloodstream-invasive infection in Africa.
Funded by: Biotechnology and Biological Sciences Research Council: BB/M014088/1; NIAID NIH HHS: R01 AI099525; Wellcome Trust: 092152, 098051, 100891, 101113/Z/13/Z
Nature genetics 2016;48;10;1211-1217
Genome-wide association analysis identifies three new susceptibility loci for childhood body mass index.
The Generation R Study Group, Department of Pediatrics, Department of Epidemiology, email@example.com.
A large number of genetic loci are associated with adult body mass index. However, the genetics of childhood body mass index are largely unknown. We performed a meta-analysis of genome-wide association studies of childhood body mass index, using sex- and age-adjusted standard deviation scores. We included 35 668 children from 20 studies in the discovery phase and 11 873 children from 13 studies in the replication phase. In total, 15 loci reached genome-wide significance (P-value < 5 × 10(-8)) in the joint discovery and replication analysis, of which 12 are previously identified loci in or close to ADCY3, GNPDA2, TMEM18, SEC16B, FAIM2, FTO, TFAP2B, TNNI3K, MC4R, GPR61, LMX1B and OLFM4 associated with adult body mass index or childhood obesity. We identified three novel loci: rs13253111 near ELP3, rs8092503 near RAB27B and rs13387838 near ADAM23. Per additional risk allele, body mass index increased 0.04 Standard Deviation Score (SDS) [Standard Error (SE) 0.007], 0.05 SDS (SE 0.008) and 0.14 SDS (SE 0.025), for rs13253111, rs8092503 and rs13387838, respectively. A genetic risk score combining all 15 SNPs showed that each additional average risk allele was associated with a 0.073 SDS (SE 0.011, P-value = 3.12 × 10(-10)) increase in childhood body mass index in a population of 1955 children. This risk score explained 2% of the variance in childhood body mass index. This study highlights the shared genetic background between childhood and adult body mass index and adds three novel loci. These loci likely represent age-related differences in strength of the associations with body mass index.
Funded by: Wellcome Trust: 098381
Human molecular genetics 2016;25;2;389-403
A whole-genome sequence and transcriptome perspective on HER2-positive breast cancers.
Synergie Lyon Cancer, Plateforme de bioinformatique 'Gilles Thomas' Centre Léon Bérard, 28 rue Laënnec, 69008 Lyon, France.
HER2-positive breast cancer has long proven to be a clinically distinct class of breast cancers for which several targeted therapies are now available. However, resistance to the treatment associated with specific gene expressions or mutations has been observed, revealing the underlying diversity of these cancers. Therefore, understanding the full extent of the HER2-positive disease heterogeneity still remains challenging. Here we carry out an in-depth genomic characterization of 64 HER2-positive breast tumour genomes that exhibit four subgroups, based on the expression data, with distinctive genomic features in terms of somatic mutations, copy-number changes or structural variations. The results suggest that, despite being clinically defined by a specific gene amplification, HER2-positive tumours melt into the whole luminal-basal breast cancer spectrum rather than standing apart. The results also lead to a refined ERBB2 amplicon of 106 kb and show that several cases of amplifications are compatible with a breakage-fusion-bridge mechanism.
Nature communications 2016;7;12222
Sequence variation between 462 human individuals fine-tunes functional sites of RNA processing.
Bioinformatics and Genomics, Center for Genomic Regulation (CRG), 08003 Barcelona, Catalonia, Spain.
Recent advances in the cost-efficiency of sequencing technologies enabled the combined DNA- and RNA-sequencing of human individuals at the population-scale, making genome-wide investigations of the inter-individual genetic impact on gene expression viable. Employing mRNA-sequencing data from the Geuvadis Project and genome sequencing data from the 1000 Genomes Project we show that the computational analysis of DNA sequences around splice sites and poly-A signals is able to explain several observations in the phenotype data. In contrast to widespread assessments of statistically significant associations between DNA polymorphisms and quantitative traits, we developed a computational tool to pinpoint the molecular mechanisms by which genetic markers drive variation in RNA-processing, cataloguing and classifying alleles that change the affinity of core RNA elements to their recognizing factors. The in silico models we employ further suggest RNA editing can moonlight as a splicing-modulator, albeit less frequently than genomic sequence diversity. Beyond existing annotations, we demonstrate that the ultra-high resolution of RNA-Seq combined from 462 individuals also provides evidence for thousands of bona fide novel elements of RNA processing-alternative splice sites, introns, and cleavage sites-which are often rare and lowly expressed but in other characteristics similar to their annotated counterparts.
Scientific reports 2016;6;32406
The Pfam protein families database: towards a more sustainable future.
European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK firstname.lastname@example.org.
In the last two years the Pfam database (http://pfam.xfam.org) has undergone a substantial reorganisation to reduce the effort involved in making a release, thereby permitting more frequent releases. Arguably the most significant of these changes is that Pfam is now primarily based on the UniProtKB reference proteomes, with the counts of matched sequences and species reported on the website restricted to this smaller set. Building families on reference proteomes sequences brings greater stability, which decreases the amount of manual curation required to maintain them. It also reduces the number of sequences displayed on the website, whilst still providing access to many important model organisms. Matches to the full UniProtKB database are, however, still available and Pfam annotations for individual UniProtKB sequences can still be retrieved. Some Pfam entries (1.6%) which have no matches to reference proteomes remain; we are working with UniProt to see if sequences from them can be incorporated into reference proteomes. Pfam-B, the automatically-generated supplement to Pfam, has been removed. The current release (Pfam 29.0) includes 16 295 entries and 559 clans. The facility to view the relationship between families within a clan has been improved by the introduction of a new tool.
Funded by: Biotechnology and Biological Sciences Research Council: BB/L024136/1; Howard Hughes Medical Institute; Wellcome Trust: 108433/Z/15/Z]
Nucleic acids research 2016;44;D1;D279-85
The diversity of <i>Klebsiella pneumoniae</i> surface polysaccharides.
1LimmaTech Biologics AG, Schlieren, Switzerland.
<i>Klebsiella pneumoniae</i> is considered an urgent health concern due to the emergence of multi-drug-resistant strains for which vaccination offers a potential remedy. Vaccines based on surface polysaccharides are highly promising but need to address the high diversity of surface-exposed polysaccharides, synthesized as O-antigens (lipopolysaccharide, LPS) and K-antigens (capsule polysaccharide, CPS), present in <i>K. pneumoniae</i>. We present a comprehensive and clinically relevant study of the diversity of O- and K-antigen biosynthesis gene clusters across a global collection of over 500 <i>K. pneumoniae</i> whole-genome sequences and the seroepidemiology of human isolates from different infection types. Our study defines the genetic diversity of O- and K-antigen biosynthesis cluster sequences across this collection, identifying sequences for known serotypes as well as identifying novel LPS and CPS gene clusters found in circulating contemporary isolates. Serotypes O1, O2 and O3 were most prevalent in our sample set, accounting for approximately 80 % of all infections. In contrast, K serotypes showed an order of magnitude higher diversity and differ among infection types. In addition we investigated a potential association of O or K serotypes with phylogenetic lineage, infection type and the presence of known virulence genes. K1 and K2 serotypes, which are associated with hypervirulent <i>K. pneumoniae</i>, were associated with a higher abundance of virulence genes and more diverse O serotypes compared to other common K serotypes.
Microbial genomics 2016;2;8;e000073
COSMIC: High-Resolution Cancer Genetics Using the Catalogue of Somatic Mutations in Cancer.
Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, United Kingdom.
COSMIC (http://cancer.sanger.ac.uk) is an expert-curated database of somatic mutations in human cancer. Broad and comprehensive in scope, recent releases in 2016 describe over 4 million coding mutations across all human cancer disease types. Mutations are annotated across the entire genome, but expert curation is focused on over 400 key cancer genes. Now encompassing the majority of molecular mutation mechanisms in oncogenetics, COSMIC additionally describes 10 million non-coding mutations, 1 million copy-number aberrations, 9 million gene-expression variants, and almost 8 million differentially methylated CpGs. This information combines a consistent interpretation of the data from the major cancer genome consortia and cancer genome literature with exhaustive hand curation of over 22,000 gene-specific literature publications. This unit describes the graphical Web site in detail; alternative protocols overview other ways the entire database can be accessed, analyzed, and downloaded. © 2016 by John Wiley & Sons, Inc.
Current protocols in human genetics 2016;91;10.11.1-10.11.37
HPMCD: the database of human microbial communities from metagenomic datasets and microbial reference genomes.
Host Microbiota Interactions Laboratory, Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1SA, UK Centre for Innate Immunity and Infectious Diseases, Hudson Institute of Medical Research, Clayton 3168, Australia Department of Molecular and Translational Sciences, Monash University, Clayton 3800, Australia email@example.com.
The Human Pan-Microbe Communities (HPMC) database (http://www.hpmcd.org/) provides a manually curated, searchable, metagenomic resource to facilitate investigation of human gastrointestinal microbiota. Over the past decade, the application of metagenome sequencing to elucidate the microbial composition and functional capacity present in the human microbiome has revolutionized many concepts in our basic biology. When sufficient high quality reference genomes are available, whole genome metagenomic sequencing can provide direct biological insights and high-resolution classification. The HPMC database provides species level, standardized phylogenetic classification of over 1800 human gastrointestinal metagenomic samples. This is achieved by combining a manually curated list of bacterial genomes from human faecal samples with over 21000 additional reference genomes representing bacteria, viruses, archaea and fungi with manually curated species classification and enhanced sample metadata annotation. A user-friendly, web-based interface provides the ability to search for (i) microbial groups associated with health or disease state, (ii) health or disease states and community structure associated with a microbial group, (iii) the enrichment of a microbial gene or sequence and (iv) enrichment of a functional annotation. The HPMC database enables detailed analysis of human microbial communities and supports research from basic microbiology and immunology to therapeutic development in human health and disease.
Funded by: Biotechnology and Biological Sciences Research Council: BB/M011755/1; Medical Research Council: 1091097; Wellcome Trust: 098051
Nucleic acids research 2016;44;D1;D604-9
Resistance of Transmitted Founder HIV-1 to IFITM-Mediated Restriction.
Department of Infectious Diseases, King's College London Faculty of Life Sciences and Medicine, Guy's Hospital, London SE1 9RT, UK.
Interferon-induced transmembrane proteins (IFITMs) restrict the entry of diverse enveloped viruses through incompletely understood mechanisms. While IFITMs are reported to inhibit HIV-1, their in vivo relevance is unclear. We show that IFITM sensitivity of HIV-1 strains is determined by the co-receptor usage of the viral envelope glycoproteins as well as IFITM subcellular localization within the target cell. Importantly, we find that transmitted founder HIV-1, which establishes de novo infections, is uniquely resistant to the antiviral activity of IFITMs. However, viral sensitivity to IFITMs, particularly IFITM2 and IFITM3, increases over the first 6 months of infection, primarily as a result of neutralizing antibody escape mutations. Additionally, the ability to evade IFITM restriction contributes to the different interferon sensitivities of transmitted founder and chronic viruses. Together, these data indicate that IFITMs constitute an important barrier to HIV-1 transmission and that escape from adaptive immune responses exposes the virus to antiviral restriction.
Cell host & microbe 2016
Variant Exported Blood-Stage Proteins Encoded by Plasmodium Multigene Families Are Expressed in Liver Stages Where They Are Exported into the Parasitophorous Vacuole.
Leiden Malaria Research Group, Parasitology, Center of infectious Diseases, Leiden University Medical Center (LUMC), Leiden, The Netherlands.
Many variant proteins encoded by Plasmodium-specific multigene families are exported into red blood cells (RBC). P. falciparum-specific variant proteins encoded by the var, stevor and rifin multigene families are exported onto the surface of infected red blood cells (iRBC) and mediate interactions between iRBC and host cells resulting in tissue sequestration and rosetting. However, the precise function of most other Plasmodium multigene families encoding exported proteins is unknown. To understand the role of RBC-exported proteins of rodent malaria parasites (RMP) we analysed the expression and cellular location by fluorescent-tagging of members of the pir, fam-a and fam-b multigene families. Furthermore, we performed phylogenetic analyses of the fam-a and fam-b multigene families, which indicate that both families have a history of functional differentiation unique to RMP. We demonstrate for all three families that expression of family members in iRBC is not mutually exclusive. Most tagged proteins were transported into the iRBC cytoplasm but not onto the iRBC plasma membrane, indicating that they are unlikely to play a direct role in iRBC-host cell interactions. Unexpectedly, most family members are also expressed during the liver stage, where they are transported into the parasitophorous vacuole. This suggests that these protein families promote parasite development in both the liver and blood, either by supporting parasite development within hepatocytes and erythrocytes and/or by manipulating the host immune response. Indeed, in the case of Fam-A, which have a steroidogenic acute regulatory-related lipid transfer (START) domain, we found that several family members can transfer phosphatidylcholine in vitro. These observations indicate that these proteins may transport (host) phosphatidylcholine for membrane synthesis. This is the first demonstration of a biological function of any exported variant protein family of rodent malaria parasites.
PLoS pathogens 2016;12;11;e1005917
An Antibody Screen of a Plasmodium vivax Antigen Library Identifies Novel Merozoite Proteins Associated with Clinical Protection.
Population Health and Immunity Division, Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia.
Background: Elimination of Plasmodium vivax malaria would be greatly facilitated by the development of an effective vaccine. A comprehensive and systematic characterization of antibodies to P. vivax antigens in exposed populations is useful in guiding rational vaccine design.
Methodology/principal findings: In this study, we investigated antibodies to a large library of P. vivax entire ectodomain merozoite proteins in 2 Asia-Pacific populations, analysing the relationship of antibody levels with markers of current and cumulative malaria exposure, and socioeconomic and clinical indicators. 29 antigenic targets of natural immunity were identified. Of these, 12 highly-immunogenic proteins were strongly associated with age and thus cumulative lifetime exposure in Solomon Islanders (P<0.001-0.027). A subset of 6 proteins, selected on the basis of immunogenicity and expression levels, were used to examine antibody levels in plasma samples from a population of young Papua New Guinean children with well-characterized individual differences in exposure. This analysis identified a strong association between reduced risk of clinical disease and antibody levels to P12, P41, and a novel hypothetical protein that has not previously been studied, PVX_081550 (IRR 0.46-0.74; P<0.001-0.041).
Conclusion/significance: These data emphasize the benefits of an unbiased screening approach in identifying novel vaccine candidate antigens. Functional studies are now required to establish whether PVX_081550 is a key component of the naturally-acquired protective immune response, a biomarker of immune status, or both.
Funded by: Medical Research Council: MR/J002283/1, MR/L012170/1; NIAID NIH HHS: U19 AI089686
PLoS neglected tropical diseases 2016;10;5;e0004639
A single dividing cell population with imbalanced fate drives oesophageal tumour growth.
Wellcome Trust Sanger Institute, Hinxton CB10 1SA, UK.
Understanding the cellular mechanisms of tumour growth is key for designing rational anticancer treatment. Here we used genetic lineage tracing to quantify cell behaviour during neoplastic transformation in a model of oesophageal carcinogenesis. We found that cell behaviour was convergent across premalignant tumours, which contained a single proliferating cell population. The rate of cell division was not significantly different in the lesions and the surrounding epithelium. However, dividing tumour cells had a uniform, small bias in cell fate so that, on average, slightly more dividing than non-dividing daughter cells were generated at each round of cell division. In invasive cancers induced by Kras(G12D) expression, dividing cell fate became more strongly biased towards producing dividing over non-dividing cells in a subset of clones. These observations argue that agents that restore the balance of cell fate may prove effective in checking tumour growth, whereas those targeting cycling cells may show little selectivity.
Funded by: Cancer Research UK: C609/A17257; Medical Research Council: MC_PC_12009; Wellcome Trust: 098357/Z/12/Z
Nature cell biology 2016;18;9;967-78
Rapid phenotyping of knockout mice to identify genetic determinants of bone strength.
B Freudenthal, Medicine, Imperial College London, London, United Kingdom of Great Britain and Northern Ireland.
The genetic determinants of osteoporosis remain poorly understood and there is a large unmet need for new treatments in our aging society. Thus, new approaches for gene discovery in skeletal disease are required to complement the current genome wide association studies in human populations. The International Knockout Mouse Consortium (IKMC) and International Mouse Phenotyping Consortium (IMPC) provide such an opportunity. The IKMC is generating knockout mice representing each of the known protein-coding genes in C57BL/6 mice and, as part of the IMPC initiative, the Origins of Bone and Cartilage Disease project is identifying mutants with significant outlier skeletal phenotypes. This initiative will add value to data from large human cohorts and provide a new understanding of bone and cartilage pathophysiology, ultimately leading to the identification of novel drug targets for the treatment of skeletal disease.
The Journal of endocrinology 2016
HUWE1 mutations in Juberg-Marsidi and Brooks syndromes: the results of an X-chromosome exome sequencing study.
Greenwood Genetic Center, Greenwood, South Carolina, USA.
Background: X linked intellectual disability (XLID) syndromes account for a substantial number of males with ID. Much progress has been made in identifying the genetic cause in many of the syndromes described 20-40 years ago. Next generation sequencing (NGS) has contributed to the rapid discovery of XLID genes and identifying novel mutations in known XLID genes for many of these syndromes.
Methods: 2 NGS approaches were employed to identify mutations in X linked genes in families with XLID disorders. 1 involved exome sequencing of genes on the X chromosome using the Agilent SureSelect Human X Chromosome Kit. The second approach was to conduct targeted NGS sequencing of 90 known XLID genes.
Results: We identified the same mutation, a c.12928 G>C transversion in the HUWE1 gene, which gives rise to a p.G4310R missense mutation in 2 XLID disorders: Juberg-Marsidi syndrome (JMS) and Brooks syndrome. Although the original families with these disorders were considered separate entities, they indeed overlap clinically. A third family was also found to have a novel HUWE1 mutation.
Conclusions: As we identified a HUWE1 mutation in an affected male from the original family reported by Juberg and Marsidi, it is evident the syndrome does not result from a mutation in ATRX as reported in the literature. Additionally, our data indicate that JMS and Brooks syndromes are allelic having the same HUWE1 mutation.
Funded by: NICHD NIH HHS: 2R01HD026202, R01 HD026202; NINDS NIH HHS: 1R01NS73854, R01 NS073854
BMJ open 2016;6;4;e009537
Tyrosine kinase 2 is not limiting human antiviral type III interferon responses.
Center for Chronic Immunodeficiency, Faculty of Medicine, Medical Center-University of Freiburg, Freiburg, Germany.
Tyrosine kinase 2 (TYK2) associates with interferon (IFN) alpha receptor, IL-10 receptor (IL-10R) beta and other cytokine receptor subunits for signal transduction, in response to various cytokines, including type-I and type-III IFNs, IL-6, IL-10, IL-12 and IL-23. Data on TYK2 dependence on cytokine responses and in vivo consequences of TYK2 deficiency are inconsistent. We investigated a TYK2 deficient patient, presenting with eczema, skin abscesses, respiratory infections and IgE levels >1000 U/mL, without viral or mycobacterial infections and a corresponding cellular model to analyze the role of TYK2 in type-III IFN mediated responses and NK-cell function. We established a novel simple diagnostic monocyte assay to show that the mutation completely abolishes the IFN-α mediated antiviral response. It also partly reduces IL-10 but not IL-6 mediated signaling associated with reduced IL-10Rβ expression. However, we found almost normal type-III IFN signaling associated with minimal impairment of virus control in a TYK2 deficient human cell line. Contrary to observations in TYK2 deficient mice, NK-cell phenotype and function, including IL-12/IL-18 mediated responses, were normal in the patient. Thus, preserved type-III IFN responses and normal NK-cell function may contribute to antiviral protection in TYK2 deficiency leading to a surprisingly mild human phenotype.
European journal of immunology 2016;46;11;2639-2649
The genetic architecture of type 2 diabetes.
Department of Biostatistics and Center for Statistical Genetics, University of Michigan, Ann Arbor, Michigan, USA.
The genetic architecture of common traits, including the number, frequency, and effect sizes of inherited variants that contribute to individual risk, has been long debated. Genome-wide association studies have identified scores of common variants associated with type 2 diabetes, but in aggregate, these explain only a fraction of the heritability of this disease. Here, to test the hypothesis that lower-frequency variants explain much of the remainder, the GoT2D and T2D-GENES consortia performed whole-genome sequencing in 2,657 European individuals with and without diabetes, and exome sequencing in 12,940 individuals from five ancestry groups. To increase statistical power, we expanded the sample size via genotyping and imputation in a further 111,548 subjects. Variants associated with type 2 diabetes after sequencing were overwhelmingly common and most fell within regions previously identified by genome-wide association studies. Comprehensive enumeration of sequence variation is necessary to identify functional alleles that provide important clues to disease pathophysiology, but large-scale sequencing does not support the idea that lower-frequency variants have a major role in predisposition to type 2 diabetes.
Funded by: British Heart Foundation: RG/14/5/30893, SP/04/002, SP/09/002; CIHR; Medical Research Council: G0601261, G0601966, G0700931, G0800270, G0900747-‐91070, MC_UU_12012/5, MC_UU_12015/1, MR/K002414/1, MR/L01341X/1; NCI NIH HHS: K12CA139160; NHGRI NIH HHS: R01 HG000376, R56 HG000376, U01 HG005773, U01HG005773, U54 HG003067, U54HG003067; NHLBI NIH HHS: HHSN268201300046C, HHSN268201300047C, HHSN268201300048C, HHSN268201300049C, HHSN268201300050C, R01 HL102830, R01HL102830, T32 HL007055; NIA NIH HHS: 1R01AG042188, P01 AG027734, P01AG027734, P30 AG038072, P30AG038072, R01 AG042188, R01 AG046949, R01AG046949; NIDDK NIH HHS: 1RC2DK088389, DK072193, DK085501, DK085524, DK085526, DK085545, DK085584, DK088389, DK093757, DK098032, K24 DK080140, K24 DK110550, K24DK080140, P30 DK020572, P30 DK020595, P30DK020595, P60 DK020595, P60DK20595, R00 DK092251, R00 DK099240, R00DK092251, R01 DK066358, R01 DK072193, R01 DK073541, R01 DK093757, R01 DK098032, R01 DK101478, R01 DK106236, R01DK062370, R01DK066358, R01DK073541, R01DK098032, RC2 DK088389, RC2-‐DK088389, RC2DK088389, U01 DK062370, U01 DK078616, U01 DK085501, U01 DK085524, U01 DK085526, U01 DK085545, U01 DK085584, U01DK085501, U01DK085526; NIGMS NIH HHS: T32 GM007753, T32GM007753; NIH HHS: S10 OD018522; NIMH NIH HHS: R01 MH090937, R01 MH101820, R01MH090937, R01MH101820; NIMHD NIH HHS: U54 MD007588; Wellcome Trust: 064890, 083948, 084723, 085475, 086596, 090367, 090532, 092447, 095101, 095552, 098017, 098051, 098381, 100956
Lymphoid-Tissue-Resident Commensal Bacteria Promote Members of the IL-10 Cytokine Family to Establish Mutualism.
Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA; Joan and Sanford I. Weill Department of Medicine, Division of Gastroenterology, Weill Cornell Medicine, New York, NY 10021 USA; Department of Microbiology and Immunology, Weill Cornell Medicine, New York, NY 10021 USA; Jill Roberts Institute for Research in Inflammatory Bowel Disease, Weill Cornell Medicine, New York, NY 10021, USA.
Physical separation between the mammalian immune system and commensal bacteria is necessary to limit chronic inflammation. However, selective species of commensal bacteria can reside within intestinal lymphoid tissues of healthy mammals. Here, we demonstrate that lymphoid-tissue-resident commensal bacteria (LRC) colonized murine dendritic cells and modulated their cytokine production. In germ-free and antibiotic-treated mice, LRCs colonized intestinal lymphoid tissues and induced multiple members of the IL-10 cytokine family, including dendritic-cell-derived IL-10 and group 3 innate lymphoid cell (ILC3)-derived IL-22. Notably, IL-10 limited the development of pro-inflammatory Th17 cell responses, and IL-22 production enhanced LRC colonization in the steady state. Furthermore, LRC colonization protected mice from lethal intestinal damage in an IL-10-IL-10R-dependent manner. Collectively, our data reveal a unique host-commensal-bacteria dialog whereby selective subsets of commensal bacteria interact with dendritic cells to facilitate tissue-specific responses that are mutually beneficial for both the host and the microbe.
Funded by: Medical Research Council: PF451; NCI NIH HHS: P30 CA008748; NIAID NIH HHS: R01 AI123368, R01AI123368, R56 AI114724, R56AI114724; NIDDK NIH HHS: P01 DK094779, P30 DK034987, P30-DK034987; NIH HHS: 5-P40-OD010995, DP5 OD012116, DP5OD012116, P40 OD010995; Wellcome Trust: 098051, 105644
RUNX1 mutations in acute myeloid leukemia are associated with distinct clinico-pathologic and genetic features.
Klinik für Innere Medizin III, Universitätsklinikum Ulm, Ulm, Germany.
We evaluated the frequency, genetic architecture, clinico-pathologic features and prognostic impact of RUNX1 mutations in 2439 adult patients with newly-diagnosed acute myeloid leukemia (AML). RUNX1 mutations were found in 245 of 2439 (10%) patients; were almost mutually exclusive of AML with recurrent genetic abnormalities; and they co-occurred with a complex pattern of gene mutations, frequently involving mutations in epigenetic modifiers (ASXL1, IDH2, KMT2A, EZH2), components of the spliceosome complex (SRSF2, SF3B1) and STAG2, PHF6, BCOR. RUNX1 mutations were associated with older age (16-59 years: 8.5%; ⩾60 years: 15.1%), male gender, more immature morphology and secondary AML evolving from myelodysplastic syndrome. In univariable analyses, RUNX1 mutations were associated with inferior event-free (EFS, P<0.0001), relapse-free (RFS, P=0.0007) and overall survival (OS, P<0.0001) in all patients, remaining significant when age was considered. In multivariable analysis, RUNX1 mutations predicted for inferior EFS (P=0.01). The effect of co-mutation varied by partner gene, where patients with the secondary genotypes RUNX1<sup>mut</sup>/ASXL1<sup>mut</sup> (OS, P=0.004), RUNX1<sup>mut</sup>/SRSF2<sup>mut</sup> (OS, P=0.007) and RUNX1<sup>mut</sup>/PHF6<sup>mut</sup> (OS, P=0.03) did significantly worse, whereas patients with the genotype RUNX1<sup>mut</sup>/IDH2<sup>mut</sup> (OS, P=0.04) had a better outcome. In conclusion, RUNX1-mutated AML is associated with a complex mutation cluster and is correlated with distinct clinico-pathologic features and inferior prognosis.
Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.
This month's Genome Watch examines how the increased availability of mammalian genomes provides new insights into the interactions of endogenous retroviruses with other viruses and various hosts.
Nature reviews. Microbiology 2016;14;2;66
tRNA fragments: novel players in intergenerational inheritance.
The Gurdon Institute and Department of Genetics, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QN, UK.
Non-genetic inheritance is an evocative topic; in the past few years, the debate around potential inheritance of life-time experiences independent of social factors in mammals has become highly prominent due to increasing evidence for phenotypes in the offspring after paternal environmental exposures. Strikingly, two independent studies published in Science newly implicate a special class of RNA, transfer RNA fragments, in the intergenerational effects of paternal dietary intervention.
Funded by: Cancer Research UK: 11832
Cell research 2016;26;4;395-6
Interleukin-13 Activates Distinct Cellular Pathways Leading to Ductular Reaction, Steatosis, and Fibrosis.
Immunopathogenesis Section, Laboratory of Parasitic Diseases, National Institute of Allergy and Infectious Diseases, NIH, Bethesda, MD 20852, USA; Wellcome Trust-Medical Research Council Stem Cell Institute, Anne McLaren Laboratory, Department of Surgery, University of Cambridge, Cambridge CB2 0SZ, UK.
Fibroproliferative diseases are driven by dysregulated tissue repair responses and are a major cause of morbidity and mortality because they affect nearly every organ system. Type 2 cytokine responses are critically involved in tissue repair; however, the mechanisms that regulate beneficial regeneration versus pathological fibrosis are not well understood. Here, we have shown that the type 2 effector cytokine interleukin-13 simultaneously, yet independently, directed hepatic fibrosis and the compensatory proliferation of hepatocytes and biliary cells in progressive models of liver disease induced by interleukin-13 overexpression or after infection with Schistosoma mansoni. Using transgenic mice with interleukin-13 signaling genetically disrupted in hepatocytes, cholangiocytes, or resident tissue fibroblasts, we have revealed direct and distinct roles for interleukin-13 in fibrosis, steatosis, cholestasis, and ductular reaction. Together, these studies show that these mechanisms are simultaneously controlled but distinctly regulated by interleukin-13 signaling. Thus, it may be possible to promote interleukin-13-dependent hepatobiliary expansion without generating pathological fibrosis. VIDEO ABSTRACT.
Funded by: Intramural NIH HHS: Z01 AI000829-11, Z01 AI001019-01
Cytokine profiles during invasive nontyphoidal Salmonella disease predict outcome in African children.
Wellcome Trust Centre for Human Genetics, University of Oxford, UK Department of Paediatrics, University of Oxford, UK.
Nontyphoidal Salmonellae are a leading cause of sepsis in African children. Cytokine responses are central to the pathophysiology of sepsis and predict sepsis outcome in other settings. In this study we investigated cytokine responses to invasive nontyphoidal Salmonella (iNTS) disease in Malawian children. We determined serum concentrations of 48 cytokines with multiplexed immunoassays in Malawian children during acute iNTS disease (n = 111) and in convalescence (n = 77). Principal components analysis and logistic regression were used to identify cytokine signatures of acute iNTS disease. We further investigated whether these responses are altered by HIV co-infection or severe malnutrition, and whether cytokine responses predict inpatient mortality. Cytokine changes in acute iNTS disease were associated with two distinct cytokine signatures. The first is characterized by increased concentrations of mediators known to be associated with macrophage function, and the second by raised pro- and anti-inflammatory cytokines typical of responses reported in sepsis secondary to diverse pathogens. These cytokine responses were largely unaltered by either severe malnutrition or HIV co-infection. Children with fatal disease had a distinctive cytokine profile, characterized by raised mediators known to be associated with neutrophil function. In conclusion, cytokine responses to acute iNTS infection in Malawian children are reflective of both the cytokine storm typical of sepsis secondary to diverse pathogens, and the intra-macrophage replicative niche of NTS. The cytokine profile predictive of fatal disease supports a key role of neutrophils in the pathogenesis of NTS sepsis.
Clinical and vaccine immunology : CVI 2016
Very low-depth sequencing in a founder population identifies a cardioprotective APOC3 signal missed by genome-wide imputation.
Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK.
Cohort-wide very low-depth whole-genome sequencing (WGS) can comprehensively capture low-frequency sequence variation for the cost of a dense genome-wide genotyping array. Here, we analyse 1x sequence data across the APOC3 gene in a founder population from the island of Crete in Greece (n = 1239) and find significant evidence for association with blood triglyceride levels with the previously reported R19X cardioprotective null mutation (β = -1.09,σ = 0.163, P = 8.2 × 10<sup>-11</sup>) and a second loss of function mutation, rs138326449 (β = -1.17,σ = 0.188, P = 1.14 × 10<sup>-9</sup>). The signal cannot be recapitulated by imputing genome-wide genotype data on a large reference panel of 5122 individuals including 249 with 4x WGS data from the same population. Gene-level meta-analysis with other studies reporting burden signals at APOC3 provides robust evidence for a replicable cardioprotective rare variant aggregation (P = 3.2 × 10<sup>-31</sup>, n = 13 480).
Funded by: Medical Research Council: MC_PC_15018; Wellcome Trust: 098051, 102215, WT091310
Human molecular genetics 2016;25;11;2360-2365
Rapid Karyotype Evolution in Lasiopodomys Involved at Least Two Autosome - Sex Chromosome Translocations.
Institute of Molecular and Cellular Biology, Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia.
The generic status of Lasiopodomys and its division into subgenera Lasiopodomys (L. mandarinus, L. brandtii) and Stenocranius (L. gregalis, L. raddei) are not generally accepted because of contradictions between the morphological and molecular data. To obtain cytogenetic evidence for the Lasiopodomys genus and its subgenera and to test the autosome to sex chromosome translocation hypothesis of sex chromosome complex origin in L. mandarinus proposed previously, we hybridized chromosome painting probes from the field vole (Microtus agrestis, MAG) and the Arctic lemming (Dicrostonyx torquatus, DTO) onto the metaphases of a female Mandarin vole (L. mandarinus, 2n = 47) and a male Brandt's vole (L. brandtii, 2n = 34). In addition, we hybridized Arctic lemming painting probes onto chromosomes of a female narrow-headed vole (L. gregalis, 2n = 36). Cross-species painting revealed three cytogenetic signatures (MAG12/18, 17a/19, and 22/24) that could validate the genus Lasiopodomys and indicate the evolutionary affinity of L. gregalis to the genus. Moreover, all three species retained the associations MAG1bc/17b and 2/8a detected previously in karyotypes of all arvicolins studied. The associations MAG2a/8a/19b, 8b/21, 9b/23, 11/13b, 12b/18, 17a/19a, and 5 fissions of ancestral segments appear to be characteristic for the subgenus Lasiopodomys. We also validated the autosome to sex chromosome translocation hypothesis on the origin of complex sex chromosomes in L. mandarinus. Two translocations of autosomes onto the ancestral X chromosome in L. mandarinus led to a complex of neo-X1, neo-X2, and neo-X3 elements. Our results demonstrate that genus Lasiopodomys represents a striking example of rapid chromosome evolution involving both autosomes and sex chromosomes. Multiple reshuffling events including Robertsonian fusions, chromosomal fissions, inversions and heterochromatin expansion have led to the formation of modern species karyotypes in a very short time, about 2.4 MY.
PloS one 2016;11;12;e0167653
GENOMICS. A federated ecosystem for sharing genomic, clinical data.
Science (New York, N.Y.) 2016;352;6291;1278-80
Chromosomal phylogeny of Vampyressine bats (Chiroptera, Phyllostomidae) with description of two new sex chromosome systems.
Laboratório de Citogenética, CEABIO, ICB, Universidade Federal do Pará, Belém, Brazil.
Background: The subtribe Vampyressina (sensu Baker et al. 2003) encompasses approximately 43 species and seven genera and is a recent and diversified group of New World leaf-nosed bats specialized in fruit eating. The systematics of this group continues to be debated mainly because of the lack of congruence between topologies generated by molecular and morphological data. We analyzed seven species of all genera of vampyressine bats by multidirectional chromosome painting, using whole-chromosome-painting probes from Carollia brevicauda and Phyllostomus hastatus. Phylogenetic analyses were performed using shared discrete chromosomal segments as characters and the Phylogenetic Analysis Using Parsimony (PAUP) software package, using Desmodontinae as outgroup. We also used the Tree Analysis Using New Technology (TNT) software.
Results: The result showed a well-supported phylogeny congruent with molecular topologies regarding the sister taxa relationship of Vampyressa and Mesophylla genera, as well as the close relationship between the genus Chiroderma and Vampyriscus.
Conclusions: Our results supported the hypothesis that all genera of this subtribe have compound sex chromosome systems that originated from an X-autosome translocation, an ancestral condition observed in the Stenodermatinae. Additional rearrangements occurred independently in the genus Vampyressa and Mesophylla yielding the X1X1X2X2/X1X2Y sex chromosome system. This work presents additional data supporting the hypothesis based on molecular studies regarding the polyphyly of the genus Vampyressa and its sister relationship to Mesophylla.
BMC evolutionary biology 2016;16;1;119
Commitment issues in Plasmodium.
Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.
Nature reviews. Microbiology 2016;14;1;4
Standardized Welfare Terms for the Zebrafish Community.
1 Research Support Facility, Wellcome Trust Sanger Institute , Cambridge, United Kingdom .
Managing the welfare of laboratory animals is critical to animal health, vital in the understanding of phenotypes created by treatment or genetic alteration and ensures compliance of regulations. Part of an animal welfare assessment is the requirement to record observations, ensuring all those responsible for the animals are aware of their health status and can act accordingly. Although the use of zebrafish in research continues to increase, guidelines for conducting welfare assessments and the reporting of observations are considered unclear compared to mammalian species. To support the movement of zebrafish between facilities, significant improvement would be achieved through the use of standardized terms to ensure clarity and consistency between facilities. Improving the clarity of terminology around welfare not only addresses our ethical obligation but also supports the research goals and provides a searchable description of the phenotypes. A Collaboration between the Wellcome Trust Sanger Institute and Cambridge University (Department of Medicine-Laboratory of Molecular Biology) has led to the creation of the zebrafish welfare terms from which standardization of terminology can be achieved.
Evaluating and Optimizing Fish Health and Welfare During Experimental Procedures.
1 Research Support Facility, Wellcome Trust Sanger Institute , Cambridge, United Kingdom .
Many facilities house fish in separate static containers post-procedure, for example, while awaiting genotyping results. This ensures fish can be easily identified, but it does not allow for provision of continuous filtered water or diet. At the Wellcome Trust Sanger Institute, concern over the housing conditions led to the development of an individual housing system (GeneS) enabling feeding and water filtration. Trials to compare the water quality measures between the various systems found that fish housed in static containers experienced rapid deterioration in water quality. By day 1, measures of ammonia were outside the Institute's prescribed values and continued to rise until it was 25-fold higher than recommended levels. Nitrite levels were also outside recommended levels for all fish by day 9 and were twofold higher by the end of the trial. The water quality measures for tanks held on the recirculating system were stable even though food was provided. These results indicate that for housing zebrafish, running water or appropriately timed water changes are a critical component to ensure that the ethical obligations are met.
Heterogeneity in Oct4 and Sox2 Targets Biases Cell Fate in 4-Cell Mouse Embryos.
Department of Physiology, Development & Neuroscience, University of Cambridge, Downing Street, Cambridge CB2 3EG, UK.
The major and essential objective of pre-implantation development is to establish embryonic and extra-embryonic cell fates. To address when and how this fundamental process is initiated in mammals, we characterize transcriptomes of all individual cells throughout mouse pre-implantation development. This identifies targets of master pluripotency regulators Oct4 and Sox2 as being highly heterogeneously expressed between blastomeres of the 4-cell embryo, with Sox21 showing one of the most heterogeneous expression profiles. Live-cell tracking demonstrates that cells with decreased Sox21 yield more extra-embryonic than pluripotent progeny. Consistently, decreasing Sox21 results in premature upregulation of the differentiation regulator Cdx2, suggesting that Sox21 helps safeguard pluripotency. Furthermore, Sox21 is elevated following increased expression of the histone H3R26-methylase CARM1 and is lowered following CARM1 inhibition, indicating the importance of epigenetic regulation. Therefore, our results indicate that heterogeneous gene expression, as early as the 4-cell stage, initiates cell-fate decisions by modulating the balance of pluripotency and differentiation.
Funded by: Wellcome Trust
Meta-analysis of 375,000 individuals identifies 38 susceptibility loci for migraine.
Psychiatric and Neurodevelopmental Genetics Unit, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, USA.
Migraine is a debilitating neurological disorder affecting around one in seven people worldwide, but its molecular mechanisms remain poorly understood. There is some debate about whether migraine is a disease of vascular dysfunction or a result of neuronal dysfunction with secondary vascular changes. Genome-wide association (GWA) studies have thus far identified 13 independent loci associated with migraine. To identify new susceptibility loci, we carried out a genetic study of migraine on 59,674 affected subjects and 316,078 controls from 22 GWA studies. We identified 44 independent single-nucleotide polymorphisms (SNPs) significantly associated with migraine risk (P < 5 × 10(-8)) that mapped to 38 distinct genomic loci, including 28 loci not previously reported and a locus that to our knowledge is the first to be identified on chromosome X. In subsequent computational analyses, the identified loci showed enrichment for genes expressed in vascular and smooth muscle tissues, consistent with a predominant theory of migraine that highlights vascular etiologies.
Nature genetics 2016
Invasion of hepatocytes by Plasmodium sporozoites requires cGMP-dependent protein kinase and calcium dependent protein kinase 4.
Department of Microbiology, Biochemistry and Molecular Genetics, Rutgers - New Jersey Medical School, Newark, NJ, USA.
Invasion of hepatocytes by sporozoites is essential for Plasmodium to initiate infection of the mammalian host. The parasite's subsequent intracellular differentiation in the liver is the first developmental step of its mammalian cycle. Despite their biological significance, surprisingly little is known of the signalling pathways required for sporozoite invasion. We report that sporozoite invasion of hepatocytes requires signalling through two second-messengers - cGMP mediated by the parasite's cGMP-dependent protein kinase (PKG), and Ca<sup>2+</sup> , mediated by the parasite's calcium-dependent protein kinase 4 (CDPK4). Sporozoites expressing a mutated form of Plasmodium berghei PKG or carrying a deletion of the CDPK4 gene are defective in invasion of hepatocytes. Using specific and potent inhibitors of Plasmodium PKG and CDPK4, we demonstrate that PKG and CDPK4 are required for sporozoite motility, and that PKG regulates the secretion of TRAP, an adhesin that is essential for motility. Chemical inhibition of PKG decreases parasite egress from hepatocytes by inhibiting either the formation or release of merosomes. In contrast, genetic inhibition of CDPK4 does not significantly decrease the number of merosomes. By revealing the requirement for PKG and CDPK4 in Plasmodium sporozoite invasion, our work enables a better understanding of kinase pathways that act in different Plasmodium stages.
Funded by: NIAID NIH HHS: R21 AI094167; Wellcome Trust: WT098051
Molecular microbiology 2016;102;2;349-363
Genomic epidemiology of gonococcal resistance to extended spectrum cephalosporins, macrolides, and fluoroquinolones in the US, 2000-2013.
Department of Immunology and Infectious Diseases, Harvard TH Chan School of Public Health, Boston MA, USA Division of Infectious Diseases, Brigham and Women's Hospital, Harvard Medical School, Boston MA, USA firstname.lastname@example.org.
Background: Treatment of Neisseria gonorrhoeae infection is empiric and based on population-wide susceptibilities. Increasing antimicrobial resistance underscores the potential importance of rapid diagnostics, including sequence-based tests, to guide therapy. However, the utility of sequence-based diagnostics depends on the prevalence and dynamics of the resistance mechanisms.
Methods: We define the prevalence and dynamics of resistance markers to extended spectrum cephalosporins (ESC), macrolides, and fluoroquinolones in 1102 resistant and susceptible clinical N. gonorrhoeae isolates collected from 2000-2013 via the CDC's Gonococcal Isolate Surveillance Project (GISP).
Results: Reduced ESC susceptibility (ESC(RS)) is predominantly clonal and associated with the mosaic penA XXXIV allele and derivatives (sensitivity 98% for cefixime, 91% for ceftriaxone), but alternative resistance mechanisms have sporadically emerged. Reduced azithromycin susceptibility (Azi(RS)) has arisen through multiple mechanisms and shows limited clonal spread; the basis for resistance in 36% of Azi(RS) isolates is unclear. Quinolone resistant N. gonorrhoeae (QRNG) have arisen multiple times, with extensive clonal spread.
Conclusion: QRNG and reduced cefixime susceptibility appear amenable to development of sequence-based diagnostics, whereas the undefined mechanisms of resistance to ceftriaxone and azithromycin underscore the importance of phenotypic surveillance. The identification of multidrug-resistant isolates highlights the need for additional measures to respond to the threat of untreatable gonorrhea.
The Journal of infectious diseases 2016
Genes Required for the Fitness of Salmonella enterica Serovar Typhimurium during Infection of Immunodeficient gp91-/- phox Mice.
Department of Veterinary Medicine, University of Cambridge, Cambridge, United Kingdom email@example.com.
Salmonella enterica causes systemic diseases (typhoid and paratyphoid fever), nontyphoidal septicemia (NTS), and gastroenteritis in humans and other animals worldwide. An important but underrecognized emerging infectious disease problem in sub-Saharan Africa is NTS in children and immunocompromised adults. A current goal is to identify Salmonella mutants that are not pathogenic in the absence of key components of the immune system such as might be found in immunocompromised hosts. Such attenuated strains have the potential to be used as live vaccines. We have used transposon-directed insertion site sequencing (TraDIS) to screen mutants of Salmonella enterica serovar Typhimurium for their ability to infect and grow in the tissues of wild-type and immunodeficient mice. This was to identify bacterial genes that might be deleted for the development of live attenuated vaccines that would be safer to use in situations and/or geographical areas where immunodeficiencies are prevalent. The relative fitness of each of 9,356 transposon mutants, representing mutations in 3,139 different genes, was determined in gp91(-/-) phox mice. Mutations in certain genes led to reduced fitness in both wild-type and mutant mice. To validate these results, these genes were mutated by allelic replacement, and resultant mutants were retested for fitness in the mice. A defined deletion mutant of cysE was attenuated in C57BL/6 wild-type mice and immunodeficient gp91(-/-) phox mice and was effective as a live vaccine in wild-type mice.
Funded by: Biotechnology and Biological Sciences Research Council: APG19115; Medical Research Council: G1100102; Wellcome Trust: WT098051
Infection and immunity 2016;84;4;989-97
Modeling the evolution space of breakage fusion bridge cycles with a stochastic folding process.
School of Computing Sciences, University of East Anglia, Norwich, UK. C.Greenman@uea.ac.uk.
Breakage-fusion-bridge cycles in cancer arise when a broken segment of DNA is duplicated and an end from each copy joined together. This structure then 'unfolds' into a new piece of palindromic DNA. This is one mechanism responsible for the localised amplicons observed in cancer genome data. Here we study the evolution space of breakage-fusion-bridge structures in detail. We firstly consider discrete representations of this space with 2-d trees to demonstrate that there are [Formula: see text] qualitatively distinct evolutions involving [Formula: see text] breakage-fusion-bridge cycles. Secondly we consider the stochastic nature of the process to show these evolutions are not equally likely, and also describe how amplicons become localized. Finally we highlight these methods by inferring the evolution of breakage-fusion-bridge cycles with data from primary tissue cancer samples.
Journal of mathematical biology 2016;72;1-2;47-86
Genetic invalidation of Lp-PLA2 as a therapeutic target: Large-scale study of five functional Lp-PLA2-lowering alleles.
MRC/BHF Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, UK.
Aims: Darapladib, a potent inhibitor of lipoprotein-associated phospholipase A2 (Lp-PLA2), has not reduced risk of cardiovascular disease outcomes in recent randomized trials. We aimed to test whether Lp-PLA2 enzyme activity is causally relevant to coronary heart disease.
Methods: In 72,657 patients with coronary heart disease and 110,218 controls in 23 epidemiological studies, we genotyped five functional variants: four rare loss-of-function mutations (c.109+2T > C (rs142974898), Arg82His (rs144983904), Val279Phe (rs76863441), Gln287Ter (rs140020965)) and one common modest-impact variant (Val379Ala (rs1051931)) in PLA2G7, the gene encoding Lp-PLA2. We supplemented de-novo genotyping with information on a further 45,823 coronary heart disease patients and 88,680 controls in publicly available databases and other previous studies. We conducted a systematic review of randomized trials to compare effects of darapladib treatment on soluble Lp-PLA2 activity, conventional cardiovascular risk factors, and coronary heart disease risk with corresponding effects of Lp-PLA2-lowering alleles.
Results: Lp-PLA2 activity was decreased by 64% (p = 2.4 × 10(-25)) with carriage of any of the four loss-of-function variants, by 45% (p < 10(-300)) for every allele inherited at Val279Phe, and by 2.7% (p = 1.9 × 10(-12)) for every allele inherited at Val379Ala. Darapladib 160 mg once-daily reduced Lp-PLA2 activity by 65% (p < 10(-300)). Causal risk ratios for coronary heart disease per 65% lower Lp-PLA2 activity were: 0.95 (0.88-1.03) with Val279Phe; 0.92 (0.74-1.16) with carriage of any loss-of-function variant; 1.01 (0.68-1.51) with Val379Ala; and 0.95 (0.89-1.02) with darapladib treatment.
Conclusions: In a large-scale human genetic study, none of a series of Lp-PLA2-lowering alleles was related to coronary heart disease risk, suggesting that Lp-PLA2 is unlikely to be a causal risk factor.
European journal of preventive cardiology 2016
Rapid parallel acquisition of somatic mutations after NPM1 in acute myeloid leukaemia evolution.
Wellcome Trust Sanger Institute, Hinxton, Cambridgeshire, UK.
British journal of haematology 2016
Role of Plasmodium vivax Duffy-binding protein 1 in invasion of Duffy-null Africans.
Laboratory of Malaria and Vector Research, National Institutes of Allergy and Infectious Diseases, National Institutes of Health, Rockville, MD 20852;
The ability of the malaria parasite Plasmodium vivax to invade erythrocytes is dependent on the expression of the Duffy blood group antigen on erythrocytes. Consequently, Africans who are null for the Duffy antigen are not susceptible to P. vivax infections. Recently, P. vivax infections in Duffy-null Africans have been documented, raising the possibility that P. vivax, a virulent pathogen in other parts of the world, may expand malarial disease in Africa. P. vivax binds the Duffy blood group antigen through its Duffy-binding protein 1 (DBP1). To determine if mutations in DBP1 resulted in the ability of P. vivax to bind Duffy-null erythrocytes, we analyzed P. vivax parasites obtained from two Duffy-null individuals living in Ethiopia where Duffy-null and -positive Africans live side-by-side. We determined that, although the DBP1s from these parasites contained unique sequences, they failed to bind Duffy-null erythrocytes, indicating that mutations in DBP1 did not account for the ability of P. vivax to infect Duffy-null Africans. However, an unusual DNA expansion of DBP1 (three and eight copies) in the two Duffy-null P. vivax infections suggests that an expansion of DBP1 may have been selected to allow low-affinity binding to another receptor on Duffy-null erythrocytes. Indeed, we show that Salvador (Sal) I P. vivax infects Squirrel monkeys independently of DBP1 binding to Squirrel monkey erythrocytes. We conclude that P. vivax Sal I and perhaps P. vivax in Duffy-null patients may have adapted to use new ligand-receptor pairs for invasion.
Funded by: NIAID NIH HHS: R21 AI101802; Wellcome Trust
Proceedings of the National Academy of Sciences of the United States of America 2016;113;22;6271-6
Naive Pluripotent Stem Cells Derived Directly from Isolated Cells of the Human Inner Cell Mass.
Wellcome Trust - Medical Research Council Stem Cell Institute, University of Cambridge, Tennis Court Road, Cambridge CB2 1QR, UK.
Conventional generation of stem cells from human blastocysts produces a developmentally advanced, or primed, stage of pluripotency. In vitro resetting to a more naive phenotype has been reported. However, whether the reset culture conditions of selective kinase inhibition can enable capture of naive epiblast cells directly from the embryo has not been determined. Here, we show that in these specific conditions individual inner cell mass cells grow into colonies that may then be expanded over multiple passages while retaining a diploid karyotype and naive properties. The cells express hallmark naive pluripotency factors and additionally display features of mitochondrial respiration, global gene expression, and genome-wide hypomethylation distinct from primed cells. They transition through primed pluripotency into somatic lineage differentiation. Collectively these attributes suggest classification as human naive embryonic stem cells. Human counterparts of canonical mouse embryonic stem cells would argue for conservation in the phased progression of pluripotency in mammals.
Stem cell reports 2016;6;4;437-46
Functional analysis of an unusual type IV pilus in the Gram-positive Streptococcus sanguinis.
MRC Centre for Molecular Bacteriology and Infection, Imperial College London, London, UK.
Type IV pili (Tfp), which have been studied extensively in a few Gram-negative species, are the paradigm of a group of widespread and functionally versatile nano-machines. Here, we performed the most detailed molecular characterisation of Tfp in a Gram-positive bacterium. We demonstrate that the naturally competent Streptococcus sanguinis produces retractable Tfp, which like their Gram-negative counterparts can generate hundreds of piconewton of tensile force and promote intense surface-associated motility. Tfp power 'train-like' directional motion parallel to the long axis of chains of cells, leading to spreading zones around bacteria grown on plates. However, S. sanguinis Tfp are not involved in DNA uptake, which is mediated by a related but distinct nano-machine, and are unusual because they are composed of two pilins in comparable amounts, rather than one as normally seen. Whole genome sequencing identified a locus encoding all the genes involved in Tfp biology in S. sanguinis. A systematic mutational analysis revealed that Tfp biogenesis in S. sanguinis relies on a more basic machinery (only 10 components) than in Gram-negative species and that a small subset of four proteins dispensable for pilus biogenesis are essential for motility. Intriguingly, one of the piliated mutants that does not exhibit spreading retains microscopic motility but moves sideways, which suggests that the corresponding protein controls motion directionality. Besides establishing S. sanguinis as a useful new model for studying Tfp biology, these findings have important implications for our understanding of these widespread filamentous nano-machines.
Molecular microbiology 2016;99;2;380-92
Atlas of prostate cancer heritability in European and African-American men pinpoints tissue-specific regulation.
Program in Genetic Epidemiology and Statistical Genetics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts 02115, USA.
Although genome-wide association studies have identified over 100 risk loci that explain ∼33% of familial risk for prostate cancer (PrCa), their functional effects on risk remain largely unknown. Here we use genotype data from 59,089 men of European and African American ancestries combined with cell-type-specific epigenetic data to build a genomic atlas of single-nucleotide polymorphism (SNP) heritability in PrCa. We find significant differences in heritability between variants in prostate-relevant epigenetic marks defined in normal versus tumour tissue as well as between tissue and cell lines. The majority of SNP heritability lies in regions marked by H3k27 acetylation in prostate adenoc7arcinoma cell line (LNCaP) or by DNaseI hypersensitive sites in cancer cell lines. We find a high degree of similarity between European and African American ancestries suggesting a similar genetic architecture from common variation underlying PrCa risk. Our findings showcase the power of integrating functional annotation with genetic data to understand the genetic basis of PrCa.
Funded by: Cancer Research UK: 10119, 10124, 13065, 14136, C12292/A11174, C1281/A12014, C1287/A10118, C1287/A10710, C16913/A6135, C490/A10124, C5047/A10692, C5047/A15007, C5047/A3354, C5047/A7357, C5047/A8384, C522/A8649, C8197/A10123, C8197/A10865, G0500966/75466; Medical Research Council: 75466, G0401527, G0500966, MC_PC_15018; NCI NIH HHS: 1 U19 CA148537-01, 1U19 CA148065, 1U19 CA148112, 1U19 CA148537, CA098758, CA128978, CA54281, CA63464, P30 CA016672, P30 CA042014, P30 CA060553, P30 CA068485, P30 CA68485, P30CA042014, R01 CA054281, R01 CA056678, R01 CA063464, R01 CA072818, R01 CA082664, R01 CA092447, R01 CA092579, R01 CA128813, R01 CA128978, R01 CA188392, R01 CA193910, R01CA056678, R01CA082664, R01CA092579, R01CA128813, R01CA72818, R35 CA197449, R37 CA054281, U01 CA063464, U01 CA098758, U01 CA164973, U01 CA188392, U01 CA194393, U10 CA037429, U19 CA148065, U19 CA148112, U19 CA148537, UG1 CA189974, UM1 CA182883; NIAID NIH HHS: U19 AI111224; NIAMS NIH HHS: R01 AR063759; NIEHS NIH HHS: R01 ES011126; NIGMS NIH HHS: F32 GM106584, R01 GM105857, R01 GM107427; NIMH NIH HHS: R01 MH101244; Wellcome Trust: 076113, 102215
Nature communications 2016;7;10979
Functional implications of disease-specific variants in loci jointly associated with coeliac disease and rheumatoid arthritis.
Department of Genetics, University Medical Centre Groningen, University of Groningen, Groningen, The Netherlands.
Hundreds of genomic loci have been associated with a significant number of immune-mediated diseases, and a large proportion of these associated loci are shared among traits. Both the molecular mechanisms by which these loci confer disease susceptibility and the extent to which shared loci are implicated in a common pathogenesis are unknown. We therefore sought to dissect the functional components at loci shared between two autoimmune diseases: coeliac disease (CeD) and rheumatoid arthritis (RA). We used a cohort of 12 381 CeD cases and 7827 controls, and another cohort of 13 819 RA cases and 12 897 controls, all genotyped with the Immunochip platform. In the joint analysis, we replicated 19 previously identified loci shared by CeD and RA and discovered five new non-HLA loci shared by CeD and RA. Our fine-mapping results indicate that in nine of 24 shared loci the associated variants are distinct in the two diseases. Using cell-type-specific histone markers, we observed that loci which pointed to the same variants in both diseases were enriched for marks of promoters active in CD14+ and CD34+ immune cells (P < 0.001), while loci pointing to distinct variants in one of the two diseases showed enrichment for marks of more specialized cell types, like CD4+ regulatory T cells in CeD (P < 0.0001) compared with Th17 and CD15+ in RA (P = 0.0029).
Funded by: Wellcome Trust: WT098051
Human molecular genetics 2016;25;1;180-90
Chad Genetic Diversity Reveals an African History Marked by Multiple Holocene Eurasian Migrations.
Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK. Electronic address: firstname.lastname@example.org.
Understanding human genetic diversity in Africa is important for interpreting the evolution of all humans, yet vast regions in Africa, such as Chad, remain genetically poorly investigated. Here, we use genotype data from 480 samples from Chad, the Near East, and southern Europe, as well as whole-genome sequencing from 19 of them, to show that many populations today derive their genomes from ancient African-Eurasian admixtures. We found evidence of early Eurasian backflow to Africa in people speaking the unclassified isolate Laal language in southern Chad and estimate from linkage-disequilibrium decay that this occurred 4,750-7,200 years ago. It brought to Africa a Y chromosome lineage (R1b-V88) whose closest relatives are widespread in present-day Eurasia; we estimate from sequence data that the Chad R1b-V88 Y chromosomes coalesced 5,700-7,300 years ago. This migration could thus have originated among Near Eastern farmers during the African Humid Period. We also found that the previously documented Eurasian backflow into Africa, which occurred ∼3,000 years ago and was thought to be mostly limited to East Africa, had a more westward impact affecting populations in northern Chad, such as the Toubou, who have 20%-30% Eurasian ancestry today. We observed a decline in heterozygosity in admixed Africans and found that the Eurasian admixture can bias inferences on their coalescent history and confound genetic signals from adaptation and archaic introgression.
Funded by: Wellcome Trust
American journal of human genetics 2016;99;6;1316-1324
Ancient DNA and the rewriting of human history: be sparing with Occam's razor.
The Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK.
Ancient DNA research is revealing a human history far more complex than that inferred from parsimonious models based on modern DNA. Here, we review some of the key events in the peopling of the world in the light of the findings of work on ancient DNA.
Funded by: Wellcome Trust: 098051
Genome biology 2016;17;1
Genetic evidence for an origin of the Armenians from Bronze Age mixing of multiple populations.
The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, UK.
The Armenians are a culturally isolated population who historically inhabited a region in the Near East bounded by the Mediterranean and Black seas and the Caucasus, but remain under-represented in genetic studies and have a complex history including a major geographic displacement during World War I. Here, we analyse genome-wide variation in 173 Armenians and compare them with 78 other worldwide populations. We find that Armenians form a distinctive cluster linking the Near East, Europe, and the Caucasus. We show that Armenian diversity can be explained by several mixtures of Eurasian populations that occurred between ~3000 and ~2000 bce, a period characterized by major population migrations after the domestication of the horse, appearance of chariots, and the rise of advanced civilizations in the Near East. However, genetic signals of population mixture cease after ~1200 bce when Bronze Age civilizations in the Eastern Mediterranean world suddenly and violently collapsed. Armenians have since remained isolated and genetic structure within the population developed ~500 years ago when Armenia was divided between the Ottomans and the Safavid Empire in Iran. Finally, we show that Armenians have higher genetic affinity to Neolithic Europeans than other present-day Near Easterners, and that 29% of Armenian ancestry may originate from an ancestral population that is best represented by Neolithic Europeans.
Funded by: Wellcome Trust: 077009
European journal of human genetics : EJHG 2016;24;6;931-6
Wide distribution and altitude correlation of an archaic high-altitude-adaptive EPAS1 haplotype in the Himalayas.
The Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK.
High-altitude adaptation in Tibetans is influenced by introgression of a 32.7-kb haplotype from the Denisovans, an extinct branch of archaic humans, lying within the endothelial PAS domain protein 1 (EPAS1), and has also been reported in Sherpa. We genotyped 19 variants in this genomic region in 1507 Eurasian individuals, including 1188 from Bhutan and Nepal residing at altitudes between 86 and 4550 m above sea level. Derived alleles for five SNPs characterizing the core Denisovan haplotype (AGGAA) were present at high frequency not only in Tibetans and Sherpa, but also among many populations from the Himalayas, showing a significant correlation with altitude (Spearman's correlation coefficient = 0.75, p value 3.9 × 10(-11)). Seven East- and South-Asian 1000 Genomes Project individuals shared the Denisovan haplotype extending beyond the 32-kb region, enabling us to refine the haplotype structure and identify a candidate regulatory variant (rs370299814) that might be interacting in an additive manner with the derived G allele of rs150877473, the variant previously associated with high-altitude adaptation in Tibetans. Denisovan-derived alleles were also observed at frequencies of 3-14% in the 1000 Genomes Project African samples. The closest African haplotype is, however, separated from the Asian high-altitude haplotype by 22 mutations whereas only three mutations, including rs150877473, separate the Asians from the Denisovan, consistent with distant shared ancestry for African and Asian haplotypes and Denisovan adaptive introgression.
Funded by: Wellcome Trust: 087576, 098051
Human genetics 2016;135;4;393-402
A bit of a mouthful.
Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.
This month's Genome Watch explores recent advances in the identification of species-level and strain-level diversity in microbiome studies, and highlights how these have provided insights into the tropism and persistence of Neisseria spp. in the human oral cavity.
Nature reviews. Microbiology 2016;14;9;548
Great ape Y Chromosome and mitochondrial DNA phylogenies reflect subspecies structure and patterns of mating and dispersal.
Department of Genetics, University of Leicester, Leicester LE1 7RH, United Kingdom; Institute of Molecular and Cell Biology, University of Tartu, Tartu 51010, Estonia;
The distribution of genetic diversity in great ape species is likely to have been affected by patterns of dispersal and mating. This has previously been investigated by sequencing autosomal and mitochondrial DNA (mtDNA), but large-scale sequence analysis of the male-specific region of the Y Chromosome (MSY) has not yet been undertaken. Here, we use the human MSY reference sequence as a basis for sequence capture and read mapping in 19 great ape males, combining the data with sequences extracted from the published whole genomes of 24 additional males to yield a total sample of 19 chimpanzees, four bonobos, 14 gorillas, and six orangutans, in which interpretable MSY sequence ranges from 2.61 to 3.80 Mb. This analysis reveals thousands of novel MSY variants and defines unbiased phylogenies. We compare these with mtDNA-based trees in the same individuals, estimating time-to-most-recent common ancestor (TMRCA) for key nodes in both cases. The two loci show high topological concordance and are consistent with accepted (sub)species definitions, but time depths differ enormously between loci and (sub)species, likely reflecting different dispersal and mating patterns. Gorillas and chimpanzees/bonobos present generally low and high MSY diversity, respectively, reflecting polygyny versus multimale-multifemale mating. However, particularly marked differences exist among chimpanzee subspecies: The western chimpanzee MSY phylogeny has a TMRCA of only 13.2 (10.8-15.8) thousand years, but that for central chimpanzees exceeds 1 million years. Cross-species comparison within a single MSY phylogeny emphasizes the low human diversity, and reveals species-specific branch length variation that may reflect differences in long-term generation times.
Genome research 2016;26;4;427-39
Powerful decomposition of complex traits in a diploid model.
Institute for Research on Cancer and Aging, Nice (IRCAN), CNRS UMR7284, INSERM U1081, University of Nice Sophia Antipolis, 06107 Nice, France.
Explaining trait differences between individuals is a core and challenging aim of life sciences. Here, we introduce a powerful framework for complete decomposition of trait variation into its underlying genetic causes in diploid model organisms. We sequence and systematically pair the recombinant gametes of two intercrossed natural genomes into an array of diploid hybrids with fully assembled and phased genomes, termed Phased Outbred Lines (POLs). We demonstrate the capacity of this approach by partitioning fitness traits of 6,642 Saccharomyces cerevisiae POLs across many environments, achieving near complete trait heritability and precisely estimating additive (73%), dominance (10%), second (7%) and third (1.7%) order epistasis components. We map quantitative trait loci (QTLs) and find nonadditive QTLs to outnumber (3:1) additive loci, dominant contributions to heterosis to outnumber overdominant, and extensive pleiotropy. The POL framework offers the most complete decomposition of diploid traits to date and can be adapted to most model organisms.
Nature communications 2016;7;13311
Exploitation of the Apoptosis-Primed State of MYCN-Amplified Neuroblastoma to Develop a Potent and Specific Targeted Therapy Combination.
Philips Institute for Oral Health Research, VCU School of Dentistry and Massey Cancer Center, Virginia Commonwealth University, Perkinson Building, Richmond, VA 23298, USA.
Fewer than half of children with high-risk neuroblastoma survive. Many of these tumors harbor high-level amplification of MYCN, which correlates with poor disease outcome. Using data from our large drug screen we predicted, and subsequently demonstrated, that MYCN-amplified neuroblastomas are sensitive to the BCL-2 inhibitor ABT-199. This sensitivity occurs in part through low anti-apoptotic BCL-xL expression, high pro-apoptotic NOXA expression, and paradoxical, MYCN-driven upregulation of NOXA. Screening for enhancers of ABT-199 sensitivity in MYCN-amplified neuroblastomas, we demonstrate that the Aurora Kinase A inhibitor MLN8237 combines with ABT-199 to induce widespread apoptosis. In diverse models of MYCN-amplified neuroblastoma, including a patient-derived xenograft model, this combination uniformly induced tumor shrinkage, and in multiple instances led to complete tumor regression.
Cancer cell 2016;29;2;159-72
Association of breast cancer risk in BRCA1 and BRCA2 mutation carriers with genetic variants showing differential allelic expression: identification of a modifier of breast cancer risk at locus 11q22.3.
Genomics Center, Centre Hospitalier Universitaire de Québec Research Center and Laval University, 2705 Laurier Boulevard, Quebec, QC, G1V 4G2, Canada.
Purpose: Cis-acting regulatory SNPs resulting in differential allelic expression (DAE) may, in part, explain the underlying phenotypic variation associated with many complex diseases. To investigate whether common variants associated with DAE were involved in breast cancer susceptibility among BRCA1 and BRCA2 mutation carriers, a list of 175 genes was developed based of their involvement in cancer-related pathways.
Methods: Using data from a genome-wide map of SNPs associated with allelic expression, we assessed the association of ~320 SNPs located in the vicinity of these genes with breast and ovarian cancer risks in 15,252 BRCA1 and 8211 BRCA2 mutation carriers ascertained from 54 studies participating in the Consortium of Investigators of Modifiers of BRCA1/2.
Results: We identified a region on 11q22.3 that is significantly associated with breast cancer risk in BRCA1 mutation carriers (most significant SNP rs228595 p = 7 × 10(-6)). This association was absent in BRCA2 carriers (p = 0.57). The 11q22.3 region notably encompasses genes such as ACAT1, NPAT, and ATM. Expression quantitative trait loci associations were observed in both normal breast and tumors across this region, namely for ACAT1, ATM, and other genes. In silico analysis revealed some overlap between top risk-associated SNPs and relevant biological features in mammary cell data, which suggests potential functional significance.
Conclusion: We identified 11q22.3 as a new modifier locus in BRCA1 carriers. Replication in larger studies using estrogen receptor (ER)-negative or triple-negative (i.e., ER-, progesterone receptor-, and HER2-negative) cases could therefore be helpful to confirm the association of this locus with breast cancer risk.
Breast cancer research and treatment 2016
A small Acinetobacter plasmid carrying the tet39 tetracycline resistance determinant.
School of Molecular Bioscience, The University of Sydney, NSW 2006, Australia email@example.com.
The Journal of antimicrobial chemotherapy 2016;71;1;269-71
Rubinstein-Taybi syndrome type 2: report of nine new cases that extend the phenotypic and genotypic spectrum.
aDepartment of Clinical Genetics, Nottingham City Hospital, Nottingham bDepartment of Clinical Genetics, University Hospitals Bristol, Bristol cClinical Genetics Service dViapath Analytics LLP, Guy's and St Thomas' Hospital eClinical Genetics Unit, Great Ormond Street Hospital for Children, London fWest of Scotland Clinical Genetics Service, Queen Elizabeth University Hospital, Glasgow gYorkshire Regional Genetics Service, Chapel Allerton Hospital, Leeds hWellcome Trust Sanger Institute, Hinxton, Cambridge, UK iDepartment of Clinical Genetics, Our Lady's Hospital for Children jACoRD, University College Dublin, Dublin, Ireland.
Rubinstein-Taybi syndrome (RTS) is an autosomal dominant neurodevelopmental disorder characterized by growth deficiency, broad thumbs and great toes, intellectual disability and characteristic craniofacial appearance. Mutations in CREBBP account for around 55% of cases, with a further 8% attributed to the paralogous gene EP300. Comparatively few reports exist describing the phenotype of Rubinstein-Taybi because of EP300 mutations. Clinical and genetic data were obtained from nine patients from the UK and Ireland with pathogenic EP300 mutations, identified either by targeted testing or by exome sequencing. All patients had mild or moderate intellectual impairment. Behavioural or social difficulties were noted in eight patients, including three with autistic spectrum disorders. Typical dysmorphic features of Rubinstein-Taybi were only variably present. Additional observations include maternal pre-eclampsia (2/9), syndactyly (3/9), feeding or swallowing issues (3/9), delayed bone age (2/9) and scoliosis (2/9). Six patients had truncating mutations in EP300, with pathogenic missense mutations identified in the remaining three. The findings support previous observations that microcephaly, maternal pre-eclampsia, mild growth restriction and a mild to moderate intellectual disability are key pointers to the diagnosis of EP300-related RTS. Variability in the presence of typical facial features of Rubinstein-Taybi further highlights clinical heterogeneity, particularly among patients identified by exome sequencing. Features that overlap with Floating-Harbor syndrome, including craniofacial dysmorphism and delayed osseous maturation, were observed in three patients. Previous reports have only described mutations predicted to cause haploinsufficiency of EP300, whereas this cohort includes the first described pathogenic missense mutations in EP300.
Clinical dysmorphology 2016;25;4;135-45
Public health interventions to protect against falsified medicines: a systematic review of international, national and local policies.
University of Cambridge School of Clinical Medicine, Addenbrooke's Hospital, Hills Road, Cambridge CB2 0SP, UK firstname.lastname@example.org.
Background: Falsified medicines are deliberately fraudulent drugs that pose a direct risk to patient health and undermine healthcare systems, causing global morbidity and mortality.
Objective: To produce an overview of anti-falsifying public health interventions deployed at international, national and local scales in low and middle income countries (LMIC).
Data sources: We conducted a systematic search of the PubMed, Web of Science, Embase and Cochrane Central Register of Controlled Trials databases for healthcare or pharmaceutical policies relevant to reducing the burden of falsified medicines in LMIC.
Results: Our initial search identified 660 unique studies, of which 203 met title/abstract inclusion criteria and were categorised according to their primary focus: international; national; local pharmacy; internet pharmacy; drug analysis tools. Eighty-four were included in the qualitative synthesis, along with 108 articles and website links retrieved through secondary searches.
Discussion: On the international stage, we discuss the need for accessible pharmacovigilance (PV) global reporting systems, international leadership and funding incorporating multiple stakeholders (healthcare, pharmaceutical, law enforcement) and multilateral trade agreements that emphasise public health. On the national level, we explore the importance of establishing adequate medicine regulatory authorities and PV capacity, with drug screening along the supply chain. This requires interdepartmental coordination, drug certification and criminal justice legislation and enforcement that recognise the severity of medicine falsification. Local healthcare professionals can receive training on medicine quality assessments, drug registration and pharmacological testing equipment. Finally, we discuss novel technologies for drug analysis which allow rapid identification of fake medicines in low-resource settings. Innovative point-of-purchase systems like mobile phone verification allow consumers to check the authenticity of their medicines.
Conclusions: Combining anti-falsifying strategies targeting different levels of the pharmaceutical supply chain provides multiple barriers of protection from falsified medicines. This requires the political will to drive policy implementation; otherwise, people around the world remain at risk.
Health policy and planning 2016;31;10;1448-1466
Divergent evolution of vitamin B9 binding underlies Juno-mediated adhesion of mammalian gametes.
Department of Biosciences and Nutrition & Center for Innovative Medicine, Karolinska Institutet, Huddinge, SE-141 83, Sweden.
The interaction between egg and sperm is the first necessary step of fertilization in all sexually reproducing organisms. A decade-long search for a protein pair mediating this event in mammals culminated in the identification of the glycosylphosphatidylinositol (GPI)-anchored glycoprotein Juno as the egg plasma membrane receptor of sperm Izumo1 [1,2]. The Juno-Izumo1 interaction was shown to be essential for fertilization since mice lacking either gene exhibit sex-specific sterility, making these proteins promising non-hormonal contraceptive targets [1,3]. No structural information is available on how gamete membranes interact at fertilization, and it is unclear how Juno - which was previously named folate receptor (FR) 4, based on sequence similarity considerations - triggers membrane adhesion by binding Izumo1. Here, we report the crystal structure of Juno and find that the overall fold is similar to that of FRα and FRβ but with significant flexibility within the area that corresponds to the rigid ligand-binding site of these bona fide folate receptors. This explains both the inability of Juno to bind vitamin B9/folic acid , and why mutations within the flexible region can either abolish or change the species specificity of this interaction. Furthermore, structural similarity between Juno and the cholesterol-binding Niemann-Pick disease type C1 protein (NPC1) suggests how the modified binding surface of Juno may recognize the helical structure of the amino-terminal domain of Izumo1. As Juno appears to be a mammalian innovation, our study indicates that a key evolutionary event in mammalian reproduction originated from the neofunctionalization of the vitamin B9-binding pocket of an ancestral folate receptor molecule.
Funded by: European Research Council: 260759; Medical Research Council: MR/M012468/1; Wellcome Trust: 098051
Current biology : CB 2016;26;3;R100-1
Fast, Accurate and Automatic Ancient Nucleosome and Methylation Maps with epiPALEOMIX.
Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Copenhagen, Denmark.
The first epigenomes from archaic hominins (AH) and ancient anatomically modern humans (AMH) have recently been characterized, based, however, on a limited number of samples. The extent to which ancient genome-wide epigenetic landscapes can be reconstructed thus remains contentious. Here, we present epiPALEOMIX, an open-source and user-friendly pipeline that exploits post-mortem DNA degradation patterns to reconstruct ancient methylomes and nucleosome maps from shotgun and/or capture-enrichment data. Applying epiPALEOMIX to the sequence data underlying 35 ancient genomes including AMH, AH, equids and aurochs, we investigate the temporal, geographical and preservation range of ancient epigenetic signatures. We first assess the quality of inferred ancient epigenetic signatures within well-characterized genomic regions. We find that tissue-specific methylation signatures can be obtained across a wider range of DNA preparation types than previously thought, including when no particular experimental procedures have been used to remove deaminated cytosines prior to sequencing. We identify a large subset of samples for which DNA associated with nucleosomes is protected from post-mortem degradation, and nucleosome positioning patterns can be reconstructed. Finally, we describe parameters and conditions such as DNA damage levels and sequencing depth that limit the preservation of epigenetic signatures in ancient samples. When such conditions are met, we propose that epigenetic profiles of CTCF binding regions can be used to help data authentication. Our work, including epiPALEOMIX, opens for further investigations of ancient epigenomes through time especially aimed at tracking possible epigenetic changes during major evolutionary, environmental, socioeconomic, and cultural shifts.
Molecular biology and evolution 2016;33;12;3284-3298
Germline TERT promoter mutations are rare in familial melanoma.
Section of Epidemiology and Biostatistics, Leeds Institute of Cancer and Pathology, University of Leeds, Leeds, LS9 7TF, UK. email@example.com.
Germline CDKN2A mutations occur in 40 % of 3-or-more case melanoma families while mutations of CDK4, BAP1, and genes involved in telomere function (ACD, TERF2IP, POT1), have also been implicated in melanomagenesis. Mutation of the promoter of the telomerase reverse transcriptase (TERT) gene (c.-57 T>G variant) has been reported in one family. We tested for the TERT promoter variant in 675 multicase families wild-type for the known high penetrance familial melanoma genes, 1863 UK population-based melanoma cases and 529 controls. Germline lymphocyte telomere length was estimated in carriers. The c.-57 T>G TERT promoter variant was identified in one 7-case family with multiple primaries and early age of onset (earliest, 15 years) but not among population cases or controls. One family member had multiple primary melanomas, basal cell carcinomas and a bladder tumour. The blood leukocyte telomere length of a carrier was similar to wild-type cases. We provide evidence confirming that a rare promoter variant of TERT (c.-57 T>G) is associated with high penetrance, early onset melanoma and potentially other cancers, and explains <1 % of UK melanoma multicase families. The identification of POT1 and TERT germline mutations highlights the importance of telomere integrity in melanoma biology.
Funded by: Cancer Research UK: 13031, C588/A19167, C8197/A16565, C8216/A6129; Intramural NIH HHS; NCI NIH HHS: CA83115, R01 CA083115
Familial cancer 2016;15;1;139-44
TRAIP promotes DNA damage response during genome replication and is mutated in primordial dwarfism.
MRC Human Genetics Unit, IGMM, University of Edinburgh, Edinburgh, EH4 2XU, UK.
DNA lesions encountered by replicative polymerases threaten genome stability and cell cycle progression. Here we report the identification of mutations in TRAIP, encoding an E3 RING ubiquitin ligase, in patients with microcephalic primordial dwarfism. We establish that TRAIP relocalizes to sites of DNA damage, where it is required for optimal phosphorylation of H2AX and RPA2 during S-phase in response to ultraviolet (UV) irradiation, as well as fork progression through UV-induced DNA lesions. TRAIP is necessary for efficient cell cycle progression and mutations in TRAIP therefore limit cellular proliferation, providing a potential mechanism for microcephaly and dwarfism phenotypes. Human genetics thus identifies TRAIP as a component of the DNA damage response to replication-blocking DNA lesions.
Funded by: Cancer Research UK: 11224, 13030, C17183/A13030, C6/A11224; European Research Council: 281847; Medical Research Council: MC_PC_U127580972
Nature genetics 2016;48;1;36-43
The Human Gut Microbiota.
Department of Medical Microbiology, University of Groningen, University Medical Center Groningen, 30001, 9700, Groningen, The Netherlands. H.J.M.Harmsen@med.umcg.nl.
The microbiota in our gut performs many different essential functions that help us to stay healthy. These functions include vitamin production, regulation of lipid metabolism and short chain fatty acid production as fuel for epithelial cells and regulation of gene expression. There is a very numerous and diverse microbial community present in the gut, especially in the colon, with reported numbers of species that vary between 400 and 1500, for some those we even do not yet have culture representatives.A healthy gut microbiota is important for maintaining a healthy host. An aberrant microbiota can cause diseases of different nature and at different ages ranging from allergies at early age to IBD in young adults. This shows that our gut microbiota needs to be treated well to stay healthy. In this chapter we describe what we consider a healthy microbiota and discuss what the role of the microbiota is in various diseases. Research into these described dysbiosis conditions could lead to new strategies for treatment and/or management of our microbiota to improve health.
Advances in experimental medicine and biology 2016;902;95-108
PBP2a substitutions linked to ceftaroline resistance in MRSA isolates from the UK.
Department of Medicine, University of Cambridge, Cambridge, UK firstname.lastname@example.org.
Funded by: Medical Research Council: G1000803, G1001787, G1001787/1; Wellcome Trust
The Journal of antimicrobial chemotherapy 2016;71;1;268-9
Validation of self-administered nasal swabs and postage for the isolation of Staphylococcus aureus.
1Department of Veterinary Medicine, University of Cambridge, Cambridge, UK.
Staphylococcus aureus carriers are at higher risk of S. aureus infection and are a reservoir for transmission to others. Detection of nasal S. aureus carriage is important for both targeted decolonization and epidemiological studies. Self-administered nasal swabbing has been reported previously, but the effects of posting swabs prior to culture on S. aureus yield have not been investigated. A longitudinal cohort study was performed in which healthy volunteers were recruited, trained in the swabbing procedure and asked to take weekly nasal swabs for 6 weeks (median: 3 weeks, range 1-6 weeks). Two swabs were taken at each sampling episode and randomly assigned for immediate processing on arrival to the laboratory (Swab A) or second class postage prior to processing (Swab B). S. aureus was isolated using standard methods. A total of 95 participants were recruited, who took 944 swabs (472 pairs) over a median of 5 weeks. Of these, 459 swabs were positive for S. aureus. We found no significant difference (P=0.25) between 472 pairs of nasal self-swabs processed immediately or following standard postage from 95 study participants (51.4 % vs. 48.6 %, respectively). We also provide further evidence that persistent carriers can be detected by two weekly swabs with high degrees of sensitivity [92.3 % (95 % CI 74.8-98.8 %)] and specificity [95.6 % (95 % CI 84.8-99.3 %)] compared with a gold standard of five weekly swabs. Self-swabbing and postage of nasal swabs prior to processing has no effect on yield of S. aureus, and could facilitate large community-based carriage studies.
Funded by: Medical Research Council: G0800270, G1000803, G1001787
Journal of medical microbiology 2016;65;12;1434-1437
Transmission of methicillin-resistant Staphylococcus aureus in long-term care facilities and their related healthcare networks.
Department of Medicine, University of Cambridge, Addenbrooke's Hospital, Box 157, Hills Road, Cambridge, CB2 0QQ, UK. email@example.com.
Background: Long-term care facilities (LTCF) are potential reservoirs for methicillin-resistant Staphylococcus aureus (MRSA), control of which may reduce MRSA transmission and infection elsewhere in the healthcare system. Whole-genome sequencing (WGS) has been used successfully to understand MRSA epidemiology and transmission in hospitals and has the potential to identify transmission between these and LTCF.
Methods: Two prospective observational studies of MRSA carriage were conducted in LTCF in England and Ireland. MRSA isolates were whole-genome sequenced and analyzed using established methods. Genomic data were available for MRSA isolated in the local healthcare systems (isolates submitted by hospitals and general practitioners).
Results: We sequenced a total of 181 MRSA isolates from the two study sites. The majority of MRSA were multilocus sequence type (ST)22. WGS identified one likely transmission event between residents in the English LTCF and three putative transmission events in the Irish LTCF. WGS also identified closely related isolates present in colonized Irish residents and their immediate environment. Based on phylogenetic reconstruction, closely related MRSA clades were identified between the LTCF and their healthcare referral network, together with putative MRSA acquisition by LTCF residents during hospital admission.
Conclusions: These data confirm that MRSA is transmitted between residents of LTCF and is both acquired and transmitted to others in referral hospitals and beyond. Our data present compelling evidence for the importance of environmental contamination in MRSA transmission, reinforcing the importance of environmental cleaning. The use of WGS in this study highlights the need to consider infection control in hospitals and community healthcare facilities as a continuum.
Genome medicine 2016;8;1;102
Differential Killing of Salmonella enterica Serovar Typhi by Antibodies Targeting Vi and Lipopolysaccharide O:9 Antigen.
School of Immunity and Infection, College of Medicine and Dental Sciences, University of Birmingham, Birmingham, United Kingdom.
Salmonella enterica serovar Typhi expresses a capsule of Vi polysaccharide, while most Salmonella serovars, including S. Enteritidis and S. Typhimurium, do not. Both S. Typhi and S. Enteritidis express the lipopolysaccharide O:9 antigen, yet there is little evidence of cross-protection from anti-O:9 antibodies. Vaccines based on Vi polysaccharide have efficacy against typhoid fever, indicating that antibodies against Vi confer protection. Here we investigate the role of Vi capsule and antibodies against Vi and O:9 in antibody-dependent complement- and phagocyte-mediated killing of Salmonella. Using isogenic Vi-expressing and non-Vi-expressing derivatives of S. Typhi and S. Typhimurium, we show that S. Typhi is inherently more sensitive to serum and blood than S. Typhimurium. Vi expression confers increased resistance to both complement- and phagocyte-mediated modalities of antibody-dependent killing in human blood. The Vi capsule is associated with reduced C3 and C5b-9 deposition, and decreased overall antibody binding to S. Typhi. However, purified human anti-Vi antibodies in the presence of complement are able to kill Vi-expressing Salmonella, while killing by anti-O:9 antibodies is inversely related to Vi expression. Human serum depleted of antibodies to antigens other than Vi retains the ability to kill Vi-expressing bacteria. Our findings support a protective role for Vi capsule in preventing complement and phagocyte killing of Salmonella that can be overcome by specific anti-Vi antibodies, but only to a limited extent by anti-O:9 antibodies.
Funded by: Biotechnology and Biological Sciences Research Council: BB/F022778/1; Medical Research Council: G0701275, G9818340
PloS one 2016;11;1;e0145945
Fluorescence-Based Flow Sorting in Parallel with Transposon Insertion Site Sequencing Identifies Multidrug Efflux Systems in Acinetobacter baumannii.
Department of Chemistry and Biomolecular Sciences, Macquarie University, Sydney, NSW, Australia firstname.lastname@example.org.
Unlabelled: Multidrug efflux pumps provide clinically significant levels of drug resistance in a number of Gram-negative hospital-acquired pathogens. These pathogens frequently carry dozens of genes encoding putative multidrug efflux pumps. However, it can be difficult to determine how many of these pumps actually mediate antimicrobial efflux, and it can be even more challenging to identify the regulatory proteins that control expression of these pumps. In this study, we developed an innovative high-throughput screening method, combining transposon insertion sequencing and cell sorting methods (TraDISort), to identify the genes encoding major multidrug efflux pumps, regulators, and other factors that may affect the permeation of antimicrobials, using the nosocomial pathogen Acinetobacter baumannii A dense library of more than 100,000 unique transposon insertion mutants was treated with ethidium bromide, a common substrate of multidrug efflux pumps that is differentially fluorescent inside and outside the bacterial cytoplasm. Populations of cells displaying aberrant accumulations of ethidium were physically enriched using fluorescence-activated cell sorting, and the genomic locations of transposon insertions within these strains were determined using transposon-directed insertion sequencing. The relative abundance of mutants in the input pool compared to the selected mutant pools indicated that the AdeABC, AdeIJK, and AmvA efflux pumps are the major ethidium efflux systems in A. baumannii Furthermore, the method identified a new transcriptional regulator that controls expression of amvA In addition to the identification of efflux pumps and their regulators, TraDISort identified genes that are likely to control cell division, cell morphology, or aggregation in A. baumannii
Importance: Transposon-directed insertion sequencing (TraDIS) and related technologies have emerged as powerful methods to identify genes required for bacterial survival or competitive fitness under various selective conditions. We applied fluorescence-activated cell sorting (FACS) to physically enrich for phenotypes of interest within a mutant population prior to TraDIS. To our knowledge, this is the first time that a physical selection method has been applied in parallel with TraDIS rather than a fitness-induced selection. The results demonstrate the feasibility of this combined approach to generate significant results and highlight the major multidrug efflux pumps encoded in an important pathogen. This FACS-based approach, TraDISort, could have a range of future applications, including the characterization of efflux pump inhibitors, the identification of regulatory factors controlling gene or protein expression using fluorescent reporters, and the identification of genes involved in cell replication, morphology, and aggregation.
Funded by: Medical Research Council: G1100100; Wellcome Trust: 098051, 100087/Z/12/Z
Genome-wide time-to-event analysis on smoking progression stages in a family-based study.
Department of Public Health University of Helsinki Helsinki Finland.
Background: Various pivotal stages in smoking behavior can be identified, including initiation, conversion from experimenting to established use, development of tolerance, and cessation. Previous studies have shown high heritability for age of smoking initiation and cessation; however, time-to-event genome-wide association studies aiming to identify underpinning genes that accelerate or delay these transitions are missing to date.
Methods: We investigated which single nucleotide polymorphisms (SNPs) across the whole genome contribute to the hazard ratio of transition between different stages of smoking behavior by performing time-to-event analyses within a large Finnish twin family cohort (N = 1962), and further conducted mediation analyses of plausible intermediate traits for significant SNPs.
Results: Genome-wide significant signals were detected for three of the four transitions: (1) for smoking cessation on 10p14 (P = 4.47e-08 for rs72779075 flanked by RP11-575N15 and GATA3), (2) for tolerance on 11p13 (P = 1.29e-08 for rs11031684 in RP1-65P5.1), mediated by smoking quantity, and on 9q34.12 (P = 3.81e-08 for rs2304808 in FUBP3), independent of smoking quantity, and (3) for smoking initiation on 19q13.33 (P = 3.37e-08 for rs73050610 flanked by TRPM4 and SLC6A16) in analysis adjusted for first time sensations. Although our top SNPs did not replicate, another SNP in the TRPM4-SLC6A16 gene region showed statistically significant association after region-based multiple testing correction in an independent Australian twin family sample.
Conclusion: Our results suggest that the functional effect of the TRPM4-SLC6A16 gene region deserves further investigation, and that complex neurotransmitter networks including dopamine and glutamate may play a critical role in smoking initiation. Moreover, comparison of these results implies that genetic contributions to the complex smoking behavioral phenotypes vary among the transitions.
Brain and behavior 2016;e00462
Linear mixed model for heritability estimation that explicitly addresses environmental variation.
Microsoft Research, Los Angeles, CA 90024; email@example.com.
The linear mixed model (LMM) is now routinely used to estimate heritability. Unfortunately, as we demonstrate, LMM estimates of heritability can be inflated when using a standard model. To help reduce this inflation, we used a more general LMM with two random effects-one based on genomic variants and one based on easily measured spatial location as a proxy for environmental effects. We investigated this approach with simulated data and with data from a Uganda cohort of 4,778 individuals for 34 phenotypes including anthropometric indices, blood factors, glycemic control, blood pressure, lipid tests, and liver function tests. For the genomic random effect, we used identity-by-descent estimates from accurately phased genome-wide data. For the environmental random effect, we constructed a covariance matrix based on a Gaussian radial basis function. Across the simulated and Ugandan data, narrow-sense heritability estimates were lower using the more general model. Thus, our approach addresses, in part, the issue of "missing heritability" in the sense that much of the heritability previously thought to be missing was fictional. Software is available at https://github.com/MicrosoftGenomics/FaST-LMM.
Funded by: Medical Research Council: G0801566, G0901213, MR/K013491/1; Wellcome Trust
Proceedings of the National Academy of Sciences of the United States of America 2016;113;27;7377-82
Conserved Features in the Structure, Mechanism, and Biogenesis of the Inverse Autotransporter Protein Family.
Department of Microbiology, Infection & Immunity Program, Biomedicine Discovery Institute, Monash University, Clayton, Australia Wellcome Trust Sanger Institute, Hinxton, United Kingdom.
The bacterial cell surface proteins intimin and invasin are virulence factors that share a common domain structure and bind selectively to host cell receptors in the course of bacterial pathogenesis. The β-barrel domains of intimin and invasin show significant sequence and structural similarities. Conversely, a variety of proteins with sometimes limited sequence similarity have also been annotated as "intimin-like" and "invasin" in genome datasets, while other recent work on apparently unrelated virulence-associated proteins ultimately revealed similarities to intimin and invasin. Here we characterize the sequence and structural relationships across this complex protein family. Surprisingly, intimins and invasins represent a very small minority of the sequence diversity in what has been previously the "intimin/invasin protein family". Analysis of the assembly pathway for expression of the classic intimin, EaeA, and a characteristic example of the most prevalent members of the group, FdeC, revealed a dependence on the translocation and assembly module as a common feature for both these proteins. While the majority of the sequences in the grouping are most similar to FdeC, a further and widespread group is two-partner secretion systems that use the β-barrel domain as the delivery device for secretion of a variety of virulence factors. This comprehensive analysis supports the adoption of the "inverse autotransporter protein family" as the most accurate nomenclature for the family and, in turn, has important consequences for our overall understanding of the Type V secretion systems of bacterial pathogens.
Genome biology and evolution 2016;8;6;1690-705
Ensembl comparative genomics resources.
European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SD, Bill Lyons Informatics Centre, UCL Cancer Institute, University College London, London WC1E 6DD, firstname.lastname@example.org email@example.com firstname.lastname@example.org.
Evolution provides the unifying framework with which to understand biology. The coherent investigation of genic and genomic data often requires comparative genomics analyses based on whole-genome alignments, sets of homologous genes and other relevant datasets in order to evaluate and answer evolutionary-related questions. However, the complexity and computational requirements of producing such data are substantial: this has led to only a small number of reference resources that are used for most comparative analyses. The Ensembl comparative genomics resources are one such reference set that facilitates comprehensive and reproducible analysis of chordate genome data. Ensembl computes pairwise and multiple whole-genome alignments from which large-scale synteny, per-base conservation scores and constrained elements are obtained. Gene alignments are used to define Ensembl Protein Families, GeneTrees and homologies for both protein-coding and non-coding RNA genes. These resources are updated frequently and have a consistent informatics infrastructure and data presentation across all supported species. Specialized web-based visualizations are also available including synteny displays, collapsible gene tree plots, a gene family locator and different alignment views. The Ensembl comparative genomics infrastructure is extensively reused for the analysis of non-vertebrate species by other projects including Ensembl Genomes and Gramene and much of the information here is relevant to these projects. The consistency of the annotation across species and the focus on vertebrates makes Ensembl an ideal system to perform and support vertebrate comparative genomic analyses. We use robust software and pipelines to produce reference comparative data and make it freely available.Database URL: http://www.ensembl.org.
Database : the journal of biological databases and curation 2016;2016
Burden of Diabetes and First Evidence for the Utility of HbA1c for Diagnosis and Detection of Diabetes in Urban Black South Africans: The Durban Diabetes Study.
Department of Medicine, University of Cambridge, Cambridge, United Kingdom.
Objective: Glycated haemoglobin (HbA1c) is recommended as an additional tool to glucose-based measures (fasting plasma glucose [FPG] and 2-hour plasma glucose [2PG] during oral glucose tolerance test [OGTT]) for the diagnosis of diabetes; however, its use in sub-Saharan African populations is not established. We assessed prevalence estimates and the diagnosis and detection of diabetes based on OGTT, FPG, and HbA1c in an urban black South African population.
Research design and methods: We conducted a population-based cross-sectional survey using multistage cluster sampling of adults aged ≥18 years in Durban (eThekwini municipality), KwaZulu-Natal. All participants had a 75-g OGTT and HbA1c measurements. Receiver operating characteristic (ROC) analysis was used to assess the overall diagnostic accuracy of HbA1c, using OGTT as the reference, and to determine optimal HbA1c cut-offs.
Results: Among 1190 participants (851 women, 92.6% response rate), the age-standardised prevalence of diabetes was 12.9% based on OGTT, 11.9% based on FPG, and 13.1% based on HbA1c. In participants without a previous history of diabetes (n = 1077), using OGTT as the reference, an HbA1c ≥48 mmol/mol (6.5%) detected diabetes with 70.3% sensitivity (95%CI 52.7-87.8) and 98.7% specificity (95%CI 97.9-99.4) (AUC 0.94 [95%CI 0.89-1.00]). Additional analyses suggested the optimal HbA1c cut-off for detection of diabetes in this population was 42 mmol/mol (6.0%) (sensitivity 89.2% [95%CI 78.6-99.8], specificity 92.0% [95%CI: 90.3-93.7]).
Conclusions: In an urban black South African population, we found a high prevalence of diabetes and provide the first evidence for the utility of HbA1c for the diagnosis and detection of diabetes in black Africans in sub-Saharan Africa.
Funded by: Medical Research Council: MR/K013491/1
PloS one 2016;11;8;e0161966
Study profile: the Durban Diabetes Study (DDS): a platform for chronic disease research.
Department of Medicine, University of Cambridge, Cambridge, UK.
The Durban Diabetes Study (DDS) is a population-based cross-sectional survey of an urban black population in the eThekwini Municipality (city of Durban) in South Africa. The survey combines health, lifestyle and socioeconomic questionnaire data with standardised biophysical measurements, biomarkers for non-communicable and infectious diseases, and genetic data. Data collection for the study is currently underway and the target sample size is 10 000 participants. The DDS has an established infrastructure for survey fieldwork, data collection and management, sample processing and storage, managed data sharing and consent for re-approaching participants, which can be utilised for further research studies. As such, the DDS represents a rich platform for investigating the distribution, interrelation and aetiology of chronic diseases and their risk factors, which is critical for developing health care policies for disease management and prevention. For data access enquiries please contact the African Partnership for Chronic Disease Research (APCDR) at email@example.com or the corresponding author.
Funded by: Medical Research Council: MR/K013491/1; Wellcome Trust
Global health, epidemiology and genomics 2016;1;e2
Genomic Analysis of Companion Rabbit Staphylococcus aureus.
Department of Veterinary Medicine, University of Cambridge, Cambridge, United Kingdom.
In addition to being an important human pathogen, Staphylococcus aureus is able to cause a variety of infections in numerous other host species. While the S. aureus strains causing infection in several of these hosts have been well characterised, this is not the case for companion rabbits (Oryctolagus cuniculus), where little data are available on S. aureus strains from this host. To address this deficiency we have performed antimicrobial susceptibility testing and genome sequencing on a collection of S. aureus isolates from companion rabbits. The findings show a diverse S. aureus population is able to cause infection in this host, and while antimicrobial resistance was uncommon, the isolates possess a range of known and putative virulence factors consistent with a diverse clinical presentation in companion rabbits including severe abscesses. We additionally show that companion rabbit isolates carry polymorphisms within dltB as described as underlying host-adaption of S. aureus to farmed rabbits. The availability of S. aureus genome sequences from companion rabbits provides an important aid to understanding the pathogenesis of disease in this host and in the clinical management and surveillance of these infections.
PloS one 2016;11;3;e0151458
Five decades of genome evolution in the globally distributed, extensively antibiotic-resistant Acinetobacter baumannii global clone 1.
Department of Biochemistry & Molecular Biology, The University of Melbourne , Royal Parade, Parkville, Victoria , Australia.
The majority of Acinetobacter baumannii isolates that are multiply, extensively and pan-antibiotic resistant belong to two globally disseminated clones, GC1 and GC2, that were first noticed in the 1970s. Here, we investigated microevolution and phylodynamics within GC1 via analysis of 45 whole-genome sequences, including 23 sequenced for this study. The most recent common ancestor of GC1 arose around 1960 and later diverged into two phylogenetically distinct lineages. In the 1970s, the main lineage acquired the AbaR resistance island, conferring resistance to older antibiotics, via a horizontal gene transfer event. We estimate a mutation rate of ∼5 SNPs genome(- 1) year(- 1) and detected extensive recombination within GC1 genomes, introducing nucleotide diversity into the population at >20 times the substitution rate (the ratio of SNPs introduced by recombination compared with mutation was 22). The recombination events were non-randomly distributed in the genome and created significant diversity within loci encoding outer surface molecules (including the capsular polysaccharide, the outer core lipooligosaccharide and the outer membrane protein CarO), and spread antimicrobial resistance-conferring mutations affecting the gyrA and parC genes and insertion sequence insertions activating the ampC gene. Both GC1 lineages accumulated resistance to newer antibiotics through various genetic mechanisms, including the acquisition of plasmids and transposons or mutations in chromosomal genes. Our data show that GC1 has diversified into multiple successful extensively antibiotic-resistant subclones that differ in their surface structures. This has important implications for all avenues of control, including epidemiological tracking, antimicrobial therapy and vaccination.
Microbial genomics 2016;2;2;e000052
Palmitoyl Transferases have Critical Roles in the Development of Mosquito and Liver Stages of Plasmodium.
Department of Molecular Microbiology & Immunology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, 21205, USA.
As the Plasmodium parasite transitions between mammalian and mosquito host, it has to adjust quickly to new environments. Palmitoylation, a reversible and dynamic lipid posttranslational modification plays a central role in regulating this process and has been implicated with functions for parasite morphology, motility and host cell invasion. While proteins associated with the gliding motility machinery have been described to be palmitoylated, no palmitoyl transferase responsible for regulating gliding motility has previously been identified. Here, we characterize two palmityol transferases with gene tagging and gene deletion approaches. We identify DHHC3, a palmitoyl transferase as a mediator of ookinete development, with a crucial role for gliding motility in ookinetes and sporozoites and we co-localize the protein with a marker for the inner membrane complex in the ookinete stage. Ookinetes and sporozoites lacking DHHC3 are impaired in gliding motility and exhibit a strong phenotype in vivo; with ookinetes being significantly less infectious to their mosquito host and sporozoites being non-infectious to mice. Importantly, genetic complementation of the DHHC3-ko parasite completely restored virulence. We generated parasites lacking both DHHC3, as well as the palmitoyl transferase DHHC9, and found an enhanced phenotype for these double knockout parasites, allowing insights into the functional overlap and compensational nature of the large family of PbDHHCs. These findings contribute to our understanding of the organization and mechanism of the gliding motility machinery, which as is becoming increasingly clear, is mediated by palmitoylation. This article is protected by copyright. All rights reserved.
Cellular microbiology 2016
Retinol and ascorbate drive erasure of epigenetic memory and enhance reprogramming to naïve pluripotency by complementary mechanisms.
Epigenetics Programme, Babraham Institute, Cambridge CB22 3AT, United Kingdom; Department of Anatomy, University of Otago, Dunedin 9016, New Zealand; firstname.lastname@example.org email@example.com firstname.lastname@example.org.
Epigenetic memory, in particular DNA methylation, is established during development in differentiating cells and must be erased to create naïve (induced) pluripotent stem cells. The ten-eleven translocation (TET) enzymes can catalyze the oxidation of 5-methylcytosine (5mC) to 5-hydroxymethylcytosine (5hmC) and further oxidized derivatives, thereby actively removing this memory. Nevertheless, the mechanism by which the TET enzymes are regulated, and the extent to which they can be manipulated, are poorly understood. Here we report that retinoic acid (RA) or retinol (vitamin A) and ascorbate (vitamin C) act as modulators of TET levels and activity. RA or retinol enhances 5hmC production in naïve embryonic stem cells by activation of TET2 and TET3 transcription, whereas ascorbate potentiates TET activity and 5hmC production through enhanced Fe(2+) recycling, and not as a cofactor as reported previously. We find that both ascorbate and RA or retinol promote the derivation of induced pluripotent stem cells synergistically and enhance the erasure of epigenetic memory. This mechanistic insight has significance for the development of cell treatments for regenenerative medicine, and enhances our understanding of how intrinsic and extrinsic signals shape the epigenome.
Proceedings of the National Academy of Sciences of the United States of America 2016
Genome-wide associations for birth weight and correlations with adult disease.
Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, UK.
Birth weight (BW) has been shown to be influenced by both fetal and maternal factors and in observational studies is reproducibly associated with future risk of adult metabolic diseases including type 2 diabetes (T2D) and cardiovascular disease. These life-course associations have often been attributed to the impact of an adverse early life environment. Here, we performed a multi-ancestry genome-wide association study (GWAS) meta-analysis of BW in 153,781 individuals, identifying 60 loci where fetal genotype was associated with BW (P < 5 × 10(-8)). Overall, approximately 15% of variance in BW was captured by assays of fetal genetic variation. Using genetic association alone, we found strong inverse genetic correlations between BW and systolic blood pressure (Rg = -0.22, P = 5.5 × 10(-13)), T2D (Rg = -0.27, P = 1.1 × 10(-6)) and coronary artery disease (Rg = -0.30, P = 6.5 × 10(-9)). In addition, using large -cohort datasets, we demonstrated that genetic factors were the major contributor to the negative covariance between BW and future cardiometabolic risk. Pathway analyses indicated that the protein products of genes within BW-associated regions were enriched for diverse processes including insulin signalling, glucose homeostasis, glycogen biosynthesis and chromatin remodelling. There was also enrichment of associations with BW in known imprinted regions (P = 1.9 × 10(-4)). We demonstrate that life-course associations between early growth phenotypes and adult cardiometabolic disease are in part the result of shared genetic effects and identify some of the pathways through which these causal genetic effects are mediated.
Transancestral fine-mapping of four type 2 diabetes susceptibility loci highlights potential causal regulatory mechanisms.
Wellcome Trust Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford, UK, Oxford Centre for Diabetes, Endocrinology and Metabolism, Radcliffe Department of Medicine, University of Oxford, Oxford, UK.
To gain insight into potential regulatory mechanisms through which the effects of variants at four established type 2 diabetes (T2D) susceptibility loci (CDKAL1, CDKN2A-B, IGF2BP2 and KCNQ1) are mediated, we undertook transancestral fine-mapping in 22 086 cases and 42 539 controls of East Asian, European, South Asian, African American and Mexican American descent. Through high-density imputation and conditional analyses, we identified seven distinct association signals at these four loci, each with allelic effects on T2D susceptibility that were homogenous across ancestry groups. By leveraging differences in the structure of linkage disequilibrium between diverse populations, and increased sample size, we localised the variants most likely to drive each distinct association signal. We demonstrated that integration of these genetic fine-mapping data with genomic annotation can highlight potential causal regulatory elements in T2D-relevant tissues. These analyses provide insight into the mechanisms through which T2D association signals are mediated, and suggest future routes to understanding the biology of specific disease susceptibility loci.
Funded by: NIDDK NIH HHS: R01 DK072193, R01 DK078616, U01 DK078616, U01 DK105535
Human molecular genetics 2016
Independent Origin and Global Distribution of Distinct Plasmodium vivax Duffy Binding Protein Gene Duplications.
Malaria Programme, Wellcome Trust Sanger Institute, Hinxton, United Kingdom.
Background: Plasmodium vivax causes the majority of malaria episodes outside Africa, but remains a relatively understudied pathogen. The pathology of P. vivax infection depends critically on the parasite's ability to recognize and invade human erythrocytes. This invasion process involves an interaction between P. vivax Duffy Binding Protein (PvDBP) in merozoites and the Duffy antigen receptor for chemokines (DARC) on the erythrocyte surface. Whole-genome sequencing of clinical isolates recently established that some P. vivax genomes contain two copies of the PvDBP gene. The frequency of this duplication is particularly high in Madagascar, where there is also evidence for P. vivax infection in DARC-negative individuals. The functional significance and global prevalence of this duplication, and whether there are other copy number variations at the PvDBP locus, is unknown.
Methodology/principal findings: Using whole-genome sequencing and PCR to study the PvDBP locus in P. vivax clinical isolates, we found that PvDBP duplication is widespread in Cambodia. The boundaries of the Cambodian PvDBP duplication differ from those previously identified in Madagascar, meaning that current molecular assays were unable to detect it. The Cambodian PvDBP duplication did not associate with parasite density or DARC genotype, and ranged in prevalence from 20% to 38% over four annual transmission seasons in Cambodia. This duplication was also present in P. vivax isolates from Brazil and Ethiopia, but not India.
Conclusions/significance: PvDBP duplications are much more widespread and complex than previously thought, and at least two distinct duplications are circulating globally. The same duplication boundaries were identified in parasites from three continents, and were found at high prevalence in human populations where DARC-negativity is essentially absent. It is therefore unlikely that PvDBP duplication is associated with infection of DARC-negative individuals, but functional tests will be required to confirm this hypothesis.
Funded by: NCATS NIH HHS: UL1 TR001414; NIAID NIH HHS: U19 AI089688
PLoS neglected tropical diseases 2016;10;10;e0005091
Structure and evolutionary history of a large family of NLR proteins in the zebrafish.
Wellcome Trust Sanger Institute, Cambridge, UK.
Multicellular eukaryotes have evolved a range of mechanisms for immune recognition. A widespread family involved in innate immunity are the NACHT-domain and leucine-rich-repeat-containing (NLR) proteins. Mammals have small numbers of NLR proteins, whereas in some species, mostly those without adaptive immune systems, NLRs have expanded into very large families. We describe a family of nearly 400 NLR proteins encoded in the zebrafish genome. The proteins share a defining overall structure, which arose in fishes after a fusion of the core NLR domains with a B30.2 domain, but can be subdivided into four groups based on their NACHT domains. Gene conversion acting differentially on the NACHT and B30.2 domains has shaped the family and created the groups. Evidence of positive selection in the B30.2 domain indicates that this domain rather than the leucine-rich repeats acts as the pathogen recognition module. In an unusual chromosomal organization, the majority of the genes are located on one chromosome arm, interspersed with other large multigene families, including a new family encoding zinc-finger proteins. The NLR-B30.2 proteins represent a new family with diversity in the specific recognition module that is present in fishes in spite of the parallel existence of an adaptive immune system.
Funded by: European Research Council: 335980; Howard Hughes Medical Institute: 55007424; NHGRI NIH HHS: HG002659; Wellcome Trust
Open biology 2016;6;4;160009
WormBase 2016: expanding to enable helminth genomic research.
European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK email@example.com.
WormBase (www.wormbase.org) is a central repository for research data on the biology, genetics and genomics of Caenorhabditis elegans and other nematodes. The project has evolved from its original remit to collect and integrate all data for a single species, and now extends to numerous nematodes, ranging from evolutionary comparators of C. elegans to parasitic species that threaten plant, animal and human health. Research activity using C. elegans as a model system is as vibrant as ever, and we have created new tools for community curation in response to the ever-increasing volume and complexity of data. To better allow users to navigate their way through these data, we have made a number of improvements to our main website, including new tools for browsing genomic features and ontology annotations. Finally, we have developed a new portal for parasitic worm genomes. WormBase ParaSite (parasite.wormbase.org) contains all publicly available nematode and platyhelminth annotated genome sequences, and is designed specifically to support helminth genomic research.
Nucleic acids research 2016;44;D1;D774-80
WormBase ParaSite - a comprehensive resource for helminth genomics.
European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK. Electronic address: firstname.lastname@example.org.
The number of publicly available parasitic worm genome sequences has increased dramatically in the past three years, and research interest in helminth functional genomics is now quickly gathering pace in response to the foundation that has been laid by these collective efforts. A systematic approach to the organisation, curation, analysis and presentation of these data is clearly vital for maximising the utility of these data to researchers. We have developed a portal called WormBase ParaSite (http://parasite.wormbase.org) for interrogating helminth genomes on a large scale. Data from over 100 nematode and platyhelminth species are integrated, adding value by way of systematic and consistent functional annotation (e.g. protein domains and Gene Ontology terms), gene expression analysis (e.g. alignment of life-stage specific transcriptome data sets), and comparative analysis (e.g. orthologues and paralogues). We provide several ways of exploring the data, including genome browsers, genome and gene summary pages, text search, sequence search, a query wizard, bulk downloads, and programmatic interfaces. In this review, we provide an overview of the back-end infrastructure and analysis behind WormBase ParaSite, and the displays and tools available to users for interrogating helminth genomic data.
Molecular and biochemical parasitology 2016
Insulin resistance uncoupled from dyslipidemia due to C-terminal PIK3R1 mutations.
The University of Cambridge Metabolic Research Laboratories, Wellcome Trust-MRC Institute of Metabolic Science, Cambridge, United Kingdom.
Obesity-related insulin resistance is associated with fatty liver, dyslipidemia, and low plasma adiponectin. Insulin resistance due to insulin receptor (INSR) dysfunction is associated with none of these, but when due to dysfunction of the downstream kinase AKT2 phenocopies obesity-related insulin resistance. We report 5 patients with SHORT syndrome and C-terminal mutations in <i>PIK3R1</i>, encoding the p85α/p55α/p50α subunits of PI3K, which act between INSR and AKT in insulin signaling. Four of 5 patients had extreme insulin resistance without dyslipidemia or hepatic steatosis. In 3 of these 4, plasma adiponectin was preserved, as in insulin receptor dysfunction. The fourth patient and her healthy mother had low plasma adiponectin associated with a potentially novel mutation, p.Asp231Ala, in adiponectin itself. Cells studied from one patient with the p.Tyr657X <i>PIK3R1</i> mutation expressed abundant truncated PIK3R1 products and showed severely reduced insulin-stimulated association of mutant but not WT p85α with IRS1, but normal downstream signaling. In 3T3-L1 preadipocytes, mutant p85α overexpression attenuated insulin-induced AKT phosphorylation and adipocyte differentiation. Thus, <i>PIK3R1</i> C-terminal mutations impair insulin signaling only in some cellular contexts and produce a subphenotype of insulin resistance resembling INSR dysfunction but unlike AKT2 dysfunction, implicating PI3K in the pathogenesis of key components of the metabolic syndrome.
Funded by: Biotechnology and Biological Sciences Research Council: BBS/E/B/0000H213, BBS/E/B/0000S227, BBS/E/B/000C0413; Medical Research Council: MC_PC_15018, MC_UU_12012/5, MRC_MC_UU_12012/5; NHLBI NIH HHS: RC2 HL102923, RC2 HL102924, RC2 HL102925, RC2 HL102926, RC2 HL103010, UC2 HL102923, UC2 HL102924, UC2 HL102925, UC2 HL102926, UC2 HL103010; Wellcome Trust: WT091310, WT095515, WT098051, WT098498, WT107064
JCI insight 2016;1;17;e88766
The genomic basis of parasitism in the Strongyloides clade of nematodes.
School of Biological Sciences, University of Bristol, Bristol, UK.
Soil-transmitted nematodes, including the Strongyloides genus, cause one of the most prevalent neglected tropical diseases. Here we compare the genomes of four Strongyloides species, including the human pathogen Strongyloides stercoralis, and their close relatives that are facultatively parasitic (Parastrongyloides trichosuri) and free-living (Rhabditophanes sp. KR3021). A significant paralogous expansion of key gene families--families encoding astacin-like and SCP/TAPS proteins--is associated with the evolution of parasitism in this clade. Exploiting the unique Strongyloides life cycle, we compare the transcriptomes of the parasitic and free-living stages and find that these same gene families are upregulated in the parasitic stages, underscoring their role in nematode parasitism.
Funded by: NCRR NIH HHS: P40 RR002512, RR02512; NIAID NIH HHS: AI050668, AI060516, AI105856, R01 AI050668, R21 AI105856, R33 AI105856, T32 AI060516; Wellcome Trust: 094462/Z/10/Z, 098051
Nature genetics 2016;48;3;299-307
GWAS for executive function and processing speed suggests involvement of the CADM2 gene.
Genetic Epidemiology Unit, Department of Epidemiology, Erasmus University Medical Center, Rotterdam, The Netherlands.
To identify common variants contributing to normal variation in two specific domains of cognitive functioning, we conducted a genome-wide association study (GWAS) of executive functioning and information processing speed in non-demented older adults from the CHARGE (Cohorts for Heart and Aging Research in Genomic Epidemiology) consortium. Neuropsychological testing was available for 5429-32 070 subjects of European ancestry aged 45 years or older, free of dementia and clinical stroke at the time of cognitive testing from 20 cohorts in the discovery phase. We analyzed performance on the Trail Making Test parts A and B, the Letter Digit Substitution Test (LDST), the Digit Symbol Substitution Task (DSST), semantic and phonemic fluency tests, and the Stroop Color and Word Test. Replication was sought in 1311-21860 subjects from 20 independent cohorts. A significant association was observed in the discovery cohorts for the single-nucleotide polymorphism (SNP) rs17518584 (discovery P-value=3.12 × 10(-8)) and in the joint discovery and replication meta-analysis (P-value=3.28 × 10(-9) after adjustment for age, gender and education) in an intron of the gene cell adhesion molecule 2 (CADM2) for performance on the LDST/DSST. Rs17518584 is located about 170 kb upstream of the transcription start site of the major transcript for the CADM2 gene, but is within an intron of a variant transcript that includes an alternative first exon. The variant is associated with expression of CADM2 in the cingulate cortex (P-value=4 × 10(-4)). The protein encoded by CADM2 is involved in glutamate signaling (P-value=7.22 × 10(-15)), gamma-aminobutyric acid (GABA) transport (P-value=1.36 × 10(-11)) and neuron cell-cell adhesion (P-value=1.48 × 10(-13)). Our findings suggest that genetic variation in the CADM2 gene is associated with individual differences in information processing speed.
Funded by: Biotechnology and Biological Sciences Research Council: BB/F019394/1; Medical Research Council: G0700704, MR/K026992/1; NCATS NIH HHS: UL1 TR000124; NCI NIH HHS: P01 CA055075, P01 CA087969, R01 CA047988, R01 CA049449, R01 CA050385, R01 CA065725, R01 CA067262, R01 CA134958, U01 CA067262, U01 CA098233; NEI NIH HHS: R01 EY009611, R01 EY015473; NHGRI NIH HHS: U01 HG004399, U01 HG004402, U01 HG004728; NHLBI NIH HHS: HHSN268200900020C, HHSN268201100005C, HHSN268201100006C, HHSN268201100007C, HHSN268201100008C, HHSN268201100009C, HHSN268201100010C, HHSN268201100011C, HHSN268201100012C, HHSN268201200036C, N01 HC015103, N01 HC025195, N01 HC035129, N01 HC045133, N01 HC075150, N01 HC085082, N01 HC085084, N01 HC085085, N01 HC085086, N01HC55222, N01HC85079, N01HC85080, N01HC85081, N01HC85082, N01HC85083, R01 HL034594, R01 HL035464, R01 HL043851, R01 HL059367, R01 HL070825, R01 HL071917, R01 HL080295, R01 HL080467, R01 HL086694, R01 HL087641, R01 HL087652, R01 HL087660, R01 HL093029, R01 HL105756, U01 HL054457, U01 HL054463, U01 HL054464, U01 HL054481, U01 HL096917; NIA NIH HHS: K08 AG034290, K25 AG041906, N01 AG012100, N01 AG062101, N01 AG062103, N01 AG062106, N01 AG821336, N01 AG916413, P30 AG010161, P50 AG005133, R01 AG008122, R01 AG015819, R01 AG015928, R01 AG016495, R01 AG017917, R01 AG020098, R01 AG023629, R01 AG027058, R01 AG030146, R01 AG032098, R01 AG033193, U01 AG049505; NIDDK NIH HHS: P01 DK070756, P30 DK063491, R01 DK058845; NIMHD NIH HHS: 263 MD821336, 263 MD9164 13; NINDS NIH HHS: R01 NS017950, R01 NS041558
Molecular psychiatry 2016;21;2;189-97
Classification of low quality cells from single-cell RNA-seq data.
European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK. email@example.com.
Single-cell RNA sequencing (scRNA-seq) has broad applications across biomedical research. One of the key challenges is to ensure that only single, live cells are included in downstream analysis, as the inclusion of compromised cells inevitably affects data interpretation. Here, we present a generic approach for processing scRNA-seq data and detecting low quality cells, using a curated set of over 20 biological and technical features. Our approach improves classification accuracy by over 30 % compared to traditional methods when tested on over 5,000 cells, including CD4+ T cells, bone marrow dendritic cells, and mouse embryonic stem cells.
Funded by: Biotechnology and Biological Sciences Research Council
Genome biology 2016;17;29
Evolutionary genomics of epidemic visceral leishmaniasis in the Indian subcontinent.
Department of Biomedical Sciences, Institute of Tropical Medicine, Antwerp, Belgium.
Leishmania donovani causes visceral leishmaniasis (VL), the second most deadly vector-borne parasitic disease. A recent epidemic in the Indian subcontinent (ISC) caused up to 80% of global VL and over 30,000 deaths per year. Resistance against antimonial drugs has probably been a contributing factor in the persistence of this epidemic. Here we use whole genome sequences from 204 clinical isolates to track the evolution and epidemiology of L. donovani from the ISC. We identify independent radiations that have emerged since a bottleneck coincident with 1960s DDT spraying campaigns. A genetically distinct population frequently resistant to antimonials has a two base-pair insertion in the aquaglyceroporin gene LdAQP1 that prevents the transport of trivalent antimonials. We find evidence of genetic exchange between ISC populations, and show that the mutation in LdAQP1 has spread by recombination. Our results reveal the complexity of L. donovani evolution in the ISC in response to drug treatment.
Genome-wide association studies in the Japanese population identify seven novel loci for type 2 diabetes.
Laboratory for Endocrinology, Metabolism and Kidney Diseases, RIKEN Center for Integrative Medical Sciences, Yokohama 230-0045, Japan.
Genome-wide association studies (GWAS) have identified more than 80 susceptibility loci for type 2 diabetes (T2D), but most of its heritability still remains to be elucidated. In this study, we conducted a meta-analysis of GWAS for T2D in the Japanese population. Combined data from discovery and subsequent validation analyses (23,399 T2D cases and 31,722 controls) identify 7 new loci with genome-wide significance (P<5 × 10(-8)), rs1116357 near CCDC85A, rs147538848 in FAM60A, rs1575972 near DMRTA1, rs9309245 near ASB3, rs67156297 near ATP8B2, rs7107784 near MIR4686 and rs67839313 near INAFM2. Of these, the association of 4 loci with T2D is replicated in multi-ethnic populations other than Japanese (up to 65,936 T2Ds and 158,030 controls, P<0.007). These results indicate that expansion of single ethnic GWAS is still useful to identify novel susceptibility loci to complex traits not only for ethnicity-specific loci but also for common loci across different ethnicities.
Funded by: British Heart Foundation: SP/09/002; European Research Council: 268834; FIC NIH HHS: K01 TW006087, KO1TW006087; Medical Research Council: G0800270; NCI NIH HHS: BC050791, R01 CA064277, R01 CA124558, R01CA124558, R01CA64277, R37 CA070867, R37CA070867, UM1 CA182910; NCRR NIH HHS: UL1 RR024975; NIDDK NIH HHS: R01 DK082766, R01DK082766
Nature communications 2016;7;10531
Comparative Antibody Responses Against three Antimalarial Vaccine Candidate Antigens from Urban and Rural Exposed Individuals in Gabon.
Unité de Parasitologie Médicale (UPARAM), Centre International de Recherches Médicales de Franceville (CIRMF), BP 769 Franceville, Gabon; Molécules de Communication et Adaptation des Microorganismes (MCAM, UMR 7245), Sorbonne Universités, Muséum National d'Histoire Naturelle, CNRS, CP52, 57 rue Cuvier 75005 Paris, France; Ecole Doctorale Régionale en Infectiologie Tropicale d'Afrique Centrale (ECODRAC), BP 876 Franceville, Gabon.
The analysis of immune responses in diverse malaria endemic regions provides more information to understand the host's immune response to <i>Plasmodium falciparum.</i> Several plasmodial antigens have been reported as targets of human immunity. PfAMA1 is one of most studied vaccine candidates; PfRH5 and Pf113 are new promising vaccine candidates. The aim of this study was to evaluate humoral response against these three antigens among children of Lastourville (rural area) and Franceville (urban area). Malaria was diagnosed using rapid diagnosis tests. Plasma samples were tested against these antigens by enzyme-linked immunosorbent assay (ELISA). We found that malaria prevalence was five times higher in the rural area than in the urban area (<i>p</i> < 0.0001). The anti-PfAMA1 and PfRh5 response levels were significantly higher in Lastourville than in Franceville (<i>p</i> < 0.0001; <i>p</i> = 0.005). The anti-AMA1 response was higher than the anti-Pf113 response, which in turn was higher than the anti-PfRh5 response in both sites. Anti-PfAMA1 levels were significantly higher in infected children than those in uninfected children (<i>p</i> = 0.001) in Franceville. Anti-Pf113 and anti-PfRh5 antibody levels were lowest in children presenting severe malarial anemia. These three antigens are targets of immunity in Gabon. Further studies on the role of Pf113 in antimalarial protection against severe anemia are needed.
European journal of microbiology & immunology 2016;6;4;287-297
S1PR2 variants associated with auditory function in humans and endocochlear potential decline in mouse.
Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.
Progressive hearing loss is very common in the population but we still know little about the underlying pathology. A new spontaneous mouse mutation (stonedeaf, stdf ) leading to recessive, early-onset progressive hearing loss was detected and exome sequencing revealed a Thr289Arg substitution in Sphingosine-1-Phosphate Receptor-2 (S1pr2). Mutants aged 2 weeks had normal hearing sensitivity, but at 4 weeks most showed variable degrees of hearing impairment, which became severe or profound in all mutants by 14 weeks. Endocochlear potential (EP) was normal at 2 weeks old but was reduced by 4 and 8 weeks old in mutants, and the stria vascularis, which generates the EP, showed degenerative changes. Three independent mouse knockout alleles of S1pr2 have been described previously, but this is the first time that a reduced EP has been reported. Genomic markers close to the human S1PR2 gene were significantly associated with auditory thresholds in the 1958 British Birth Cohort (n = 6099), suggesting involvement of S1P signalling in human hearing loss. The finding of early onset loss of EP gives new mechanistic insight into the disease process and suggests that therapies for humans with hearing loss due to S1P signalling defects need to target strial function.
Funded by: Medical Research Council: G0000934, G0300212, MC_QA137918; NIDDK NIH HHS: U01 DK062418; Wellcome Trust: 068545/Z/02, 076113/B/04/Z, 079895, 089622AIA, 098051, 100699
Scientific reports 2016;6;28964
Evolution of atypical enteropathogenic E. coli by repeated acquisition of LEE pathogenicity island variants.
Department of Microbiology and Immunology, The University of Melbourne at the Peter Doherty Institute for Infection and Immunity, Victoria 3010, Australia.
Atypical enteropathogenic Escherichia coli (aEPEC) is an umbrella term given to E. coli that possess a type III secretion system encoded in the locus of enterocyte effacement (LEE), but lack the virulence factors (stx, bfpA) that characterize enterohaemorrhagic E. coli and typical EPEC, respectively. The burden of disease caused by aEPEC has recently increased in industrialized and developing nations, yet the population structure and virulence profile of this emerging pathogen are poorly understood. Here, we generated whole-genome sequences of 185 aEPEC isolates collected during the Global Enteric Multicenter Study from seven study sites in Asia and Africa, and compared them with publicly available E. coli genomes. Phylogenomic analysis revealed ten distinct widely distributed aEPEC clones. Analysis of genetic variation in the LEE pathogenicity island identified 30 distinct LEE subtypes divided into three major lineages. Each LEE lineage demonstrated a preferred chromosomal insertion site and different complements of non-LEE encoded effector genes, indicating distinct patterns of evolution of these lineages. This study provides the first detailed genomic framework for aEPEC in the context of the EPEC pathotype and will facilitate further studies into the epidemiology and pathogenicity of EPEC by enabling the detection and tracking of specific clones and LEE variants.
Funded by: Medical Research Council: MC_U190074190, MC_U190081991, MC_UP_A900_1122
Nature microbiology 2016;1;15010
Molecular Surveillance Identifies Multiple Transmissions of Typhoid in West Africa.
The Wellcome Trust Sanger Institute, Hinxton, Cambridge, United Kingdom.
Background: The burden of typhoid in sub-Saharan African (SSA) countries has been difficult to estimate, in part, due to suboptimal laboratory diagnostics. However, surveillance blood cultures at two sites in Nigeria have identified typhoid associated with Salmonella enterica serovar Typhi (S. Typhi) as an important cause of bacteremia in children.
Methods: A total of 128 S. Typhi isolates from these studies in Nigeria were whole-genome sequenced, and the resulting data was used to place these Nigerian isolates into a worldwide context based on their phylogeny and carriage of molecular determinants of antibiotic resistance.
Results: Several distinct S. Typhi genotypes were identified in Nigeria that were related to other clusters of S. Typhi isolates from north, west and central regions of Africa. The rapidly expanding S. Typhi clade 4.3.1 (H58) previously associated with multiple antimicrobial resistances in Asia and in east, central and southern Africa, was not detected in this study. However, antimicrobial resistance was common amongst the Nigerian isolates and was associated with several plasmids, including the IncHI1 plasmid commonly associated with S. Typhi.
Conclusions: These data indicate that typhoid in Nigeria was established through multiple independent introductions into the country, with evidence of regional spread. MDR typhoid appears to be evolving independently of the haplotype H58 found in other typhoid endemic countries. This study highlights an urgent need for routine surveillance to monitor the epidemiology of typhoid and evolution of antimicrobial resistance within the bacterial population as a means to facilitate public health interventions to reduce the substantial morbidity and mortality of typhoid.
PLoS neglected tropical diseases 2016;10;9;e0004781
A Landscape of Pharmacogenomic Interactions in Cancer.
European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Cambridge CB10 1SA, UK; Wellcome Trust Sanger Institute, Wellcome Genome Campus, Cambridge CB10 1SA, UK.
Systematic studies of cancer genomes have provided unprecedented insights into the molecular nature of cancer. Using this information to guide the development and application of therapies in the clinic is challenging. Here, we report how cancer-driven alterations identified in 11,289 tumors from 29 tissues (integrating somatic mutations, copy number alterations, DNA methylation, and gene expression) can be mapped onto 1,001 molecularly annotated human cancer cell lines and correlated with sensitivity to 265 drugs. We find that cell lines faithfully recapitulate oncogenic alterations identified in tumors, find that many of these associate with drug sensitivity/resistance, and highlight the importance of tissue lineage in mediating drug response. Logic-based modeling uncovers combinations of alterations that sensitize to drugs, while machine learning demonstrates the relative importance of different data types in predicting drug response. Our analysis and datasets are rich resources to link genotypes with cellular phenotypes and to identify therapeutic options for selected cancer sub-populations.
Funded by: Cancer Research UK; European Research Council: 268626; Marie Curie; NCI NIH HHS: U24 CA143835; Wellcome Trust: 086375, 102696
Discovery and refinement of genetic loci associated with cardiometabolic risk using dense imputation maps.
European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK.
Large-scale whole-genome sequence data sets offer novel opportunities to identify genetic variation underlying human traits. Here we apply genotype imputation based on whole-genome sequence data from the UK10K and 1000 Genomes Project into 35,981 study participants of European ancestry, followed by association analysis with 20 quantitative cardiometabolic and hematological traits. We describe 17 new associations, including 6 rare (minor allele frequency (MAF) < 1%) or low-frequency (1% < MAF < 5%) variants with platelet count (PLT), red blood cell indices (MCH and MCV) and HDL cholesterol. Applying fine-mapping analysis to 233 known and new loci associated with the 20 traits, we resolve the associations of 59 loci to credible sets of 20 or fewer variants and describe trait enrichments within regions of predicted regulatory function. These findings improve understanding of the allelic architecture of risk factors for cardiometabolic and hematological diseases and provide additional functional insights with the identification of potentially novel biological targets.
Funded by: British Heart Foundation: SP/04/002; Medical Research Council: G0601966, G0700931, G0800270, MC_PC_15018, MC_U106179471, MC_UU_12013/1-‐9, MC_UU_12015/1, MC_UU_12015/2; NHLBI NIH HHS: HHSN268201100046C, R21 HL121422; NIA NIH HHS: HHSN271201100004C; NIH HHS: S10 OD020069; WHI NIH HHS: HHSN268201100001C, HHSN268201100002C, HHSN268201100003C, HHSN268201100004C; Wellcome Trust: 084723/Z/08/Z, 091310, 092731, WT091310, WT092447/B/10/Z, WT098051
Nature genetics 2016;48;11;1303-1312
In vivo genome-wide profiling reveals a tissue-specific role for 5-formylcytosine.
The Babraham Institute, Epigenetics Programme, Cambridge, CB22 3AT, UK.
Background: Genome-wide methylation of cytosine can be modulated in the presence of TET and thymine DNA glycosylase (TDG) enzymes. TET is able to oxidise 5-methylcytosine (5mC) to 5-hydroxymethylcytosine (5hmC), 5-formylcytosine (5fC) and 5-carboxylcytosine (5caC). TDG can excise the oxidative products 5fC and 5caC, initiating base excision repair. These modified bases are stable and detectable in the genome, suggesting that they could have epigenetic functions in their own right. However, functional investigation of the genome-wide distribution of 5fC has been restricted to cell culture-based systems, while its in vivo profile remains unknown.
Results: Here, we describe the first analysis of the in vivo genome-wide profile of 5fC across a range of tissues from both wild-type and Tdg-deficient E11.5 mouse embryos. Changes in the formylation profile of cytosine upon depletion of TDG suggest TET/TDG-mediated active demethylation occurs preferentially at intron-exon boundaries and reveals a major role for TDG in shaping 5fC distribution at CpG islands. Moreover, we find that active enhancer regions specifically exhibit high levels of 5fC, resulting in characteristic tissue-diagnostic patterns, which suggest a role in embryonic development.
Conclusions: The tissue-specific distribution of 5fC can be regulated by the collective contribution of TET-mediated oxidation and excision by TDG. The in vivo profile of 5fC during embryonic development resembles that of embryonic stem cells, sharing key features including enrichment of 5fC in enhancer and intragenic regions. Additionally, by investigating mouse embryo 5fC profiles in a tissue-specific manner, we identify targeted enrichment at active enhancers involved in tissue development.
Funded by: Biotechnology and Biological Sciences Research Council: BBS/E/B/0000H112; Cancer Research UK; Medical Research Council; Wellcome Trust
Genome biology 2016;17;1;141
Kinetoplastid Phylogenomics Reveals the Evolutionary Innovations Associated with the Origins of Parasitism.
Department of Infection Biology, Institute of Infection and Global Health, University of Liverpool, Liverpool Science Park Ic2, 146 Brownlow Hill, Liverpool L3 5RF, UK. Electronic address: firstname.lastname@example.org.
The evolution of parasitism is a recurrent event in the history of life and a core problem in evolutionary biology. Trypanosomatids are important parasites and include the human pathogens Trypanosoma brucei, Trypanosoma cruzi, and Leishmania spp., which in humans cause African trypanosomiasis, Chagas disease, and leishmaniasis, respectively. Genome comparison between trypanosomatids reveals that these parasites have evolved specialized cell-surface protein families, overlaid on a well-conserved cell template. Understanding how these features evolved and which ones are specifically associated with parasitism requires comparison with related non-parasites. We have produced genome sequences for Bodo saltans, the closest known non-parasitic relative of trypanosomatids, and a second bodonid, Trypanoplasma borreli. Here we show how genomic reduction and innovation contributed to the character of trypanosomatid genomes. We show that gene loss has "streamlined" trypanosomatid genomes, particularly with respect to macromolecular degradation and ion transport, but consistent with a widespread loss of functional redundancy, while adaptive radiations of gene families involved in membrane function provide the principal innovations in trypanosomatid evolution. Gene gain and loss continued during trypanosomatid diversification, resulting in the asymmetric assortment of ancestral characters such as peptidases between Trypanosoma and Leishmania, genomic differences that were subsequently amplified by lineage-specific innovations after divergence. Finally, we show how species-specific, cell-surface gene families (DGF-1 and PSA) with no apparent structural similarity are independent derivations of a common ancestral form, which we call "bodonin." This new evidence defines the parasitic innovations of trypanosomatid genomes, revealing how a free-living phagotroph became adapted to exploiting hostile host environments.
Current biology : CB 2016;26;2;161-172
DNA REPAIR. Drugging DNA repair.
The Wellcome Trust/Cancer Research UK Gurdon Institute and Department of Biochemistry, University of Cambridge, Cambridge CB2 1QN, UK. The Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, UK. email@example.com firstname.lastname@example.org.
Science (New York, N.Y.) 2016;352;6290;1178-9
WGS analysis and molecular resistance mechanisms of azithromycin-resistant (MIC >2 mg/L) Neisseria gonorrhoeae isolates in Europe from 2009 to 2014.
Örebro University, Örebro, Sweden.
Objectives: To elucidate the genome-based epidemiology and phylogenomics of azithromycin-resistant (MIC >2 mg/L) Neisseria gonorrhoeae strains collected in 2009-14 in Europe and clarify the azithromycin resistance mechanisms.
Methods: Seventy-five azithromycin-resistant (MIC 4 to >256 mg/L) N. gonorrhoeae isolates collected in 17 European countries during 2009-14 were examined using antimicrobial susceptibility testing and WGS.
Results: Thirty-six N. gonorrhoeae multi-antigen sequence typing STs and five phylogenomic clades, including 4-22 isolates from several countries per clade, were identified. The azithromycin target mutation A2059G (Escherichia coli numbering) was found in all four alleles of the 23S rRNA gene in all isolates with high-level azithromycin resistance (n = 4; MIC ≥256 mg/L). The C2611T mutation was identified in two to four alleles of the 23S rRNA gene in the remaining 71 isolates. Mutations in mtrR and its promoter were identified in 43 isolates, comprising isolates within the whole azithromycin MIC range. No mutations associated with azithromycin resistance were found in the rplD gene or the rplV gene and none of the macrolide resistance-associated genes [mef(A/E), ere(A), ere(B), erm(A), erm(B), erm(C) and erm(F)] were identified in any isolate.
Conclusions: Clonal spread of relatively few N. gonorrhoeae strains accounts for the majority of the azithromycin resistance (MIC >2 mg/L) in Europe. The four isolates with high-level resistance to azithromycin (MIC ≥256 mg/L) were widely separated in the phylogenomic tree and did not belong to any of the main clades. The main azithromycin resistance mechanisms were the A2059G mutation (high-level resistance) and the C2611T mutation (low- and moderate-level resistance) in the 23S rRNA gene.
The Journal of antimicrobial chemotherapy 2016;71;11;3109-3116
Pan-genomic perspective on the evolution of the <i>Staphylococcus aureus</i> USA300 epidemic.
1The Wellcome Trust Sanger Institute, Cambridge CB10 1SA, UK.
<i>Staphylococcus aureus</i> USA300 represents the dominant community-associated methicillin-resistant <i>S. aureus</i> lineage in the USA, where it is a major cause of skin and soft tissue infections. Previous comparative genomic studies have described the population structure and evolution of USA300 based on geographically restricted isolate collections. Here, we investigated the USA300 population by sequencing genomes of a geographically distributed panel of 191 clinical <i>S. aureus</i> isolates belonging to clonal complex 8 (CC8), derived from the Tigecycline Evaluation and Surveillance Trial program. Isolates were collected at 12 healthcare centres across nine USA states in 2004, 2009 or 2010. Reconstruction of evolutionary relationships revealed that CC8 was dominated by USA300 isolates (154/191, 81 %), which were heterogeneous and demonstrated limited phylogeographic clustering. Analysis of the USA300 core genomes revealed an increase in median pairwise SNP distance from 62 to 98 between 2004 and 2010, with a stable pattern of above average d<i>N</i>/d<i>S</i> ratios. The phylogeny of the USA300 population indicated that early diversification events led to the formation of nested clades, which arose through cumulative acquisition of predominantly non-synonymous SNPs in various coding sequences. The accessory genome of USA300 was largely homogenous and consisted of elements previously associated with this lineage. We observed an emergence of SCC<i>mec</i> negative and ACME negative USA300 isolates amongst more recent samples, and an increase in the prevalence of ϕSa5 prophage. Together, the analysed <i>S. aureus</i> USA300 collection revealed an evolving pan-genome through increased core genome heterogeneity and temporal variation in the frequency of certain accessory elements.
Funded by: Wellcome Trust: 098051
Microbial genomics 2016;2;5;e000058
Lineage-Specific Genome Architecture Links Enhancers and Non-coding Disease Variants to Target Gene Promoters.
Nuclear Dynamics Programme, The Babraham Institute, Babraham Research Campus, Cambridge CB22 3AT, UK.
Long-range interactions between regulatory elements and gene promoters play key roles in transcriptional regulation. The vast majority of interactions are uncharted, constituting a major missing link in understanding genome control. Here, we use promoter capture Hi-C to identify interacting regions of 31,253 promoters in 17 human primary hematopoietic cell types. We show that promoter interactions are highly cell type specific and enriched for links between active promoters and epigenetically marked enhancers. Promoter interactomes reflect lineage relationships of the hematopoietic tree, consistent with dynamic remodeling of nuclear architecture during differentiation. Interacting regions are enriched in genetic variants linked with altered expression of genes they contact, highlighting their functional role. We exploit this rich resource to connect non-coding disease variants to putative target promoters, prioritizing thousands of disease-candidate genes and implicating disease pathways. Our results demonstrate the power of primary cell promoter interactomes to reveal insights into genomic regulatory mechanisms underlying common diseases.
Molecular characterisation of the Chlamydia pecorum plasmid from porcine, ovine, bovine, and koala strains indicates plasmid-strain co-evolution.
Centre for Animal Health Innovation, University of the Sunshine Coast , Sippy Downs, Queensland , Australia.
Background. Highly stable, evolutionarily conserved, small, non-integrative plasmids are commonly found in members of the Chlamydiaceae and, in some species, these plasmids have been strongly linked to virulence. To date, evidence for such a plasmid in Chlamydia pecorum has been ambiguous. In a recent comparative genomic study of porcine, ovine, bovine, and koala C. pecorum isolates, we identified plasmids (pCpec) in a pig and three koala strains, respectively. Screening of further porcine, ovine, bovine, and koala C. pecorum isolates for pCpec showed that pCpec is common, but not ubiquitous in C. pecorum from all of the infected hosts. Methods. We used a combination of (i) bioinformatic mining of previously sequenced C. pecorum genome data sets and (ii) pCpec PCR-amplicon sequencing to characterise a further 17 novel pCpecs in C. pecorum isolates obtained from livestock, including pigs, sheep, and cattle, as well as those from koala. Results and Discussion. This analysis revealed that pCpec is conserved with all eight coding domain sequences (CDSs) present in isolates from each of the hosts studied. Sequence alignments revealed that the 21 pCpecs show 99% nucleotide sequence identity, with 83 single nucleotide polymorphisms (SNPs) shown to differentiate all of the plasmids analysed in this study. SNPs were found to be mostly synonymous and were distributed evenly across all eight pCpec CDSs as well as in the intergenic regions. Although conserved, analyses of the 21 pCpec sequences resolved plasmids into 12 distinct genotypes, with five shared between pCpecs from different isolates, and the remaining seven genotypes being unique to a single pCpec. Phylogenetic analysis revealed congruency and co-evolution of pCpecs with their cognate chromosome, further supporting polyphyletic origin of the koala C. pecorum. This study provides further understanding of the complex epidemiology of this pathogen in livestock and koala hosts and paves the way for studies to evaluate the function of this putative C. pecorum virulence factor.
Whole-exome sequencing in an isolated population from the Dalmatian island of Vis.
Department of Research in Biomedicine and Health, University of Split School of Medicine, Split, Croatia.
We have whole-exome sequenced 176 individuals from the isolated population of the island of Vis in Croatia in order to describe exonic variation architecture. We found 290 577 single nucleotide variants (SNVs), 65% of which are singletons, low frequency or rare variants. A total of 25 430 (9%) SNVs are novel, previously not catalogued in NHLBI GO Exome Sequencing Project, UK10K-Generation Scotland, 1000Genomes Project, ExAC or NCBI Reference Assembly dbSNP. The majority of these variants (76%) are singletons. Comparable to data obtained from UK10K-Generation Scotland that were sequenced and analysed using the same protocols, we detected an enrichment of potentially damaging variants (non-synonymous and loss-of-function) in the low frequency and common variant categories. On average 115 (range 93-140) genotypes with loss-of-function variants, 23 (15-34) of which were homozygous, were identified per person. The landscape of loss-of-function variants across an exome revealed that variants mainly accumulated in genes on the xenobiotic-related pathways, of which majority coded for enzymes. The frequency of loss-of-function variants was additionally increased in Vis runs of homozygosity regions where variants mainly affected signalling pathways. This work confirms the isolate status of Vis population by means of whole-exome sequence and reveals the pattern of loss-of-function mutations, which resembles the trails of adaptive evolution that were found in other species. By cataloguing the exomic variants and describing the allelic structure of the Vis population, this study will serve as a valuable resource for future genetic studies of human diseases, population genetics and evolution in this population.
Funded by: Medical Research Council: MC_PC_U127561128; Wellcome Trust: 098051
European journal of human genetics : EJHG 2016;24;10;1479-87
Heterogeneity of CD34 and CD38 expression in acute B lymphoblastic leukemia cells is reversible and not hierarchically organized.
State Key Laboratory of Respiratory Disease, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, 190 Kaiyuan Avenue, Science Park, Guangzhou, Guangdong, 510530, China.
The existence and identification of leukemia-initiating cells in adult acute B lymphoblastic leukemia (B-ALL) remain controversial. We examined whether adult B-ALL is hierarchically organized into phenotypically distinct subpopulations of leukemogenic and non-leukemogenic cells or whether most B-ALL cells retain leukemogenic capacity, irrespective of their immunophenotype profiles. Our results suggest that adult B-ALL follows the stochastic stem cell model and that the expression of CD34 and CD38 in B-ALL is reversibly and not hierarchically organized.
Journal of hematology & oncology 2016;9;1;94
Anti-GPC3-CAR T Cells Suppress the Growth of Tumor Cells in Patient-Derived Xenografts of Hepatocellular Carcinoma.
State Key Laboratory of Respiratory Disease, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou, China; Key Laboratory of Regenerative Biology, South China Institute for Stem Cell Biology and Regenerative Medicine, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou, China; Guangdong Provincial Key Laboratory of Stem Cell and Regenerative Medicine, South China Institute for Stem Cell Biology and Regenerative Medicine, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou, China.
Background: The lack of a general clinic-relevant model for human cancer is a major impediment to the acceleration of novel therapeutic approaches for clinical use. We propose to establish and characterize primary human hepatocellular carcinoma (HCC) xenografts that can be used to evaluate the cytotoxicity of adoptive chimeric antigen receptor (CAR) T cells and accelerate the clinical translation of CAR T cells used in HCC.
Methods: Primary HCCs were used to establish the xenografts. The morphology, immunological markers, and gene expression characteristics of xenografts were detected and compared to those of the corresponding primary tumors. CAR T cells were adoptively transplanted into patient-derived xenograft (PDX) models of HCC. The cytotoxicity of CAR T cells <i>in vivo</i> was evaluated.
Results: PDX1, PDX2, and PDX3 were established using primary tumors from three individual HCC patients. All three PDXs maintained original tumor characteristics in their morphology, immunological markers, and gene expression. Tumors in PDX1 grew relatively slower than that in PDX2 and PDX3. Glypican 3 (GPC3)-CAR T cells efficiently suppressed tumor growth in PDX3 and impressively eradicated tumor cells from PDX1 and PDX2, in which GPC3 proteins were highly expressed.
Conclusion: GPC3-CAR T cells were capable of effectively eliminating tumors in PDX model of HCC. Therefore, GPC3-CAR T cell therapy is a promising candidate for HCC treatment.
Frontiers in immunology 2016;7;690
Identification of new heat-stable (STa) enterotoxin allele variants produced by human enterotoxigenic Escherichia coli (ETEC).
Department of Microbiology and Immunology, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden; Institute of Molecular Biology and Biotechnology, Universidad Mayor de San Andrés, La Paz, Bolivia. Electronic address: email@example.com.
We describe natural variants of the heat stable toxin (STa) produced by enterotoxigenic Escherichia coli (ETEC) isolates collected worldwide. Previous studies of ETEC isolated from human diarrheal cases have reported the existence of three natural STa gene variants estA1, estA2 and estA3/4 where the first variant encodes STp (porcine, bovine, and human origin) and the two latter ones encode STh (human origin). We identified STa sequences by BLASTn and profiled ST amino acid polymorphisms in a collection of 118 clinical ETEC isolates from children and adults from Asia, Africa and, Latin America that were characterized by whole genome sequencing. Three novel variants of STp and STh were found and designated STa5 and STa6, and STa7, respectively. Presence of glucose significantly decreased the production of STh and STp toxin variants (p<0.05) as well as downregulated the gene expression (STh: p<0.001, STp: p<0.05). We found that the ETEC isolates producing the most common STp variant, STa5, co-expressed coli surface antigen CS6 and was significantly associated with disease in adults in this data set (p<0.001). Expression of mature STa5 peptide as well as gene expression of tolC, involved in ST secretion, increased in response to bile (p<0.05). ETEC expressing the common STh variant STa3/4 was associated with disease in children (p<0.05). The crp gene, that positively regulate estA3/4 encoding STa3/4, and estA3/4 itself had decreased transcriptional levels in presence of bile. Since bile levels in the intestine are lower in children than adults, these results may suggest differences in pathogenicity of ETEC in children and adult populations.
International journal of medical microbiology : IJMM 2016
Targeting the RB-E2F pathway in breast cancer.
Division of Molecular Carcinogenesis and Cancer Genomics Netherlands, The Netherlands Cancer Institute, Amsterdam, The Netherlands.
Mutations of the retinoblastoma tumor-suppressor gene (RB1) or components regulating the CDK-RB-E2F pathway have been identified in nearly every human malignancy. Re-establishing cell cycle control through cyclin-dependent kinase (CDK) inhibition has therefore emerged as an attractive option in the development of targeted cancer therapy. The most successful example of this today is the use of the CDK4/6 inhibitor palbociclib combined with aromatase inhibitors for the treatment of estrogen receptor-positive breast cancers. Multiple studies have demonstrated that the CDK-RB-E2F pathway is critical for the control of cell proliferation. More recently, studies have highlighted additional roles of this pathway, especially E2F transcription factors themselves, in tumor progression, angiogenesis and metastasis. Specific E2Fs also have prognostic value in breast cancer, independent of clinical parameters. We discuss here recent advances in understanding of the RB-E2F pathway in breast cancer. We also discuss the application of genome-wide genetic screening efforts to gain insight into synthetic lethal interactions of CDK4/6 inhibitors in breast cancer for the development of more effective combination therapies.
Funded by: Wellcome Trust: 102696STRATTON, 102696Stratton
cgpCaVEManWrapper: Simple Execution of CaVEMan in Order to Detect Somatic Single Nucleotide Variants in NGS Data.
Cancer Genome Project, Wellcome Trust Sanger Institute, Cambridge, United Kingdom.
CaVEMan is an expectation maximization-based somatic substitution-detection algorithm that is written in C. The algorithm analyzes sequence data from a test sample, such as a tumor relative to a reference normal sample from the same patient and the reference genome. It performs a comparative analysis of the tumor and normal sample to derive a probabilistic estimate for putative somatic substitutions. When combined with a set of validated post-hoc filters, CaVEMan generates a set of somatic substitution calls with high recall and positive predictive value. Here we provide instructions for using a wrapper script called cgpCaVEManWrapper, which runs the CaVEMan algorithm and additional downstream post-hoc filters. We describe both a simple one-shot run of cgpCaVEManWrapper and a more in-depth implementation suited to large-scale compute farms. © 2016 by John Wiley & Sons, Inc.
Funded by: Wellcome Trust: 098051
Current protocols in bioinformatics 2016;56;15.10.1-15.10.18
Salmonella Enteritidis Isolate Harboring Multiple Efflux Pumps and Pathogenicity Factors, Shows Absence of O Antigen Polymerase Gene.
National Reference Laboratory of Antibiotic Resistances and Healthcare Associated Infections, Department of Infectious Diseases, National Health Institute Doutor Ricardo Jorge (INSA)Lisbon, Portugal; Centre for the Studies of Animal Science, Institute of Agrarian and Agri-Food Sciences and Technologies, University of PortoPorto, Portugal.
Frontiers in microbiology 2016;7;1130
Heterozygous KIDINS220/ARMS nonsense variants cause spastic paraplegia, intellectual disability, nystagmus, and obesity.
Department of Clinical Genetics, Guys' and St. Thomas' Hospital, London SE1 7EH, UK.
We identified de novo nonsense variants in KIDINS220/ARMS in three unrelated patients with spastic paraplegia, intellectual disability, nystagmus, and obesity (SINO). KIDINS220 is an essential scaffold protein coordinating neurotrophin signal pathways in neurites and is spatially and temporally regulated in the brain. Molecular analysis of patients' variants confirmed expression and translation of truncated transcripts similar to recently characterized alternative terminal exon splice isoforms of KIDINS220 KIDINS220 undergoes extensive alternative splicing in specific neuronal populations and developmental time points, reflecting its complex role in neuronal maturation. In mice and humans, KIDINS220 is alternative spliced in the middle region as well as in the last exon. These full-length and KIDINS220 splice variants occur at precise moments in cortical, hippocampal, and motor neuron development, with splice variants similar to the variants seen in our patients and lacking the last exon of KIDINS220 occurring in adult rather than in embryonic brain. We conducted tissue-specific expression studies in zebrafish that resulted in spasms, confirming a functional link with disruption of the KIDINS220 levels in developing neurites. This work reveals a crucial physiological role of KIDINS220 in development and provides insight into how perturbation of the complex interplay of KIDINS220 isoforms and their relative expression can affect neuron control and human metabolism. Altogether, we here show that de novo protein-truncating KIDINS220 variants cause a new syndrome, SINO. This is the first report of KIDINS220 variants causing a human disease.
Funded by: Wellcome Trust: WT098051
Human molecular genetics 2016;25;11;2158-2167
New native South American Y chromosome lineages.
Departamento de Biologia Geral, Universidade Federal de Minas Gerais, Belo Horizonte, Brazil.
Many single-nucleotide polymorphisms (SNPs) in the non-recombining region of the human Y chromosome have been described in the last decade. High-coverage sequencing has helped to characterize new SNPs, which has in turn increased the level of detail in paternal phylogenies. However, these paternal lineages still provide insufficient information on population history and demography, especially for Native Americans. The present study aimed to identify informative paternal sublineages derived from the main founder lineage of the Americas-haplogroup Q-L54-in a sample of 1841 native South Americans. For this purpose, we used a Y-chromosomal genotyping multiplex platform and conventional genotyping methods to validate 34 new SNPs that were identified in the present study by sequencing, together with many Y-SNPs previously described in the literature. We updated the haplogroup Q phylogeny and identified two new Q-M3 and three new Q-L54*(xM3) sublineages defined by five informative SNPs, designated SA04, SA05, SA02, SA03 and SA29. Within the Q-M3, sublineage Q-SA04 was mostly found in individuals from ethnic groups belonging to the Tukanoan linguistic family in the northwest Amazon, whereas sublineage Q-SA05 was found in Peruvian and Bolivian Amazon ethnic groups. Within Q-L54*, the derived sublineages Q-SA03 and Q-SA02 were exclusively found among Coyaima individuals (Cariban linguistic family) from Colombia, while Q-SA29 was found only in Maxacali individuals (Jean linguistic family) from southeast Brazil. Furthermore, we validated the usefulness of several published SNPs among indigenous South Americans. This new Y chromosome haplogroup Q phylogeny offers an informative paternal genealogy to investigate the pre-Columbian history of South America.Journal of Human Genetics advance online publication, 31 March 2016; doi:10.1038/jhg.2016.26.
Journal of human genetics 2016;61;7;593-603
Deficiency of the zinc finger protein ZFP106 causes motor and sensory neurodegeneration.
MRC Mammalian Genetics Unit, Harwell, Oxfordshire OX11 0RD, UK.
Zinc finger motifs are distributed amongst many eukaryotic protein families, directing nucleic acid-protein and protein-protein interactions. Zinc finger protein 106 (ZFP106) has previously been associated with roles in immune response, muscle differentiation, testes development and DNA damage, although little is known about its specific function. To further investigate the function of ZFP106, we performed an in-depth characterization of Zfp106 deficient mice (Zfp106(-/-)), and we report a novel role for ZFP106 in motor and sensory neuronal maintenance and survival. Zfp106(-/-) mice develop severe motor abnormalities, major deficits in muscle strength and histopathological changes in muscle. Intriguingly, despite being highly expressed throughout the central nervous system, Zfp106(-/-) mice undergo selective motor and sensory neuronal and axonal degeneration specific to the spinal cord and peripheral nervous system. Neurodegeneration does not occur during development of Zfp106(-/-) mice, suggesting that ZFP106 is likely required for the maintenance of mature peripheral motor and sensory neurons. Analysis of embryonic Zfp106(-/-) motor neurons revealed deficits in mitochondrial function, with an inhibition of Complex I within the mitochondrial electron transport chain. Our results highlight a vital role for ZFP106 in sensory and motor neuron maintenance and reveal a novel player in mitochondrial dysfunction and neurodegeneration.
Human molecular genetics 2016;25;2;291-307
Mutations at protein-protein interfaces: Small changes over big surfaces have large impacts on human health.
Department of Biochemistry, University of Cambridge, Cambridge CB2 1GA, UK; Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK.
Many essential biological processes including cell regulation and signalling are mediated through the assembly of protein complexes. Changes to protein-protein interaction (PPI) interfaces can affect the formation of multiprotein complexes, and consequently lead to disruptions in interconnected networks of PPIs within and between cells, further leading to phenotypic changes as functional interactions are created or disrupted. Mutations altering PPIs have been linked to the development of genetic diseases including cancer and rare Mendelian diseases, and to the development of drug resistance. The importance of these protein mutations has led to the development of many resources for understanding and predicting their effects. We propose that a better understanding of how these mutations affect the structure, function, and formation of multiprotein complexes provides novel opportunities for tackling them, including the development of small-molecule drugs targeted specifically to mutated PPIs.
Progress in biophysics and molecular biology 2016
Comparison of bacterial genome assembly software for MinION data and their applicability to medical microbiology.
1Department of Medicine, University of Cambridge, Level 5, Addenbrookes Hospital, CB2 0QQ Cambridge, UK.
Translating the Oxford Nanopore MinION sequencing technology into medical microbiology requires on-going analysis that keeps pace with technological improvements to the instrument and release of associated analysis software. Here, we use a multidrug-resistant <i>Enterobacter kobei</i> isolate as a model organism to compare open source software for the assembly of genome data, and relate this to the time taken to generate actionable information. Three software tools (PBcR, Canu and miniasm) were used to assemble MinION data and a fourth (SPAdes) was used to combine MinION and Illumina data to produce a hybrid assembly. All four had a similar number of contigs and were more contiguous than the assembly using Illumina data alone, with SPAdes producing a single chromosomal contig. Evaluation of the four assemblies to represent the genome structure revealed a single large inversion in the SPAdes assembly, which also incorrectly integrated a plasmid into the chromosomal contig. Almost 50 %, 80 % and 90 % of MinION pass reads were generated in the first 6, 9 and 12 h, respectively. Using data from the first 6 h alone led to a less accurate, fragmented assembly, but data from the first 9 or 12 h generated similar assemblies to that from 48 h sequencing. Assemblies were generated in 2 h using Canu, indicating that going from isolate to assembled data is possible in less than 48 h. MinION data identified that genes responsible for resistance were carried by two plasmids encoding resistance to carbapenem and to sulphonamides, rifampicin and aminoglycosides, respectively.
Funded by: Department of Health; Wellcome Trust: WT098600
Microbial genomics 2016;2;9;e000085
Efficient gene targeting in mouse zygotes mediated by CRISPR/Cas9-protein.
University of California, San Francisco Benioff Children's Hospital Oakland Research Institute, Oakland, CA, 94609, USA.
The CRISPR/Cas9 system has rapidly advanced targeted genome editing technologies. However, its efficiency in targeting with constructs in mouse zygotes via homology directed repair (HDR) remains low. Here, we systematically explored optimal parameters for targeting constructs in mouse zygotes via HDR using mouse embryonic stem cells as a model system. We characterized several parameters, including single guide RNA cleavage activity and the length and symmetry of homology arms in the construct, and we compared the targeting efficiency between Cas9, Cas9nickase, and dCas9-FokI. We then applied the optimized conditions to zygotes, delivering Cas9 as either mRNA or protein. We found that Cas9 nucleo-protein complex promotes highly efficient, multiplexed targeting of circular constructs containing reporter genes and floxed exons. This approach allows for a one-step zygote injection procedure targeting multiple genes to generate conditional alleles via homologous recombination, and simultaneous knockout of corresponding genes in non-targeted alleles via non-homologous end joining.
Transgenic research 2016
Targeting Chromatin Regulators Inhibits Leukemogenic Gene Expression in NPM1 Mutant Leukemia.
Cancer Biology and Genetics Program, Memorial Sloan Kettering Cancer Center.
Homeobox (HOX) proteins and the receptor tyrosine kinase FLT3 are frequently highly expressed and mutated in acute myeloid leukemia (AML). Aberrant HOX expression is found in nearly all AMLs that harbor a mutation in the Nucleophosmin (NPM1) gene, and FLT3 is concomitantly mutated in approximately 60% of these cases. Little is known how mutant NPM1 (NPM1mut) cells maintain aberrant gene expression. Here, we demonstrate that the histone modifiers MLL1 and DOT1L control HOX and FLT3 expression and differentiation in NPM1mut AML. Using a CRISPR-Cas9 genome editing domain screen, we show NPM1mut AML to be exceptionally dependent on the menin binding site in MLL1. Pharmacological small-molecule inhibition of the menin-MLL1 protein interaction had profound anti-leukemic activity in human and murine models of NPM1mut AML. Combined pharmacological inhibition of menin-MLL1 and DOT1L resulted in dramatic suppression of HOX and FLT3 expression, induction of differentiation, and superior activity against NPM1mut leukemia. STATEMENT OF SIGNIFICANCE MLL1 and DOT1L are chromatin regulators that control HOX, MEIS1 and FLT3 expression and are therapeutic targets in NPM1mut AML. Combinatorial small-molecule inhibition has synergistic on target activity and constitutes a novel therapeutic concept for this common AML subtype.
Cancer discovery 2016
Epstein-Barr virus nuclear protein EBNA3C directly induces expression of AID and somatic mutations in B cells.
Molecular Virology, Department of Medicine, Imperial College London, London W2 1PG, England, UK.
Activation-induced cytidine deaminase (AID), the enzyme responsible for induction of sequence variation in immunoglobulins (Igs) during the process of somatic hypermutation (SHM) and also Ig class switching, can have a potent mutator phenotype in the development of lymphoma. Using various Epstein-Barr virus (EBV) recombinants, we provide definitive evidence that the viral nuclear protein EBNA3C is essential in EBV-infected primary B cells for the induction of AID mRNA and protein. Using lymphoblastoid cell lines (LCLs) established with EBV recombinants conditional for EBNA3C function, this was confirmed, and it was shown that transactivation of the AID gene (AICDA) is associated with EBNA3C binding to highly conserved regulatory elements located proximal to and upstream of the AICDA transcription start site. EBNA3C binding initiated epigenetic changes to chromatin at specific sites across the AICDA locus. Deep sequencing of cDNA corresponding to the IgH V-D-J region from the conditional LCL was used to formally show that SHM is activated by functional EBNA3C and induction of AID. These data, showing the direct targeting and induction of functional AID by EBNA3C, suggest a novel role for EBV in the etiology of B cell cancers, including endemic Burkitt lymphoma.
Funded by: Wellcome Trust: 097005, 099273/Z/12/Z
The Journal of experimental medicine 2016;213;6;921-8
EPEC: a cocktail of virulence.
Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.
Genomics studies are prompting a re-evaluation of the diversity of Escherichia coli pathovars and how this diversity corresponds to virulence.
Funded by: Medical Research Council: G1100100
Nature reviews. Microbiology 2016;14;4;196
Genome wide conditional mouse knockout resources
Drug Discovery Today: Disease Models 2016;20;3;12
Analysis with the exome array identifies multiple new independent variants in lipid loci.
1. William Harvey Research Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London, UK.
It has been hypothesised that low frequency (1-5% MAF) and rare (<1% MAF) variants with large effect sizes may contribute to the missing heritability in complex traits. Here we report an association analysis of lipid traits (total cholesterol, LDL-cholesterol, HDL-cholesterol triglycerides) in up to 27,312 individuals with a comprehensive set of low frequency coding variants (ExomeChip), combined with conditional analysis in the known lipid loci. No new locus reached genome-wide significance. However, we found a new lead variant in 26 known lipid association regions of which 16 were >1000 fold more significant than the previous sentinel variant and not in close LD (6 had MAF < 5%). Furthermore, conditional analysis revealed multiple independent signals (ranging from 1-5) in a third of the 98 lipid loci tested, including rare variants. Addition of our novel associations resulted in between 1.5-2.5 fold increase in the proportion of heritability explained for the different lipid traits. Our findings suggest that rare coding variants contribute to the genetic architecture of lipid traits.
Human molecular genetics 2016
The Ecological Dynamics of Fecal Contamination and Salmonella Typhi and Salmonella Paratyphi A in Municipal Kathmandu Drinking Water.
Oxford University Clinical Research Unit, Patan Academy of Health Sciences, Kathmandu, Nepal.
One of the UN sustainable development goals is to achieve universal access to safe and affordable drinking water by 2030. It is locations like Kathmandu, Nepal, a densely populated city in South Asia with endemic typhoid fever, where this goal is most pertinent. Aiming to understand the public health implications of water quality in Kathmandu we subjected weekly water samples from 10 sources for one year to a range of chemical and bacteriological analyses. We additionally aimed to detect the etiological agents of typhoid fever and longitudinally assess microbial diversity by 16S rRNA gene surveying. We found that the majority of water sources exhibited chemical and bacterial contamination exceeding WHO guidelines. Further analysis of the chemical and bacterial data indicated site-specific pollution, symptomatic of highly localized fecal contamination. Rainfall was found to be a key driver of this fecal contamination, correlating with nitrates and evidence of S. Typhi and S. Paratyphi A, for which DNA was detectable in 333 (77%) and 303 (70%) of 432 water samples, respectively. 16S rRNA gene surveying outlined a spectrum of fecal bacteria in the contaminated water, forming complex communities again displaying location-specific temporal signatures. Our data signify that the municipal water in Kathmandu is a predominant vehicle for the transmission of S. Typhi and S. Paratyphi A. This study represents the first extensive spatiotemporal investigation of water pollution in an endemic typhoid fever setting and implicates highly localized human waste as the major contributor to poor water quality in the Kathmandu Valley.
Funded by: Medical Research Council: G0902420, MR/K010174/1; NIGMS NIH HHS: U01 GM110721; Wellcome Trust: 098051, 100087, 100087/Z/12/Z
PLoS neglected tropical diseases 2016;10;1;e0004346
Retrospective Analysis of Serotype Switching of Vibrio cholerae O1 in a Cholera Endemic Region Shows It Is a Non-random Process.
Department of Microbiology and Immunology, Institute of Biomedicine, University of Gothenburg, Gothenburg, Sweden.
Genomic data generated from clinical Vibrio cholerae O1 isolates collected over a five year period in an area of Kolkata, India with seasonal cholera outbreaks allowed a detailed genetic analysis of serotype switching that occurred from Ogawa to Inaba and back to Ogawa. The change from Ogawa to Inaba resulted from mutational disruption of the methyltransferase encoded by the wbeT gene. Re-emergence of the Ogawa serotype was found to result either from expansion of an already existing Ogawa clade or reversion of the mutation in an Inaba clade. Our data suggests that such transitions are not random events but rather driven by as yet unidentified selection mechanisms based on differences in the structure of the O1 antigen or in the serotype-determining wbeT gene.
PLoS neglected tropical diseases 2016;10;10;e0005044
Improving the Identification of Phenotypic Abnormalities and Sexual Dimorphism in Mice When Studying Rare Event Categorical Characteristics.
Wellcome Trust Sanger Institute; firstname.lastname@example.org.
Biological research frequently involves the study of phenotyping data. Many of these studies focus on rare event categorical data, and in functional genomics typically study the presence or absence of an abnormal phenotype. With the growing interest in the role of sex, there is a need to assess the phenotype for sexual dimorphism. The identification of abnormal phenotypes for downstream research is challenged by the small sample size, the rare event nature, and the multiple testing problem, as many variables are monitored simultaneously. Here we develop a statistical pipeline to assess statistical and biological significance whilst managing the multiple testing problem. We propose a two-step pipeline to initially assess for a treatment effect, in our case example genotype, and then test for an interaction with sex. We compare multiple statistical methods and use simulations to investigate the control of the type one error rate and power. To maximize the power whilst addressing the multiple testing issue we implement filters to remove datasets where the hypotheses to be tested cannot achieve significance. A motivating case study utilizing a large scale high throughput mouse phenotyping dataset from the Wellcome Trust Sanger Institute Mouse Genetics Project, where the treatment is a gene ablation, demonstrates the benefits of the new pipeline on the downstream biological calls.
BRAF(V600E) Kinase Domain Duplication Identified in Therapy-Refractory Melanoma Patient-Derived Xenografts.
Division of Molecular Oncology, The Netherlands Cancer Institute, Plesmanlaan 121, 1066 CX Amsterdam, the Netherlands.
The therapeutic landscape of melanoma is improving rapidly. Targeted inhibitors show promising results, but drug resistance often limits durable clinical responses. There is a need for in vivo systems that allow for mechanistic drug resistance studies and (combinatorial) treatment optimization. Therefore, we established a large collection of patient-derived xenografts (PDXs), derived from BRAF(V600E), NRAS(Q61), or BRAF(WT)/NRAS(WT) melanoma metastases prior to treatment with BRAF inhibitor and after resistance had occurred. Taking advantage of PDXs as a limitless source, we screened tumor lysates for resistance mechanisms. We identified a BRAF(V600E) protein harboring a kinase domain duplication (BRAF(V600E/DK)) in ∼10% of the cases, both in PDXs and in an independent patient cohort. While BRAF(V600E/DK) depletion restored sensitivity to BRAF inhibition, a pan-RAF dimerization inhibitor effectively eliminated BRAF(V600E/DK)-expressing cells. These results illustrate the utility of this PDX platform and warrant clinical validation of BRAF dimerization inhibitors for this group of melanoma patients.
Funded by: Cancer Research UK: 13031; Wellcome Trust: WT098051
Cell reports 2016;16;1;263-277
The nucleosome landscape of Plasmodium falciparum reveals chromatin architecture and dynamics of regulatory sequences.
Department of Molecular Biology, Radboud University, 6525GA Nijmegen, The Netherlands.
In eukaryotes, the chromatin architecture has a pivotal role in regulating all DNA-associated processes and it is central to the control of gene expression. For Plasmodium falciparum, a causative agent of human malaria, the nucleosome positioning profile of regulatory regions deserves particular attention because of their extreme AT-content. With the aid of a highly controlled MNase-seq procedure we reveal how positioning of nucleosomes provides a structural and regulatory framework to the transcriptional unit by demarcating landmark sites (transcription/translation start and end sites). In addition, our analysis provides strong indications for the function of positioned nucleosomes in splice site recognition. Transcription start sites (TSSs) are bordered by a small nucleosome-depleted region, but lack the stereotypic downstream nucleosome arrays, highlighting a key difference in chromatin organization compared to model organisms. Furthermore, we observe transcription-coupled eviction of nucleosomes on strong TSSs during intraerythrocytic development and demonstrate that nucleosome positioning and dynamics can be predictive for the functionality of regulatory DNA elements. Collectively, the strong nucleosome positioning over splice sites and surrounding putative transcription factor binding sites highlights the regulatory capacity of the nucleosome landscape in this deadly human pathogen.
Funded by: Wellcome Trust: WT 098051
Nucleic acids research 2016;44;5;2110-24
Polymorphism in a lincRNA Associates with a Doubled Risk of Pneumococcal Bacteremia in Kenyan Children.
Wellcome Trust Centre for Human Genetics, University of Oxford, Roosevelt Drive, Oxford OX3 7BN, UK. Electronic address: email@example.com.
Bacteremia (bacterial bloodstream infection) is a major cause of illness and death in sub-Saharan Africa but little is known about the role of human genetics in susceptibility. We conducted a genome-wide association study of bacteremia susceptibility in more than 5,000 Kenyan children as part of the Wellcome Trust Case Control Consortium 2 (WTCCC2). Both the blood-culture-proven bacteremia case subjects and healthy infants as controls were recruited from Kilifi, on the east coast of Kenya. Streptococcus pneumoniae is the most common cause of bacteremia in Kilifi and was thus the focus of this study. We identified an association between polymorphisms in a long intergenic non-coding RNA (lincRNA) gene (AC011288.2) and pneumococcal bacteremia and replicated the results in the same population (p combined = 1.69 × 10(-9); OR = 2.47, 95% CI = 1.84-3.31). The susceptibility allele is African specific, derived rather than ancestral, and occurs at low frequency (2.7% in control subjects and 6.4% in case subjects). Our further studies showed AC011288.2 expression only in neutrophils, a cell type that is known to play a major role in pneumococcal clearance. Identification of this novel association will further focus research on the role of lincRNAs in human infectious disease.
American journal of human genetics 2016;98;6;1092-100
High-throughput DNA methylation analysis in anorexia nervosa confirms TNXB hypermethylation.
a Clinical Epidemiology, Integrated Research and Treatment Center, Center for Sepsis Control and Care (CSCC), Jena University Hospital , Jena , Germany ;
Objectives: Patients with anorexia nervosa (AN) are ideally suited to identify differentially methylated genes in response to starvation.
Methods: We examined high-throughput DNA methylation derived from whole blood of 47 females with AN, 47 lean females without AN and 100 population-based females to compare AN with both controls. To account for different cell type compositions, we applied two reference-free methods (FastLMM-EWASher, RefFreeEWAS) and searched for consensus CpG sites identified by both methods. We used a validation sample of five monozygotic AN-discordant twin pairs.
Results: Fifty-one consensus sites were identified in AN vs. lean and 81 in AN vs. population-based comparisons. These sites have not been reported in AN methylation analyses, but for the latter comparison 54/81 sites showed directionally consistent differential methylation effects in the AN-discordant twins. For a single nucleotide polymorphism rs923768 in CSGALNACT1 a nearby site was nominally associated with AN. At the gene level, we confirmed hypermethylated sites at TNXB. We found support for a locus at NR1H3 in the AN vs. lean control comparison, but the methylation direction was opposite to the one previously reported.
Conclusions: We confirm genes like TNXB previously described to comprise differentially methylated sites, and highlight further sites that might be specifically involved in AN starvation processes.
The world journal of biological psychiatry : the official journal of the World Federation of Societies of Biological Psychiatry 2016;1-13
Genome-wide study for circulating metabolites identifies 62 loci and reveals novel systemic effects of LPA.
Computational Medicine, Faculty of Medicine, University of Oulu, PO Box 5000, 90014 Oulu, Finland.
Genome-wide association studies have identified numerous loci linked with complex diseases, for which the molecular mechanisms remain largely unclear. Comprehensive molecular profiling of circulating metabolites captures highly heritable traits, which can help to uncover metabolic pathophysiology underlying established disease variants. We conduct an extended genome-wide association study of genetic influences on 123 circulating metabolic traits quantified by nuclear magnetic resonance metabolomics from up to 24,925 individuals and identify eight novel loci for amino acids, pyruvate and fatty acids. The LPA locus link with cardiovascular risk exemplifies how detailed metabolic profiling may inform underlying aetiology via extensive associations with very-low-density lipoprotein and triglyceride metabolism. Genetic fine mapping and Mendelian randomization uncover wide-spread causal effects of lipoprotein(a) on overall lipoprotein metabolism and we assess potential pleiotropic consequences of genetically elevated lipoprotein(a) on diverse morbidities via electronic health-care records. Our findings strengthen the argument for safe LPA-targeted intervention to reduce cardiovascular risk.
Nature communications 2016;7;11122
Diagnostic Yield of Sequencing Familial Hypercholesterolemia Genes in Patients with Severe Hypercholesterolemia.
Center for Human Genetic Research, Cardiovascular Research Center and Cardiology Division, Massachusetts General Hospital, Harvard Medical School, Boston MA; Program in Medical and Population Genetics, Broad Institute, Cambridge, MA.
Background: About 7% of US adults have severe hypercholesterolemia (untreated LDL cholesterol ≥190 mg/dl). Such high LDL levels may be due to familial hypercholesterolemia (FH), a condition caused by a single mutation in any of three genes. Lifelong elevations in LDL cholesterol in FH mutation carriers may confer CAD risk beyond that captured by a single LDL cholesterol measurement.
Objectives: Assess the prevalence of a FH mutation among those with severe hypercholesterolemia and determine whether CAD risk varies according to mutation status beyond the observed LDL cholesterol.
Methods: Three genes causative for FH (LDLR, APOB, PCSK9) were sequenced in 26,025 participants from 7 case-control studies (5,540 CAD cases, 8,577 CAD-free controls) and 5 prospective cohort studies (11,908 participants). FH mutations included loss-of-function variants in LDLR, missense mutations in LDLR predicted to be damaging, and variants linked to FH in ClinVar, a clinical genetics database.
Results: Among 8,577 CAD-free control participants, 430 had LDL cholesterol ≥190 mg/dl; of these, only eight (1.9%) carried a FH mutation. Similarly, among 11,908 participants from 5 prospective cohorts, 956 had LDL cholesterol ≥190 mg/dl and of these, only 16 (1.7%) carried a FH mutation. Within any stratum of observed LDL cholesterol, risk of CAD was higher among FH mutation carriers when compared with non-carriers. When compared to a reference group with LDL cholesterol <130 mg/dl and no mutation, participants with LDL cholesterol ≥190 mg/dl and no FH mutation had six-fold higher risk for CAD (OR 6.0; 95%CI 5.2-6.9) whereas those with LDL cholesterol ≥190 mg/dl as well as a FH mutation demonstrated twenty-two fold increased risk (OR 22.3; 95%CI 10.7-53.2).
Conclusions: Among individuals with LDL cholesterol ≥190 mg/dl, gene sequencing identified a FH mutation in <2%. However, for any given observed LDL cholesterol, FH mutation carriers are at substantially increased risk for CAD.
Journal of the American College of Cardiology 2016
Evolutionary dynamics of Anolis sex chromosomes revealed by sequencing of flow sorting-derived microchromosome-specific DNA.
Institute of Molecular and Cellular Biology SB RAS, Novosibirsk, 630090, Russia. firstname.lastname@example.org.
Squamate reptiles show a striking diversity in modes of sex determination, including both genetic (XY or ZW) and temperature-dependent sex determination systems. The genomes of only a handful of species have been sequenced, analyzed and assembled including the genome of Anolis carolinensis. Despite a high genome coverage, only macrochromosomes of A. carolinensis were assembled whereas the content of most microchromosomes remained unclear. Most of the Anolis species have homomorphic XY sex chromosome system. However, some species have large heteromorphic XY chromosomes (e.g., A. sagrei) and even multiple sex chromosomes systems (e.g. A. pogus), that were shown to be derived from fusions of the ancestral XY with microautosomes. We applied next generation sequencing of flow sorting-derived chromosome-specific DNA pools to characterize the content and composition of microchromosomes in A. carolinensis and A. sagrei. Comparative analysis of sequenced chromosome-specific DNA pools revealed that the A. sagrei XY sex chromosomes contain regions homologous to several microautosomes of A. carolinensis. We suggest that the sex chromosomes of A. sagrei are derived by fusions of the ancestral sex chromosome with three microautosomes and subsequent loss of some genetic content on the Y chromosome.
Molecular genetics and genomics : MGG 2016;291;5;1955-66
Genome-wide meta-analysis uncovers novel loci influencing circulating leptin levels.
The Novo Nordisk Foundation Center for Basic Metabolic Research, Section of Metabolic Genetics, Faculty of Health and Medical Sciences, University of Copenhagen, Universitetsparken 1, DIKU Building, Copenhagen 2100, Denmark.
Leptin is an adipocyte-secreted hormone, the circulating levels of which correlate closely with overall adiposity. Although rare mutations in the leptin (LEP) gene are well known to cause leptin deficiency and severe obesity, no common loci regulating circulating leptin levels have been uncovered. Therefore, we performed a genome-wide association study (GWAS) of circulating leptin levels from 32,161 individuals and followed up loci reaching P<10(-6) in 19,979 additional individuals. We identify five loci robustly associated (P<5 × 10(-8)) with leptin levels in/near LEP, SLC32A1, GCKR, CCNL1 and FTO. Although the association of the FTO obesity locus with leptin levels is abolished by adjustment for BMI, associations of the four other loci are independent of adiposity. The GCKR locus was found associated with multiple metabolic traits in previous GWAS and the CCNL1 locus with birth weight. Knockdown experiments in mouse adipose tissue explants show convincing evidence for adipogenin, a regulator of adipocyte differentiation, as the novel causal gene in the SLC32A1 locus influencing leptin levels. Our findings provide novel insights into the regulation of leptin production by adipose tissue and open new avenues for examining the influence of variation in leptin levels on adiposity and metabolic health.
Funded by: British Heart Foundation: PG/07/131/24254, PG/13/66/30442; Canadian Institutes of Health Research: FRCN-CCT-83028; Intramural NIH HHS; Medical Research Council: G0701863, G9815508, MC_U106179471, MC_U106179472, MC_U147574242, MC_UP_A620_1016, MC_UU_12011/3, MC_UU_12011/4, MC_UU_12013/3, MC_UU_12013/8, MC_UU_12015/1, MC_UU_12015/2, MR/J012165/1; NCATS NIH HHS: UL1 TR000040, UL1 TR000124, UL1 TR001079, UL1-TR-000040, UL1-TR-001079; NCI NIH HHS: CA047988, CA055075, CA087969, CA49449, CA50385, CA65725, CA67262, P01 CA055075, P01 CA087969, R01 CA047988, R01 CA049449, R01 CA050385, R01 CA065725, R01 CA067262, U01 CA049449, U01 CA067262, U01CA098233, UM1 CA182913; NCRR NIH HHS: UL1 RR024156, UL1 RR025005, UL1-RR-24156, UL1-RR-25005; NHGRI NIH HHS: HG004399, HG004446, HHSN268200782096C, U01 HG004399, U01 HG007033, U01-HG007033; NHLBI NIH HHS: 5R01HL068891, 5R01HL087700, HL-043851, HL-045670, HL080467, N01-HC-65226, N01-HC-95160, N01-HC-95161, N01-HC-95162, N01-HC-95163, N01-HC-95164, N01-HC-95165, N01-HC-95166, N01-HC-95167, N01-HC-95168, N01-HC-95169, N01-HC95159, N01HC65226, N01HC95159, N01HC95160, N01HC95161, N01HC95162, N01HC95163, N01HC95164, N01HC95165, N01HC95166, N01HC95167, N01HC95168, N01HC95169, N02HL64278, R00 HL098459, R00-HL-098459, R01 HL043851, R01 HL045670, R01 HL068891, R01 HL071051, R01 HL071205, R01 HL071250, R01 HL071251, R01 HL071252, R01 HL071258, R01 HL071259, R01 HL080467, R01 HL087700, R01 HL088451, R01 HL117078, R01-HL-071051, R01-HL-071205, R01-HL-071250, R01-HL-071251, R01-HL-071252, R01-HL-071258, R01-HL-071259, R01-HL-088451, R01HL117078; NIA NIH HHS: 1R01AG032098-01A1, N01 AG062101, N01 AG062106, N01AG62103, R01 AG032098; NIDDK NIH HHS: 1R01DK080015, 5R01DK068336, 5R01DK075681, 5R01DK07568102, DK-26687, DK058845, DK52431, P30 DK020541, P30 DK026687, P30 DK063491, R01 DK052431, R01 DK058845, R01 DK068336, R01 DK075681, R01 DK080015, R01 DK089256, R01DK089256; NIMHD NIH HHS: 263 MD 821336, 263MD9164, R01 MD009164; PHS HHS: HHSN26800625226C, HHSN268200782096C; Wellcome Trust: 081917/Z/07/Z, 086596/Z/08/Z, 090532, WT064890, WT089062, WT090532, WT091551, WT098017, WT098051
Nature communications 2016;7;10494
De Novo Mutations in SON Disrupt RNA Splicing of Genes Essential for Brain Development and Metabolism, Causing an Intellectual-Disability Syndrome.
Mitchell Cancer Institute, University of South Alabama, Mobile, AL 36604, USA.
The overall understanding of the molecular etiologies of intellectual disability (ID) and developmental delay (DD) is increasing as next-generation sequencing technologies identify genetic variants in individuals with such disorders. However, detailed analyses conclusively confirming these variants, as well as the underlying molecular mechanisms explaining the diseases, are often lacking. Here, we report on an ID syndrome caused by de novo heterozygous loss-of-function (LoF) mutations in SON. The syndrome is characterized by ID and/or DD, malformations of the cerebral cortex, epilepsy, vision problems, musculoskeletal abnormalities, and congenital malformations. Knockdown of son in zebrafish resulted in severe malformation of the spine, brain, and eyes. Importantly, analyses of RNA from affected individuals revealed that genes critical for neuronal migration and cortex organization (TUBG1, FLNA, PNKP, WDR62, PSMD3, and HDAC6) and metabolism (PCK2, PFKL, IDH2, ACY1, and ADA) are significantly downregulated because of the accumulation of mis-spliced transcripts resulting from erroneous SON-mediated RNA splicing. Our data highlight SON as a master regulator governing neurodevelopment and demonstrate the importance of SON-mediated RNA splicing in human development.
Funded by: NCI NIH HHS: R01 CA190688, R21 CA185818; NHGRI NIH HHS: U54 HG006493, UM1 HG006493; NIGMS NIH HHS: R15 GM084407
American journal of human genetics 2016;99;3;711-719
Advances in Understanding Bacterial Pathogenesis Gained from Whole-Genome Sequencing and Phylogenetics.
Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.
The development of next-generation sequencing as a cost-effective technology has facilitated the analysis of bacterial population structure at a whole-genome level and at scale. From these data, phylogenic trees have been constructed that define population structures at a local, national, and global level, providing a framework for genetic analysis. Although still at an early stage, these approaches have yielded progress in several areas, including pathogen transmission mapping, the genetics of niche colonization and host adaptation, as well as gene-to-phenotype association studies. Antibiotic resistance has proven to be a major challenge in the early 21(st) century, and phylogenetic analyses have uncovered the dramatic effect that the use of antibiotics has had on shaping bacterial population structures. An update on insights into bacterial evolution from comparative genomics is provided in this review.
Funded by: Wellcome Trust: 100891
Cell host & microbe 2016;19;5;599-610
Emergence of host-adapted <i>Salmonella</i> Enteritidis through rapid evolution in an immunocompromised host.
The Wellcome Trust Sanger Institute, The Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom.
Host adaptation is a key factor contributing to the emergence of new bacterial, viral and parasitic pathogens. Many pathogens are considered promiscuous because they cause disease across a range of host species, while others are host-adapted, infecting particular hosts<sup>1</sup>. Host adaptation can potentially progress to host restriction where the pathogen is strictly limited to a single host species and is frequently associated with more severe symptoms. Host-adapted and host-restricted bacterial clades evolve from within a broader host-promiscuous species and sometimes target different niches within their specialist hosts, such as adapting from a mucosal to a systemic lifestyle. Genome degradation, marked by gene inactivation and deletion, is a key feature of host adaptation, although the triggers initiating genome degradation are not well understood. Here, we show that a chronic systemic non-typhoidal <i>Salmonella</i> infection in an immunocompromised human patient resulted in genome degradation targeting genes that are expendable for a systemic lifestyle. We present a genome-based investigation of a recurrent blood-borne <i>Salmonella enterica</i> serotype Enteritidis (<i>S</i>. Enteritidis) infection covering 15 years in an interleukin (IL)-12 β-1 receptor-deficient individual that developed into an asymptomatic chronic infection. The infecting <i>S.</i> Enteritidis harbored a mutation in the mismatch repair gene <i>mutS</i> that accelerated the genomic mutation rate. Phylogenetic analysis and phenotyping of multiple patient isolates provides evidence for a remarkable level of within-host evolution that parallels genome changes present in successful host-restricted bacterial pathogens but never before observed on this timescale. Our analysis identifies common pathways of host adaptation and demonstrates the role that immunocompromised individuals can play in this process.
Funded by: Wellcome Trust: 098051
Nature microbiology 2016;1;3
Logic models to predict continuous outputs based on binary inputs with an application to personalized cancer therapy.
Institute for Systems Biology, Seattle, US.
Mining large datasets using machine learning approaches often leads to models that are hard to interpret and not amenable to the generation of hypotheses that can be experimentally tested. We present 'Logic Optimization for Binary Input to Continuous Output' (LOBICO), a computational approach that infers small and easily interpretable logic models of binary input features that explain a continuous output variable. Applying LOBICO to a large cancer cell line panel, we find that logic combinations of multiple mutations are more predictive of drug response than single gene predictors. Importantly, we show that the use of the continuous information leads to robust and more accurate logic models. LOBICO implements the ability to uncover logic models around predefined operating points in terms of sensitivity and specificity. As such, it represents an important step towards practical application of interpretable logic models.
Funded by: NCI NIH HHS: U24 CA143835
Scientific reports 2016;6;36812
A novel signalling screen demonstrates that CALR mutations activate essential MAPK signalling and facilitate megakaryocyte differentiation.
Cambridge Institute for Medical Research and Wellcome Trust/MRC Stem Cell Institute, University of Cambridge, Cambridge, UK.
Most MPN patients lacking JAK2 mutations harbour somatic CALR mutations that are thought to activate cytokine signalling although the mechanism is unclear. To identify kinases important for survival of CALR-mutant cells we developed a novel strategy (KISMET) which utilises the full range of kinase selectivity data available from each inhibitor and thus takes advantage of off-target noise that limits conventional siRNA or inhibitor screens. KISMET successfully identified known essential kinases in haematopoietic and non-haematopoietic cell lines and identified the MAPK pathway as required for growth of the CALR-mutated MARIMO cells. Expression of mutant CALR in murine or human haematopoietic cell lines was accompanied by MPL-dependent activation of MAPK signalling, and MPN patients with CALR mutations showed increased MAPK activity in CD34-cells, platelets and megakaryocytes. Although CALR mutations resulted in protein instability and proteosomal degradation, mutant CALR was able to enhance megakaryopoiesis and pro-platelet production from human CD34+ progenitors. These data link aberrant MAPK activation to the MPN phenotype and identify it as a potential therapeutic target in CALR-mutant positive MPNs.Leukemia accepted article preview online, 14 October 2016. doi:10.1038/leu.2016.280.
Bi-allelic Truncating Mutations in TANGO2 Cause Infancy-Onset Recurrent Metabolic Crises with Encephalocardiomyopathy.
Institute of Human Genetics, Technische Universität München, 81675 München, Germany; Institute of Human Genetics, Helmholtz Zentrum München, 85764 Neuherberg, Germany.
Molecular diagnosis of mitochondrial disorders is challenging because of extreme clinical and genetic heterogeneity. By exome sequencing, we identified three different bi-allelic truncating mutations in TANGO2 in three unrelated individuals with infancy-onset episodic metabolic crises characterized by encephalopathy, hypoglycemia, rhabdomyolysis, arrhythmias, and laboratory findings suggestive of a defect in mitochondrial fatty acid oxidation. Over the course of the disease, all individuals developed global brain atrophy with cognitive impairment and pyramidal signs. TANGO2 (transport and Golgi organization 2) encodes a protein with a putative function in redistribution of Golgi membranes into the endoplasmic reticulum in Drosophila and a mitochondrial localization has been confirmed in mice. Investigation of palmitate-dependent respiration in mutant fibroblasts showed evidence of a functional defect in mitochondrial β-oxidation. Our results establish TANGO2 deficiency as a clinically recognizable cause of pediatric disease with multi-organ involvement.
American journal of human genetics 2016;98;2;358-62
Integrated transcriptomic and proteomic analysis identifies protein kinase CK2 as a key signaling node in an inflammatory cytokine network in ovarian cancer cells.
Centre for Cancer and Inflammation, Barts Cancer Institute, Queen Mary University of London, London, UK.
We previously showed how key pathways in cancer-related inflammation and Notch signaling are part of an autocrine malignant cell network in ovarian cancer. This network, which we named the "TNF network", has paracrine actions within the tumor microenvironment, influencing angiogenesis and the immune cell infiltrate.The aim of this study was to identify critical regulators in the signaling pathways of the TNF network in ovarian cancer cells that might be therapeutic targets. To achieve our aim, we used a systems biology approach, combining data from phospho-proteomic mass spectrometry and gene expression array analysis. Among the potential therapeutic kinase targets identified was the protein kinase Casein kinase II (CK2).Knockdown of CK2 expression in malignant cells by siRNA or treatment with the specific CK2 inhibitor CX-4945 significantly decreased Notch signaling and reduced constitutive cytokine release in ovarian cancer cell lines that expressed the TNF network as well as malignant cells isolated from high grade serous ovarian cancer asci