Sanger Institute - Publications 2013
Number of papers published in 2013: 337
-
Bloomsbury report on mouse embryo phenotyping: recommendations from the IMPC workshop on embryonic lethal screening.
Wellcome Trust Sanger Institute, Cambridge, CB10 1SA, UK.
Identifying genes that are important for embryo development is a crucial first step towards understanding their many functions in driving the ordered growth, differentiation and organogenesis of embryos. It can also shed light on the origins of developmental disease and congenital abnormalities. Current international efforts to examine gene function in the mouse provide a unique opportunity to pinpoint genes that are involved in embryogenesis, owing to the emergence of embryonic lethal knockout mutants. Through internationally coordinated efforts, the International Knockout Mouse Consortium (IKMC) has generated a public resource of mouse knockout strains and, in April 2012, the International Mouse Phenotyping Consortium (IMPC), supported by the EU InfraCoMP programme, convened a workshop to discuss developing a phenotyping pipeline for the investigation of embryonic lethal knockout lines. This workshop brought together over 100 scientists, from 13 countries, who are working in the academic and commercial research sectors, including experts and opinion leaders in the fields of embryology, animal imaging, data capture, quality control and annotation, high-throughput mouse production, phenotyping, and reporter gene analysis. This article summarises the outcome of the workshop, including (1) the vital scientific importance of phenotyping embryonic lethal mouse strains for basic and translational research; (2) a common framework to harmonise international efforts within this context; (3) the types of phenotyping that are likely to be most appropriate for systematic use, with a focus on 3D embryo imaging; (4) the importance of centralising data in a standardised form to facilitate data mining; and (5) the development of online tools to allow open access to and dissemination of the phenotyping data.
Funded by: NICHD NIH HHS: P30 HD024064
Disease models & mechanisms 2013;6;3;571-9
PUBMED: 23519032; DOI: 10.1242/dmm.011833
-
Bacteriotherapy for the treatment of intestinal dysbiosis caused by Clostridium difficile infection.
Bacterial Pathogenesis Laboratory, Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, UK.
Faecal microbiota transplantation (FMT) has been used for more than five decades to treat a variety of intestinal diseases associated with pathological imbalances within the resident microbiota, termed dysbiosis. FMT has been particularly effective for treating patients with recurrent Clostridium difficile infection who are left with few clinical options other than continued antibiotic therapy. Our increasing knowledge of the structure and function of the human intestinal microbiota and C. difficile pathogenesis has led to the understanding that FMT promotes intestinal ecological restoration and highlights the microbiota as a viable therapeutic target. However, the use of undefined faecal samples creates a barrier for widespread clinical use because of safety and aesthetic issues. An emerging concept of bacteriotherapy, the therapeutic use of a defined mixture of harmless, health-associated bacteria, holds promise for the treatment of patients with severe C. difficile infection, and possibly represents a paradigm shift for the treatment of diseases linked to intestinal dysbiosis.
Current opinion in microbiology 2013
PUBMED: 23866975; DOI: 10.1016/j.mib.2013.06.009
-
Sequencing ancient calcified dental plaque shows changes in oral microbiota with dietary shifts of the Neolithic and Industrial revolutions.
Australian Centre for Ancient DNA, School of Earth and Environmental Sciences, The University of Adelaide, Adelaide, South Australia, Australia.
The importance of commensal microbes for human health is increasingly recognized, yet the impacts of evolutionary changes in human diet and culture on commensal microbiota remain almost unknown. Two of the greatest dietary shifts in human evolution involved the adoption of carbohydrate-rich Neolithic (farming) diets (beginning ∼10,000 years before the present) and the more recent advent of industrially processed flour and sugar (in ∼1850). Here, we show that calcified dental plaque (dental calculus) on ancient teeth preserves a detailed genetic record throughout this period. Data from 34 early European skeletons indicate that the transition from hunter-gatherer to farming shifted the oral microbial community to a disease-associated configuration. The composition of oral microbiota remained unexpectedly constant between Neolithic and medieval times, after which (the now ubiquitous) cariogenic bacteria became dominant, apparently during the Industrial Revolution. Modern oral microbiotic ecosystems are markedly less diverse than historic populations, which might be contributing to chronic oral (and other) disease in postindustrial lifestyles.
Funded by: Wellcome Trust: WT092799/Z/10/Z, WT098051
Nature genetics 2013;45;4;450-5, 455e1
PUBMED: 23416520; DOI: 10.1038/ng.2536
-
New insights into the genetic basis of TAR (thrombocytopenia-absent radii) syndrome.
Department of Haematology, University of Cambridge, UK; NHS Blood and Transplant, Cambridge, UK; Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK. Electronic address: c.albers@gen.umcn.nl.
Thrombocytopenia with absent radii (TAR) syndrome is a rare disorder combining specific skeletal abnormalities with a reduced platelet count. Rare proximal microdeletions of 1q21.1 are found in the majority of patients but are also found in unaffected parents. Recently it was shown that TAR syndrome is caused by the compound inheritance of a low-frequency noncoding SNP and a rare null allele in RBM8A, a gene encoding the exon-junction complex subunit member Y14 located in the deleted region. This finding provides new insight into the complex inheritance pattern and new clues to the molecular mechanisms underlying TAR syndrome. We discuss TAR syndrome in the context of abnormal phenotypes associated with proximal and distal 1q21.1 microdeletion and microduplications with incomplete penetrance and variable expressivity.
Current opinion in genetics & development 2013
PUBMED: 23602329; DOI: 10.1016/j.gde.2013.02.015
-
Specificity and heterogeneity of terahertz radiation effect on gene expression in mouse mesenchymal stem cells.
Theoretical Division, Los Alamos National Laboratory , Los Alamos, NM 87545, USA ; Harvard Medical School, Beth Israel Deaconess Medical Center, Department of Medicine , Boston, MA 02215, USA.
We report that terahertz (THz) irradiation of mouse mesenchymal stem cells (mMSCs) with a single-frequency (SF) 2.52 THz laser or pulsed broadband (centered at 10 THz) source results in irradiation specific heterogenic changes in gene expression. The THz effect depends on irradiation parameters such as the duration and type of THz source, and on the degree of stem cell differentiation. Our microarray survey and RT-PCR experiments demonstrate that prolonged broadband THz irradiation drives mMSCs toward differentiation, while 2-hour irradiation (regardless of THz sources) affects genes transcriptionally active in pluripotent stem cells. The strictly controlled experimental environment indicates minimal temperature changes and the absence of any discernable response to heat shock and cellular stress genes imply a non-thermal response. Computer simulations of the core promoters of two pluripotency markers reveal association between gene upregulation and propensity for DNA breathing. We propose that THz radiation has potential for non-contact control of cellular gene expression.
Scientific reports 2013;3;1184
PUBMED: 23378916; PMC: 3560359; DOI: 10.1038/srep01184
-
Signatures of mutational processes in human cancer.
Cancer Genome Project, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK.
All cancers are caused by somatic mutations; however, understanding of the biological processes generating these mutations is limited. The catalogue of somatic mutations from a cancer genome bears the signatures of the mutational processes that have been operative. Here we analysed 4,938,362 mutations from 7,042 cancers and extracted more than 20 distinct mutational signatures. Some are present in many cancer types, notably a signature attributed to the APOBEC family of cytidine deaminases, whereas others are confined to a single cancer class. Certain signatures are associated with age of the patient at cancer diagnosis, known mutagenic exposures or defects in DNA maintenance, but many are of cryptic origin. In addition to these genome-wide mutational signatures, hypermutation localized to small genomic regions, 'kataegis', is found in many cancer types. The results reveal the diversity of mutational processes underlying the development of cancer, with potential implications for understanding of cancer aetiology, prevention and therapy.
Funded by: Wellcome Trust: 098051
Nature 2013;500;7463;415-21
PUBMED: 23945592; PMC: 3776390; DOI: 10.1038/nature12477
-
Deciphering signatures of mutational processes operative in human cancer.
Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton CB10 1SA, UK.
The genome of a cancer cell carries somatic mutations that are the cumulative consequences of the DNA damage and repair processes operative during the cellular lineage between the fertilized egg and the cancer cell. Remarkably, these mutational processes are poorly characterized. Global sequencing initiatives are yielding catalogs of somatic mutations from thousands of cancers, thus providing the unique opportunity to decipher the signatures of mutational processes operative in human cancer. However, until now there have been no theoretical models describing the signatures of mutational processes operative in cancer genomes and no systematic computational approaches are available to decipher these mutational signatures. Here, by modeling mutational processes as a blind source separation problem, we introduce a computational framework that effectively addresses these questions. Our approach provides a basis for characterizing mutational signatures from cancer-derived somatic mutational catalogs, paving the way to insights into the pathogenetic mechanism underlying all cancers.
Funded by: Wellcome Trust: 093867, 098051, WT088340MA
Cell reports 2013;3;1;246-59
PUBMED: 23318258; PMC: 3588146; DOI: 10.1016/j.celrep.2012.12.008
-
Inappropriately low hepcidin levels in patients with myelodysplastic syndrome carrying a somatic mutation of SF3B1.
Somatic mutations of the RNA splicing machinery have been recently identified in myelodysplastic syndromes. In particular, a strong association has been found between SF3B1 mutation and refractory anemia with ring sider-oblasts, a condition characterized by ineffective erythropoiesis and parenchymal iron overload. We studied the relationship between SF3B1 mutation, erythroid activity and hepcidin levels in myelodysplastic syndrome patients. Erythroid activity was evaluated through the proportion of marrow erythroblasts, soluble transferrin receptor and serum growth differentiation factor 15. Significant relationships were found between SF3B1 mutation and marrow erythroblasts (P=0.001), soluble transferrin receptor (P=0.003) and serum growth differentiation factor 15 (P=0.033). Serum hepcidin varied considerably, and multivariable analysis showed that the hepcidin to ferritin ratio, a measure of adequacy of hepcidin levels relative to body iron stores, was inversely related to the SF3B1 mutation (P=0.013). These observations suggest that patients with SF3B1 mutation have inappropriately low hepcidin levels, which may explain their propensity to parenchymal iron loading.
Haematologica 2013;98;3;420-3
PUBMED: 23300182; DOI: 10.3324/haematol.2012.077446
-
The African coelacanth genome provides insights into tetrapod evolution.
Molecular Genetics Program, Benaroya Research Institute, Seattle, Washington 98101, USA. camemiya@benaroyaresearch.org
The discovery of a living coelacanth specimen in 1938 was remarkable, as this lineage of lobe-finned fish was thought to have become extinct 70 million years ago. The modern coelacanth looks remarkably similar to many of its ancient relatives, and its evolutionary proximity to our own fish ancestors provides a glimpse of the fish that first walked on land. Here we report the genome sequence of the African coelacanth, Latimeria chalumnae. Through a phylogenomic analysis, we conclude that the lungfish, and not the coelacanth, is the closest living relative of tetrapods. Coelacanth protein-coding genes are significantly more slowly evolving than those of tetrapods, unlike other genomic features. Analyses of changes in genes and regulatory elements during the vertebrate adaptation to land highlight genes involved in immunity, nitrogen excretion and the development of fins, tail, ear, eye, brain and olfaction. Functional assays of enhancers involved in the fin-to-limb transition and in the emergence of extra-embryonic tissues show the importance of the coelacanth genome as a blueprint for understanding tetrapod evolution.
Funded by: NCRR NIH HHS: R24 RR032670; NHGRI NIH HHS: R01 HG003474; NIEHS NIH HHS: R01 ES006272; NIH HHS: R01 OD011116, R24 OD011199
Nature 2013;496;7445;311-6
PUBMED: 23598338; PMC: 3633110; DOI: 10.1038/nature12027
-
The COMBREX Project: Design, Methodology, and Initial Results.
New England Biolabs, Ipswich, Massachusetts, United States of America.
Experimental data exists for only a vanishingly small fraction of sequenced microbial genes. This community page discusses the progress made by the COMBREX project to address this important issue using both computational and experimental resources.
PLoS biology 2013;11;8;e1001638
PUBMED: 24013487; PMC: 3754883; DOI: 10.1371/journal.pbio.1001638
-
Genome-wide meta-analysis identifies new susceptibility loci for migraine.
Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK. anttila@atgu.mgh.harvard.edu
Migraine is the most common brain disorder, affecting approximately 14% of the adult population, but its molecular mechanisms are poorly understood. We report the results of a meta-analysis across 29 genome-wide association studies, including a total of 23,285 individuals with migraine (cases) and 95,425 population-matched controls. We identified 12 loci associated with migraine susceptibility (P<5×10(-8)). Five loci are new: near AJAP1 at 1p36, near TSPAN2 at 1p13, within FHL5 at 6q16, within C7orf10 at 7p14 and near MMP16 at 8q21. Three of these loci were identified in disease subgroup analyses. Brain tissue expression quantitative trait locus analysis suggests potential functional candidate genes at four loci: APOA1BP, TBC1D7, FUT9, STAT6 and ATP5B.
Nature genetics 2013;45;8;912-7
PUBMED: 23793025; DOI: 10.1038/ng.2676
-
Genome-wide, whole mount in situ analysis of transcriptional regulators in zebrafish embryos.
Institute of Toxicology and Genetics, Karlsruhe Institute of Technology, Postfach 3640, 76021 Karlsruhe, Germany.
Transcription is the primary step in the retrieval of genetic information. A substantial proportion of the protein repertoire of each organism consists of transcriptional regulators (TRs). It is believed that the differential expression and combinatorial action of these TRs is essential for vertebrate development and body homeostasis. We mined the zebrafish genome exhaustively for genes encoding TRs and determined their expression in the zebrafish embryo by sequencing to saturation and in situ hybridisation. At the evolutionary conserved phylotypic stage, 75% of the 3302 TR genes encoded in the genome are already expressed. The number of expressed TR genes increases only marginally in subsequent stages and is maintained during adulthood suggesting important roles of the TR genes in body homeostasis. Fewer than half of the TR genes (45%, n=1711 genes) are expressed in a tissue-restricted manner in the embryo. Transcripts of 207 genes were detected in a single tissue in the 24h embryo, potentially acting as regulators of specific processes. Other TR genes were expressed in multiple tissues. However, with the exception of certain territories in the nervous system, we did not find significant synexpression suggesting that most tissue-restricted TRs act in a freely combinatorial fashion. Our data indicate that elaboration of body pattern and function from the phylotypic stage onward relies mostly on redeployment of TRs and post-transcriptional processes.
Developmental biology 2013;380;2;351-62
PUBMED: 23684812; DOI: 10.1016/j.ydbio.2013.05.006
-
The general population cohort in rural south-western Uganda: a platform for communicable and non-communicable disease studies.
Medical Research Council/Uganda Virus Research Institute (MRC/UVRI), Uganda Research Unit on AIDS, Entebbe, Uganda, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK, Wellcome Trust Sanger Institute, Hinxton, UK, London School of Hygiene and Tropical Medicine, London, UK, School of International Development, University of East Anglia, Norwich, UK and Wellcome Trust, UK (formerly with MRC/UVRI Uganda Research Unit on AIDS, Entebbe, Uganda).
The General Population Cohort (GPC) was set up in 1989 to examine trends in HIV prevalence and incidence, and their determinants in rural south-western Uganda. Recently, the research questions have included the epidemiology and genetics of communicable and non-communicable diseases (NCDs) to address the limited data on the burden and risk factors for NCDs in sub-Saharan Africa. The cohort comprises all residents (52% aged ≥13years, men and women in equal proportions) within one-half of a rural sub-county, residing in scattered houses, and largely farmers of three major ethnic groups. Data collected through annual surveys include; mapping for spatial analysis and participant location; census for individual socio-demographic and household socioeconomic status assessment; and a medical survey for health, lifestyle and biophysical and blood measurements to ascertain disease outcomes and risk factors for selected participants. This cohort offers a rich platform to investigate the interplay between communicable diseases and NCDs. There is robust infrastructure for data management, sample processing and storage, and diverse expertise in epidemiology, social and basic sciences. For any data access enquiries you may contact the director, MRC/UVRI, Uganda Research Unit on AIDS by email to mrc@mrcuganda.org or the corresponding author.
International journal of epidemiology 2013
PUBMED: 23364209; DOI: 10.1093/ije/dys234
-
Hospital outbreak of Middle East respiratory syndrome coronavirus.
Global Center for Mass Gatherings Medicine, Ministry of Health, Riyadh, Saudi Arabia.
Background: In September 2012, the World Health Organization reported the first cases of pneumonia caused by the novel Middle East respiratory syndrome coronavirus (MERS-CoV). We describe a cluster of health care-acquired MERS-CoV infections.
Methods: Medical records were reviewed for clinical and demographic information and determination of potential contacts and exposures. Case patients and contacts were interviewed. The incubation period and serial interval (the time between the successive onset of symptoms in a chain of transmission) were estimated. Viral RNA was sequenced.
Results: Between April 1 and May 23, 2013, a total of 23 cases of MERS-CoV infection were reported in the eastern province of Saudi Arabia. Symptoms included fever in 20 patients (87%), cough in 20 (87%), shortness of breath in 11 (48%), and gastrointestinal symptoms in 8 (35%); 20 patients (87%) presented with abnormal chest radiographs. As of June 12, a total of 15 patients (65%) had died, 6 (26%) had recovered, and 2 (9%) remained hospitalized. The median incubation period was 5.2 days (95% confidence interval [CI], 1.9 to 14.7), and the serial interval was 7.6 days (95% CI, 2.5 to 23.1). A total of 21 of the 23 cases were acquired by person-to-person transmission in hemodialysis units, intensive care units, or in-patient units in three different health care facilities. Sequencing data from four isolates revealed a single monophyletic clade. Among 217 household contacts and more than 200 health care worker contacts whom we identified, MERS-CoV infection developed in 5 family members (3 with laboratory-confirmed cases) and in 2 health care workers (both with laboratory-confirmed cases).
Conclusions: Person-to-person transmission of MERS-CoV can occur in health care settings and may be associated with considerable morbidity. Surveillance and infection-control measures are critical to a global public health response.
Funded by: NIGMS NIH HHS: U01 GM070708, U54 GM088491
The New England journal of medicine 2013;369;5;407-16
PUBMED: 23782161; DOI: 10.1056/NEJMoa1306742
-
Genome Sequencing Reveals Loci under Artificial Selection that Underlie Disease Phenotypes in the Laboratory Rat.
Physiological Genomic and Medicine Group, MRC Clinical Sciences Centre, Imperial College London, London W12 0NN, UK; National Heart and Lung Institute, Imperial College London, London W12 0NN, UK.
Large numbers of inbred laboratory rat strains have been developed for a range of complex disease phenotypes. To gain insights into the evolutionary pressures underlying selection for these phenotypes, we sequenced the genomes of 27 rat strains, including 11 models of hypertension, diabetes, and insulin resistance, along with their respective control strains. Altogether, we identified more than 13 million single-nucleotide variants, indels, and structural variants across these rat strains. Analysis of strain-specific selective sweeps and gene clusters implicated genes and pathways involved in cation transport, angiotensin production, and regulators of oxidative stress in the development of cardiovascular disease phenotypes in rats. Many of the rat loci that we identified overlap with previously mapped loci for related traits in humans, indicating the presence of shared pathways underlying these phenotypes in rats and humans. These data represent a step change in resources available for evolutionary analysis of complex traits in disease models. PAPERCLIP:
Cell 2013
PUBMED: 23890820; DOI: 10.1016/j.cell.2013.06.040
-
Effective Preparation of Plasmodium vivax Field Isolates for High-Throughput Whole Genome Sequencing.
Global and Tropical Health Division, Menzies School of Health Research, Charles Darwin University, Darwin, Australia.
Whole genome sequencing (WGS) of Plasmodium vivax is problematic due to the reliance on clinical isolates which are generally low in parasitaemia and sample volume. Furthermore, clinical isolates contain a significant contaminating background of host DNA which confounds efforts to map short read sequence of the target P. vivax DNA. Here, we discuss a methodology to significantly improve the success of P. vivax WGS on natural (non-adapted) patient isolates. Using 37 patient isolates from Indonesia, Thailand, and travellers, we assessed the application of CF11-based white blood cell filtration alone and in combination with short term ex vivo schizont maturation. Although CF11 filtration reduced human DNA contamination in 8 Indonesian isolates tested, additional short-term culture increased the P. vivax DNA yield from a median of 0.15 to 6.2 ng µl(-1) packed red blood cells (pRBCs) (p = 0.001) and reduced the human DNA percentage from a median of 33.9% to 6.22% (p = 0.008). Furthermore, post-CF11 and culture samples from Thailand gave a median P. vivax DNA yield of 2.34 ng µl(-1) pRBCs, and 2.65% human DNA. In 22 P. vivax patient isolates prepared with the 2-step method, we demonstrate high depth (median 654X coverage) and breadth (≥89%) of coverage on the Illumina GAII and HiSeq platforms. In contrast to the A+T-rich P. falciparum genome, negligible bias was observed in coverage depth between coding and non-coding regions of the P. vivax genome. This uniform coverage will greatly facilitate the detection of SNPs and copy number variants across the genome, enabling unbiased exploration of the natural diversity in P. vivax populations.
PloS one 2013;8;1;e53160
PUBMED: 23308154; PMC: 3537768; DOI: 10.1371/journal.pone.0053160
-
Genomic triumph meets clinical reality.
The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK. cts@sanger.ac.uk.
A report on the 'Genomic Disorders 2013: from 60 years of DNA to human genomes in the clinic' meeting, held at Homerton College, Cambridge, UK, April 10-12, 2013.
Genome biology 2013;14;5;307
PUBMED: 23714135; DOI: 10.1186/gb-2013-14-5-307
-
FOXP2 targets show evidence of positive selection in European populations.
The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK. qa1@sanger.ac.uk
Forkhead box P2 (FOXP2) is a highly conserved transcription factor that has been implicated in human speech and language disorders and plays important roles in the plasticity of the developing brain. The pattern of nucleotide polymorphisms in FOXP2 in modern populations suggests that it has been the target of positive (Darwinian) selection during recent human evolution. In our study, we searched for evidence of selection that might have followed FOXP2 adaptations in modern humans. We examined whether or not putative FOXP2 targets identified by chromatin-immunoprecipitation genomic screening show evidence of positive selection. We developed an algorithm that, for any given gene list, systematically generates matched lists of control genes from the Ensembl database, collates summary statistics for three frequency-spectrum-based neutrality tests from the low-coverage resequencing data of the 1000 Genomes Project, and determines whether these statistics are significantly different between the given gene targets and the set of controls. Overall, there was strong evidence of selection of FOXP2 targets in Europeans, but not in the Han Chinese, Japanese, or Yoruba populations. Significant outliers included several genes linked to cellular movement, reproduction, development, and immune cell trafficking, and 13 of these constituted a significant network associated with cardiac arteriopathy. Strong signals of selection were observed for CNTNAP2 and RBFOX1, key neurally expressed genes that have been consistently identified as direct FOXP2 targets in multiple studies and that have themselves been associated with neurodevelopmental disorders involving language dysfunction.
Funded by: Wellcome Trust: 098051
American journal of human genetics 2013;92;5;696-706
PUBMED: 23602712; PMC: 3644635; DOI: 10.1016/j.ajhg.2013.03.019
-
Cooperativity of imprinted genes inactivated by acquired chromosome 20q deletions.
Large regions of recurrent genomic loss are common in cancers; however, with a few well-characterized exceptions, how they contribute to tumor pathogenesis remains largely obscure. Here we identified primate-restricted imprinting of a gene cluster on chromosome 20 in the region commonly deleted in chronic myeloid malignancies. We showed that a single heterozygous 20q deletion consistently resulted in the complete loss of expression of the imprinted genes L3MBTL1 and SGK2, indicative of a pathogenetic role for loss of the active paternally inherited locus. Concomitant loss of both L3MBTL1 and SGK2 dysregulated erythropoiesis and megakaryopoiesis, 2 lineages commonly affected in chronic myeloid malignancies, with distinct consequences in each lineage. We demonstrated that L3MBTL1 and SGK2 collaborated in the transcriptional regulation of MYC by influencing different aspects of chromatin structure. L3MBTL1 is known to regulate nucleosomal compaction, and we here showed that SGK2 inactivated BRG1, a key ATP-dependent helicase within the SWI/SNF complex that regulates nucleosomal positioning. These results demonstrate a link between an imprinted gene cluster and malignancy, reveal a new pathogenetic mechanism associated with acquired regions of genomic loss, and underline the complex molecular and cellular consequences of "simple" cancer-associated chromosome deletions.
The Journal of clinical investigation 2013;123;5;2169-82
PUBMED: 23543057; PMC: 3635733; DOI: 10.1172/JCI66113
-
Y-chromosome and mtDNA genetics reveal significant contrasts in affinities of modern Middle Eastern populations with European and African populations.
The Lebanese American University, Chouran, Beirut, Lebanon.
The Middle East was a funnel of human expansion out of Africa, a staging area for the Neolithic Agricultural Revolution, and the home to some of the earliest world empires. Post LGM expansions into the region and subsequent population movements created a striking genetic mosaic with distinct sex-based genetic differentiation. While prior studies have examined the mtDNA and Y-chromosome contrast in focal populations in the Middle East, none have undertaken a broad-spectrum survey including North and sub-Saharan Africa, Europe, and Middle Eastern populations. In this study 5,174 mtDNA and 4,658 Y-chromosome samples were investigated using PCA, MDS, mean-linkage clustering, AMOVA, and Fisher exact tests of F(ST)'s, R(ST)'s, and haplogroup frequencies. Geographic differentiation in affinities of Middle Eastern populations with Africa and Europe showed distinct contrasts between mtDNA and Y-chromosome data. Specifically, Lebanon's mtDNA shows a very strong association to Europe, while Yemen shows very strong affinity with Egypt and North and East Africa. Previous Y-chromosome results showed a Levantine coastal-inland contrast marked by J1 and J2, and a very strong North African component was evident throughout the Middle East. Neither of these patterns were observed in the mtDNA. While J2 has penetrated into Europe, the pattern of Y-chromosome diversity in Lebanon does not show the widespread affinities with Europe indicated by the mtDNA data. Lastly, while each population shows evidence of connections with expansions that now define the Middle East, Africa, and Europe, many of the populations in the Middle East show distinctive mtDNA and Y-haplogroup characteristics that indicate long standing settlement with relatively little impact from and movement into other populations.
PloS one 2013;8;1;e54616
PUBMED: 23382925; PMC: 3559847; DOI: 10.1371/journal.pone.0054616
-
Metagenomic study of the viruses of African straw-coloured fruit bats: Detection of a chiropteran poxvirus and isolation of a novel adenovirus.
University of Cambridge, Department of Veterinary Medicine, Madingley Rd, Cambridge, Cambridgeshire, CB3 0ES, United Kingdom; Institute of Zoology, Zoological Society of London, Regent's Park, NW1 4RY, United Kingdom. Electronic address: kf281@cam.ac.uk.
Viral emergence as a result of zoonotic transmission constitutes a continuous public health threat. Emerging viruses such as SARS coronavirus, hantaviruses and henipaviruses have wildlife reservoirs. Characterising the viruses of candidate reservoir species in geographical hot spots for viral emergence is a sensible approach to develop tools to predict, prevent, or contain emergence events. Here, we explore the viruses of Eidolon helvum, an Old World fruit bat species widely distributed in Africa that lives in close proximity to humans. We identified a great abundance and diversity of novel herpes and papillomaviruses, described the isolation of a novel adenovirus, and detected, for the first time, sequences of a chiropteran poxvirus closely related with Molluscum contagiosum. In sum, E. helvum display a wide variety of mammalian viruses, some of them genetically similar to known human pathogens, highlighting the possibility of zoonotic transmission.
Virology 2013
PUBMED: 23562481; DOI: 10.1016/j.virol.2013.03.014
-
Atypical Mitogen-Activated Protein Kinase Phosphatase Implicated in Regulating Transition from Pre-S-Phase Asexual Intraerythrocytic Development of Plasmodium falciparum.
Department of Global Health, College of Public Health, University of South Florida, Tampa, Florida, USA.
Intraerythrocytic development of the human malaria parasite Plasmodium falciparum appears as a continuous flow through growth and proliferation. To develop a greater understanding of the critical regulatory events, we utilized piggyBac insertional mutagenesis to randomly disrupt genes. Screening a collection of piggyBac mutants for slow growth, we isolated the attenuated parasite C9, which carried a single insertion disrupting the open reading frame (ORF) of PF3D7_1305500. This gene encodes a protein structurally similar to a mitogen-activated protein kinase (MAPK) phosphatase, except for two notable characteristics that alter the signature motif of the dual-specificity phosphatase domain, suggesting that it may be a low-activity phosphatase or pseudophosphatase. C9 parasites demonstrated a significantly lower growth rate with delayed entry into the S/M phase of the cell cycle, which follows the stage of maximum PF3D7_1305500 expression in intact parasites. Genetic complementation with the full-length PF3D7_1305500 rescued the wild-type phenotype of C9, validating the importance of the putative protein phosphatase PF3D7_1305500 as a regulator of pre-S-phase cell cycle progression in P. falciparum.
Eukaryotic cell 2013;12;9;1171-8
PUBMED: 23813392; DOI: 10.1128/EC.00028-13
-
Imputation-based meta-analysis of severe malaria in three African populations.
Wellcome Trust Centre for Human Genetics, Oxford, United Kingdom.
Combining data from genome-wide association studies (GWAS) conducted at different locations, using genotype imputation and fixed-effects meta-analysis, has been a powerful approach for dissecting complex disease genetics in populations of European ancestry. Here we investigate the feasibility of applying the same approach in Africa, where genetic diversity, both within and between populations, is far more extensive. We analyse genome-wide data from approximately 5,000 individuals with severe malaria and 7,000 population controls from three different locations in Africa. Our results show that the standard approach is well powered to detect known malaria susceptibility loci when sample sizes are large, and that modern methods for association analysis can control the potential confounding effects of population structure. We show that pattern of association around the haemoglobin S allele differs substantially across populations due to differences in haplotype structure. Motivated by these observations we consider new approaches to association analysis that might prove valuable for multicentre GWAS in Africa: we relax the assumptions of SNP-based fixed effect analysis; we apply Bayesian approaches to allow for heterogeneity in the effect of an allele on risk across studies; and we introduce a region-based test to allow for heterogeneity in the location of causal alleles.
Funded by: Medical Research Council: G0600230, G0600718; Wellcome Trust: 075491/Z/04, 077012/Z/05/Z, 087285, 090532/Z/09/Z, 090770/Z/09/Z, 091758/Z/10/Z, 096527, 097364/Z/11/Z, WT077383/Z/05/Z, WT098051
PLoS genetics 2013;9;5;e1003509
PUBMED: 23717212; PMC: 3662650; DOI: 10.1371/journal.pgen.1003509
-
Approaches to querying bacterial genomes with transposon-insertion sequencing.
Wellcome Trust Sanger Institute; Hinxton, Cambridge, UK; EMBL-European Bioinformatics Institute; Hinxton, Cambridge, UK.
In this review we discuss transposon-insertion sequencing, variously known in the literature as TraDIS, Tn-seq, INSeq and HITS. By monitoring a large library of single transposon-insertion mutants with high-throughput sequencing, these methods can rapidly identify genomic regions that contribute to organismal fitness under any condition assayable in the laboratory with exquisite resolution. We discuss the various protocols that have been developed and methods for analysis. We provide an overview of studies that have examined the reproducibility and accuracy of these methods, as well as studies showing the advantages offered by the high resolution and dynamic range of high-throughput sequencing over previous methods. We review a number of applications in the literature, from predicting genes essential for in vitro growth to directly assaying requirements for survival under infective conditions in vivo. We also highlight recent progress in assaying non-coding regions of the genome in addition to known coding sequences, including the combining of RNA-seq with high-throughput transposon mutagenesis.
RNA biology 2013;10;7
PUBMED: 23635712
-
A comparison of dense transposon insertion libraries in the Salmonella serovars Typhi and Typhimurium.
Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, UK. lb14@sanger.ac.uk
Salmonella Typhi and Typhimurium diverged only ∼50 000 years ago, yet have very different host ranges and pathogenicity. Despite the availability of multiple whole-genome sequences, the genetic differences that have driven these changes in phenotype are only beginning to be understood. In this study, we use transposon-directed insertion-site sequencing to probe differences in gene requirements for competitive growth in rich media between these two closely related serovars. We identify a conserved core of 281 genes that are required for growth in both serovars, 228 of which are essential in Escherichia coli. We are able to identify active prophage elements through the requirement for their repressors. We also find distinct differences in requirements for genes involved in cell surface structure biogenesis and iron utilization. Finally, we demonstrate that transposon-directed insertion-site sequencing is not only applicable to the protein-coding content of the cell but also has sufficient resolution to generate hypotheses regarding the functions of non-coding RNAs (ncRNAs) as well. We are able to assign probable functions to a number of cis-regulatory ncRNA elements, as well as to infer likely differences in trans-acting ncRNA regulatory networks.
Funded by: Wellcome Trust: WT076964, WT079643, WT098051
Nucleic acids research 2013;41;8;4549-64
PUBMED: 23470992; PMC: 3632133; DOI: 10.1093/nar/gkt148
-
Identifying novel Plasmodium falciparum erythrocyte invasion receptors using systematic extracellular protein interaction screens.
Cell Surface Signalling Laboratory, Wellcome Trust Sanger Institute, Hinxton, Cambridge, CB10 1HH; Malaria Programme, Wellcome Trust Sanger Institute, Hinxton, Cambridge, CB10 1SA.
The invasion of host erythrocytes by the parasite Plasmodium falciparum initiates the blood stage of infection responsible for the symptoms of malaria. Invasion involves extracellular protein interactions between host erythrocyte receptors and ligands on the merozoite, the invasive form of the parasite. Despite significant research effort, many merozoite surface ligands have no known erythrocyte binding partner, most likely due to the intractable biochemical nature of membrane-tethered receptor proteins and their interactions. The few receptor-ligand pairs that have been described have largely relied on sourcing erythrocytes from patients with rare blood groups, a serendipitous approach that is unsatisfactory for systematically identifying novel receptors. We have recently developed a scalable assay called AVEXIS (for AVidity-based EXtracellular Interaction Screen), designed to circumvent the technical difficulties associated with the identification of extracellular protein interactions, and applied it to identify erythrocyte receptors for orphan Plasmodium falciparum merozoite ligands. Using this approach, we have recently identified Basigin (CD147) and Semaphorin-7A (CD108) as receptors for RH5 and MTRAP, respectively. In this essay, we review techniques used to identify Plasmodium receptors and discuss how they could be applied in the future to identify novel receptors both for Plasmodium parasites but also other pathogens.
Cellular microbiology 2013
PUBMED: 23617720; DOI: 10.1111/cmi.12151
-
Network properties derived from deep sequencing of human B-cell receptor repertoires delineate B-cell populations.
Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom;
The adaptive immune response selectively expands B- and T-cell clones following antigen recognition by B- and T-cell receptors (BCR and TCR), respectively. Next-generation sequencing is a powerful tool for dissecting the BCR and TCR populations at high resolution, but robust computational analyses are required to interpret such sequencing. Here, we develop a novel computational approach for BCR repertoire analysis using established next-generation sequencing methods coupled with network construction and population analysis. BCR sequences organize into networks based on sequence diversity, with differences in network connectivity clearly distinguishing between diverse repertoires of healthy individuals and clonally expanded repertoires from individuals with chronic lymphocytic leukemia (CLL) and other clonal blood disorders. Network population measures defined by the Gini Index and cluster sizes quantify the BCR clonality status and are robust to sampling and sequencing depths. BCR network analysis therefore allows the direct and quantifiable comparison of BCR repertoires between samples and intra-individual population changes between temporal or spatially separated samples and over the course of therapy.
Genome research 2013
PUBMED: 23742949; DOI: 10.1101/gr.154815.113
-
Peripheral administration of prokineticin 2 potently reduces food intake and body weight in mice via the brainstem.
Section of Investigative Medicine, Imperial College London, London, UK.
BACKGROUND AND PURPOSE: Prokineticin 2 (PK2) has recently been shown to acutely reduce food intake in rodents. We aimed to determine the CNS sites and receptors that mediate the anorectic effects of peripherally administered PK2 and its chronic effects on glucose and energy homeostasis. EXPERIMENTAL APPROACH: We investigated neuronal activation following i.p. administration of PK2 using c-Fos-like immunoreactivity (CFL-IR). The anorectic effect of PK2 was examined in mice with targeted deletion of either prokineticin receptor 1 (PKR1) or prokineticin receptor 2 (PKR2), and in wild-type mice following administration of the PKR1 antagonist, PC1. The effect of IP PK2 administration on glucose homeostasis was investigated. Finally, the effect of long-term administration of PK2 on glucose and energy homeostasis in diet-induced obese (DIO) mice was determined. KEY RESULTS: I.p. PK2 administration significantly increased CFL-IR in the dorsal motor vagal nucleus of the brainstem. The anorectic effect of PK2 was maintained in mice lacking the PKR2 but abolished in mice lacking PKR1 and in wild-type mice pre-treated with PC1. DIO mice treated chronically with PK2 had no changes in glucose levels but significantly reduced food intake and body weight compared to controls. CONCLUSIONS AND IMPLICATIONS: Together, our data suggest that the anorectic effects of peripherally administered PK2 are mediated via the brainstem and this effect requires PKR1 but not PKR2 signalling. Chronic administration of PK2 reduces food intake and body weight in a mouse model of human obesity, suggesting that PKR1-selective agonists have potential to be novel therapeutics for the treatment of obesity.
British journal of pharmacology 2013;168;2;403-410
PUBMED: 22935107; DOI: 10.1111/j.1476-5381.2012.02191.x
-
Microbial genomes as cheat sheets.
Nature reviews. Microbiology 2013;11;5;302
PUBMED: 23563106; DOI: 10.1038/nrmicro3014
-
Genome-wide meta-analysis identifies 11 new loci for anthropometric traits and provides insights into genetic architecture.
1] US Department of Health and Human Services, Division of Cancer Epidemiology and Genetics, National Cancer Institute, US National Institutes of Health, Bethesda, Maryland, USA. [2].
Approaches exploiting trait distribution extremes may be used to identify loci associated with common traits, but it is unknown whether these loci are generalizable to the broader population. In a genome-wide search for loci associated with the upper versus the lower 5th percentiles of body mass index, height and waist-to-hip ratio, as well as clinical classes of obesity, including up to 263,407 individuals of European ancestry, we identified 4 new loci (IGFBP4, H6PD, RSRC1 and PPP2R2A) influencing height detected in the distribution tails and 7 new loci (HNF4G, RPTOR, GNAT2, MRPS33P4, ADCY9, HS6ST3 and ZZZ3) for clinical classes of obesity. Further, we find a large overlap in genetic structure and the distribution of variants between traits based on extremes and the general population and little etiological heterogeneity between obesity subgroups.
Nature genetics 2013
PUBMED: 23563607; DOI: 10.1038/ng.2606
-
The evolutionary dynamics of influenza A virus adaptation to mammalian hosts.
Department of Zoology, University of Oxford, , Oxford, UK.
Few questions on infectious disease are more important than understanding how and why avian influenza A viruses successfully emerge in mammalian populations, yet little is known about the rate and nature of the virus' genetic adaptation in new hosts. Here, we measure, for the first time, the genomic rate of adaptive evolution of swine influenza viruses (SwIV) that originated in birds. By using a curated dataset of more than 24 000 human and swine influenza gene sequences, including 41 newly characterized genomes, we reconstructed the adaptive dynamics of three major SwIV lineages (Eurasian, EA; classical swine, CS; triple reassortant, TR). We found that, following the transfer of the EA lineage from birds to swine in the late 1970s, EA virus genes have undergone substantially faster adaptive evolution than those of the CS lineage, which had circulated among swine for decades. Further, the adaptation rates of the EA lineage antigenic haemagglutinin and neuraminidase genes were unexpectedly high and similar to those observed in human influenza A. We show that the successful establishment of avian influenza viruses in swine is associated with raised adaptive evolution across the entire genome for many years after zoonosis, reflecting the contribution of multiple mutations to the coordinated optimization of viral fitness in a new environment. This dynamics is replicated independently in the polymerase genes of the TR lineage, which established in swine following separate transmission from non-swine hosts.
Philosophical transactions of the Royal Society of London. Series B, Biological sciences 2013;368;1614;20120382
PUBMED: 23382435; DOI: 10.1098/rstb.2012.0382
-
Uniparental markers in Italy reveal a sex-biased genetic structure and different historical strata.
Laboratorio di Antropologia Molecolare, Dipartimento di Scienze Biologiche, Geologiche e Ambientali, Università di Bologna, Bologna, Italy.
Located in the center of the Mediterranean landscape and with an extensive coastal line, the territory of what is today Italy has played an important role in the history of human settlements and movements of Southern Europe and the Mediterranean Basin. Populated since Paleolithic times, the complexity of human movements during the Neolithic, the Metal Ages and the most recent history of the two last millennia (involving the overlapping of different cultural and demic strata) has shaped the pattern of the modern Italian genetic structure. With the aim of disentangling this pattern and understanding which processes more importantly shaped the distribution of diversity, we have analyzed the uniparentally-inherited markers in ∼900 individuals from an extensive sampling across the Italian peninsula, Sardinia and Sicily. Spatial PCAs and DAPCs revealed a sex-biased pattern indicating different demographic histories for males and females. Besides the genetic outlier position of Sardinians, a North West-South East Y-chromosome structure is found in continental Italy. Such structure is in agreement with recent archeological syntheses indicating two independent and parallel processes of Neolithisation. In addition, date estimates pinpoint the importance of the cultural and demographic events during the late Neolithic and Metal Ages. On the other hand, mitochondrial diversity is distributed more homogeneously in agreement with older population events that might be related to the presence of an Italian Refugium during the last glacial period in Europe.
PloS one 2013;8;5;e65441
PUBMED: 23734255; PMC: 3666984; DOI: 10.1371/journal.pone.0065441
-
Compression of FASTQ and SAM Format Sequencing Data.
Wellcome Trust Sanger Institute, Cambridge, United Kingdom.
Storage and transmission of the data produced by modern DNA sequencing instruments has become a major concern, which prompted the Pistoia Alliance to pose the SequenceSqueeze contest for compression of FASTQ files. We present several compression entries from the competition, Fastqz and Samcomp/Fqzcomp, including the winning entry. These are compared against existing algorithms for both reference based compression (CRAM, Goby) and non-reference based compression (DSRC, BAM) and other recently published competition entries (Quip, SCALCE). The tools are shown to be the new Pareto frontier for FASTQ compression, offering state of the art ratios at affordable CPU costs. All programs are freely available on SourceForge. Fastqz: https://sourceforge.net/projects/fastqz/, fqzcomp: https://sourceforge.net/projects/fqzcomp/, and samcomp: https://sourceforge.net/projects/samcomp/.
PloS one 2013;8;3;e59190
PUBMED: 23533605; DOI: 10.1371/journal.pone.0059190
-
A Single Multilocus Sequence Typing (MLST) Scheme for Seven Pathogenic Leptospira Species.
Mahidol-Oxford Tropical Medicine Research Unit, Faculty of Tropical Medicine, Mahidol University, Bangkok, Thailand ; Department of Microbiology and Immunology, Faculty of Tropical Medicine, Mahidol University, Bangkok, Thailand.
Background: The available Leptospira multilocus sequence typing (MLST) scheme supported by a MLST website is limited to L. interrogans and L. kirschneri. Our aim was to broaden the utility of this scheme to incorporate a total of seven pathogenic species. We modified the existing scheme by replacing one of the seven MLST loci (fadD was changed to caiB), as the former gene did not appear to be present in some pathogenic species. Comparison of the original and modified schemes using data for L. interrogans and L. kirschneri demonstrated that the discriminatory power of the two schemes was not significantly different. The modified scheme was used to further characterize 325 isolates (L. alexanderi [n = 5], L. borgpetersenii [n = 34], L. interrogans [n = 222], L. kirschneri [n = 29], L. noguchii [n = 9], L. santarosai [n = 10], and L. weilii [n = 16]). Phylogenetic analysis using concatenated sequences of the 7 loci demonstrated that each species corresponded to a discrete clade, and that no strains were misclassified at the species level. Comparison between genotype and serovar was possible for 254 isolates. Of the 31 sequence types (STs) represented by at least two isolates, 18 STs included isolates assigned to two or three different serovars. Conversely, 14 serovars were identified that contained between 2 to 10 different STs. New observations were made on the global phylogeography of Leptospira spp., and the utility of MLST in making associations between human disease and specific maintenance hosts was demonstrated. Conclusion: The new MLST scheme, supported by an updated MLST website, allows the characterization and species assignment of isolates of the seven major pathogenic species associated with leptospirosis.
PLoS neglected tropical diseases 2013;7;1;e1954
PUBMED: 23359622; PMC: 3554523; DOI: 10.1371/journal.pntd.0001954
-
Platelet Genomics
Platelets 2013
DOI: 10.1016/B978-0-12-387837-3.00004-3; URL: http://dx.doi.org/10.1016/B97...8-0-12-387837-3.00004-3
-
A new method for high-resolution imaging of Ku foci to decipher mechanisms of DNA double-strand break repair.
The Wellcome Trust and Cancer Research UK Gurdon Institute, University of Cambridge, Cambridge, CB2 1QN, England, UK.
DNA double-strand breaks (DSBs) are the most toxic of all genomic insults, and pathways dealing with their signaling and repair are crucial to prevent cancer and for immune system development. Despite intense investigations, our knowledge of these pathways has been technically limited by our inability to detect the main repair factors at DSBs in cells. In this paper, we present an original method that involves a combination of ribonuclease- and detergent-based preextraction with high-resolution microscopy. This method allows direct visualization of previously hidden repair complexes, including the main DSB sensor Ku, at virtually any type of DSB, including those induced by anticancer agents. We demonstrate its broad range of applications by coupling it to laser microirradiation, super-resolution microscopy, and single-molecule counting to investigate the spatial organization and composition of repair factories. Furthermore, we use our method to monitor DNA repair and identify mechanisms of repair pathway choice, and we show its utility in defining cellular sensitivities and resistance mechanisms to anticancer agents.
Funded by: Cancer Research UK: A11224, C6/A11224, C6946/A14492; European Research Council: 268536; Wellcome Trust: WT092096
The Journal of cell biology 2013;202;3;579-95
PUBMED: 23897892; PMC: 3734090; DOI: 10.1083/jcb.201303073
-
Neolithic mitochondrial haplogroup H genomes and the genetic origins of Europeans.
1] The Australian Centre for Ancient DNA, School of Earth and Environmental Sciences, University of Adelaide, Adelaide, South Australia 5005, Australia [2] Archaeogenetics Research Group, School of Applied Sciences, University of Huddersfield, Huddersfield HD1 3DH, UK [3].
Haplogroup H dominates present-day Western European mitochondrial DNA variability (>40%), yet was less common (~19%) among Early Neolithic farmers (~5450 BC) and virtually absent in Mesolithic hunter-gatherers. Here we investigate this major component of the maternal population history of modern Europeans and sequence 39 complete haplogroup H mitochondrial genomes from ancient human remains. We then compare this 'real-time' genetic data with cultural changes taking place between the Early Neolithic (~5450 BC) and Bronze Age (~2200 BC) in Central Europe. Our results reveal that the current diversity and distribution of haplogroup H were largely established by the Mid Neolithic (~4000 BC), but with substantial genetic contributions from subsequent pan-European cultures such as the Bell Beakers expanding out of Iberia in the Late Neolithic (~2800 BC). Dated haplogroup H genomes allow us to reconstruct the recent evolutionary history of haplogroup H and reveal a mutation rate 45% higher than current estimates for human mitochondria.
Nature communications 2013;4;1764
PUBMED: 23612305; DOI: 10.1038/ncomms2656
-
Translating the human microbiome.
GlaxoSmithKline, Collegeville, Pennsylvania, USA. james.r.brown@gsk.com
Nature biotechnology 2013;31;4;304-8
PUBMED: 23563424; DOI: 10.1038/nbt.2543
-
Culture-free club.
Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.
Nature reviews. Microbiology 2013
PUBMED: 23748338; DOI: 10.1038/nrmicro3052
-
Whole-genome sequencing to identify transmission of Mycobacterium abscessus between patients with cystic fibrosis: a retrospective cohort study.
Wellcome Trust Sanger Institute, Hinxton, UK.
Background: Increasing numbers of individuals with cystic fibrosis are becoming infected with the multidrug-resistant non-tuberculous mycobacterium (NTM) Mycobacterium abscessus, which causes progressive lung damage and is extremely challenging to treat. How this organism is acquired is not currently known, but there is growing concern that person-to-person transmission could occur. We aimed to define the mechanisms of acquisition of M abscessus in individuals with cystic fibrosis.
Method: Whole genome sequencing and antimicrobial susceptibility testing were done on 168 consecutive isolates of M abscessus from 31 patients attending an adult cystic fibrosis centre in the UK between 2007 and 2011. In parallel, we undertook detailed environmental testing for NTM and defined potential opportunities for transmission between patients both in and out of hospital using epidemiological data and social network analysis.
Findings: Phylogenetic analysis revealed two clustered outbreaks of near-identical isolates of the M abscessus subspecies massiliense (from 11 patients), differing by less than ten base pairs. This variation represents less diversity than that seen within isolates from a single individual, strongly indicating between-patient transmission. All patients within these clusters had numerous opportunities for within-hospital transmission from other individuals, while comprehensive environmental sampling, initiated during the outbreak, failed to detect any potential point source of NTM infection. The clusters of M abscessus subspecies massiliense showed evidence of transmission of mutations acquired during infection of an individual to other patients. Thus, isolates with constitutive resistance to amikacin and clarithromycin were isolated from several individuals never previously exposed to long-term macrolides or aminoglycosides, further indicating cross-infection.
Interpretation: Whole genome sequencing has revealed frequent transmission of multidrug resistant NTM between patients with cystic fibrosis despite conventional cross-infection measures. Although the exact transmission route is yet to be established, our epidemiological analysis suggests that it could be indirect.
Funding: The Wellcome Trust, Papworth Hospital, NIHR Cambridge Biomedical Research Centre, UK Health Protection Agency, Medical Research Council, and the UKCRC Translational Infection Research Initiative.
Funded by: Medical Research Council; Wellcome Trust: 084953, 098051
Lancet 2013;381;9877;1551-60
PUBMED: 23541540; PMC: 3664974; DOI: 10.1016/S0140-6736(13)60632-7
-
Transmission of M abscessus in patients with cystic fibrosis - Authors' reply.
Lancet 2013;382;9891;504
PUBMED: 23931921; DOI: 10.1016/S0140-6736(13)61709-2
-
Inferring patient to patient transmission of Mycobacterium tuberculosis from whole genome sequencing data.
BACKGROUND: Mycobacterium tuberculosis is characterised by limited genomic diversity, which makes the application of whole genome sequencing particularly attractive for clinical and epidemiological investigation. However, in order to confidently infer transmission events, an accurate knowledge of the rate of change in the genome over relevant timescales is required. METHODS: We attempted to estimate a molecular clock by sequencing 199 isolates from epidemiologically linked tuberculosis cases, collected in the Netherlands spanning almost 16 years. RESULTS: Multiple analyses support an average mutation rate of ~0.3 SNPs per genome per year. However, all analyses revealed a very high degree of variation around this mean, making the confirmation of links proposed by epidemiology, and inference of novel links, difficult. Despite this, in some cases, the phylogenetic context of other strains provided evidence supporting the confident exclusion of previously inferred epidemiological links. CONCLUSIONS: This in-depth analysis of the molecular clock revealed that it is slow and variable over short time scales, which limits its usefulness in transmission studies. However, the superior resolution of whole genome sequencing can provide the phylogenetic context to allow the confident exclusion of possible transmission events previously inferred via traditional DNA fingerprinting techniques and epidemiological cluster investigation. Despite the slow generation of variation even at the whole genome level we conclude that the investigation of tuberculosis transmission will benefit greatly from routine whole genome sequencing.
BMC infectious diseases 2013;13;1;110
PUBMED: 23446317; PMC: 3599118; DOI: 10.1186/1471-2334-13-110
-
Headbobber: a combined morphogenetic and cochleosaccular mouse model to study 10qter deletions in human deafness.
Wellcome Trust Sanger Institute, Hinxton, Cambridgeshire, United Kingdom ; Wolfson Centre for Age-Related Diseases, King's College London, London, United Kingdom.
The recessive mouse mutant headbobber () displays the characteristic behavioural traits associated with vestibular defects including headbobbing, circling and deafness. This mutation was caused by the insertion of a transgene into distal chromosome 7 affecting expression of native genes. We show that the inner ear of mutants lacks semicircular canals and cristae, and the saccule and utricle are fused together in a single utriculosaccular sac. Moreover, we detect severe abnormalities of the cochlear sensory hair cells, the stria vascularis looks severely disorganised, Reissner's membrane is collapsed and no endocochlear potential is detected. Myo7a and Kcnj10 expression analysis show a lack of the melanocyte-like intermediate cells in stria vascularis, which can explain the absence of endocochlear potential. We use Trp2 as a marker of melanoblasts migrating from the neural crest at E12.5 and show that they do not interdigitate into the developing strial epithelium, associated with abnormal persistence of the basal lamina in the cochlea. We perform array CGH, deep sequencing as well as an extensive expression analysis of candidate genes in the headbobber region of and littermate controls, and conclude that the headbobber phenotype is caused by: 1) effect of a 648 kb deletion on distal Chr7, resulting in the loss of three protein coding genes (, and ) with expression in the inner ear but unknown function; and 2) indirect, long range effect of the deletion on the expression of neighboring genes on Chr7, associated with downregulation of and homeobox transcription factors. Interestingly, deletions of the orthologous region in humans, affecting the same genes, have been reported in nineteen patients with common features including sensorineural hearing loss and vestibular problems. Therefore, we propose that headbobber is a useful model to gain insight into the mechanisms underlying deafness in human 10qter deletion syndrome.
PloS one 2013;8;2;e56274
PUBMED: 23457544; DOI: 10.1371/journal.pone.0056274
-
Missense mutations in β-1,3-N-acetylglucosaminyltransferase 1 (B3GNT1) cause Walker-Warburg syndrome.
The authors wish it to be known that, in their opinion, the first five authors should be regarded as joint First Authors.
Several known or putative glycosyltransferases are required for the synthesis of laminin-binding glycans on alpha-dystroglycan (αDG), including POMT1, POMT2, POMGnT1, LARGE, Fukutin, FKRP, ISPD and GTDC2. Mutations in these glycosyltransferase genes result in defective αDG glycosylation and reduced ligand binding by αDG causing a clinically heterogeneous group of congenital muscular dystrophies, commonly referred to as dystroglycanopathies. The most severe clinical form, Walker-Warburg syndrome (WWS), is characterized by congenital muscular dystrophy and severe neurological and ophthalmological defects. Here, we report two homozygous missense mutations in the β-1,3-N-acetylglucosaminyltransferase 1 (B3GNT1) gene in a family affected with WWS. Functional studies confirmed the pathogenicity of the mutations. First, expression of wild-type but not mutant B3GNT1 in human prostate cancer (PC3) cells led to increased levels of αDG glycosylation. Second, morpholino knockdown of the zebrafish b3gnt1 orthologue caused characteristic muscular defects and reduced αDG glycosylation. These functional studies identify an important role of B3GNT1 in the synthesis of the uncharacterized laminin-binding glycan of αDG and implicate B3GNT1 as a novel causative gene for WWS.
Human molecular genetics 2013;22;9;1746-54
PUBMED: 23359570; PMC: 3613162; DOI: 10.1093/hmg/ddt021
-
A CRISPR view of genome sequences.
Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK. microbes@sanger.ac.uk.
This month's Genome Watch explores recent applications of the CRISPR immune system for bacterial phylogenetic analysis and genome editing.
Nature reviews. Microbiology 2013
PUBMED: 23474684; DOI: 10.1038/nrmicro2997
-
Large-scale association analysis identifies new risk loci for coronary artery disease.
Coronary artery disease (CAD) is the commonest cause of death. Here, we report an association analysis in 63,746 CAD cases and 130,681 controls identifying 15 loci reaching genome-wide significance, taking the number of susceptibility loci for CAD to 46, and a further 104 independent variants (r(2) < 0.2) strongly associated with CAD at a 5% false discovery rate (FDR). Together, these variants explain approximately 10.6% of CAD heritability. Of the 46 genome-wide significant lead SNPs, 12 show a significant association with a lipid trait, and 5 show a significant association with blood pressure, but none is significantly associated with diabetes. Network analysis with 233 candidate genes (loci at 10% FDR) generated 5 interaction networks comprising 85% of these putative genes involved in CAD. The four most significant pathways mapping to these networks are linked to lipid metabolism and inflammation, underscoring the causal role of these activities in the genetic etiology of CAD. Our study provides insights into the genetic basis of CAD and identifies key biological pathways.
Funded by: NHLBI NIH HHS: K24 HL107643, R00 HL094535, R01 HL111694; NIDDK NIH HHS: R01 DK062370
Nature genetics 2013;45;1;25-33
PUBMED: 23202125; PMC: 3679547; DOI: 10.1038/ng.2480
-
Pitpnm1 is expressed in hair cells during development but is not required for hearing.
Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, Cambs, CB10 1SA, United Kingdom.
Deafness is a genetically complex disorder with many contributing genes still unknown. Here we describe the expression of Pitpnm1 in the inner ear. It is expressed in the inner hair cells of the organ of Corti from late embryonic stages until adulthood, and transiently in the outer hair cells during early postnatal stages. Despite this specific expression, Pitpnm1 null mice showed no hearing defects, possibly due to redundancy with the paralogous genes Pitpnm2 and Pitpnm3.
Neuroscience 2013
PUBMED: 23820044; DOI: 10.1016/j.neuroscience.2013.06.045
-
Adipogenesis: new insights into brown adipose tissue differentiation.
S Carobbio, Wellcome Trust Genome Campus, Welcome Trust Sanger Insitute, Cambridge, United Kingdom.
Confirmation of the presence of functional brown adipose tissue (BAT) in humans has renewed the interest in investigating the potential therapeutic use of this tissue. The finding that its activity positively correlates with decreased BMI, fat content and augmented energy expenditure suggests that increasing BAT mass/activity or browning of WAT could be a strategy to prevent or treat obesity and its associated morbidities. The challenge now is to find a safe and efficient way to develop this idea. Whereas BAT has being widely studied in murine models both in vivo and in vitro, there is an urgent need for human cellular models to investigate BAT physiology and functionality from a molecular point of view. In our review, we focus on the latest insights surrounding BAT development and activation in rodents and humans. Then, we discuss how the availability of murine models has been essential to identify BAT progenitors and trace their lineage. Finally, we address how this information can be exploited to develop human cellular models for BAT differentiation/activation. In this context, human embryonic (hES) and induced plutipotent cells (hIPS)-based cellular models represent a resource of great potential value, as they can provide a virtually inexhaustible supply of starting material for functional genetic studies, -omics based analysis and validation of therapeutic approaches. Moreover, these cells can be easily genetically engineered, opening the possibility of generating patient-specific cellular models, allowing the investigation of the impact of different genetic backgrounds on BAT differentiation both in pathological or physiological states.
Journal of molecular endocrinology 2013
PUBMED: 24041933; DOI: 10.1530/JME-13-0158
-
Mutations in GDP-Mannose Pyrophosphorylase B Cause Congenital and Limb-Girdle Muscular Dystrophies Associated with Hypoglycosylation of α-Dystroglycan.
Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK.
Congenital muscular dystrophies with hypoglycosylation of α-dystroglycan (α-DG) are a heterogeneous group of disorders often associated with brain and eye defects in addition to muscular dystrophy. Causative variants in 14 genes thought to be involved in the glycosylation of α-DG have been identified thus far. Allelic mutations in these genes might also cause milder limb-girdle muscular dystrophy phenotypes. Using a combination of exome and Sanger sequencing in eight unrelated individuals, we present evidence that mutations in guanosine diphosphate mannose (GDP-mannose) pyrophosphorylase B (GMPPB) can result in muscular dystrophy variants with hypoglycosylated α-DG. GMPPB catalyzes the formation of GDP-mannose from GTP and mannose-1-phosphate. GDP-mannose is required for O-mannosylation of proteins, including α-DG, and it is the substrate of cytosolic mannosyltransferases. We found reduced α-DG glycosylation in the muscle biopsies of affected individuals and in available fibroblasts. Overexpression of wild-type GMPPB in fibroblasts from an affected individual partially restored glycosylation of α-DG. Whereas wild-type GMPPB localized to the cytoplasm, five of the identified missense mutations caused formation of aggregates in the cytoplasm or near membrane protrusions. Additionally, knockdown of the GMPPB ortholog in zebrafish caused structural muscle defects with decreased motility, eye abnormalities, and reduced glycosylation of α-DG. Together, these data indicate that GMPPB mutations are responsible for congenital and limb-girdle muscular dystrophies with hypoglycosylation of α-DG.
American journal of human genetics 2013
PUBMED: 23768512; DOI: 10.1016/j.ajhg.2013.05.009
-
Use of Vitek 2 Antimicrobial Susceptibility Profile To Identify mecC in Methicillin-Resistant Staphylococcus aureus.
Department of Medicine, University of Cambridge, Cambridge, United Kingdom.
The emergence of mecC methicillin-resistant Staphylococcus aureus (MRSA) poses a diagnostic challenge for clinical microbiology laboratories. Using the Vitek 2 system, we tested a panel of 896 Staphylococcus aureus isolates and found that an oxacillin-sensitive/cefoxitin-resistant profile had a sensitivity of 88.7% and a specificity of 99.5% for the identification of mecC MRSA isolates. The presence of the mecC gene, determined by bacterial whole-genome sequencing, was used as the gold standard. This profile could provide a zero-cost screening method for identification of mecC-positive MRSA strains.
Journal of clinical microbiology 2013;51;8;2732-4
PUBMED: 23720794; PMC: 3719650; DOI: 10.1128/JCM.00847-13
-
BamView: visualizing and interpretation of next-generation sequencing read alignments.
Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK. artemis@sanger.ac.uk.
So-called next-generation sequencing (NGS) has provided the ability to sequence on a massive scale at low cost, enabling biologists to perform powerful experiments and gain insight into biological processes. BamView has been developed to visualize and analyse sequence reads from NGS platforms, which have been aligned to a reference sequence. It is a desktop application for browsing the aligned or mapped reads [Ruffalo, M, LaFramboise, T, Koyutürk, M. Comparative analysis of algorithms for next-generation sequencing read alignment. Bioinformatics 2011;27:2790-6] at different levels of magnification, from nucleotide level, where the base qualities can be seen, to genome or chromosome level where overall coverage is shown. To enable in-depth investigation of NGS data, various views are provided that can be configured to highlight interesting aspects of the data. Multiple read alignment files can be overlaid to compare results from different experiments, and filters can be applied to facilitate the interpretation of the aligned reads. As well as being a standalone application it can be used as an integrated part of the Artemis genome browser, BamView allows the user to study NGS data in the context of the sequence and annotation of the reference genome. Single nucleotide polymorphism (SNP) density and candidate SNP sites can be highlighted and investigated, and read-pair information can be used to discover large structural insertions and deletions. The application will also calculate simple analyses of the read mapping, including reporting the read counts and reads per kilobase per million mapped reads (RPKM) for genes selected by the user. Availability: BamView and Artemis are freely available software. These can be downloaded from their home pages: http://bamview.sourceforge.net/; http://www.sanger.ac.uk/resources/software/artemis/. Requirements: Java 1.6 or higher.
Briefings in bioinformatics 2013;14;2;203-12
PUBMED: 22253280; PMC: 3603209; DOI: 10.1093/bib/bbr073
-
Phosphoproteomics data classify hematological cancer cell lines according to tumor type and sensitivity to kinase inhibitors.
Analytical Signalling Group, Centre for Cell Signalling, Barts Cancer Institute, Queen Mary University of London, Charterhouse Square, London EC1B 6BQ, UK. pedro.cutillas@imperial.ac.uk.
BACKGROUND: Tumor classification based on their predicted responses to kinase inhibitors is a major goal for advancing targeted personalized therapies. Here, we used a phosphoproteomic approach to investigate biological heterogeneity across hematological cancer cell lines including acute myeloid leukemia, lymphoma, and multiple myeloma. RESULTS: Mass spectrometry was used to quantify 2,000 phosphorylation sites across three acute myeloid leukemia, three lymphoma, and three multiple myeloma cell lines in six biological replicates. The intensities of the phosphorylation sites grouped these cancer cell lines according to their tumor type. In addition, a phosphoproteomic analysis of seven acute myeloid leukemia cell lines revealed a battery of phosphorylation sites whose combined intensities correlated with the growth-inhibitory responses to three kinase inhibitors with remarkable correlation coefficients and fold changes (> 100 between the most resistant and sensitive cells). Modeling based on regression analysis indicated that a subset of phosphorylation sites could be used to predict response to the tested drugs. Quantitative analysis of phosphorylation motifs indicated that resistant and sensitive cells differed in their patterns of kinase activities, but, interestingly, phosphorylations correlating with responses were not on members of the pathway being targeted; instead, these mainly were on parallel kinase pathways. CONCLUSION: This study reveals that the information on kinase activation encoded in phosphoproteomics data correlates remarkably well with the phenotypic responses of cancer cells to compounds that target kinase signaling and could be useful for the identification of novel markers of resistance or sensitivity to drugs that target the signaling network.
Genome biology 2013;14;4;R37
PUBMED: 23628362; DOI: 10.1186/gb-2013-14-4-r37
-
Comprehensive assignment of roles for salmonella typhimurium genes in intestinal colonization of food-producing animals.
Department of Veterinary Medicine, University of Cambridge, Cambridge, United Kingdom.
Chickens, pigs, and cattle are key reservoirs of Salmonella enterica, a foodborne pathogen of worldwide importance. Though a decade has elapsed since publication of the first Salmonella genome, thousands of genes remain of hypothetical or unknown function, and the basis of colonization of reservoir hosts is ill-defined. Moreover, previous surveys of the role of Salmonella genes in vivo have focused on systemic virulence in murine typhoid models, and the genetic basis of intestinal persistence and thus zoonotic transmission have received little study. We therefore screened pools of random insertion mutants of S. enterica serovar Typhimurium in chickens, pigs, and cattle by transposon-directed insertion-site sequencing (TraDIS). The identity and relative fitness in each host of 7,702 mutants was simultaneously assigned by massively parallel sequencing of transposon-flanking regions. Phenotypes were assigned to 2,715 different genes, providing a phenotype-genotype map of unprecedented resolution. The data are self-consistent in that multiple independent mutations in a given gene or pathway were observed to exert a similar fitness cost. Phenotypes were further validated by screening defined null mutants in chickens. Our data indicate that a core set of genes is required for infection of all three host species, and smaller sets of genes may mediate persistence in specific hosts. By assigning roles to thousands of Salmonella genes in key reservoir hosts, our data facilitate systems approaches to understand pathogenesis and the rational design of novel cross-protective vaccines and inhibitors. Moreover, by simultaneously assigning the genotype and phenotype of over 90% of mutants screened in complex pools, our data establish TraDIS as a powerful tool to apply rich functional annotation to microbial genomes with minimal animal use.
PLoS genetics 2013;9;4;e1003456
PUBMED: 23637626; PMC: 3630085; DOI: 10.1371/journal.pgen.1003456
-
Mcph1-Deficient Mice Reveal a Role for MCPH1 in Otitis Media.
Wellcome Trust Sanger Institute, Hinxton, Cambridgeshire, United Kingdom.
Otitis media is a common reason for hearing loss, especially in children. Otitis media is a multifactorial disease and environmental factors, anatomic dysmorphology and genetic predisposition can all contribute to its pathogenesis. However, the reasons for the variable susceptibility to otitis media are elusive. MCPH1 mutations cause primary microcephaly in humans. So far, no hearing impairment has been reported either in the MCPH1 patients or mouse models with Mcph1 deficiency. In this study, Mcph1-deficient (Mcph1(tm1a) (/tm1a) ) mice were produced using embryonic stem cells with a targeted mutation by the Sanger Institute's Mouse Genetics Project. Auditory brainstem response measurements revealed that Mcph1(tm1a) (/tm1a) mice had mild to moderate hearing impairment with around 70% penetrance. We found otitis media with effusion in the hearing-impaired Mcph1(tm1a) (/tm1a) mice by anatomic and histological examinations. Expression of Mcph1 in the epithelial cells of middle ear cavities supported its involvement in the development of otitis media. Other defects of Mcph1(tm1a) (/tm1a) mice included small skull sizes, increased micronuclei in red blood cells, increased B cells and ocular abnormalities. These findings not only recapitulated the defects found in other Mcph1-deficient mice or MCPH1 patients, but also revealed an unexpected phenotype, otitis media with hearing impairment, which suggests Mcph1 is a new gene underlying genetic predisposition to otitis media.
PloS one 2013;8;3;e58156
PUBMED: 23516444; PMC: 3596415; DOI: 10.1371/journal.pone.0058156
-
Proteomic Comparison of Historic and Recently Emerged Hypervirulent Clostridium difficile Strains.
Department of Population Medicine and Diagnostic Sciences, Cornell University , Ithaca, New York 14853, United States.
Clostridium difficile in recent years has undergone rapid evolution and has emerged as a serious human pathogen. Proteomic approaches can improve the understanding of the diversity of this important pathogen, especially in comparing the adaptive ability of different C. difficile strains. In this study, TMT labeling and nanoLC-MS/MS driven proteomics were used to investigate the responses of four C. difficile strains to nutrient shift and osmotic shock. We detected 126 and 67 differentially expressed proteins in at least one strain under nutrition shift and osmotic shock, respectively. During nutrient shift, several components of the phosphotransferase system (PTS) were found to be differentially expressed, which indicated that the carbon catabolite repression (CCR) was relieved to allow the expression of enzymes and transporters responsible for the utilization of alternate carbon sources. Some classical osmotic shock associated proteins, such as GroEL, RecA, CspG, and CspF, and other stress proteins such as PurG and SerA were detected during osmotic shock. Furthermore, the recently emerged strains were found to contain a more robust gene network in response to both stress conditions. This work represents the first comparative proteomic analysis of historic and recently emerged hypervirulent C. difficile strains, complementing the previously published proteomics studies utilizing only one reference strain.
Journal of proteome research 2013;12;3;1151-61
PUBMED: 23298230; DOI: 10.1021/pr3007528
-
Hierarchical and spatially explicit clustering of DNA sequences with BAPS software.
Department of Mathematics and statistics, University of Helsinki, 00014, Finland; Cardiff School of Biosciences, Cardiff University, Cardiff, CF10 3AX, UK; Department of Infectious Disease Epidemiology, Imperial College London, Norfolk Place, London, W2 1PG, UK; Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.
Phylogeographical analyses have become commonplace for a myriad of organisms with the advent of cheap DNA sequencing technologies. Bayesian model-based clustering is a powerful tool for detecting important patterns in such data and can be used to decipher even quite subtle signals of systematic differences in molecular variation. Here we introduce two upgrades to the Bayesian Analysis of Population Structure (BAPS) software, which enable (1) spatially explicit modeling of variation in DNA sequences, and (2) hierarchical clustering of DNA sequence data to reveal nested genetic population structures. We provide a direct interface to map the results from spatial clustering with Google Maps using the portal http://www.spatialepidemiology.net/ and illustrate this approach using sequence data from Borrelia burgdorferii. The usefulness of hierarchical clustering is demonstrated through an analysis of the metapopulation structure within a bacterial population experiencing a high level of local horizontal gene transfer. The tools that are introduced are freely available at http://www.helsinki.fi/bsg/software/BAPS/.
Molecular biology and evolution 2013
PUBMED: 23408797; DOI: 10.1093/molbev/mst028
-
ISPD gene mutations are a common cause of congenital and limb-girdle muscular dystrophies.
Dubowitz Neuromuscular Centre, UCL Institute of Child Health, University College London, 30 Guilford Street, London WC1N 1EH, UK. f.muntoni@ucl.ac.uk.
Dystroglycanopathies are a clinically and genetically diverse group of recessively inherited conditions ranging from the most severe of the congenital muscular dystrophies, Walker-Warburg syndrome, to mild forms of adult-onset limb-girdle muscular dystrophy. Their hallmark is a reduction in the functional glycosylation of α-dystroglycan, which can be detected in muscle biopsies. An important part of this glycosylation is a unique O-mannosylation, essential for the interaction of α-dystroglycan with extracellular matrix proteins such as laminin-α2. Mutations in eight genes coding for proteins in the glycosylation pathway are responsible for ∼50% of dystroglycanopathy cases. Despite multiple efforts using traditional positional cloning, the causative genes for unsolved dystroglycanopathy cases have escaped discovery for several years. In a recent collaborative study, we discovered that loss-of-function recessive mutations in a novel gene, called isoprenoid synthase domain containing (ISPD), are a relatively common cause of Walker-Warburg syndrome. In this article, we report the involvement of the ISPD gene in milder dystroglycanopathy phenotypes ranging from congenital muscular dystrophy to limb-girdle muscular dystrophy and identified allelic ISPD variants in nine cases belonging to seven families. In two ambulant cases, there was evidence of structural brain involvement, whereas in seven, the clinical manifestation was restricted to a dystrophic skeletal muscle phenotype. Although the function of ISPD in mammals is not yet known, mutations in this gene clearly lead to a reduction in the functional glycosylation of α-dystroglycan, which not only causes the severe Walker-Warburg syndrome but is also a common cause of the milder forms of dystroglycanopathy.
Brain : a journal of neurology 2013;136;Pt 1;269-81
PUBMED: 23288328; PMC: 3562076; DOI: 10.1093/brain/aws312
-
Genome of Acanthamoeba castellanii highlights extensive lateral gene transfer and early evolution of tyrosine kinase signaling.
Conway Institute, University College Dublin, Belfield, Dublin 4, Ireland. brendan.loftus@ucd.ie.
BACKGROUND: The Amoebozoa constitute one of the primary divisions of eukaryotes, encompassing taxa of both biomedical and evolutionary importance, yet its genomic diversity remains largely unsampled. Here we present an analysis of a whole genome assembly of Acanthamoeba castellanii (Ac) the first representative from a solitary free-living amoebozoan. RESULTS: Ac encodes 15,455 compact intron-rich genes, a significant number of which are predicted to have arisen through inter-kingdom lateral gene transfer (LGT). A majority of the LGT candidates have undergone a substantial degree of intronization and Ac appears to have incorporated them into established transcriptional programs. Ac manifests a complex signaling and cell communication repertoire, including a complete tyrosine kinase signaling toolkit and a comparable diversity of predicted extracellular receptors to that found in the facultatively multicellular dictyostelids. An important environmental host of a diverse range of bacteria and viruses, Ac utilizes a diverse repertoire of predicted pattern recognition receptors, many with predicted orthologous functions in the innate immune systems of higher organisms. CONCLUSIONS: Our analysis highlights the important role of LGT in the biology of Ac and in the diversification of microbial eukaryotes. The early evolution of a key signaling facility implicated in the evolution of metazoan multicellularity strongly argues for its emergence early in the Unikont lineage. Overall, the availability of an Ac genome should aid in deciphering the biology of the Amoebozoa and facilitate functional genomic studies in this important model organism and environmental host.
Genome biology 2013;14;2;R11
PUBMED: 23375108; DOI: 10.1186/gb-2013-14-2-r11
-
Identification of seven loci affecting mean telomere length and their association with disease.
1] Department of Cardiovascular Sciences, University of Leicester, Leicester, UK. [2] National Institute for Health Research Leicester Cardiovascular Biomedical Research Unit, Glenfield Hospital, Leicester, UK. [3].
Interindividual variation in mean leukocyte telomere length (LTL) is associated with cancer and several age-associated diseases. We report here a genome-wide meta-analysis of 37,684 individuals with replication of selected variants in an additional 10,739 individuals. We identified seven loci, including five new loci, associated with mean LTL (P < 5 × 10(-8)). Five of the loci contain candidate genes (TERC, TERT, NAF1, OBFC1 and RTEL1) that are known to be involved in telomere biology. Lead SNPs at two loci (TERC and TERT) associate with several cancers and other diseases, including idiopathic pulmonary fibrosis. Moreover, a genetic risk score analysis combining lead variants at all 7 loci in 22,233 coronary artery disease cases and 64,762 controls showed an association of the alleles associated with shorter LTL with increased risk of coronary artery disease (21% (95% confidence interval, 5-35%) per standard deviation in LTL, P = 0.014). Our findings support a causal role of telomere-length variation in some age-related diseases.
Nature genetics 2013;45;4;422-7
PUBMED: 23535734; DOI: 10.1038/ng.2528
-
Real-time genomic epidemiological evaluation of human campylobacter isolates by use of whole-genome multilocus sequence typing.
Department of Zoology, University of Oxford, Oxford, United Kingdom.
Sequence-based typing is essential for understanding the epidemiology of Campylobacter infections, a major worldwide cause of bacterial gastroenteritis. We demonstrate the practical and rapid exploitation of whole-genome sequencing to provide routine definitive characterization of Campylobacter jejuni and Campylobacter coli for clinical and public health purposes. Short-read data from 384 Campylobacter clinical isolates collected over 4 months in Oxford, United Kingdom, were assembled de novo. Contigs were deposited at the pubMLST.org/campylobacter website and automatically annotated for 1,667 loci. Typing and phylogenetic information was extracted and comparative analyses were performed for various subsets of loci, up to the level of the whole genome, using the Genome Comparator and Neighbor-net algorithms. The assembled sequences (for 379 isolates) were diverse and resembled collections from previous studies of human campylobacteriosis. Small subsets of very closely related isolates originated mainly from repeated sampling from the same patients and, in one case, likely laboratory contamination. Much of the within-patient variation occurred in phase-variable genes. Clinically and epidemiologically informative data can be extracted from whole-genome sequence data in real time with straightforward, publicly available tools. These analyses are highly scalable, are transparent, do not require closely related genome reference sequences, and provide improved resolution (i) among Campylobacter clonal complexes and (ii) between very closely related isolates. Additionally, these analyses rapidly differentiated unrelated isolates, allowing the detection of single-strain clusters. The approach is widely applicable to analyses of human bacterial pathogens in real time in clinical laboratories, with little specialist training required.
Journal of clinical microbiology 2013;51;8;2526-34
PUBMED: 23698529; PMC: 3719633; DOI: 10.1128/JCM.00066-13
-
A genetic study of Wilson's disease in the United Kingdom.
1 Wellcome Trust Sanger Institute, Hinxton, CB10 1SA, UK.
Previous studies have failed to identify mutations in the Wilson's disease gene ATP7B in a significant number of clinically diagnosed cases. This has led to concerns about genetic heterogeneity for this condition but also suggested the presence of unusual mutational mechanisms. We now present our findings in 181 patients from the United Kingdom with clinically and biochemically confirmed Wilson's disease. A total of 116 different ATP7B mutations were detected, 32 of which are novel. The overall mutation detection frequency was 98%. The likelihood of mutations in genes other than ATP7B causing a Wilson's disease phenotype is therefore very low. We report the first cases with Wilson's disease due to segmental uniparental isodisomy as well as three patients with three ATP7B mutations and three families with Wilson's disease in two consecutive generations. We determined the genetic prevalence of Wilson's disease in the United Kingdom by sequencing the entire coding region and adjacent splice sites of ATP7B in 1000 control subjects. The frequency of all single nucleotide variants with in silico evidence of pathogenicity (Class 1 variant) was 0.056 or 0.040 if only those single nucleotide variants that had previously been reported as mutations in patients with Wilson's disease were included in the analysis (Class 2 variant). The frequency of heterozygote, putative or definite disease-associated ATP7B mutations was therefore considerably higher than the previously reported occurrence of 1:90 (or 0.011) for heterozygote ATP7B mutation carriers in the general population (P < 2.2 × 10(-16) for Class 1 variants or P < 5 × 10(-11) for Class 2 variants only). Subsequent exclusion of four Class 2 variants without additional in silico evidence of pathogenicity led to a further reduction of the mutation frequency to 0.024. Using this most conservative approach, the calculated frequency of individuals predicted to carry two mutant pathogenic ATP7B alleles is 1:7026 and thus still considerably higher than the typically reported prevalence of Wilson's disease of 1:30 000 (P = 0.00093). Our study provides strong evidence for monogenic inheritance of Wilson's disease. It also has major implications for ATP7B analysis in clinical practice, namely the need to consider unusual genetic mechanisms such as uniparental disomy or the possible presence of three ATP7B mutations. The marked discrepancy between the genetic prevalence and the number of clinically diagnosed cases of Wilson's disease may be due to both reduced penetrance of ATP7B mutations and failure to diagnose patients with this eminently treatable disorder.
Brain : a journal of neurology 2013
PUBMED: 23518715; DOI: 10.1093/brain/awt035
-
Genomic and proteomic dissection of the ubiquitous plant pathogen, Armillaria mellea: toward a new infection model system.
Department of Biology, National University of Ireland Maynooth, Maynooth, Co Kildare, Ireland.
Armillaria mellea is a major plant pathogen. Yet, no large-scale "-omics" data are available to enable new studies, and limited experimental models are available to investigate basidiomycete pathogenicity. Here we reveal that the A. mellea genome comprises 58.35 Mb, contains 14473 gene models, of average length 1575 bp (4.72 introns/gene). Tandem mass spectrometry identified 921 mycelial (n = 629 unique) and secreted (n = 183 unique) proteins. Almost 100 mycelial proteins were either species-specific or previously unidentified at the protein level. A number of proteins (n = 111) was detected in both mycelia and culture supernatant extracts. Signal sequence occurrence was 4-fold greater for secreted (50.2%) compared to mycelial (12%) proteins. Analyses revealed a rich reservoir of carbohydrate degrading enzymes, laccases, and lignin peroxidases in the A. mellea proteome, reminiscent of both basidiomycete and ascomycete glycodegradative arsenals. We discovered that A. mellea exhibits a specific killing effect against Candida albicans during coculture. Proteomic investigation of this interaction revealed the unique expression of defensive and potentially offensive A. mellea proteins (n = 30). Overall, our data reveal new insights into the origin of basidiomycete virulence and we present a new model system for further studies aimed at deciphering fungal pathogenic mechanisms.
Journal of proteome research 2013;12;6;2552-70
PUBMED: 23656496; PMC: 3679558; DOI: 10.1021/pr301131t
-
Small effective population size and genetic homogeneity in the Val Borbera isolate.
Institute of Genetics and Biophysics 'A. Buzzati-Traverso', National Research Council (CNR), Naples, Italy. vincenza.colonna@igb.cnr.it
Population isolates are a valuable resource for medical genetics because of their reduced genetic, phenotypic and environmental heterogeneity. Further, extended linkage disequilibrium (LD) allows accurate haplotyping and imputation. In this study, we use nuclear and mitochondrial DNA data to determine to what extent the geographically isolated population of the Val Borbera valley also presents features of genetic isolation. We performed a comparative analysis of population structure and estimated effective population size exploiting LD data. We also evaluated haplotype sharing through the analysis of segments of autozygosity. Our findings reveal that the valley has features characteristic of a genetic isolate, including reduced genetic heterogeneity and reduced effective population size. We show that this population has been subject to prolonged genetic drift and thus we expect many variants that are rare in the general population to reach significant frequency values in the valley, making this population suitable for the identification of rare variants underlying complex traits.
European journal of human genetics : EJHG 2013;21;1;89-94
PUBMED: 22713810; PMC: 3522197; DOI: 10.1038/ejhg.2012.113
-
Out-of-Africa migration and Neolithic coexpansion of Mycobacterium tuberculosis with modern humans.
1] Genomics and Health Unit, Centre for Public Health Research (CSISP-FISABIO), Valencia, Spain. [2] CIBER (Centros de Investigación Biomédica en Red) in Epidemiology and Public Health, Barcelona, Spain.
Tuberculosis caused 20% of all human deaths in the Western world between the seventeenth and nineteenth centuries and remains a cause of high mortality in developing countries. In analogy to other crowd diseases, the origin of human tuberculosis has been associated with the Neolithic Demographic Transition, but recent studies point to a much earlier origin. We analyzed the whole genomes of 259 M. tuberculosis complex (MTBC) strains and used this data set to characterize global diversity and to reconstruct the evolutionary history of this pathogen. Coalescent analyses indicate that MTBC emerged about 70,000 years ago, accompanied migrations of anatomically modern humans out of Africa and expanded as a consequence of increases in human population density during the Neolithic period. This long coevolutionary history is consistent with MTBC displaying characteristics indicative of adaptation to both low and high host densities.
Nature genetics 2013;45;10;1176-82
PUBMED: 23995134; DOI: 10.1038/ng.2744
-
Detailed molecular characterisation of acute myeloid leukaemia with a normal karyotype using targeted DNA capture.
1] Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK [2] EMBL-European Bioinformatics Institute, Cambridge, UK.
Advances in sequencing technologies are giving unprecedented insights into the spectrum of somatic mutations underlying acute myeloid leukaemia with a normal karyotype (AML-NK). It is clear that the prognosis of individual patients is strongly influenced by the combination of mutations in their leukaemia and that many leukaemias are composed of multiple subclones, with differential susceptibilities to treatment. Here, we describe a method, employing targeted capture coupled with next-generation sequencing and tailored bioinformatic analysis, for the simultaneous study of 24 genes recurrently mutated in AML-NK. Mutational analysis was performed using open source software and an in-house script (Mutation Identification and Analysis Software), which identified dominant clone mutations with 100% specificity. In each of seven cases of AML-NK studied, we identified and verified mutations in 2-4 genes in the main leukaemic clone. Additionally, high sequencing depth enabled us to identify putative subclonal mutations and detect leukaemia-specific mutations in DNA from remission marrow. Finally, we used normalised read depths to detect copy number changes and identified and subsequently verified a tandem duplication of exons 2-9 of MLL and at least one deletion involving PTEN. This methodology reliably detects sequence and copy number mutations, and can thus greatly facilitate the classification, clinical research, diagnosis and management of AML-NK.
Leukemia 2013;27;9;1820-5
PUBMED: 23702683; PMC: 3768109; DOI: 10.1038/leu.2013.117
-
Where genotype is not predictive of phenotype: towards an understanding of the molecular basis of reduced penetrance in human inherited disease.
Institute of Medical Genetics, School of Medicine, Cardiff University, Heath Park, Cardiff, CF14 4XN, UK, cooperDN@cardiff.ac.uk.
Some individuals with a particular disease-causing mutation or genotype fail to express most if not all features of the disease in question, a phenomenon that is known as 'reduced (or incomplete) penetrance'. Reduced penetrance is not uncommon; indeed, there are many known examples of 'disease-causing mutations' that fail to cause disease in at least a proportion of the individuals who carry them. Reduced penetrance may therefore explain not only why genetic diseases are occasionally transmitted through unaffected parents, but also why healthy individuals can harbour quite large numbers of potentially disadvantageous variants in their genomes without suffering any obvious ill effects. Reduced penetrance can be a function of the specific mutation(s) involved or of allele dosage. It may also result from differential allelic expression, copy number variation or the modulating influence of additional genetic variants in cis or in trans. The penetrance of some pathogenic genotypes is known to be age- and/or sex-dependent. Variable penetrance may also reflect the action of unlinked modifier genes, epigenetic changes or environmental factors. At least in some cases, complete penetrance appears to require the presence of one or more genetic variants at other loci. In this review, we summarize the evidence for reduced penetrance being a widespread phenomenon in human genetics and explore some of the molecular mechanisms that may help to explain this enigmatic characteristic of human inherited disease.
Human genetics 2013
PUBMED: 23820649; DOI: 10.1007/s00439-013-1331-2
-
Novel Mycobacterium tuberculosis complex isolate from a wild chimpanzee.
Swiss Tropical and Public Health Institute, Basel, Switzerland.
Tuberculosis (TB) is caused by gram-positive bacteria known as the Mycobacterium tuberculosis complex (MTBC). MTBC include several human-associated lineages and several variants adapted to domestic and, more rarely, wild animal species. We report an M. tuberculosis strain isolated from a wild chimpanzee in Côte d'Ivoire that was shown by comparative genomic and phylogenomic analyses to belong to a new lineage of MTBC, closer to the human-associated lineage 6 (also known as M. africanum West Africa 2) than to the other classical animal-associated MTBC strains. These results show that the general view of the genetic diversity of MTBC is limited and support the possibility that other MTBC variants exist, particularly in wild mammals in Africa. Exploring this diversity is crucial to the understanding of the biology and evolutionary history of this widespread infectious disease.
Funded by: NIAID NIH HHS: AI090928; PHS HHS: HHSN266200700022C; Wellcome Trust
Emerging infectious diseases 2013;19;6;969-76
PUBMED: 23735084; PMC: 3713819; DOI: 10.3201/eid1906.121012
-
Full-genome deep sequencing and phylogenetic analysis of novel human betacoronavirus.
Wellcome Trust Sanger Institute, Hinxton, UK.
A novel betacoronavirus associated with lethal respiratory and renal complications was recently identified in patients from several countries in the Middle East. We report the deep genome sequencing of the virus directly from a patient's sputum sample. Our high-throughput sequencing yielded a substantial depth of genome sequence assembly and showed the minority viral variants in the specimen. Detailed phylogenetic analysis of the virus genome (England/Qatar/2012) revealed its close relationship to European bat coronaviruses circulating among the bat species of the Vespertilionidae family. Molecular clock analysis showed that the 2 human infections of this betacoronavirus in June 2012 (EMC/2012) and September 2012 (England/Qatar/2012) share a common virus ancestor most likely considerably before early 2012, suggesting the human diversity is the result of multiple zoonotic events.
Funded by: Wellcome Trust
Emerging infectious diseases 2013;19;5;736-42B
PUBMED: 23693015; PMC: 3647518; DOI: 10.3201/eid1905.130057
-
Genome-wide association and longitudinal analyses reveal genetic loci linking pubertal height growth, pubertal timing and childhood adiposity.
A full list of members is provided in the Supplementary Material.
The pubertal height growth spurt is a distinctive feature of childhood growth reflecting both the central onset of puberty and local growth factors. Although little is known about the underlying genetics, growth variability during puberty correlates with adult risks for hormone-dependent cancer and adverse cardiometabolic health. The only gene so far associated with pubertal height growth, LIN28B, pleiotropically influences childhood growth, puberty and cancer progression, pointing to shared underlying mechanisms. To discover genetic loci influencing pubertal height and growth and to place them in context of overall growth and maturation, we performed genome-wide association meta-analyses in 18 737 European samples utilizing longitudinally collected height measurements. We found significant associations (P < 1.67 × 10(-8)) at 10 loci, including LIN28B. Five loci associated with pubertal timing, all impacting multiple aspects of growth. In particular, a novel variant correlated with expression of MAPK3, and associated both with increased prepubertal growth and earlier menarche. Another variant near ADCY3-POMC associated with increased body mass index, reduced pubertal growth and earlier puberty. Whereas epidemiological correlations suggest that early puberty marks a pathway from rapid prepubertal growth to reduced final height and adult obesity, our study shows that individual loci associating with pubertal growth have variable longitudinal growth patterns that may differ from epidemiological observations. Overall, this study uncovers part of the complex genetic architecture linking pubertal height growth, the timing of puberty and childhood obesity and provides new information to pinpoint processes linking these traits.
Human molecular genetics 2013;22;13;2735-47
PUBMED: 23449627; PMC: 3674797; DOI: 10.1093/hmg/ddt104
-
Large scale variation in DNA copy number in chicken breeds.
Animal Breeding and Genomics Centre, Wageningen University, P,O, box 338, Wageningen 6700 AH, The Netherlands. richard.crooijmans@wur.nl.
Background: Detecting genetic variation is a critical step in elucidating the molecular mechanisms underlying phenotypic diversity. Until recently, such detection has mostly focused on single nucleotide polymorphisms (SNPs) because of the ease in screening complete genomes. Another type of variant, copy number variation (CNV), is emerging as a significant contributor to phenotypic variation in many species. Here we describe a genome-wide CNV study using array comparative genomic hybridization (aCGH) in a wide variety of chicken breeds. Results: We identified 3,154 CNVs, grouped into 1,556 CNV regions (CNVRs). Thirty percent of the CNVs were detected in at least 2 individuals. The average size of the CNVs detected was 46.3 kb with the largest CNV, located on GGAZ, being 4.3 Mb. Approximately 75% of the CNVs are copy number losses relatively to the Red Jungle Fowl reference genome. The genome coverage of CNVRs in this study is 60 Mb, which represents almost 5.4% of the chicken genome. In particular large gene families such as the keratin gene family and the MHC show extensive CNV. Conclusions: A relative large group of the CNVs are line-specific, several of which were previously shown to be related to the causative mutation for a number of phenotypic variants. The chance that inter-specific CNVs fall into CNVRs detected in chicken is related to the evolutionary distance between the species. Our results provide a valuable resource for the study of genetic and phenotypic variation in this phenotypically diverse species.
BMC genomics 2013;14;398
PUBMED: 23763846; PMC: 3751642; DOI: 10.1186/1471-2164-14-398
-
Identification of Null Alleles and Deletions from SNP Genotypes for an Intercross Between Domestic and Wild Chickens.
Wellcome Trust Sanger Institute.
We analyzed genotypes from ~10K SNPs in two families of an F2 intercross between Red Junglefowl and White Leghorn chickens. Possible null alleles were found by patterns of incompatible and missing genotypes. We estimated that 2.6% of SNPs had null alleles compared to 2.3% with genotyping errors and that 40% of SNPs where a parent and offspring were genotyped as different homozygotes had null alleles. Putative deletions were identified by null alleles at adjacent markers. We found two candidate deletions that were supported by fluorescence intensity data from a 60K SNP chip. One of the candidate deletions was from the Red Junglefowl and one was present in both the Red Junglefowl and White Leghorn. Both candidate deletions spanned protein-coding regions and were close to a previously detected QTL affecting body weight in this population. This study demonstrates that the ~50K SNP genotyping arrays now available for several agricultural species can be used to identify null alleles and deletions in data from large families. We suggest that our approach could be a useful complement to linkage analysis in experimental crosses.
G3 (Bethesda, Md.) 2013
PUBMED: 23708300; DOI: 10.1534/g3.113.006643
-
A library of functional recombinant cell surface and secreted Plasmodium falciparum merozoite proteins.
Wellcome Trust Sanger Institute, United Kingdom;
Malaria, an infectious disease caused by parasites of the Plasmodium genus, is one of the worlds major public health concerns causing up to a million deaths annually, mostly due to P. falciparum infections. All of the clinical symptoms are associated with the obligatory blood stage of infection, when a form of the parasite called the merozoite recognises and invades host erythrocytes. During erythrocyte invasion, merozoites are directly exposed to the host humoral immune system making the blood stage a conceptually attractive therapeutic target. Progress in the functional and molecular characterisation of P. falciparum merozoite proteins, however, has been hampered by the technical challenges associated with expressing these proteins in a biochemically active recombinant form. This challenge is particularly acute for extracellular proteins, which are the likely targets of host antibody responses, because they contain structurally-critical posttranslational modifications that are not added by some recombinant expression systems. Here, we report the development of a method that uses a mammalian expression system to compile a protein resource containing the entire ectodomains of 42 P. falciparum merozoite secreted and cell surface proteins, many of which have not previously been characterised. Importantly, we are able to recapitulate known biochemical activities by showing that recombinant MSP1-MSP7 and P12-P41 directly interact, and that both recombinant EBA175 and EBA140 can bind human erythrocytes in a sialic acid-dependent manner. Finally, we use sera from malaria-exposed immune adults to profile the relative immunoreactivity of the proteins and show that the majority of the antigens contain conformational (heat-labile) epitopes. We envisage that this resource of recombinant proteins will make a valuable contribution towards a molecular understanding of the blood stage of P. falciparum infections and facilitate the comparative screening of antigens as blood-stage vaccine candidates.
Molecular & cellular proteomics : MCP 2013
PUBMED: 24043421; DOI: 10.1074/mcp.O113.028357
-
Genetic relationship between five psychiatric disorders estimated from genome-wide SNPs.
The University of Queensland, Queensland Brain Institute, Brisbane, Queensland, Australia.
Most psychiatric disorders are moderately to highly heritable. The degree to which genetic variation is unique to individual disorders or shared across disorders is unclear. To examine shared genetic etiology, we use genome-wide genotype data from the Psychiatric Genomics Consortium (PGC) for cases and controls in schizophrenia, bipolar disorder, major depressive disorder, autism spectrum disorders (ASD) and attention-deficit/hyperactivity disorder (ADHD). We apply univariate and bivariate methods for the estimation of genetic variation within and covariation between disorders. SNPs explained 17-29% of the variance in liability. The genetic correlation calculated using common SNPs was high between schizophrenia and bipolar disorder (0.68 ± 0.04 s.e.), moderate between schizophrenia and major depressive disorder (0.43 ± 0.06 s.e.), bipolar disorder and major depressive disorder (0.47 ± 0.06 s.e.), and ADHD and major depressive disorder (0.32 ± 0.07 s.e.), low between schizophrenia and ASD (0.16 ± 0.06 s.e.) and non-significant for other pairs of disorders as well as between psychiatric disorders and the negative control of Crohn's disease. This empirical evidence of shared genetic etiology for psychiatric disorders can inform nosology and encourages the investigation of common pathophysiologies for related disorders.
Nature genetics 2013;45;9;984-94
PUBMED: 23933821; DOI: 10.1038/ng.2711
-
Population genomics of post-vaccine changes in pneumococcal epidemiology.
Center for Communicable Disease Dynamics, Department of Epidemiology, Harvard School of Public Health, Boston, Massachusetts, USA.
Whole-genome sequencing of 616 asymptomatically carried Streptococcus pneumoniae isolates was used to study the impact of the 7-valent pneumococcal conjugate vaccine. Comparison of closely related isolates showed the role of transformation in facilitating capsule switching to non-vaccine serotypes and the emergence of drug resistance. However, such recombination was found to occur at significantly different rates across the species, and the evolution of the population was primarily driven by changes in the frequency of distinct genotypes extant before the introduction of the vaccine. These alterations resulted in little overall effect on accessory genome composition at the population level, contrasting with the decrease in pneumococcal disease rates after the vaccine's introduction.
Funded by: NIAID NIH HHS: R01 AI066304, R01AI066304; Wellcome Trust: 098051
Nature genetics 2013;45;6;656-63
PUBMED: 23644493; PMC: 3725542; DOI: 10.1038/ng.2625
-
Bacterial genomes in epidemiology--present and future.
Department of Epidemiology, Center for Communicable Disease Dynamics, Harvard School of Public Health, Boston, MA, USA.
Sequence data are well established in the reconstruction of the phylogenetic and demographic scenarios that have given rise to outbreaks of viral pathogens. The application of similar methods to bacteria has been hindered in the main by the lack of high-resolution nucleotide sequence data from quality samples. Developing and already available genomic methods have greatly increased the amount of data that can be used to characterize an isolate and its relationship to others. However, differences in sequencing platforms and data analysis mean that these enhanced data come with a cost in terms of portability: results from one laboratory may not be directly comparable with those from another. Moreover, genomic data for many bacteria bear the mark of a history including extensive recombination, which has the potential to greatly confound phylogenetic and coalescent analyses. Here, we discuss the exacting requirements of genomic epidemiology, and means by which the distorting signal of recombination can be minimized to permit the leverage of growing datasets of genomic data from bacterial pathogens.
Funded by: NIAID NIH HHS: T32 AI007061; NIGMS NIH HHS: GM088558-01, U54 GM088558; Wellcome Trust: 098051
Philosophical transactions of the Royal Society of London. Series B, Biological sciences 2013;368;1614;20120202
PUBMED: 23382424; PMC: 3678326; DOI: 10.1098/rstb.2012.0202
-
SMIM1 underlies the Vel blood group and influences red blood cell traits.
Department of Haematology, University of Cambridge, Cambridge, UK. as889@cam.ac.uk
The blood group Vel was discovered 60 years ago, but the underlying gene is unknown. Individuals negative for the Vel antigen are rare and are required for the safe transfusion of patients with antibodies to Vel. To identify the responsible gene, we sequenced the exomes of five individuals negative for the Vel antigen and found that four were homozygous and one was heterozygous for a low-frequency 17-nucleotide frameshift deletion in the gene encoding the 78-amino-acid transmembrane protein SMIM1. A follow-up study showing that 59 of 64 Vel-negative individuals were homozygous for the same deletion and expression of the Vel antigen on SMIM1-transfected cells confirm SMIM1 as the gene underlying the Vel blood group. An expression quantitative trait locus (eQTL), the common SNP rs1175550 contributes to variable expression of the Vel antigen (P = 0.003) and influences the mean hemoglobin concentration of red blood cells (RBCs; P = 8.6 × 10(-15)). In vivo, zebrafish with smim1 knockdown showed a mild reduction in the number of RBCs, identifying SMIM1 as a new regulator of RBC formation. Our findings are of immediate relevance, as the homozygous presence of the deletion allows the unequivocal identification of Vel-negative blood donors.
Funded by: British Heart Foundation: RG/09/12/28096; Cancer Research UK: C45041/A14953; Wellcome Trust: 082597/Z/07/Z, 084183/Z/07/Z
Nature genetics 2013;45;5;542-5
PUBMED: 23563608; DOI: 10.1038/ng.2603
-
Horizontally acquired glycosyltransferase operons drive salmonellae lipopolysaccharide diversity.
Centre for Immunology and Infection, Hull York Medical School and the Department of Biology, University of York, York, United Kingdom.
The immunodominant lipopolysaccharide is a key antigenic factor for Gram-negative pathogens such as salmonellae where it plays key roles in host adaptation, virulence, immune evasion, and persistence. Variation in the lipopolysaccharide is also the major differentiating factor that is used to classify Salmonella into over 2600 serovars as part of the Kaufmann-White scheme. While lipopolysaccharide diversity is generally associated with sequence variation in the lipopolysaccharide biosynthesis operon, extraneous genetic factors such as those encoded by the glucosyltransferase (gtr) operons provide further structural heterogeneity by adding additional sugars onto the O-antigen component of the lipopolysaccharide. Here we identify and examine the O-antigen modifying glucosyltransferase genes from the genomes of Salmonella enterica and Salmonella bongori serovars. We show that Salmonella generally carries between 1 and 4 gtr operons that we have classified into 10 families on the basis of gtrC sequence with apparent O-antigen modification detected for five of these families. The gtr operons localize to bacteriophage-associated genomic regions and exhibit a dynamic evolutionary history driven by recombination and gene shuffling events leading to new gene combinations. Furthermore, evidence of Dam- and OxyR-dependent phase variation of gtr gene expression was identified within eight gtr families. Thus, as O-antigen modification generates significant intra- and inter-strain phenotypic diversity, gtr-mediated modification is fundamental in assessing Salmonella strain variability. This will inform appropriate vaccine and diagnostic approaches, in addition to contributing to our understanding of host-pathogen interactions.
PLoS genetics 2013;9;6;e1003568
PUBMED: 23818865; PMC: 3688519; DOI: 10.1371/journal.pgen.1003568
-
Structural and functional annotation of the porcine immunome.
USDA-ARS, Beltsville Human Nutrition Research Center, Diet, Genomics, Immunology Laboratory, Beltsville, MD 20705, USA.
Background: The domestic pig is known as an excellent model for human immunology and the two species share many pathogens. Susceptibility to infectious disease is one of the major constraints on swine performance, yet the structure and function of genes comprising the pig immunome are not well-characterized. The completion of the pig genome provides the opportunity to annotate the pig immunome, and compare and contrast pig and human immune systems.
Results: The Immune Response Annotation Group (IRAG) used computational curation and manual annotation of the swine genome assembly 10.2 (Sscrofa10.2) to refine the currently available automated annotation of 1,369 immunity-related genes through sequence-based comparison to genes in other species. Within these genes, we annotated 3,472 transcripts. Annotation provided evidence for gene expansions in several immune response families, and identified artiodactyl-specific expansions in the cathelicidin and type 1 Interferon families. We found gene duplications for 18 genes, including 13 immune response genes and five non-immune response genes discovered in the annotation process. Manual annotation provided evidence for many new alternative splice variants and 8 gene duplications. Over 1,100 transcripts without porcine sequence evidence were detected using cross-species annotation. We used a functional approach to discover and accurately annotate porcine immune response genes. A co-expression clustering analysis of transcriptomic data from selected experimental infections or immune stimulations of blood, macrophages or lymph nodes identified a large cluster of genes that exhibited a correlated positive response upon infection across multiple pathogens or immune stimuli. Interestingly, this gene cluster (cluster 4) is enriched for known general human immune response genes, yet contains many un-annotated porcine genes. A phylogenetic analysis of the encoded proteins of cluster 4 genes showed that 15% exhibited an accelerated evolution as compared to 4.1% across the entire genome.
Conclusions: This extensive annotation dramatically extends the genome-based knowledge of the molecular genetics and structure of a major portion of the porcine immunome. Our complementary functional approach using co-expression during immune response has provided new putative immune response annotation for over 500 porcine genes. Our phylogenetic analysis of this core immunome cluster confirms rapid evolutionary change in this set of genes, and that, as in other species, such genes are important components of the pig's adaptation to pathogen challenge over evolutionary time. These comprehensive and integrated analyses increase the value of the porcine genome sequence and provide important tools for global analyses and data-mining of the porcine immune response.
Funded by: Biotechnology and Biological Sciences Research Council: BB/E010520/1, BB/E010520/2, BB/G004013/1, BB/I025328/1, EC FP6; NCRR NIH HHS: P20-RR017686; NIAID NIH HHS: T32 AI83196; Wellcome Trust: 098051
BMC genomics 2013;14;332
PUBMED: 23676093; PMC: 3658956; DOI: 10.1186/1471-2164-14-332
-
Prelamin A causes progeria through cell-extrinsic mechanisms and prevents cancer invasion.
Instituto de Medicina Oncológica y Molecular de Asturias (IMOMA), 33193 Oviedo, Spain.
Defining the relationship between ageing and cancer is a crucial but challenging task. Mice deficient in Zmpste24, a metalloproteinase mutated in human progeria and involved in nuclear prelamin A maturation, recapitulate multiple features of ageing. However, their short lifespan and serious cell-intrinsic and cell-extrinsic alterations restrict the application and interpretation of carcinogenesis protocols. Here we present Zmpste24 mosaic mice that lack these limitations. Zmpste24 mosaic mice develop normally and keep similar proportions of Zmpste24-deficient (prelamin A-accumulating) and Zmpste24-proficient (mature lamin A-containing) cells throughout life, revealing that cell-extrinsic mechanisms are preeminent for progeria development. Moreover, prelamin A accumulation does not impair tumour initiation and growth, but it decreases the incidence of infiltrating oral carcinomas. Accordingly, silencing of ZMPSTE24 reduces human cancer cell invasiveness. Our results support the potential of cell-based and systemic therapies for progeria and highlight ZMPSTE24 as a new anticancer target.
Nature communications 2013;4;2268
PUBMED: 23917225; PMC: 3758871; DOI: 10.1038/ncomms3268
-
Mutational genomics for cancer pathway discovery
Lecture Notes in Computer Science 2013;7986;35-46
DOI: 10.1007/978-3-642-39159-0_4; URL: http://link.springer.com/chapter.../10.1007/978-3-642-39159-0_4
-
Identification of heart rate-associated loci and their effects on cardiac conduction and rhythm disorders.
1] Medical Research Council (MRC) Epidemiology Unit, Institute of Metabolic Science, Addenbrooke's Hospital, Cambridge, UK. [2] Department of Medical Sciences, Molecular Epidemiology and Science for Life Laboratory, Uppsala University, Uppsala, Sweden.
Elevated resting heart rate is associated with greater risk of cardiovascular disease and mortality. In a 2-stage meta-analysis of genome-wide association studies in up to 181,171 individuals, we identified 14 new loci associated with heart rate and confirmed associations with all 7 previously established loci. Experimental downregulation of gene expression in Drosophila melanogaster and Danio rerio identified 20 genes at 11 loci that are relevant for heart rate regulation and highlight a role for genes involved in signal transmission, embryonic cardiac development and the pathophysiology of dilated cardiomyopathy, congenital heart failure and/or sudden cardiac death. In addition, genetic susceptibility to increased heart rate is associated with altered cardiac conduction and reduced risk of sick sinus syndrome, and both heart rate-increasing and heart rate-decreasing variants associate with risk of atrial fibrillation. Our findings provide fresh insights into the mechanisms regulating heart rate and identify new therapeutic targets.
Funded by: NHLBI NIH HHS: R00 HL094535, R01 HL090620, R01 HL111314
Nature genetics 2013;45;6;621-31
PUBMED: 23583979; DOI: 10.1038/ng.2610
-
Multi-allelic Phenotyping - A systematic approach for the simultaneous analysis of multiple induced mutations.
Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.
The zebrafish mutation project (ZMP) aims to generate a loss of function allele for every protein-coding gene, but importantly to also characterise the phenotypes of these alleles during the first five days of development. Such a large-scale screen requires a systematic approach both to identifying phenotypes, and also to linking those phenotypes to specific mutations. This phenotyping pipeline simultaneously assesses the consequences of multiple alleles in a two-step process. First, mutations that do not produce a visible phenotype during the first five days of development are identified, while a second round of phenotyping focuses on detailed analysis of those alleles that are suspected to cause a phenotype. Allele-specific PCR single nucleotide polymorphism (SNP) assays are used to genotype F2 parents and individual F3 fry for mutations known to be present in the F1 founder. With this method specific phenotypes can be linked to induced mutations. In addition a method is described for cryopreserving sperm samples of mutagenised males and their subsequent use for in vitro fertilisation to generate F2 families for phenotyping. Ultimately this approach will lead to the functional annotation of the zebrafish genome, which will deepen our understanding of gene function in development and disease.
Methods (San Diego, Calif.) 2013
PUBMED: 23624102; DOI: 10.1016/j.ymeth.2013.04.013
-
Back to the future!
Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.
Nature reviews. Microbiology 2013;11;9;600
PUBMED: 23949601; DOI: 10.1038/nrmicro3099
-
Histone deacetylase 1 and 2 are essential for normal T-cell development and genomic stability in mice.
Department of Biochemistry, University of Leicester, Leicester, UK.
Histone deacetylase 1 and 2 (HDAC1/2) regulate chromatin structure as the catalytic core of the Sin3A, NuRD and CoREST co-repressor complexes. To better understand the key pathways regulated by HDAC1/2 in the adaptive immune system and inform their exploitation as drug targets, we have generated mice with a T-cell specific deletion. Loss of either HDAC1 or HDAC2 alone has little effect, while dual inactivation results in a 5-fold reduction in thymocyte cellularity, accompanied by developmental arrest at the double-negative to double-positive transition. Transcriptome analysis revealed 892 misregulated genes in Hdac1/2 knock-out thymocytes, including down-regulation of LAT, Themis and Itk, key components of the T-cell receptor (TCR) signaling pathway. Down-regulation of these genes suggests a model in which HDAC1/2 deficiency results in defective propagation of TCR signaling, thus blocking development. Furthermore, mice with reduced HDAC1/2 activity (Hdac1 deleted and a single Hdac2 allele) develop a lethal pathology by 3-months of age, caused by neoplastic transformation of immature T cells in the thymus. Tumor cells become aneuploid, express increased levels of c-Myc and show elevated levels of the DNA damage marker, γH2AX. These data demonstrate a crucial role for HDAC1/2 in T-cell development and the maintenance of genomic stability.
Funded by: Medical Research Council: G0600135
Blood 2013;121;8;1335-44
PUBMED: 23287868; DOI: 10.1182/blood-2012-07-441949
-
The presence of methylation quantitative trait Loci indicates a direct genetic influence on the level of DNA methylation in adipose tissue.
Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, United Kingdom.
Genetic variants that associate with DNA methylation at CpG sites (methylation quantitative trait loci, meQTLs) offer a potential biological mechanism of action for disease associated SNPs. We investigated whether meQTLs exist in abdominal subcutaneous adipose tissue (SAT) and if CpG methylation associates with metabolic syndrome (MetSyn) phenotypes. We profiled 27,718 genomic regions in abdominal SAT samples of 38 unrelated individuals using differential methylation hybridization (DMH) together with genotypes at 5,227,243 SNPs and expression of 17,209 mRNA transcripts. Validation and replication of significant meQTLs was pursued in an independent cohort of 181 female twins. We find that, at 5% false discovery rate, methylation levels of 149 DMH regions associate with at least one SNP in a ±500 kilobase cis-region in our primary study. We sought to validate 19 of these in the replication study and find that five of these significantly associate with the corresponding meQTL SNPs from the primary study. We find that none of the 149 meQTL top SNPs is a significant expression quantitative trait locus in our expression data, but we observed association between expression levels of two mRNA transcripts and cis-methylation status. Our results indicate that DNA CpG methylation in abdominal SAT is partly under genetic control. This study provides a starting point for future investigations of DNA methylation in adipose tissue.
PloS one 2013;8;2;e55923
PUBMED: 23431366; PMC: 3576415; DOI: 10.1371/journal.pone.0055923
-
Sequencing and Functional Annotation of Avian Pathogenic Escherichia coli Serogroup O78 Strains Reveal the Evolution of E. coli Lineages Pathogenic for Poultry via Distinct Mechanisms.
Enteric Bacterial Pathogens Laboratory, Institute for Animal Health, Compton, Berkshire, United Kingdom.
Avian pathogenic Escherichia coli (APEC) causes respiratory and systemic disease in poultry. Sequencing of a multilocus sequence type 95 (ST95) serogroup O1 strain previously indicated that APEC resembles E. coli causing extraintestinal human diseases. We sequenced the genomes of two strains of another dominant APEC lineage (ST23 serogroup O78 strains χ7122 and IMT2125) and compared them to each other and to the reannotated APEC O1 sequence. For comparison, we also sequenced a human enterotoxigenic E. coli (ETEC) strain of the same ST23 serogroup O78 lineage. Phylogenetic analysis indicated that the APEC O78 strains were more closely related to human ST23 ETEC than to APEC O1, indicating that separation of pathotypes on the basis of their extraintestinal or diarrheagenic nature is not supported by their phylogeny. The accessory genome of APEC ST23 strains exhibited limited conservation of APEC O1 genomic islands and a distinct repertoire of virulence-associated loci. In light of this diversity, we surveyed the phenotype of 2,185 signature-tagged transposon mutants of χ7122 following intra-air sac inoculation of turkeys. This procedure identified novel APEC ST23 genes that play strain- and tissue-specific roles during infection. For example, genes mediating group 4 capsule synthesis were required for the virulence of χ7122 and were conserved in IMT2125 but absent from APEC O1. Our data reveal the genetic diversity of E. coli strains adapted to cause the same avian disease and indicate that the core genome of the ST23 lineage serves as a chassis for the evolution of E. coli strains adapted to cause avian or human disease via acquisition of distinct virulence genes.
Infection and immunity 2013;81;3;838-49
PUBMED: 23275093; PMC: 3584874; DOI: 10.1128/IAI.00585-12
-
The SHOCT Domain: A Widespread Domain Under-Represented in Model Organisms.
European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, United Kingdom ; Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, United Kingdom.
We have identified a new protein domain, which we have named the SHOCT domain (ort -erminal domain). This domain is widespread in bacteria with over a thousand examples. But we found it is missing from the most commonly studied model organisms, despite being present in closely related species. It's predominantly C-terminal location, co-occurrence with numerous other domains and short size is reminiscent of the Gram-positive anchor motif, however it is present in a much wider range of species. We suggest several hypotheses about the function of SHOCT, including oligomerisation and nucleic acid binding. Our initial experiments do not support its role as an oligomerisation domain.
PloS one 2013;8;2;e57848
PUBMED: 23451277; PMC: 3581485; DOI: 10.1371/journal.pone.0057848
-
The 5q31 region in two African populations as a facet of natural selection by infectious diseases.
Unit of Disease and Diversity, Department of Molecular Biology, Institute of Endemic Diseases, University of Khartoum, Khartoum, Sudan. abjil04@yahoo.com
Cases of extreme natural selection could lead either to rapid fixation or extinction of alleles depending on the population structure and size. It may also manifest in excess of heterozygosity and the locus concerned will be displaying such drastic features of allele change. We suspect the 5q31 in chromosome 5 to mirror situation of such extreme natural selection particularly that the region encompasses genes of type 2 cytokine known to associate with a number of infectious and non-infectious diseases. We typed two sets of single nucleotide polymorphisms (SNPS) in two populations: an initial limited set of only 4 SNP within the genes of IL-4, IL-13, IL-5 and IL-9 in 108 unrelated individuals and a replicating set of 14 SN P in 924 individuals from the same populations with disregard to relatedness. The results suggest the 5q31 area to be under intense selective pressure as indicated by marked heterozygosity independent of Linkage Disequilibrium (LD); difference in heterozygosity, allele, and haplotype frequencies between generations and departure from Hardy-Weinberg expectations (DHWE). The study area is endemic for several infectious diseases including malaria and visceral leishmaniasis (VL). Malaria caused by Plasmodiumfalciparum, however, occurs mostly with mild clinical symptoms in all ages, which makes it unlikely to account for these indices. The strong selection signals seems to emanate from recent outbreaks of VL which affected both populations to varying extent.
Genetika 2013;49;2;279-88
PUBMED: 23668094
-
Evaluation of the genetic overlap between osteoarthritis with body mass index and height using genome-wide association scan data.
Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, UK.
Objectives: Obesity as measured by body mass index (BMI) is one of the major risk factors for osteoarthritis. In addition, genetic overlap has been reported between osteoarthritis and normal adult height variation. We investigated whether this relationship is due to a shared genetic aetiology on a genome-wide scale. Methods: We compared genetic association summary statistics (effect size, p value) for BMI and height from the GIANT consortium genome-wide association study (GWAS) with genetic association summary statistics from the arcOGEN consortium osteoarthritis GWAS. Significance was evaluated by permutation. Replication of osteoarthritis association of the highlighted signals was investigated in an independent dataset. Phenotypic information of height and BMI was accounted for in a separate analysis using osteoarthritis-free controls. Results: We found significant overlap between osteoarthritis and height (p=3.3×10(-5) for signals with p≤0.05) when the GIANT and arcOGEN GWAS were compared. For signals with p≤0.001 we found 17 shared signals between osteoarthritis and height and four between osteoarthritis and BMI. However, only one of the height or BMI signals that had shown evidence of association with osteoarthritis in the arcOGEN GWAS was also associated with osteoarthritis in the independent dataset: rs12149832, within the FTO gene (combined p=2.3×10(-5)). As expected, this signal was attenuated when we adjusted for BMI. Conclusions: We found a significant excess of shared signals between both osteoarthritis and height and osteoarthritis and BMI, suggestive of a common genetic aetiology. However, only one signal showed association with osteoarthritis when followed up in a new dataset.
Annals of the rheumatic diseases 2013;72;6;935-41
PUBMED: 22956599; PMC: 3664369; DOI: 10.1136/annrheumdis-2012-202081
-
A meta-analysis of genome-wide association studies identifies novel variants associated with osteoarthritis of the hip.
Department of Hygiene and Epidemiology, University of Ioannina Medical School, Ioannina, Greece.
Objectives: Osteoarthritis (OA) is the most common form of arthritis with a clear genetic component. To identify novel loci associated with hip OA we performed a meta-analysis of genome-wide association studies (GWAS) on European subjects.
Methods: We performed a two-stage meta-analysis on more than 78 000 participants. In stage 1, we synthesised data from eight GWAS whereas data from 10 centres were used for 'in silico' or 'de novo' replication. Besides the main analysis, a stratified by sex analysis was performed to detect possible sex-specific signals. Meta-analysis was performed using inverse-variance fixed effects models. A random effects approach was also used.
Results: We accumulated 11 277 cases of radiographic and symptomatic hip OA. We prioritised eight single nucleotide polymorphism (SNPs) for follow-up in the discovery stage (4349 OA cases); five from the combined analysis, two male specific and one female specific. One locus, at 20q13, represented by rs6094710 (minor allele frequency (MAF) 4%) near the NCOA3 (nuclear receptor coactivator 3) gene, reached genome-wide significance level with p=7.9×10(-9) and OR=1.28 (95% CI 1.18 to 1.39) in the combined analysis of discovery (p=5.6×10(-8)) and follow-up studies (p=7.3×10(-4)). We showed that this gene is expressed in articular cartilage and its expression was significantly reduced in OA-affected cartilage. Moreover, two loci remained suggestive associated; rs5009270 at 7q31 (MAF 30%, p=9.9×10(-7), OR=1.10) and rs3757837 at 7p13 (MAF 6%, p=2.2×10(-6), OR=1.27 in male specific analysis).
Conclusions: Novel genetic loci for hip OA were found in this meta-analysis of GWAS.
Annals of the rheumatic diseases 2013
PUBMED: 23989986; DOI: 10.1136/annrheumdis-2012-203114
-
The DOT1L rs12982744 polymorphism is associated with osteoarthritis of the hip with genome-wide statistical significance in males.
1Department of Hygiene and Epidemiology, University of Ioannina Medical School, University Campus, Ioannina, Greece.
Annals of the rheumatic diseases 2013
PUBMED: 23505243; DOI: 10.1136/annrheumdis-2012-203182
-
The Role of Adiposity in Cardiometabolic Traits: A Mendelian Randomization Analysis.
Molecular Epidemiology and Science for Life Laboratory, Department of Medical Sciences, Uppsala University, Uppsala, Sweden ; Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden.
Background: The association between adiposity and cardiometabolic traits is well known from epidemiological studies. Whilst the causal relationship is clear for some of these traits, for others it is not. We aimed to determine whether adiposity is causally related to various cardiometabolic traits using the Mendelian randomization approach. We used the adiposity-associated variant rs9939609 at the FTO locus as an instrumental variable (IV) for body mass index (BMI) in a Mendelian randomization design. Thirty-six population-based studies of individuals of European descent contributed to the analyses. Age- and sex-adjusted regression models were fitted to test for association between (i) rs9939609 and BMI (n = 198,502), (ii) rs9939609 and 24 traits, and (iii) BMI and 24 traits. The causal effect of BMI on the outcome measures was quantified by IV estimators. The estimators were compared to the BMI-trait associations derived from the same individuals. In the IV analysis, we demonstrated novel evidence for a causal relationship between adiposity and incident heart failure (hazard ratio, 1.19 per BMI-unit increase; 95% CI, 1.03-1.39) and replicated earlier reports of a causal association with type 2 diabetes, metabolic syndrome, dyslipidemia, and hypertension (odds ratio for IV estimator, 1.1-1.4; all p<0.05). For quantitative traits, our results provide novel evidence for a causal effect of adiposity on the liver enzymes alanine aminotransferase and gamma-glutamyl transferase and confirm previous reports of a causal effect of adiposity on systolic and diastolic blood pressure, fasting insulin, 2-h post-load glucose from the oral glucose tolerance test, C-reactive protein, triglycerides, and high-density lipoprotein cholesterol levels (all p<0.05). The estimated causal effects were in agreement with traditional observational measures in all instances except for type 2 diabetes, where the causal estimate was larger than the observational estimate (p = 0.001). Conclusions: We provide novel evidence for a causal relationship between adiposity and heart failure as well as between adiposity and increased liver enzymes. Please see later in the article for the Editors' Summary.
PLoS medicine 2013;10;6;e1001474
PUBMED: 23824655; DOI: 10.1371/journal.pmed.1001474
-
ImmunoChip Study Implicates Antigen Presentation to T Cells in Narcolepsy.
Center for Sleep Sciences and Medicine, Stanford University, Palo Alto, California, United States of America.
Recent advances in the identification of susceptibility genes and environmental exposures provide broad support for a post-infectious autoimmune basis for narcolepsy/hypocretin (orexin) deficiency. We genotyped loci associated with other autoimmune and inflammatory diseases in 1,886 individuals with hypocretin-deficient narcolepsy and 10,421 controls, all of European ancestry, using a custom genotyping array (ImmunoChip). Three loci located outside the Human Leukocyte Antigen (HLA) region on chromosome 6 were significantly associated with disease risk. In addition to a strong signal in the T cell receptor alpha (TRA@), variants in two additional narcolepsy loci, Cathepsin H () and Tumor necrosis factor (ligand) superfamily member 4 (, also called ), attained genome-wide significance. These findings underline the importance of antigen presentation by HLA Class II to T cells in the pathophysiology of this autoimmune disease.
PLoS genetics 2013;9;2;e1003270
PUBMED: 23459209; PMC: 3573113; DOI: 10.1371/journal.pgen.1003270
-
Reprogramming by cell fusion: boosted by tets.
Epigenetics Programme, The Babraham Institute, Cambridge CB22 3AT, UK.
Pluripotent cells, when fused with somatic cells, have the dominant ability to reprogram the somatic genome. Work by Piccolo et al. (2013) shows that the Tet1 and Tet2 hydroxylases are important for DNA methylation reprogramming of pluripotency genes and parental imprints.
Molecular cell 2013;49;6;1017-8
PUBMED: 23541036; DOI: 10.1016/j.molcel.2013.03.014
-
FGF Signaling Inhibition in ESCs Drives Rapid Genome-wide Demethylation to the Epigenetic Ground State of Pluripotency.
Epigenetics Programme, The Babraham Institute, Cambridge, CB22 3AT, UK. Electronic address: gabriella.ficz@babraham.ac.uk.
Genome-wide erasure of DNA methylation takes place in primordial germ cells (PGCs) and early embryos and is linked with pluripotency. Inhibition of Erk1/2 and Gsk3β signaling in mouse embryonic stem cells (ESCs) by small-molecule inhibitors (called 2i) has recently been shown to induce hypomethylation. We show by whole-genome bisulphite sequencing that 2i induces rapid and genome-wide demethylation on a scale and pattern similar to that in migratory PGCs and early embryos. Major satellites, intracisternal A particles (IAPs), and imprinted genes remain relatively resistant to erasure. Demethylation involves oxidation of 5-methylcytosine (5mC) to 5-hydroxymethylcytosine (5hmC), impaired maintenance of 5mC and 5hmC, and repression of the de novo methyltransferases (Dnmt3a and Dnmt3b) and Dnmt3L. We identify a Prdm14- and Nanog-binding cis-acting regulatory region in Dnmt3b that is highly responsive to signaling. These insights provide a framework for understanding how signaling pathways regulate reprogramming to an epigenetic ground state of pluripotency.
Cell stem cell 2013
PUBMED: 23850245; DOI: 10.1016/j.stem.2013.06.004
-
Global Analysis of the Sporulation Pathway of Clostridium difficile.
Department of Microbiology and Molecular Genetics, University of Vermont, Burlington, Vermont, United States of America ; Program in Cellular, Molecular & Biomedical Sciences, University of Vermont, Burlington, Vermont, United States of America.
The Gram-positive, spore-forming pathogen Clostridium difficile is the leading definable cause of healthcare-associated diarrhea worldwide. C. difficile infections are difficult to treat because of their frequent recurrence, which can cause life-threatening complications such as pseudomembranous colitis. The spores of C. difficile are responsible for these high rates of recurrence, since they are the major transmissive form of the organism and resistant to antibiotics and many disinfectants. Despite the importance of spores to the pathogenesis of C. difficile, little is known about their composition or formation. Based on studies in Bacillus subtilis and other Clostridium spp., the sigma factors σ(F), σ(E), σ(G), and σ(K) are predicted to control the transcription of genes required for sporulation, although their specific functions vary depending on the organism. In order to determine the roles of σ(F), σ(E), σ(G), and σ(K) in regulating C. difficile sporulation, we generated loss-of-function mutations in genes encoding these sporulation sigma factors and performed RNA-Sequencing to identify specific sigma factor-dependent genes. This analysis identified 224 genes whose expression was collectively activated by sporulation sigma factors: 183 were σ(F)-dependent, 169 were σ(E)-dependent, 34 were σ(G)-dependent, and 31 were σ(K)-dependent. In contrast with B. subtilis, C. difficile σ(E) was dispensable for σ(G) activation, σ(G) was dispensable for σ(K) activation, and σ(F) was required for post-translationally activating σ(G). Collectively, these results provide the first genome-wide transcriptional analysis of genes induced by specific sporulation sigma factors in the Clostridia and highlight that diverse mechanisms regulate sporulation sigma factor activity in the Firmicutes.
PLoS genetics 2013;9;8;e1003660
PUBMED: 23950727; PMC: 3738446; DOI: 10.1371/journal.pgen.1003660
-
EMu: probabilistic inference of mutational processes and their localization in the cancer genome.
Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, CB10 1SA, Hinxton, Cambridge, UK. vm5@sanger.ac.uk.
The spectrum of mutations discovered in cancer genomes can be explained by the activity of a few elementary mutational processes. We present a novel probabilistic method, EMu, to infer the mutational signatures of these processes from a collection of sequenced tumors. EMu naturally incorporates the tumor-specific opportunity for different mutation types according to sequence composition. Applying EMu to breast cancer data, we derive detailed maps of the activity of each process, both genome-wide and within specific local regions of the genome. Our work provides new opportunities to study the mutational processes underlying cancer development. EMu is available at http://www.sanger.ac.uk/resources/software/emu/.
Genome biology 2013;14;4;R39
PUBMED: 23628380; DOI: 10.1186/gb-2013-14-4-r39
-
Ensembl 2013.
European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton Cambridge CB10 1SD, UK. flicek@ebi.ac.uk
The Ensembl project (http://www.ensembl.org) provides genome information for sequenced chordate genomes with a particular focus on human, mouse, zebrafish and rat. Our resources include evidenced-based gene sets for all supported species; large-scale whole genome multiple species alignments across vertebrates and clade-specific alignments for eutherian mammals, primates, birds and fish; variation data resources for 17 species and regulation annotations based on ENCODE and other data sets. Ensembl data are accessible through the genome browser at http://www.ensembl.org and through other tools and programmatic interfaces.
Funded by: Biotechnology and Biological Sciences Research Council: BB/I025506/1; NHGRI NIH HHS: U01HG004695, U41HG006104, U54HG004563; Wellcome Trust: WT062023, WT079643
Nucleic acids research 2013;41;Database issue;D48-55
PUBMED: 23203987; PMC: 3531136; DOI: 10.1093/nar/gks1236
-
Spindle checkpoint deficiency is tolerated by murine epidermal cells but not hair follicle stem cells.
Mouse Genomics, Wellcome Trust Sanger Institute, Hinxton CB10 1SA, United Kingdom.
The spindle assembly checkpoint (SAC) ensures correct chromosome segregation during mitosis by preventing aneuploidy, an event that is detrimental to the fitness and survival of normal cells but oncogenic in tumor cells. Deletion of SAC genes is incompatible with early mouse development, and RNAi-mediated depletion of SAC components in cultured cells results in rapid death. Here we describe the use of a conditional KO of mouse Mad2, an essential component of the SAC signaling cascade, as a means to selectively induce chromosome instability and aneuploidy in the epidermis of the skin. We observe that SAC inactivation is tolerated by interfollicular epidermal cells but results in depletion of hair follicle bulge stem cells. Eventually, a histologically normal epidermis develops within ∼1 mo after birth, albeit without any hair. Mad2-deficient cells in this epidermis exhibited abnormal transcription of metabolic genes, consistent with aneuploid cell state. Hair follicle bulge stem cells were completely absent, despite the continued presence of rudimentary hair follicles. These data demonstrate that different cell lineages within a single tissue respond differently to chromosome instability: some proliferating cell lineages can survive, but stem cells are highly sensitive.
Proceedings of the National Academy of Sciences of the United States of America 2013
PUBMED: 23382243; DOI: 10.1073/pnas.1217388110
-
Genome Sequence of Klebsiella pneumoniae Ecl8, a Reference Strain for Targeted Genetic Manipulation.
Wellcome Trust, Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom.
We report the genome sequence of Klebsiella pneumoniae subsp. pneumoniae Ecl8, a spontaneous streptomycin-resistant mutant of strain ECL4, derived from NCIB 418. K. pneumoniae Ecl8 has been shown to be genetically tractable for targeted gene deletion strategies and so provides a platform for in-depth analyses of this species.
Genome announcements 2013;1;1
PUBMED: 23405357; DOI: 10.1128/genomeA.00027-12
-
Global Analysis of Apicomplexan Protein S-Acyl Transferases Reveals an Enzyme Essential for Invasion.
Department of Microbiology and Molecular Medicine, CMU, University of Geneva, Rue Michel-Servet 1, CH-1211, Geneva 4, Switzerland.
The advent of techniques to study palmitoylation on a whole proteome scale has revealed that it is an important reversible modification that plays a role in regulating multiple biological processes. Palmitoylation can control the affinity of a protein for lipid membranes, which allows it to impact protein trafficking, stability, folding, signalling and interactions. The publication of the palmitome of the schizont stage of Plasmodium falciparum implicated a role for palmitoylation in host cell invasion, protein export and organelle biogenesis. However, nothing is known so far about the repertoire of protein S-acyl transferases (PATs) that catalyse this modification in Apicomplexa. We undertook a comprehensive analysis of the repertoire of Asp-His-His-Cys cysteine-rich domain (DHHC-CRD) PAT family in Toxoplasma gondii and Plasmodium berghei by assessing their localization and essentiality. Unlike functional redundancies reported in other eukaryotes, some apicomplexan-specific DHHCs are essential for parasite growth, and several are targeted to organelles unique to this phylum. Of particular interest is DHHC7, which localizes to rhoptry organelles in all parasites tested, including the major human pathogen P. falciparum. TgDHHC7 interferes with the localization of the rhoptry palmitoylated protein TgARO and affects the apical positioning of the rhoptry organelles. This PAT has a major impact on T. gondii host cell invasion, but not on the parasite's ability to egress.
Traffic (Copenhagen, Denmark) 2013
PUBMED: 23638681; DOI: 10.1111/tra.12081
-
Clonal Expansion Analysis of Transposon Insertions by High-Throughput Sequencing Identifies Candidate Cancer Genes in a PiggyBac Mutagenesis Screen.
Department of Neuroscience, Department of Developmental and Regenerative Biology, Department of Neurosurgery, Icahn School of Medicine at Mount, Sinai, New York, New York, United States of America.
Somatic transposon mutagenesis in mice is an efficient strategy to investigate the genetic mechanisms of tumorigenesis. The identification of tumor driving transposon insertions traditionally requires the generation of large tumor cohorts to obtain information about common insertion sites. Tumor driving insertions are also characterized by their clonal expansion in tumor tissue, a phenomenon that is facilitated by the slow and evolving transformation process of transposon mutagenesis. We describe here an improved approach for the detection of tumor driving insertions that assesses the clonal expansion of insertions by quantifying the relative proportion of sequence reads obtained in individual tumors. To this end, we have developed a protocol for insertion site sequencing that utilizes acoustic shearing of tumor DNA and Illumina sequencing. We analyzed various solid tumors generated by PiggyBac mutagenesis and for each tumor >10(6) reads corresponding to >10(4) insertion sites were obtained. In each tumor, 9 to 25 insertions stood out by their enriched sequence read frequencies when compared to frequencies obtained from tail DNA controls. These enriched insertions are potential clonally expanded tumor driving insertions, and thus identify candidate cancer genes. The candidate cancer genes of our study comprised many established cancer genes, but also novel candidate genes such as Mastermind-like1 (Mamld1) and Diacylglycerolkinase delta (Dgkd). We show that clonal expansion analysis by high-throughput sequencing is a robust approach for the identification of candidate cancer genes in insertional mutagenesis screens on the level of individual tumors.
PloS one 2013;8;8;e72338
PUBMED: 23940809; DOI: 10.1371/journal.pone.0072338
-
A CpG Mutational Hotspot in a ONECUT Binding Site Accounts for the Prevalent Variant of Hemophilia B Leyden.
School of Biotechnology and Biomolecular Sciences, University of New South Wales, Kensington NSW 2052, Australia.
Hemophilia B, or the "royal disease," arises from mutations in coagulation factor IX (F9). Mutations within the F9 promoter are associated with a remarkable hemophilia B subtype, termed hemophilia B Leyden, in which symptoms ameliorate after puberty. Mutations at the -5/-6 site (nucleotides -5 and -6 relative to the transcription start site, designated +1) account for the majority of Leyden cases and have been postulated to disrupt the binding of a transcriptional activator, the identity of which has remained elusive for more than 20 years. Here, we show that ONECUT transcription factors (ONECUT1 and ONECUT2) bind to the -5/-6 site. The various hemophilia B Leyden mutations that have been reported in this site inhibit ONECUT binding to varying degrees, which correlate well with their associated clinical severities. In addition, expression of F9 is crucially dependent on ONECUT factors in vivo, and as such, mice deficient in ONECUT1, ONECUT2, or both exhibit depleted levels of F9. Taken together, our findings establish ONECUT transcription factors as the missing hemophilia B Leyden regulators that operate through the -5/-6 site.
American journal of human genetics 2013;92;3;460-7
PUBMED: 23472758; PMC: 3591849; DOI: 10.1016/j.ajhg.2013.02.003
-
Global properties and functional complexity of human gene regulatory variation.
Wellcome Trust Sanger Institute, Cambridge, United Kingdom. dg13@sanger.ac.uk
Identification and functional interpretation of gene regulatory variants is a major focus of modern genomics. The application of genetic mapping to molecular and cellular traits has enabled the detection of regulatory variation on genome-wide scales and revealed an enormous diversity of regulatory architecture in humans and other species. In this review I summarise the insights gained and questions raised by a decade of genetic mapping of gene expression variation. I discuss recent extensions of this approach using alternative molecular phenotypes that have revealed some of the biological mechanisms that drive gene expression variation between individuals. Finally, I highlight outstanding problems and future directions for development.
Funded by: Wellcome Trust: 098051
PLoS genetics 2013;9;5;e1003501
PUBMED: 23737752; PMC: 3667745; DOI: 10.1371/journal.pgen.1003501
-
An elephantine viral problem.
Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.
This month's Genome Watch highlights how deep sequencing was used to generate the first full genomes of herpesviruses associated with a fatal disease in elephants.
Nature reviews. Microbiology 2013
PUBMED: 23832239; DOI: 10.1038/nrmicro3075
-
Restriction of V3 region sequence divergence in the HIV-1 envelope gene during antiretroviral treatment in a cohort of recent seroconverters.
Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.
Background: Dynamic changes in Human Immunodeficiency Virus 1 (HIV-1) sequence diversity and divergence are associated with immune control during primary infection and progression to AIDS. Consensus sequencing or single genome amplification sequencing of the HIV-1 envelope (env) gene, in particular the variable (V) regions, is used as a marker for HIV-1 genome diversity, but population diversity is only minimally, or semi-quantitatively sampled using these methods.
Results: Here we use second generation deep sequencing to determine inter-and intra-patient sequence heterogeneity and to quantify minor variants in a cohort of individuals either receiving or not receiving antiretroviral treatment following seroconversion; the SPARTAC trial. We show, through a cross-sectional study of sequence diversity of the env V3 in 30 antiretroviral-naive patients during primary infection that considerable population structure diversity exists, with some individuals exhibiting highly constrained plasma virus diversity. Diversity was independent of clinical markers (viral load, time from seroconversion, CD4 cell count) of infection. Serial sampling over 60 weeks of non-treated individuals that define three initially different diversity profiles showed that complex patterns of continuing HIV-1 sequence diversification and divergence could be readily detected. Evidence for minor sequence turnover, emergence of new variants and re-emergence of archived variants could be inferred from this analysis. Analysis of viral divergence over the same time period in patients who received short (12 weeks, ART12) or long course antiretroviral therapy (48 weeks, ART48) and a non-treated control group revealed that ART48 successfully suppressed viral divergence while ART12 did not have a significant effect.
Conclusions: Deep sequencing is a sensitive and reliable method for investigating the diversity of the env V3 as an important component of HIV-1 genome diversity. Detailed insights into the complex early intra-patient dynamics of env V3 diversity and divergence were explored in antiretroviral-naïve recent seroconverters. Long course antiretroviral therapy, initiated soon after seroconversion and administered for 48 weeks, restricts HIV-1 divergence significantly. The effect of ART12 and ART48 on clinical markers of HIV infection and progression is currently investigated in the SPARTAC trial.
Funded by: Wellcome Trust
Retrovirology 2013;10;8
PUBMED: 23331949; PMC: 3605130; DOI: 10.1186/1742-4690-10-8
-
Reprogramming to Pluripotency Using Designer TALE Transcription Factors Targeting Enhancers
Stem Cell Reports 2013;1;2;183–197
DOI: 10.1016/j.stemcr.2013.06.002; URL: http://www.sciencedirect.com/scien...ce/article/pii/S2213671113000441
-
Genome-wide haplotype analysis of cis expression quantitative trait Loci in monocytes.
INSERM, UMR_S 937, Pierre and Marie Curie University (UPMC, Paris 6), Paris, France ; ICAN Institute for Cardiometabolism and Nutrition, Pierre and Marie Curie University (UPMC, Paris 6), Paris, France.
In order to assess whether gene expression variability could be influenced by several SNPs acting in cis, either through additive or more complex haplotype effects, a systematic genome-wide search for cis haplotype expression quantitative trait loci (eQTL) was conducted in a sample of 758 individuals, part of the Cardiogenics Transcriptomic Study, for which genome-wide monocyte expression and GWAS data were available. 19,805 RNA probes were assessed for cis haplotypic regulation through investigation of ∼2,1×10(9) haplotypic combinations. 2,650 probes demonstrated haplotypic p-values >10(4)-fold smaller than the best single SNP p-value. Replication of significant haplotype effects were tested for 412 probes for which SNPs (or proxies) that defined the detected haplotypes were available in the Gutenberg Health Study composed of 1,374 individuals. At the Bonferroni correction level of 1.2×10(-4) (∼0.05/412), 193 haplotypic signals replicated. 1000G imputation was then conducted, and 105 haplotypic signals still remained more informative than imputed SNPs. In-depth analysis of these 105 cis eQTL revealed that at 76 loci genetic associations were compatible with additive effects of several SNPs, while for the 29 remaining regions data could be compatible with a more complex haplotypic pattern. As 24 of the 105 cis eQTL have previously been reported to be disease-associated loci, this work highlights the need for conducting haplotype-based and 1000G imputed cis eQTL analysis before commencing functional studies at disease-associated loci.
PLoS genetics 2013;9;1;e1003240
PUBMED: 23382694; PMC: 3561129; DOI: 10.1371/journal.pgen.1003240
-
Clonal analyses reveal associations of JAK2V617F homozygosity with hematologic features, age and gender in polycythemia vera and essential thrombocythemia.
Subclones homozygous for JAK2V617F are more common and larger in patients with polycythemia vera compared to essential thrombocythemia, but their role in determining phenotype remains unclear. We genotyped 4564 erythroid colonies from 59 patients with polycythemia vera or essential thrombocythemia to investigate whether the proportion of JAK2V617F -homozygous precursors, compared to heterozygous precursors, is associated with clinical or demographic features. In polycythemia vera, a higher proportion of homozygous-mutant precursors was associated with more extreme blood counts at diagnosis, consistent with a causal role for homozygosity in polycythemia vera pathogenesis. Larger numbers of homozygous-mutant colonies were associated with older age, and with male gender in polycythemia vera but female gender in essential thrombocythemia. These results suggest that age promotes development or expansion of homozygous-mutant clones and that gender modulates the phenotypic consequences of JAK2V617F homozygosity, thus providing a potential explanation for the long-standing observations of a preponderance of men with polycythemia vera but of women with essential thrombocythemia.
Haematologica 2013;98;5;718-21
PUBMED: 23633544; PMC: 3640115; DOI: 10.3324/haematol.2012.079129
-
Mutations in C10orf11, Encoding a Melanocyte-Differentiation Gene, Cause Autosomal-Recessive Albinism.
Applied Human Molecular Genetics, Kennedy Center, Copenhagen University Hospital, Rigshospitalet, DK-2100 Copenhagen, Denmark; Department of Cellular and Molecular Medicine, University of Copenhagen, DK-2200 Copenhagen, Denmark. Electronic address: karen.groenskov@regionh.dk.
Autosomal-recessive albinism is a hypopigmentation disorder with a broad phenotypic range. A substantial fraction of individuals with albinism remain genetically unresolved, and it has been hypothesized that more genes are to be identified. By using homozygosity mapping of an inbred Faroese family, we identified a 3.5 Mb homozygous region (10q22.2-q22.3) on chromosome 10. The region contains five protein-coding genes, and sequencing of one of these, C10orf11, revealed a nonsense mutation that segregated with the disease and showed a recessive inheritance pattern. Investigation of additional albinism-affected individuals from the Faroe Islands revealed that five out of eight unrelated affected persons had the nonsense mutation in C10orf11. Screening of a cohort of autosomal-recessive-albinism-affected individuals residing in Denmark showed a homozygous 1 bp duplication in C10orf11 in an individual originating from Lithuania. Immunohistochemistry showed localization of C10orf11 in melanoblasts and melanocytes in human fetal tissue, but no localization was seen in retinal pigment epithelial cells. Knockdown of the zebrafish (Danio rerio) homolog with the use of morpholinos resulted in substantially decreased pigmentation and a reduction of the apparent number of pigmented melanocytes. The morphant phenotype was rescued by wild-type C10orf11, but not by mutant C10orf11. In conclusion, we have identified a melanocyte-differentiation gene, C10orf11, which when mutated causes autosomal-recessive albinism in humans.
American journal of human genetics 2013
PUBMED: 23395477; DOI: 10.1016/j.ajhg.2013.01.006
-
Massively parallel sequencing reveals the complex structure of an irradiated human chromosome on a mouse background in the tc1 model of down syndrome.
Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom.
Down syndrome (DS) is caused by trisomy of chromosome 21 (Hsa21) and presents a complex phenotype that arises from abnormal dosage of genes on this chromosome. However, the individual dosage-sensitive genes underlying each phenotype remain largely unknown. To help dissect genotype - phenotype correlations in this complex syndrome, the first fully transchromosomic mouse model, the Tc1 mouse, which carries a copy of human chromosome 21 was produced in 2005. The Tc1 strain is trisomic for the majority of genes that cause phenotypes associated with DS, and this freely available mouse strain has become used widely to study DS, the effects of gene dosage abnormalities, and the effect on the basic biology of cells when a mouse carries a freely segregating human chromosome. Tc1 mice were created by a process that included irradiation microcell-mediated chromosome transfer of Hsa21 into recipient mouse embryonic stem cells. Here, the combination of next generation sequencing, array-CGH and fluorescence in situ hybridization technologies has enabled us to identify unsuspected rearrangements of Hsa21 in this mouse model; revealing one deletion, six duplications and more than 25 de novo structural rearrangements. Our study is not only essential for informing functional studies of the Tc1 mouse but also (1) presents for the first time a detailed sequence analysis of the effects of gamma radiation on an entire human chromosome, which gives some mechanistic insight into the effects of radiation damage on DNA, and (2) overcomes specific technical difficulties of assaying a human chromosome on a mouse background where highly conserved sequences may confound the analysis. Sequence data generated in this study is deposited in the ENA database, Study Accession number: ERP000439.
PloS one 2013;8;4;e60482
PUBMED: 23596509; PMC: 3626651; DOI: 10.1371/journal.pone.0060482
-
Gene-centric meta-analyses of 108 912 individuals confirm known body mass index loci and reveal three novel signals.
List of authors is given in the Full Author List Section of Appendix.
Recent genetic association studies have made progress in uncovering components of the genetic architecture of the body mass index (BMI). We used the ITMAT-Broad-Candidate Gene Association Resource (CARe) (IBC) array comprising up to 49 320 single nucleotide polymorphisms (SNPs) across ∼2100 metabolic and cardiovascular-related loci to genotype up to 108 912 individuals of European ancestry (EA), African-Americans, Hispanics and East Asians, from 46 studies, to provide additional insight into SNPs underpinning BMI. We used a five-phase study design: Phase I focused on meta-analysis of EA studies providing individual level genotype data; Phase II performed a replication of cohorts providing summary level EA data; Phase III meta-analyzed results from the first two phases; associated SNPs from Phase III were used for replication in Phase IV; finally in Phase V, a multi-ethnic meta-analysis of all samples from four ethnicities was performed. At an array-wide significance (P < 2.40E-06), we identify novel BMI associations in loci translocase of outer mitochondrial membrane 40 homolog (yeast) - apolipoprotein E - apolipoprotein C-I (TOMM40-APOE-APOC1) (rs2075650, P = 2.95E-10), sterol regulatory element binding transcription factor 2 (SREBF2, rs5996074, P = 9.43E-07) and neurotrophic tyrosine kinase, receptor, type 2 [NTRK2, a brain-derived neurotrophic factor (BDNF) receptor gene, rs1211166, P = 1.04E-06] in the Phase IV meta-analysis. Of 10 loci with previous evidence for BMI association represented on the IBC array, eight were replicated, with the remaining two showing nominal significance. Conditional analyses revealed two independent BMI-associated signals in BDNF and melanocortin 4 receptor (MC4R) regions. Of the 11 array-wide significant SNPs, three are associated with gene expression levels in both primary B-cells and monocytes; with rs4788099 in SH2B adaptor protein 1 (SH2B1) notably being associated with the expression of multiple genes in cis. These multi-ethnic meta-analyses expand our knowledge of BMI genetics.
Funded by: NIA NIH HHS: R37 AG011099
Human molecular genetics 2013;22;1;184-201
PUBMED: 23001569; PMC: 3522401; DOI: 10.1093/hmg/dds396
-
Genome-wide diversity in the levant reveals recent structuring by culture.
Institut de Biologia Evolutiva (CSIC-UPF), Departament de Ciències de la Salut i de la Vida, Universitat Pompeu Fabra, Barcelona, Spain.
The Levant is a region in the Near East with an impressive record of continuous human existence and major cultural developments since the Paleolithic period. Genetic and archeological studies present solid evidence placing the Middle East and the Arabian Peninsula as the first stepping-stone outside Africa. There is, however, little understanding of demographic changes in the Middle East, particularly the Levant, after the first Out-of-Africa expansion and how the Levantine peoples relate genetically to each other and to their neighbors. In this study we analyze more than 500,000 genome-wide SNPs in 1,341 new samples from the Levant and compare them to samples from 48 populations worldwide. Our results show recent genetic stratifications in the Levant are driven by the religious affiliations of the populations within the region. Cultural changes within the last two millennia appear to have facilitated/maintained admixture between culturally similar populations from the Levant, Arabian Peninsula, and Africa. The same cultural changes seem to have resulted in genetic isolation of other groups by limiting admixture with culturally different neighboring populations. Consequently, Levant populations today fall into two main groups: one sharing more genetic characteristics with modern-day Europeans and Central Asians, and the other with closer genetic affinities to other Middle Easterners and Africans. Finally, we identify a putative Levantine ancestral component that diverged from other Middle Easterners ∼23,700-15,500 years ago during the last glacial period, and diverged from Europeans ∼15,900-9,100 years ago between the last glacial warming and the start of the Neolithic.
Funded by: PEPFAR: 098051; Wellcome Trust
PLoS genetics 2013;9;2;e1003316
PUBMED: 23468648; PMC: 3585000; DOI: 10.1371/journal.pgen.1003316
-
The Role of Salt Bridges, Charge Density, and Subunit Flexibility in Determining Disassembly Routes of Protein Complexes.
Physical and Theoretical Chemistry Laboratory, Department of Chemistry, University of Oxford, Oxford OX1 3QZ, UK.
Mass spectrometry can be used to characterize multiprotein complexes, defining their subunit stoichiometry and composition following solution disruption and collision-induced dissociation (CID). While CID of protein complexes in the gas phase typically results in the dissociation of unfolded subunits, a second atypical route is possible wherein compact subunits or subcomplexes are ejected without unfolding. Because tertiary structure and subunit interactions may be retained, this is the preferred route for structural investigations. How can we influence which pathway is adopted? By studying properties of a series of homomeric and heteromeric protein complexes and varying their overall charge in solution, we found that low subunit flexibility, higher charge densities, fewer salt bridges, and smaller interfaces are likely to be involved in promoting dissociation routes without unfolding. Manipulating the charge on a protein complex therefore enables us to direct dissociation through structurally informative pathways that mimic those followed in solution.
Structure (London, England : 1993) 2013
PUBMED: 23850452; DOI: 10.1016/j.str.2013.06.004
-
Fine mapping of type 1 diabetes regions Idd9.1 and Idd9.2 reveals genetic complexity.
Department of Immunology and Microbial Sciences, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA, 92037, USA.
Nonobese diabetic (NOD) mice congenic for C57BL/10 (B10)-derived genes in the Idd9 region of chromosome 4 are highly protected from type 1 diabetes (T1D). Idd9 has been divided into three protective subregions (Idd9.1, 9.2, and 9.3), each of which partially prevents disease. In this study we have fine-mapped the Idd9.1 and Idd9.2 regions, revealing further genetic complexity with at least two additional subregions contributing to protection from T1D. Using the NOD sequence from bacterial artificial chromosome clones of the Idd9.1 and Idd9.2 regions as well as whole-genome sequence data recently made available, sequence polymorphisms within the regions highlight a high degree of polymorphism between the NOD and B10 strains in the Idd9 regions. Among numerous candidate genes are several with immunological importance. The Idd9.1 region has been separated into Idd9.1 and Idd9.4, with Lck remaining a candidate gene within Idd9.1. One of the Idd9.2 regions contains the candidate genes Masp2 (encoding mannan-binding lectin serine peptidase 2) and Mtor (encoding mammalian target of rapamycin). From mRNA expression analyses, we have also identified several other differentially expressed candidate genes within the Idd9.1 and Idd9.2 regions. These findings highlight that multiple, relatively small genetic effects combine and interact to produce significant changes in immune tolerance and diabetes onset.
Mammalian genome : official journal of the International Mammalian Genome Society 2013
PUBMED: 23934554; DOI: 10.1007/s00335-013-9466-y
-
Whole-genome sequencing for analysis of an outbreak of meticillin-resistant Staphylococcus aureus: a descriptive study.
Wellcome Trust Sanger Institute, Cambridge, UK.
Background: The emergence of meticillin-resistant Staphylococcus aureus (MRSA) that can persist in the community and replace existing hospital-adapted lineages of MRSA means that it is necessary to understand transmission dynamics in terms of hospitals and the community as one entity. We assessed the use of whole-genome sequencing to enhance detection of MRSA transmission between these settings.
Methods: We studied a putative MRSA outbreak on a special care baby unit (SCBU) at a National Health Service Foundation Trust in Cambridge, UK. We used whole-genome sequencing to validate and expand findings from an infection-control team who assessed the outbreak through conventional analysis of epidemiological data and antibiogram profiles. We sequenced isolates from all colonised patients in the SCBU, and sequenced MRSA isolates from patients in the hospital or community with the same antibiotic susceptibility profile as the outbreak strain.
Findings: The hospital infection-control team identified 12 infants colonised with MRSA in a 6 month period in 2011, who were suspected of being linked, but a persistent outbreak could not be confirmed with conventional methods. With whole-genome sequencing, we identified 26 related cases of MRSA carriage, and showed transmission occurred within the SCBU, between mothers on a postnatal ward, and in the community. The outbreak MRSA type was a new sequence type (ST) 2371, which is closely related to ST22, but contains genes encoding Panton-Valentine leucocidin. Whole-genome sequencing data were used to propose and confirm that MRSA carriage by a staff member had allowed the outbreak to persist during periods without known infection on the SCBU and after a deep clean.
Interpretation: Whole-genome sequencing holds great promise for rapid, accurate, and comprehensive identification of bacterial transmission pathways in hospital and community settings, with concomitant reductions in infections, morbidity, and costs.
Funding: UK Clinical Research Collaboration Translational Infection Research Initiative, Wellcome Trust, Health Protection Agency, and the National Institute for Health Research Cambridge Biomedical Research Centre.
Funded by: Biotechnology and Biological Sciences Research Council; Chief Scientist Office; Department of Health; Medical Research Council: G1000803; Wellcome Trust: 098051
The Lancet infectious diseases 2013;13;2;130-6
PUBMED: 23158674; PMC: 3556525; DOI: 10.1016/S1473-3099(12)70268-2
-
Read and assembly metrics inconsequential for clinical utility of whole-genome sequencing in mapping outbreaks.
Nature biotechnology 2013;31;7;592-4
PUBMED: 23839141; DOI: 10.1038/nbt.2616
-
Diagnostic pathway for the investigation of thrombocytosis.
Department of Haematology, Guy's and St Thomas, Hospitals' NHS Foundation Trust, London, UK.
British journal of haematology 2013
PUBMED: 23480550; DOI: 10.1111/bjh.12283
-
Whole genome sequencing identifies zoonotic transmission of MRSA isolates with the novel mecA homologue mecC.
Department of Veterinary Medicine, University of Cambridge, Cambridge, UK.
Several methicillin-resistant Staphylococcus aureus (MRSA) lineages that carry a novel mecA homologue (mecC) have recently been described in livestock and humans. In Denmark, two independent human cases of mecC-MRSA infection have been linked to a livestock reservoir. We investigated the molecular epidemiology of the associated MRSA isolates using whole genome sequencing (WGS). Single nucleotide polymorphisms (SNP) were defined and compared to a reference genome to place the isolates into a phylogenetic context. Phylogenetic analysis revealed two distinct farm-specific clusters comprising isolates from the human case and their own livestock, whereas human and animal isolates from the same farm only differed by a small number of SNPs, which supports the likelihood of zoonotic transmission. Further analyses identified a number of genes and mutations that may be associated with host interaction and virulence. This study demonstrates that mecC-MRSA ST130 isolates are capable of transmission between animals and humans, and underscores the potential of WGS in epidemiological investigations and source tracking of bacterial infections. →See accompanying article http://dx.doi.org/10.1002/emmm.201302622.
EMBO molecular medicine 2013;5;4;509-15
PUBMED: 23526809; DOI: 10.1002/emmm.201202413
-
A Staphylococcus xylosus Isolate with a New mecC Allotype.
University of Cambridge, Department of Veterinary Medicine, Cambridge, United Kingdom.
Recently, a novel variant of mecA known as mecC (mecA(LGA251)) was identified in Staphylococcus aureus isolates from both humans and animals. In this study, we identified a Staphylococcus xylosus isolate that harbors a new allotype of the mecC gene, mecC1. Whole-genome sequencing revealed that mecC1 forms part of a class E mec complex (mecI-mecR1-mecC1-blaZ) located at the orfX locus as part of a likely staphylococcal cassette chromosome mec element (SCCmec) remnant, which also contains a number of other genes present on the type XI SCCmec.
Antimicrobial agents and chemotherapy 2013;57;3;1524-8
PUBMED: 23274660; DOI: 10.1128/AAC.01882-12
-
Description and Nomenclature of Neisseria meningitidis Capsule Locus.
Pathogenic Neisseria meningitidis isolates contain a polysaccharide capsule that is the main virulence determinant for this bacterium. Thirteen capsular polysaccharides have been described, and nuclear magnetic resonance spectroscopy has enabled determination of the structure of capsular polysaccharides responsible for serogroup specificity. Molecular mechanisms involved in N. meningitidis capsule biosynthesis have also been identified, and genes involved in this process and in cell surface translocation are clustered at a single chromosomal locus termed cps. The use of multiple names for some of the genes involved in capsule synthesis, combined with the need for rapid diagnosis of serogroups commonly associated with invasive meningococcal disease, prompted a requirement for a consistent approach to the nomenclature of capsule genes. In this report, a comprehensive description of all N. meningitidis serogroups is provided, along with a proposed nomenclature, which was presented at the 2012 XVIIIth International Pathogenic Neisseria Conference.
Emerging infectious diseases 2013;19;4;566-73
PUBMED: 23628376; DOI: 10.3201/eid1904.111799
-
VS-5584, a Novel and Highly Selective PI3K/mTOR Kinase Inhibitor for the Treatment of Cancer.
Corresponding Author: Stefan Hart, S*BIO Pte Ltd., 1 Science Park Road, #05-09 The Capricorn, Singapore 117528, Singapore. stefan.sbio@gmail.com.
Dysregulation of the PI3K/mTOR pathway, either through amplifications, deletions, or as a direct result of mutations, has been closely linked to the development and progression of a wide range of cancers. Moreover, this pathway activation is a poor prognostic marker for many tumor types and confers resistance to various cancer therapies. Here, we describe VS-5584, a novel, low-molecular weight compound with equivalent potent activity against mTOR (IC(50) = 37 nmol/L) and all class I phosphoinositide 3-kinase (PI3K) isoforms IC(50): PI3Kα = 16 nmol/L; PI3Kβ = 68 nmol/L; PI3Kγ = 25 nmol/L; PI3Kδ = 42 nmol/L, without relevant activity on 400 lipid and protein kinases. VS-5584 shows robust modulation of cellular PI3K/mTOR pathways, inhibiting phosphorylation of substrates downstream of PI3K and mTORC1/2. A large human cancer cell line panel screen (436 lines) revealed broad antiproliferative sensitivity and that cells harboring mutations in PI3KCA are generally more sensitive toward VS-5584 treatment. VS-5584 exhibits favorable pharmacokinetic properties after oral dosing in mice and is well tolerated. VS-5584 induces long-lasting and dose-dependent inhibition of PI3K/mTOR signaling in tumor tissue, leading to tumor growth inhibition in various rapalog-sensitive and -resistant human xenograft models. Furthermore, VS-5584 is synergistic with an EGF receptor inhibitor in a gastric tumor model. The unique selectivity profile and favorable pharmacologic and pharmaceutical properties of VS-5584 and its efficacy in a wide range of human tumor models supports further investigations of VS-5584 in clinical trials. Mol Cancer Ther; 12(2); 151-61. ©2012 AACR.
Molecular cancer therapeutics 2013;12;2;151-61
PUBMED: 23270925; PMC: 3588144; DOI: 10.1158/1535-7163.MCT-12-0466
-
Identification of the zebrafish maternal and paternal transcriptomes.
The Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, UK.
Transcription is an essential component of basic cellular and developmental processes. However, early embryonic development occurs in the absence of transcription and instead relies upon maternal mRNAs and proteins deposited in the egg during oocyte maturation. Although the early zebrafish embryo is competent to transcribe exogenous DNA, factors present in the embryo maintain genomic DNA in a state that is incompatible with transcription. The cell cycles of the early embryo titrate out these factors, leading to zygotic transcription initiation, presumably in response to a change in genomic DNA chromatin structure to a state that supports transcription. To understand the molecular mechanisms controlling this maternal to zygotic transition, it is important to distinguish between the maternal and zygotic transcriptomes during this period. Here we use exome sequencing and RNA-seq to achieve such discrimination and in doing so have identified the first zygotic genes to be expressed in the embryo. Our work revealed different profiles of maternal mRNA post-transcriptional regulation prior to zygotic transcription initiation. Finally, we demonstrate that maternal mRNAs are required for different modes of zygotic transcription initiation, which is not simply dependent on the titration of factors that maintain genomic DNA in a transcriptionally incompetent state.
Development (Cambridge, England) 2013;140;13;2703-10
PUBMED: 23720042; DOI: 10.1242/dev.095091
-
A blood pressure genetic risk score is a significant predictor of incident cardiovascular events in 32 669 individuals.
Center for Human Genetic Research, Cardiovascular Research Center, Massachusetts General Hospital, 185 Cambridge St, CPZN 5.242, Boston, MA 02114.
Recent genome-wide association studies have identified genetic variants associated with blood pressure (BP). We investigated whether genetic risk scores (GRSs) constructed of these variants would predict incident cardiovascular disease (CVD) events. We genotyped 32 common single nucleotide polymorphisms in several Finnish cohorts, with up to 32 669 individuals after exclusion of prevalent CVD cases. The median follow-up was 9.8 years, during which 2295 incident CVD events occurred. We created GRSs separately for systolic BP and diastolic BP by multiplying the risk allele count of each single nucleotide polymorphism by the effect size estimated in published genome-wide association studies. We performed Cox regression analyses with and without adjustment for clinical factors, including BP at baseline in each cohort. The results were combined by inverse variance-weighted fixed-effects meta-analysis. The GRSs were strongly associated with systolic BP and diastolic BP, and baseline hypertension (all P<10(-62)). Hazard ratios comparing the highest quintiles of systolic BP and diastolic BP GRSs with the lowest quintiles after adjustment for age, age squared, and sex were 1.25 (1.07-1.46; P=0.006) and 1.23 (1.05-1.43; P=0.01), respectively, for incident coronary heart disease; 1.24 (1.01-1.53; P=0.04) and 1.35 (1.09-1.66; P=0.005), respectively, for incident stroke; and 1.23 (1.08-1.40; P=2×10(-6)) and 1.26 (1.11-1.44; P=5×10(-4)), respectively, for composite CVD. In conclusion, BP findings from genome-wide association studies are strongly replicated. GRSs comprising bona fide BP-single nucleotide polymorphisms predicted CVD risk, consistent with a lifelong effect on BP of these variants collectively.
Funded by: NHLBI NIH HHS: R01 HL098283
Hypertension 2013;61;5;987-94
PUBMED: 23509078; PMC: 3648219; DOI: 10.1161/HYPERTENSIONAHA.111.00649
-
Mcl-1 and FBW7 control a dominant survival pathway underlying HDAC and Bcl-2 inhibitor synergy in squamous cell carcinoma.
Massachusetts General Hospital Cancer Center and Harvard Medical School, Boston, Massachusetts 02114, USA.
Effective targeted therapeutics for squamous cell carcinoma (SCC) are lacking. Here, we uncover Mcl-1 as a dominant and tissue-specific survival factor in SCC, providing a roadmap for a new therapeutic approach. Treatment with the histone deacetylase (HDAC) inhibitor vorinostat regulates Bcl-2 family member expression to disable the Mcl-1 axis and thereby induce apoptosis in SCC cells. Although Mcl-1 dominance renders SCC cells resistant to the BH3-mimetic ABT-737, vorinostat primes them for sensitivity to ABT-737 by shuttling Bim from Mcl-1 to Bcl-2/Bcl-xl, resulting in dramatic synergy for this combination and sustained tumor regression in vivo. Moreover, somatic FBW7 mutation in SCC is associated with stabilized Mcl-1 and high Bim levels, resulting in a poor response to standard chemotherapy but a robust response to HDAC inhibitors and enhanced synergy with the combination vorinostat/ABT-737. Collectively, our findings provide a biochemical rationale and predictive markers for the application of this therapeutic combination in SCC.
Funded by: NCI NIH HHS: BC093523; NIDCR NIH HHS: NIH KO8 DE-020139, R01 DE015945; Wellcome Trust: 086357
Cancer discovery 2013;3;3;324-37
PUBMED: 23274910; PMC: 3595349; DOI: 10.1158/2159-8290.CD-12-0417
-
Emergence and global spread of epidemic healthcare-associated Clostridium difficile.
Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK.
Epidemic C. difficile (027/BI/NAP1) has rapidly emerged in the past decade as the leading cause of antibiotic-associated diarrhea worldwide. However, the key events in evolutionary history leading to its emergence and the subsequent patterns of global spread remain unknown. Here, we define the global population structure of C. difficile 027/BI/NAP1 using whole-genome sequencing and phylogenetic analysis. We show that two distinct epidemic lineages, FQR1 and FQR2, not one as previously thought, emerged in North America within a relatively short period after acquiring the same fluoroquinolone resistance-conferring mutation and a highly related conjugative transposon. The two epidemic lineages showed distinct patterns of global spread, and the FQR2 lineage spread more widely, leading to healthcare-associated outbreaks in the UK, continental Europe and Australia. Our analysis identifies key genetic changes linked to the rapid transcontinental dissemination of epidemic C. difficile 027/BI/NAP1 and highlights the routes by which it spreads through the global healthcare system.
Funded by: Medical Research Council: 93614, G0901743(93614); Wellcome Trust: 086418, 093869, 098051
Nature genetics 2013;45;1;109-13
PUBMED: 23222960; PMC: 3605770; DOI: 10.1038/ng.2478
-
A genome-wide association study of depressive symptoms.
Research Centre O3, Department of Psychiatry, Erasmus MC, Rotterdam, The Netherlands; Department of Epidemiology, Erasmus MC, Rotterdam, The Netherlands.
Background: Depression is a heritable trait that exists on a continuum of varying severity and duration. Yet, the search for genetic variants associated with depression has had few successes. We exploit the entire continuum of depression to find common variants for depressive symptoms. Methods: In this genome-wide association study, we combined the results of 17 population-based studies assessing depressive symptoms with the Center for Epidemiological Studies Depression Scale. Replication of the independent top hits (p<1×10(-5)) was performed in five studies assessing depressive symptoms with other instruments. In addition, we performed a combined meta-analysis of all 22 discovery and replication studies. Results: The discovery sample comprised 34,549 individuals (mean age of 66.5) and no loci reached genome-wide significance (lowest p = 1.05×10(-7)). Seven independent single nucleotide polymorphisms were considered for replication. In the replication set (n = 16,709), we found suggestive association of one single nucleotide polymorphism with depressive symptoms (rs161645, 5q21, p = 9.19×10(-3)). This 5q21 region reached genome-wide significance (p = 4.78×10(-8)) in the overall meta-analysis combining discovery and replication studies (n = 51,258). Conclusions: The results suggest that only a large sample comprising more than 50,000 subjects may be sufficiently powered to detect genes for depressive symptoms.
Biological psychiatry 2013;73;7;667-78
PUBMED: 23290196; DOI: 10.1016/j.biopsych.2012.09.033
-
Aberrant 3' oligoadenylation of spliceosomal U6 small nuclear RNA in poikiloderma with neutropenia.
MRC Laboratory of Molecular Biology, Cambridge, UK.
The recessive disorder poikiloderma with neutropenia (PN) is caused by mutations in the C16orf57 gene that encodes the highly conserved USB1 protein. Here, we present the 1.1 Å resolution crystal structure of human USB1, defining it as a member of the LigT-like superfamily of 2H phosphoesterases. We show that human USB1 is a distributive 3'-5' exoribonuclease that posttranscriptionally removes uridine and adenosine nucleosides from the 3' end of spliceosomal U6 small nuclear RNA (snRNA), directly catalyzing terminal 2', 3' cyclic phosphate formation. USB1 measures the appropriate length of the U6 oligo(U) tail by reading the position of a key adenine nucleotide (A102) and pausing 5 uridine residues downstream.We show that the 3' ends of U6 snRNA in PN patient lymphoblasts are elongated and unexpectedly carry nontemplated 3' oligo(A) tails that are characteristic of nuclear RNA surveillance targets. Thus, our study reveals a novel quality control pathway in which posttranscriptional 3'-end processing by USB1 protects U6 snRNA from targeting and destruction by the nuclear exosome. Our data implicate aberrant oligoadenylation of U6 snRNA in the pathogenesis of the leukemia predisposition disorder PN.
Funded by: Medical Research Council: U105161083
Blood 2013;121;6;1028-38
PUBMED: 23190533; DOI: 10.1182/blood-2012-10-461491
-
Dense genotyping of immune-related disease regions identifies 14 new susceptibility loci for juvenile idiopathic arthritis.
1] Arthritis Research UK Epidemiology Unit, Manchester Academic Health Science Centre, The University of Manchester, Manchester, UK. [2] National Institute for Health Research Manchester Musculoskeletal Biomedical Research Unit, Central Manchester University Hospitals National Health Service Foundation Trust, Manchester Academic Health Science Centre, Manchester, UK. [3].
We used the Immunochip array to analyze 2,816 individuals with juvenile idiopathic arthritis (JIA), comprising the most common subtypes (oligoarticular and rheumatoid factor-negative polyarticular JIA), and 13,056 controls. We confirmed association of 3 known JIA risk loci (the human leukocyte antigen (HLA) region, PTPN22 and PTPN2) and identified 14 loci reaching genome-wide significance (P < 5 × 10(-8)) for the first time. Eleven additional new regions showed suggestive evidence of association with JIA (P < 1 × 10(-6)). Dense mapping of loci along with bioinformatics analysis refined the associations to one gene in each of eight regions, highlighting crucial pathways, including the interleukin (IL)-2 pathway, in JIA disease pathogenesis. The entire Immunochip content, the HLA region and the top 27 loci (P < 1 × 10(-6)) explain an estimated 18, 13 and 6% of the risk of JIA, respectively. In summary, this is the largest collection of JIA cases investigated so far and provides new insight into the genetic basis of this childhood autoimmune disease.
Funded by: NIDDK NIH HHS: U01 DK062418
Nature genetics 2013;45;6;664-9
PUBMED: 23603761; PMC: 3673707; DOI: 10.1038/ng.2614
-
A genomic portrait of the emergence, evolution, and global spread of a methicillin-resistant Staphylococcus aureus pandemic.
The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB19 1SA, United Kingdom.
The widespread use of antibiotics in association with high-density clinical care has driven the emergence of drug-resistant bacteria that are adapted to thrive in hospitalized patients. Of particular concern are globally disseminated methicillin-resistant Staphylococcus aureus (MRSA) clones that cause outbreaks and epidemics associated with health care. The most rapidly spreading and tenacious health-care-associated clone in Europe currently is EMRSA-15, which was first detected in the UK in the early 1990s and subsequently spread throughout Europe and beyond. Using phylogenomic methods to analyze the genome sequences for 193 S. aureus isolates, we were able to show that the current pandemic population of EMRSA-15 descends from a health-care-associated MRSA epidemic that spread throughout England in the 1980s, which had itself previously emerged from a primarily community-associated methicillin-sensitive population. The emergence of fluoroquinolone resistance in this EMRSA-15 subclone in the English Midlands during the mid-1980s appears to have played a key role in triggering pandemic spread, and occurred shortly after the first clinical trials of this drug. Genome-based coalescence analysis estimated that the population of this subclone over the last 20 yr has grown four times faster than its progenitor. Using comparative genomic analysis we identified the molecular genetic basis of 99.8% of the antimicrobial resistance phenotypes of the isolates, highlighting the potential of pathogen genome sequencing as a diagnostic tool. We document the genetic changes associated with adaptation to the hospital environment and with increasing drug resistance over time, and how MRSA evolution likely has been influenced by country-specific drug use regimens.
Funded by: Biotechnology and Biological Sciences Research Council; PHS HHS: 2 RO1I457838-12; Wellcome Trust: 089472, 098051
Genome research 2013;23;4;653-64
PUBMED: 23299977; PMC: 3613582; DOI: 10.1101/gr.147710.112
-
Arginine Catabolic Mobile Element in Methicillin-Resistant Staphylococcus aureus (MRSA) Clonal Group ST239-MRSA-III Isolates in Singapore: Implications for PCR-Based Screening Tests.
Department of Medicine, National University Health System, Singapore, Singapore.
Antimicrobial agents and chemotherapy 2013;57;3;1563-4
PUBMED: 23318798; DOI: 10.1128/AAC.02518-12
-
WikiGWA: an open platform for collecting and using genome-wide association results.
Department of Human Genetics, Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK.
The number of discovered genetic variants from genome-wide association (GWA) studies (GWAS) has been growing rapidly. Centralized efforts such as the National Human Genome Research Institute's GWAS catalog provide regular updates and a convenient interface for quick lookup. However, the catalog entries are manually curated and rely on data from published articles. Other tools such as SNPedia (http://www.snpedia.com) collect published results regarding functional consequences of genetic variations. Here, we propose an approach that allows individual investigators to share their GWA results through an open platform. Unlike GWAS catalog or SNPedia, wikiGWA collects first-hand GWAS results and in a much larger scale. Investigators are not only able to post a much larger amount of results, but also post results from unpublished studies, which could alleviate publication bias and facilitate identification of weak signals. Our interface allows for flexible and fast queries, and the query results are formatted to work seamlessly with the LocusZoom program for visualization and annotation. We here describe wikiGWA, made publically available at http://www.wikiGWA.org.
Funded by: NHGRI NIH HHS: R01 HG006292, R01 HG006703
European journal of human genetics : EJHG 2013;21;4;471-3
PUBMED: 22929026; PMC: 3598322; DOI: 10.1038/ejhg.2012.187
-
The duck genome and transcriptome provide insight into an avian influenza virus reservoir species.
1] State Key Laboratory for Agrobiotechnology, China Agricultural University, Beijing, China. [2] The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Edinburgh, UK. [3].
The duck (Anas platyrhynchos) is one of the principal natural hosts of influenza A viruses. We present the duck genome sequence and perform deep transcriptome analyses to investigate immune-related genes. Our data indicate that the duck possesses a contractive immune gene repertoire, as in chicken and zebra finch, and this repertoire has been shaped through lineage-specific duplications. We identify genes that are responsive to influenza A viruses using the lung transcriptomes of control ducks and ones that were infected with either a highly pathogenic (A/duck/Hubei/49/05) or a weakly pathogenic (A/goose/Hubei/65/05) H5N1 virus. Further, we show how the duck's defense mechanisms against influenza infection have been optimized through the diversification of its β-defensin and butyrophilin-like repertoires. These analyses, in combination with the genomic and transcriptomic data, provide a resource for characterizing the interaction between host and influenza viruses.
Nature genetics 2013
PUBMED: 23749191; DOI: 10.1038/ng.2657
-
Olfaction and olfactory-mediated behaviour in psychiatric disease models.
Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.
Rats and mice are the most widely used species for modelling psychiatric disease. Assessment of these rodent models typically involves the analysis of aberrant behaviour with behavioural interactions often being manipulated to generate the model. Rodents rely heavily on their excellent sense of smell and almost all their social interactions have a strong olfactory component. Therefore, experimental paradigms that exploit these olfactory-mediated behaviours are among the most robust available and are highly prevalent in psychiatric disease research. These include tests of aggression and maternal instinct, foraging, olfactory memory and habituation and the establishment of social hierarchies. An appreciation of the way that rodents regulate these behaviours in an ethological context can assist experimenters to generate better data from their models and to avoid common pitfalls. We describe some of the more commonly used behavioural paradigms from a rodent olfactory perspective and discuss their application in existing models of psychiatric disease. We introduce the four olfactory subsystems that integrate to mediate the behavioural responses and the types of sensory cue that promote them and discuss their control and practical implementation to improve experimental outcomes. In addition, because smell is critical for normal behaviour in rodents and yet olfactory dysfunction is often associated with neuropsychiatric disease, we introduce some tests for olfactory function that can be applied to rodent models of psychiatric disorders as part of behavioural analysis.
Cell and tissue research 2013
PUBMED: 23604803; DOI: 10.1007/s00441-013-1617-7
-
Novel Loci Associated with Increased Risk of Sudden Cardiac Death in the Context of Coronary Artery Disease.
The Heart Institute, Cedars-Sinai Medical Center, Los Angeles, California, United States of America.
Background: Recent genome-wide association studies (GWAS) have identified novel loci associated with sudden cardiac death (SCD). Despite this progress, identified DNA variants account for a relatively small portion of overall SCD risk, suggesting that additional loci contributing to SCD susceptibility await discovery. The objective of this study was to identify novel DNA variation associated with SCD in the context of coronary artery disease (CAD). Using the MetaboChip custom array we conducted a case-control association analysis of 119,117 SNPs in 948 SCD cases (with underlying CAD) from the Oregon Sudden Unexpected Death Study (Oregon-SUDS) and 3,050 controls with CAD from the Wellcome Trust Case-Control Consortium (WTCCC). Two newly identified loci were significantly associated with increased risk of SCD after correction for multiple comparisons at: rs6730157 in the RAB3GAP1 gene on chromosome 2 (P = 4.93×10(-12), OR = 1.60) and rs2077316 in the ZNF365 gene on chromosome 10 (P = 3.64×10(-8), OR = 2.41). Conclusions: Our findings suggest that RAB3GAP1 and ZNF365 are relevant candidate genes for SCD and will contribute to the mechanistic understanding of SCD susceptibility.
PloS one 2013;8;4;e59905
PUBMED: 23593153; DOI: 10.1371/journal.pone.0059905
-
Negligible impact of rare autoimmune-locus coding-region variants on missing heritability.
Blizard Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London E1 2AT, UK.
Genome-wide association studies (GWAS) have identified common variants of modest-effect size at hundreds of loci for common autoimmune diseases; however, a substantial fraction of heritability remains unexplained, to which rare variants may contribute. To discover rare variants and test them for association with a phenotype, most studies re-sequence a small initial sample size and then genotype the discovered variants in a larger sample set. This approach fails to analyse a large fraction of the rare variants present in the entire sample set. Here we perform simultaneous amplicon-sequencing-based variant discovery and genotyping for coding exons of 25 GWAS risk genes in 41,911 UK residents of white European origin, comprising 24,892 subjects with six autoimmune disease phenotypes and 17,019 controls, and show that rare coding-region variants at known loci have a negligible role in common autoimmune disease susceptibility. These results do not support the rare-variant synthetic genome-wide-association hypothesis (in which unobserved rare causal variants lead to association detected at common tag variants). Many known autoimmune disease risk loci contain multiple, independently associated, common and low-frequency variants, and so genes at these loci are a priori stronger candidates for harbouring rare coding-region variants than other genes. Our data indicate that the missing heritability for common autoimmune diseases may not be attributable to the rare coding-region variant portion of the allelic spectrum, but perhaps, as others have proposed, may be a result of many common-variant loci of weak effect.
Nature 2013
PUBMED: 23698362; DOI: 10.1038/nature12170
-
REAPR: a universal tool for genome assembly evaluation.
Parasite Genomics, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, CB10 1SA, UK. tdo@sanger.ac.uk.
Methods to reliably assess the accuracy of genome sequence data are lacking. Currently completeness is only described qualitatively and mis-assemblies are overlooked. Here we present REAPR, a tool that precisely identifies errors in genome assemblies without the need for a reference sequence. We have validated REAPR on complete genomes or de novo assemblies from bacteria, malaria and Caenorhabditis elegans, and demonstrate that 86% and 82% of the human and mouse reference genomes are error-free, respectively. When applied to an ongoing genome project, REAPR provides corrected assembly statistics allowing the quantitative comparison of multiple assemblies. REAPR is available at http://www.sanger.ac.uk/resources/software/reapr/.
Genome biology 2013;14;5;R47
PUBMED: 23710727; DOI: 10.1186/gb-2013-14-5-r47
-
The genomic basis of vomeronasal-mediated behaviour.
Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.
The vomeronasal organ (VNO) is a chemosensory subsystem found in the nose of most mammals. It is principally tasked with detecting pheromones and other chemical signals that initiate innate behavioural responses. The VNO expresses subfamilies of vomeronasal receptors (VRs) in a cell-specific manner: each sensory neuron expresses just one or two receptors and silences all the other receptor genes. VR genes vary greatly in number within mammalian genomes, from no functional genes in some primates to many hundreds in rodents. They bind semiochemicals, some of which are also encoded in gene families that are coexpanded in species with correspondingly large VR repertoires. Protein and peptide cues that activate the VNO tend to be expressed in exocrine tissues in sexually dimorphic, and sometimes individually variable, patterns. Few chemical ligand-VR-behaviour relationships have been fully elucidated to date, largely due to technical difficulties in working with large, homologous gene families with high sequence identity. However, analysis of mouse lines with mutations in genes involved in ligand-VR signal transduction has revealed that the VNO mediates a range of social behaviours, including male-male and maternal aggression, sexual attraction, lordosis, and selective pregnancy termination, as well as interspecific responses such as avoidance and defensive behaviours. The unusual logic of VR expression now offers an opportunity to map the specific neural circuits that drive these behaviours.
Mammalian genome : official journal of the International Mammalian Genome Society 2013
PUBMED: 23884334; DOI: 10.1007/s00335-013-9463-1
-
Astroglial IFITM3 mediates neuronal impairments following neonatal immune challenge in mice.
Department of Neuropsychopharmacology and Hospital Pharmacy, Nagoya University Graduate School of Medicine, Nagoya, Japan; Department of Chemical Pharmacology, Graduate School of Pharmaceutical Sciences, Meijo University, Nagoya, Japan.
Interferon-induced transmembrane protein 3 (IFITM3) ıplays a crucial role in the antiviral responses of Type I interferons (IFNs). The role of IFITM3 in the central nervous system (CNS) is, however, largely unknown, despite the fact that its expression is increased in the brains of patients with neurologic and neuropsychiatric diseases. Here, we show the role of IFITM3 in long-lasting neuronal impairments in mice following polyriboinosinic-polyribocytidylic acid (polyI:C, a synthetic double-stranded RNA)-induced immune challenge during the early stages of development. We found that the induction of IFITM3 expression in the brain of mice treated with polyI:C was observed only in astrocytes. Cultured astrocytes were activated by polyI:C treatment, leading to an increase in the mRNA levels of inflammatory cytokines as well as Ifitm3. When cultured neurons were treated with the conditioned medium of polyI:C-treated astrocytes (polyI:C-ACM), neurite development was impaired. These polyI:C-ACM-induced neurodevelopmental abnormalities were alleviated by ifitm3(-) (/) (-) astrocyte-conditioned medium. Furthermore, decreases of MAP2 expression, spine density, and dendrite complexity in the frontal cortex as well as memory impairment were evident in polyI:C-treated wild-type mice, but such neuronal impairments were not observed in ifitm3(-) (/) (-) mice. We also found that IFITM3 proteins were localized to the early endosomes of astrocytes following polyI:C treatment and reduced endocytic activity. These findings suggest that the induction of IFITM3 expression in astrocytes by the activation of the innate immune system during the early stages of development has non-cell autonomous effects that affect subsequent neurodevelopment, leading to neuropathological impairments and brain dysfunction, by impairing endocytosis in astrocytes. GLIA 2013.
Glia 2013
PUBMED: 23382131; DOI: 10.1002/glia.22461
-
The role of high-throughput technologies in clinical cancer genomics.
Department of Hematology/Oncology, Cambridge University NHS Hospitals Foundation Trust, Cambridge, CB2 0QQ, UK.
Cancer is a genetic disease driven by both heritable and somatic alterations in DNA, which underpin not only oncogenesis but also progression and eventual metastasis. The major impetus for elucidating the nature and function of somatic mutations in cancer genomes is the potential for the development of effective targeted anticancer therapies. Over the last decade, high-throughput technologies have allowed us unprecedented access to a host of cancer genomes, leading to an influx of new information about their pathobiology. The challenge now is to integrate such emerging information into clinical practice to achieve tangible benefits for cancer patients. This review examines the roles array-based comparative genomic hybridization and next-generation sequencing are playing in furthering our understanding of both hematological and solid-organ tumors. Furthermore, the authors discuss the current challenges in translating the role of these technologies from bench to bedside.
Expert review of molecular diagnostics 2013;13;2;167-81
PUBMED: 23477557; DOI: 10.1586/erm.13.1
-
Inferring genome-wide recombination landscapes from advanced intercross lines: application to yeast crosses.
Wellcome Trust Sanger Institute, Hinxton, Cambridge, United Kingdom.
Accurate estimates of recombination rates are of great importance for understanding evolution. In an experimental genetic cross, recombination breaks apart and rejoins genetic material, such that the genomes of the resulting isolates are comprised of distinct blocks of differing parental origin. We here describe a method exploiting this fact to infer genome-wide recombination profiles from sequenced isolates from an advanced intercross line (AIL). We verified the accuracy of the method against simulated data. Next, we sequenced 192 isolates from a twelve-generation cross between West African and North American yeast Saccharomyces cerevisiae strains and inferred the underlying recombination landscape at a fine genomic resolution (mean segregating site distance 0.22 kb). Comparison was made with landscapes inferred for a similar cross between four yeast strains, and with a previous single-generation, intra-strain cross (Mancera et al., Nature 2008). Moderate congruence was identified between landscapes (correlation 0.58-0.77 at 5 kb resolution), albeit with variance between mean genome-wide recombination rates. The multiple generations of mating undergone in the AILs gave more precise inference of recombination rates than could be achieved from a single-generation cross, in particular in identifying recombination cold-spots. The recombination landscapes we describe have particular utility; both AILs are part of a resource to study complex yeast traits (see e.g. Parts et al., Genome Res 2011). Our results will enable future applications of this resource to take better account of local linkage structure heterogeneities. Our method has general applicability to other crossing experiments, including a variety of experimental designs.
PloS one 2013;8;5;e62266
PUBMED: 23658715; PMC: 3642125; DOI: 10.1371/journal.pone.0062266
-
Computational approaches to identify functional genetic variants in cancer genomes.
1] Research Unit on Biomedical Informatics, University Pompeu Fabra, Barcelona, Spain. [2].
The International Cancer Genome Consortium (ICGC) aims to catalog genomic abnormalities in tumors from 50 different cancer types. Genome sequencing reveals hundreds to thousands of somatic mutations in each tumor but only a minority of these drive tumor progression. We present the result of discussions within the ICGC on how to address the challenge of identifying mutations that contribute to oncogenesis, tumor maintenance or response to therapy, and recommend computational techniques to annotate somatic variants and predict their impact on cancer phenotype.
Nature methods 2013;10;8;723-9
PUBMED: 23900255; DOI: 10.1038/nmeth.2562
-
A Cell-surface Phylome for African Trypanosomes.
Pathogen Genomics Group, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, England, United Kingdom ; Department of Infection Biology, Institute of Infection and Global Health, University of Liverpool, Liverpool, England, United Kingdom.
The cell surface of Trypanosoma brucei, like many protistan blood parasites, is crucial for mediating host-parasite interactions and is instrumental to the initiation, maintenance and severity of infection. Previous comparisons with the related trypanosomatid parasites T. cruzi and Leishmania major suggest that the cell-surface proteome of T. brucei is largely taxon-specific. Here we compare genes predicted to encode cell surface proteins of T. brucei with those from two related African trypanosomes, T. congolense and T. vivax. We created a cell surface phylome (CSP) by estimating phylogenies for 79 gene families with putative surface functions to understand the more recent evolution of African trypanosome surface architecture. Our findings demonstrate that the transferrin receptor genes essential for bloodstream survival in T. brucei are conserved in T. congolense but absent from T. vivax and include an expanded gene family of insect stage-specific surface glycoproteins that includes many currently uncharacterized genes. We also identify species-specific features and innovations and confirm that these include most expression site-associated genes (ESAGs) in T. brucei, which are absent from T. congolense and T. vivax. The CSP presents the first global picture of the origins and dynamics of cell surface architecture in African trypanosomes, representing the principal differences in genomic repertoire between African trypanosome species and provides a basis from which to explore the developmental and pathological differences in surface architectures. All data can be accessed at: http://www.genedb.org/Page/trypanosoma_surface_phylome.
PLoS neglected tropical diseases 2013;7;3;e2121
PUBMED: 23556014; PMC: 3605285; DOI: 10.1371/journal.pntd.0002121
-
iAnn: An Event Sharing Platform for the Life Sciences.
European Bioinformatics Institute, Hinxton, UK, National Center for Biotecnology-CSIC, Madrid, Spain, Theragen BioInstitute, South Korea, SIB Swiss Institute of Bioinformatics, Genève, Switzerland, NNF Center for Protein Research, Copenhagen, Denmark, Ontario Institute for Cancer Research, Toronto, Canada, European Molecular Biology Laboratory, Heidelberg, Germany, Cancer Research Center (IBMCC-CSIC), Salamanca, Spain, Netherlands Bioinformatics Centre, Nijmegen, Netherlands, University College Dublin, Dublin, Ireland, Instituto Gulbenkian de Ciência, Oeiras, Portugal, SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland, University of Cambridge, Cambridge, UK, CSC-Scientific Computing Ltd., Espoo, Finland, WILEY-VCH Verlag, Weinheim, Germany, Wellcome Trust Sanger Institute, Hinxton, UK, Universidad Complutense, Madrid, Spain, Dept. Clinical Laboratory Sciences, IIDMM, University of Cape Town, South Africa, Academis Training, Berlin, Germany, The Genome Analysis Centre, Norwich, UK, Luxembourg Center for Systems Biomedicine, University of Luxembourg, Luxemburg, Sapienza University, Rome, Italy, Max Planck Institute for Biology of Ageing, Cologne, Germany, Itico, Fulbourn, Cambridge, UK, The University of Manchester, Manchester, UK.
SUMMARY: We present iAnn, an open source community-driven platform for dissemination of life science events, such as courses, conferences and workshops. iAnn allows automatic visualisation and integration of customised event reports. A central repository lies at the core of the platform: curators add submitted events, and these are subsequently accessed via Web services. Thus, once an iAnn widget is incorporated into a website, it permanently shows timely, relevant information as if it were native to the remote site. At the same time, announcements submitted to the repository are automatically disseminated to all portals that query the system. To facilitate the visualisation of announcements, iAnn provides powerful filtering options and views, integrated in Google Maps and Google Calendar. All iAnn widgets are freely available. AVAILABILITY: http://iann.pro/iannviewer CONTACT: manuel.corpas@tgac.ac.uk.
Bioinformatics (Oxford, England) 2013
PUBMED: 23742982; DOI: 10.1093/bioinformatics/btt306
-
The CD225 Domain of IFITM3 Is Required for both IFITM Protein Association and Inhibition of Influenza A Virus and Dengue Virus Replication.
Department of Microbiology and Physiological Systems, University of Massachusetts Medical School, Worcester, Massachusetts, USA.
The interferon-induced transmembrane protein 3 (IFITM3) gene is an interferon-stimulated gene that inhibits the replication of multiple pathogenic viruses in vitro and in vivo. IFITM3 is a member of a large protein superfamily, whose members share a functionally undefined area of high amino acid conservation, the CD225 domain. We performed mutational analyses of IFITM3 and identified multiple residues within the CD225 domain, consisting of the first intramembrane domain (intramembrane domain 1 [IM1]) and a conserved intracellular loop (CIL), that are required for restriction of both influenza A virus (IAV) and dengue virus (DENV) infection in vitro. Two phenylalanines within IM1 (F75 and F78) also mediate a physical association between IFITM proteins, and the loss of this interaction decreases IFITM3-mediated restriction. By extension, similar IM1-mediated associations may contribute to the functions of additional members of the CD225 domain family. IFITM3's distal N-terminal domain is also needed for full antiviral activity, including a tyrosine (Y20), whose alteration results in mislocalization of a portion of IFITM3 to the cell periphery and surface. Comparative analyses demonstrate that similar molecular determinants are needed for IFITM3's restriction of both IAV and DENV. However, a portion of the CIL including Y99 and R87 is preferentially needed for inhibition of the orthomyxovirus. Several IFITM3 proteins engineered with rare single-nucleotide polymorphisms demonstrated reduced expression or mislocalization, and these events were associated with enhanced viral replication in vitro, suggesting that possessing such alleles may impact an individual's risk for viral infection. On the basis of this and other data, we propose a model for IFITM3-mediated restriction.
Funded by: NIAID NIH HHS: R01 AI091786
Journal of virology 2013;87;14;7837-52
PUBMED: 23658454; PMC: 3700195; DOI: 10.1128/JVI.00481-13
-
Presynaptic maturation in auditory hair cells requires a critical period of sensory-independent spiking activity.
Department of Biomedical Science, University of Sheffield, Sheffield S10 2TN, United Kingdom.
The development of neural circuits relies on spontaneous electrical activity that occurs during immature stages of development. In the developing mammalian auditory system, spontaneous calcium action potentials are generated by inner hair cells (IHCs), which form the primary sensory synapse. It remains unknown whether this electrical activity is required for the functional maturation of the auditory system. We found that sensory-independent electrical activity controls synaptic maturation in IHCs. We used a mouse model in which the potassium channel SK2 is normally overexpressed, but can be modulated in vivo using doxycycline. SK2 overexpression affected the frequency and duration of spontaneous action potentials, which prevented the development of the Ca(2+)-sensitivity of vesicle fusion at IHC ribbon synapses, without affecting their morphology or general cell development. By manipulating the in vivo expression of SK2 channels, we identified the "critical period" during which spiking activity influences IHC synaptic maturation. Here we provide direct evidence that IHC development depends upon a specific temporal pattern of calcium spikes before sound-driven neuronal activity.
Proceedings of the National Academy of Sciences of the United States of America 2013;110;21;8720-5
PUBMED: 23650376; DOI: 10.1073/pnas.1219578110
-
A sequence variant associated with sortilin-1 (SORT1) on 1p13.3 is independently associated with abdominal aortic aneurysm.
Abdominal aortic aneurysm (AAA) is a common human disease with a high estimated heritability (0.7); however, only a small number of associated genetic loci have been reported to date. In contrast, over 100 loci have now been reproducibly associated with either blood lipid profile and/or coronary artery disease (CAD) (both risk factors for AAA) in large-scale meta-analyses. This study employed a staged design to investigate whether the loci for these two phenotypes are also associated with AAA. Validated CAD and dyslipidaemia loci underwent screening using the Otago AAA genome-wide association data set. Putative associations underwent staged secondary validation in 10 additional cohorts. A novel association between the SORT1 (1p13.3) locus and AAA was identified. The rs599839 G allele, which has been previously associated with both dyslipidaemia and CAD, reached genome-wide significance in 11 combined independent cohorts (meta-analysis with 7048 AAA cases and 75 976 controls: G allele OR 0.81, 95% CI 0.76-0.85, P = 7.2 × 10(-14)). Modelling for confounding interactions of concurrent dyslipidaemia, heart disease and other risk factors suggested that this marker is an independent predictor of AAA susceptibility. In conclusion, a genetic marker associated with cardiovascular risk factors, and in particular concurrent vascular disease, appeared to independently contribute to susceptibility for AAA. Given the potential genetic overlap between risk factor and disease phenotypes, the use of well-characterized case-control cohorts allowing for modelling of cardiovascular disease risk confounders will be an important component in the future discovery of genetic markers for conditions such as AAA.
Funded by: NHLBI NIH HHS: R01 HL064310
Human molecular genetics 2013;22;14;2941-7
PUBMED: 23535823; PMC: 3690970; DOI: 10.1093/hmg/ddt141
-
Whole-genome sequencing for rapid susceptibility testing of M. tuberculosis.
Funded by: Biotechnology and Biological Sciences Research Council; Chief Scientist Office: G1000803; Department of Health; Medical Research Council; Wellcome Trust: WT098051
The New England journal of medicine 2013;369;3;290-2
PUBMED: 23863072; DOI: 10.1056/NEJMc1215305
-
Consequences of whiB7 (Rv3197A) Mutations in Beijing Genotype Isolates of the Mycobacterium tuberculosis Complex.
Public Health England, Cambridge, United Kingdom.
Antimicrobial agents and chemotherapy 2013;57;7;3461
PUBMED: 23761426; DOI: 10.1128/AAC.00626-13
-
Genome-wide association analyses identify 18 new loci associated with serum urate concentrations.
1] Renal Division, Freiburg University Hospital, Freiburg, Germany. [2] Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, USA. [3].
Elevated serum urate concentrations can cause gout, a prevalent and painful inflammatory arthritis. By combining data from >140,000 individuals of European ancestry within the Global Urate Genetics Consortium (GUGC), we identified and replicated 28 genome-wide significant loci in association with serum urate concentrations (18 new regions in or near TRIM46, INHBB, SFMBT1, TMEM171, VEGFA, BAZ1B, PRKAG2, STC1, HNF4G, A1CF, ATXN2, UBE2Q2, IGF1R, NFAT5, MAF, HLF, ACVR1B-ACVRL1 and B3GNT4). Associations for many of the loci were of similar magnitude in individuals of non-European ancestry. We further characterized these loci for associations with gout, transcript expression and the fractional excretion of urate. Network analyses implicate the inhibins-activins signaling pathways and glucose metabolism in systemic urate control. New candidate genes for serum urate concentration highlight the importance of metabolic control of urate production and excretion, which may have implications for the treatment and prevention of gout.
Nature genetics 2013;45;2;145-54
PUBMED: 23263486; DOI: 10.1038/ng.2500
-
KAT5 tyrosine phosphorylation couples chromatin sensing to ATM signalling.
The Gurdon Institute and Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge CB2 1QN, UK.
The detection of DNA lesions within chromatin represents a critical step in cellular responses to DNA damage. However, the regulatory mechanisms that couple chromatin sensing to DNA-damage signalling in mammalian cells are not well understood. Here we show that tyrosine phosphorylation of the protein acetyltransferase KAT5 (also known as TIP60) increases after DNA damage in a manner that promotes KAT5 binding to the histone mark H3K9me3. This triggers KAT5-mediated acetylation of the ATM kinase, promoting DNA-damage-checkpoint activation and cell survival. We also establish that chromatin alterations can themselves enhance KAT5 tyrosine phosphorylation and ATM-dependent signalling, and identify the proto-oncogene c-Abl as a mediator of this modification. These findings define KAT5 tyrosine phosphorylation as a key event in the sensing of genomic and chromatin perturbations, and highlight a key role for c-Abl in such processes.
Funded by: Cancer Research UK: C6/A11224; Wellcome Trust: WT092096
Nature 2013;498;7452;70-4
PUBMED: 23708966; DOI: 10.1038/nature12201
-
Unusual features in organisation of capsular polysaccharide-related genes of C. jejuni strain X.
School of Life Sciences, Kingston University, Faculty of Science, Engineering and Computing, Penrhyn Road, Kingston-upon Thames, KT1 2EE, UK. Electronic address: a.karlyshev@kingston.ac.uk.
PCR probing of the genome of Campylobacter jejuni strain X using conserved capsular polysaccharide (CPS)-related genes allowed elucidation of a complete sequence of the respective gene cluster (cps). This is the largest known Campylobacter cps cluster (38kb excluding flanking kps regions), which includes a number of genes not detected in other Campylobacter strains. Sequence analysis suggests genetic rearrangements both within and outside the cps gene cluster, a mechanism which may be responsible for mosaic organisation of sugar transferase-related genes leading to structural variability of the capsular polysaccharide (CPS).
Gene 2013
PUBMED: 23562723; DOI: 10.1016/j.gene.2013.03.087
-
Human melioidosis, Malawi, 2011.
A case of human melioidosis caused by a novel sequence type of Burkholderia pseudomallei occurred in a child in Malawi, southern Africa. A literature review showed that human cases reported from the continent have been increasing.
Emerging infectious diseases 2013;19;6;981-4
PUBMED: 23735189; DOI: 10.3201/eid1906.120717
-
Activation of the B Cell Antigen Receptor Triggers Reactivation of Latent Kaposi's Sarcoma-Associated Herpesvirus in B Cells.
Institute of Virology, Hanover Medical School, Hanover, Germany.
Kaposi's sarcoma-associated herpesvirus (KSHV) is an oncogenic herpesvirus and the cause of Kaposi's sarcoma, primary effusion lymphoma (PEL) and multicentric Castleman's disease. Latently infected B cells are the main reservoir of this virus in vivo, but the nature of the stimuli that lead to its reactivation in B cells is only partially understood. We established stable BJAB cell lines harboring latent KSHV by cell-free infection with recombinant virus carrying a puromycin resistance marker. Our latently infected B cell lines, termed BrK.219, can be reactivated by triggering the B cell receptor (BCR) with antibodies to surface IgM, a stimulus imitating antigen recognition. Using this B cell model system we studied the mechanisms that mediate the reactivation of KSHV in B cells following the stimulation of the BCR and could identify phosphatidylinositol 3-kinase (PI3K) and X-box binding protein 1 (XBP-1) as proteins that play an important role in the BCR-mediated reactivation of latent KSHV.
Journal of virology 2013;87;14;8004-16
PUBMED: 23678173; PMC: 3700181; DOI: 10.1128/JVI.00506-13
-
RetroSeq: transposable element discovery from next-generation sequencing data.
Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, UK. tk2@sanger.ac.uk
Unlabelled: A significant proportion of eukaryote genomes consist of transposable element (TE)-derived sequence. These elements are known to have the capacity to modulate gene function and genome evolution. We have developed RetroSeq for detecting non-reference TE insertions from Illumina paired-end whole-genome sequencing data. We evaluate RetroSeq on a human trio from the 1000 Genomes Project, showing that it produces highly accurate TE calls.
Availabilty: RetroSeq is open-source and available from https://github.com/tk2/RetroSeq.
Funded by: Cancer Research UK; Medical Research Council; Wellcome Trust
Bioinformatics (Oxford, England) 2013;29;3;389-90
PUBMED: 23233656; PMC: 3562067; DOI: 10.1093/bioinformatics/bts697
-
Different Patterns of Epstein-Barr Virus Latency in Endemic Burkitt Lymphoma (BL) Lead to Distinct Variants within the BL-Associated Gene Expression Signature.
School of Cancer Sciences, College of Medical and Dental Sciences, University of Birmingham, Edgbaston, Birmingham, United Kingdom.
Epstein-Barr virus (EBV) is present in all cases of endemic Burkitt lymphoma (BL) but in few European/North American sporadic BLs. Gene expression arrays of sporadic tumors have defined a consensus BL profile within which tumors are classifiable as "molecular BL" (mBL). Where endemic BLs fall relative to this profile remains unclear, since they not only carry EBV but also display one of two different forms of virus latency. Here, we use early-passage BL cell lines from different tumors, and BL subclones from a single tumor, to compare EBV-negative cells with EBV-positive cells displaying either classical latency I EBV infection (where EBNA1 is the only EBV antigen expressed from the wild-type EBV genome) or Wp-restricted latency (where an EBNA2 gene-deleted virus genome broadens antigen expression to include the EBNA3A, -3B, and -3C proteins and BHRF1). Expression arrays show that both types of endemic BL fall within the mBL classification. However, while EBV-negative and latency I BLs show overlapping profiles, Wp-restricted BLs form a distinct subgroup, characterized by a detectable downregulation of the germinal center (GC)-associated marker Bcl6 and upregulation of genes marking early plasmacytoid differentiation, notably IRF4 and BLIMP1. Importantly, these same changes can be induced in EBV-negative or latency I BL cells by infection with an EBNA2-knockout virus. Thus, we infer that the distinct gene profile of Wp-restricted BLs does not reflect differences in the identity of the tumor progenitor cell per se but differences imposed on a common progenitor by broadened EBV gene expression.
Journal of virology 2013;87;5;2882-94
PUBMED: 23269792; PMC: 3571367; DOI: 10.1128/JVI.03003-12
-
A systematic genome-wide analysis of zebrafish protein-coding gene function.
Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.
Since the publication of the human reference genome, the identities of specific genes associated with human diseases are being discovered at a rapid rate. A central problem is that the biological activity of these genes is often unclear. Detailed investigations in model vertebrate organisms, typically mice, have been essential for understanding the activities of many orthologues of these disease-associated genes. Although gene-targeting approaches and phenotype analysis have led to a detailed understanding of nearly 6,000 protein-coding genes, this number falls considerably short of the more than 22,000 mouse protein-coding genes. Similarly, in zebrafish genetics, one-by-one gene studies using positional cloning, insertional mutagenesis, antisense morpholino oligonucleotides, targeted re-sequencing, and zinc finger and TAL endonucleases have made substantial contributions to our understanding of the biological activity of vertebrate genes, but again the number of genes studied falls well short of the more than 26,000 zebrafish protein-coding genes. Importantly, for both mice and zebrafish, none of these strategies are particularly suited to the rapid generation of knockouts in thousands of genes and the assessment of their biological activity. Here we describe an active project that aims to identify and phenotype the disruptive mutations in every zebrafish protein-coding gene, using a well-annotated zebrafish reference genome sequence, high-throughput sequencing and efficient chemical mutagenesis. So far we have identified potentially disruptive mutations in more than 38% of all known zebrafish protein-coding genes. We have developed a multi-allelic phenotyping scheme to efficiently assess the effects of each allele during embryogenesis and have analysed the phenotypic consequences of over 1,000 alleles. All mutant alleles and data are available to the community and our phenotyping scheme is adaptable to phenotypic analysis beyond embryogenesis.
Funded by: Medical Research Council: G0777791; NHGRI NIH HHS: 5R01HG00481; Wellcome Trust: 098051
Nature 2013;496;7446;494-7
PUBMED: 23594742; PMC: 3743023; DOI: 10.1038/nature11992
-
A genome-wide analysis of populations from European Russia reveals a new pole of genetic diversity in northern europe.
Department of Molecular Bases of Human Genetics, Institute of Molecular Genetics, Russian Academy of Sciences, Moscow, Russia.
Several studies examined the fine-scale structure of human genetic variation in Europe. However, the European sets analyzed represent mainly northern, western, central, and southern Europe. Here, we report an analysis of approximately 166,000 single nucleotide polymorphisms in populations from eastern (northeastern) Europe: four Russian populations from European Russia, and three populations from the northernmost Finno-Ugric ethnicities (Veps and two contrast groups of Komi people). These were compared with several reference European samples, including Finns, Estonians, Latvians, Poles, Czechs, Germans, and Italians. The results obtained demonstrated genetic heterogeneity of populations living in the region studied. Russians from the central part of European Russia (Tver, Murom, and Kursk) exhibited similarities with populations from central-eastern Europe, and were distant from Russian sample from the northern Russia (Mezen district, Archangelsk region). Komi samples, especially Izhemski Komi, were significantly different from all other populations studied. These can be considered as a second pole of genetic diversity in northern Europe (in addition to the pole, occupied by Finns), as they had a distinct ancestry component. Russians from Mezen and the Finnic-speaking Veps were positioned between the two poles, but differed from each other in the proportions of Komi and Finnic ancestries. In general, our data provides a more complete genetic map of Europe accounting for the diversity in its most eastern (northeastern) populations.
PloS one 2013;8;3;e58552
PUBMED: 23505534; PMC: 3591355; DOI: 10.1371/journal.pone.0058552
-
Genome and Transcriptome Adaptation Accompanying Emergence of the Definitive Type 2 Host-Restricted Salmonella enterica Serovar Typhimurium Pathovar.
The Wellcome Trust Sanger Institute, the Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom.
ABSTRACT Salmonella enterica serovar Typhimurium definitive type 2 (DT2) is host restricted to Columba livia (rock or feral pigeon) but is also closely related to S. Typhimurium isolates that circulate in livestock and cause a zoonosis characterized by gastroenteritis in humans. DT2 isolates formed a distinct phylogenetic cluster within S. Typhimurium based on whole-genome-sequence polymorphisms. Comparative genome analysis of DT2 94-213 and S. Typhimurium SL1344, DT104, and D23580 identified few differences in gene content with the exception of variations within prophages. However, DT2 94-213 harbored 22 pseudogenes that were intact in other closely related S. Typhimurium strains. We report a novel in silico approach to identify single amino acid substitutions in proteins that have a high probability of a functional impact. One polymorphism identified using this method, a single-residue deletion in the Tar protein, abrogated chemotaxis to aspartate in vitro. DT2 94-213 also exhibited an altered transcriptional profile in response to culture at 42°C compared to that of SL1344. Such differentially regulated genes included a number involved in flagellum biosynthesis and motility. IMPORTANCE Whereas Salmonella enterica serovar Typhimurium can infect a wide range of animal species, some variants within this serovar exhibit a more limited host range and altered disease potential. Phylogenetic analysis based on whole-genome sequences can identify lineages associated with specific virulence traits, including host adaptation. This study represents one of the first to link pathogen-specific genetic signatures, including coding capacity, genome degradation, and transcriptional responses to host adaptation within a Salmonella serovar. We performed comparative genome analysis of reference and pigeon-adapted definitive type 2 (DT2) S. Typhimurium isolates alongside phenotypic and transcriptome analyses, to identify genetic signatures linked to host adaptation within the DT2 lineage.
mBio 2013;4;5
PUBMED: 23982073; DOI: 10.1128/mBio.00565-13
-
Analysis of Tumor Heterogeneity and Cancer Gene Networks Using Deep Sequencing of MMTV-Induced Mouse Mammary Tumors.
Division of Molecular Pathology, The Netherlands Cancer Institute, Amsterdam, The Netherlands ; Division of Molecular Carcinogenesis, The Netherlands Cancer Institute, Amsterdam, The Netherlands.
Cancer develops through a multistep process in which normal cells progress to malignant tumors via the evolution of their genomes as a result of the acquisition of mutations in cancer driver genes. The number, identity and mode of action of cancer driver genes, and how they contribute to tumor evolution is largely unknown. This study deployed the Mouse Mammary Tumor Virus (MMTV) as an insertional mutagen to find both the driver genes and the networks in which they function. Using deep insertion site sequencing we identified around 31000 retroviral integration sites in 604 MMTV-induced mammary tumors from mice with mammary gland-specific deletion of Trp53, Pten heterozygous knockout mice, or wildtype strains. We identified 18 known common integration sites (CISs) and 12 previously unknown CISs marking new candidate cancer genes. Members of the Wnt, Fgf, Fgfr, Rspo and Pdgfr gene families were commonly mutated in a mutually exclusive fashion. The sequence data we generated yielded also information on the clonality of insertions in individual tumors, allowing us to develop a data-driven model of MMTV-induced tumor development. Insertional mutations near Wnt and Fgf genes mark the earliest "initiating" events in MMTV induced tumorigenesis, whereas Fgfr genes are targeted later during tumor progression. Our data shows that insertional mutagenesis can be used to discover the mutational networks, the timing of mutations, and the genes that initiate and drive tumor evolution.
PloS one 2013;8;5;e62113
PUBMED: 23690930; DOI: 10.1371/journal.pone.0062113
-
Current application and future perspectives of molecular typing methods to study Clostridium difficile infections.
Section Experimental Microbiology, Department of Medical Microbiology, Center of Infectious Diseases, Leiden University Medical Center, Leiden, Netherlands.
Euro surveillance : bulletin européen sur les maladies transmissibles = European communicable disease bulletin 2013;18;4
PUBMED: 23369393
-
Tracking chromosome evolution in southern african gerbils using flow-sorted chromosome paints.
Evolutionary Genomics Group, Department of Botany and Zoology, University of Stellenbosch, Stellenbosch, South Africa.
Desmodillus and Gerbilliscus (formerly Tatera) comprise a monophyletic group of gerbils (subfamily Gerbillinae) which last shared an ancestor approximately 8 million years ago; diploid chromosome number variation among the species ranges from 2n = 36 to 2n = 50. In an attempt to shed more light on chromosome evolution and speciation in these rodents, we compared the karyotypes of 7 species, representing 3 genera, based on homology data revealed by chromosome painting with probes derived from flow-sorted chromosomes of the hairy footed gerbil, Gerbillurus paeba (2n = 36). The fluorescent in situ hybridization data revealed remarkable genome conservation: these species share a high proportion of conserved chromosomes, and differences are due to 10 Robertsonian (Rb) rearrangements (3 autapomorphies, 3 synapomorphies and 4 hemiplasies/homoplasies). Our data suggest that chromosome evolution in Desmodillus occurred at a rate of ∼1.25 rearrangements per million years (Myr), and that the rate among Gerbilliscus over a time period spanning 8 Myr is also ∼1.25 rearrangements/Myr. The recently diverged Gerbillurus (G. tytonis and G. paeba) share an identical karyotype, while Gerbilliscus kempi, G. afra and G. leucogaster differ by 6 Rb rearrangements (a rate of ∼1 rearrangement/Myr). Thus, our data suggests a very slow rate of chromosomal evolution in Southern African gerbils.
Cytogenetic and genome research 2013;139;4;267-75
PUBMED: 23652816; DOI: 10.1159/000350696
-
A P3G generic access agreement for population genomic studies.
Centre of Genomics and Policy, McGill University, Montreal, Quebec, Canada.
Nature biotechnology 2013;31;5;384-5
PUBMED: 23657386; DOI: 10.1038/nbt.2567
-
Host responses to melioidosis and tuberculosis are both dominated by interferon-mediated signaling.
Wellcome Trust Sanger Institute, Hinxton, Cambridge, United Kingdom ; Department of Medicine, University of Cambridge, Cambridge, United Kingdom ; Mahidol-Oxford Tropical Medicine Research Unit, Mahidol University, Bangkok, Thailand ; Department of Infection and Tropical Diseases, Birmingham Heartlands Hospital, Birmingham, United Kingdom.
Melioidosis (Burkholderia pseudomallei infection) is a common cause of community-acquired sepsis in Northeast Thailand and northern Australia. B. pseudomallei is a soil saprophyte endemic to Southeast Asia and northern Australia. The clinical presentation of melioidosis may mimic tuberculosis (both cause chronic suppurative lesions unresponsive to conventional antibiotics and both commonly affect the lungs). The two diseases have overlapping risk profiles (e.g., diabetes, corticosteroid use), and both B. pseudomallei and Mycobacterium tuberculosis are intracellular pathogens. There are however important differences: the majority of melioidosis cases are acute, not chronic, and present with severe sepsis and a mortality rate that approaches 50% despite appropriate antimicrobial therapy. By contrast, tuberculosis is characteristically a chronic illness with mortality <2% with appropriate antimicrobial chemotherapy. We examined the gene expression profiles of total peripheral leukocytes in two cohorts of patients, one with acute melioidosis (30 patients and 30 controls) and another with tuberculosis (20 patients and 24 controls). Interferon-mediated responses dominate the host response to both infections, and both type 1 and type 2 interferon responses are important. An 86-gene signature previously thought to be specific for tuberculosis is also found in melioidosis. We conclude that the host responses to melioidosis and to tuberculosis are similar: both are dominated by interferon-signalling pathways and this similarity means gene expression signatures from whole blood do not distinguish between these two diseases.
PloS one 2013;8;1;e54961
PUBMED: 23383015; DOI: 10.1371/journal.pone.0054961
-
Chromatin Accessibility Data Sets Show Bias Due to Sequence Specificity of the DNase I Enzyme.
Wellcome Trust Sanger Institute, Hinxton, Cambridge, United Kingdom.
Background: DNase I is an enzyme which cuts duplex DNA at a rate that depends strongly upon its chromatin environment. In combination with high-throughput sequencing (HTS) technology, it can be used to infer genome-wide landscapes of open chromatin regions. Using this technology, systematic identification of hundreds of thousands of DNase I hypersensitive sites (DHS) per cell type has been possible, and this in turn has helped to precisely delineate genomic regulatory compartments. However, to date there has been relatively little investigation into possible biases affecting this data. Results: We report a significant degree of sequence preference spanning sites cut by DNase I in a number of published data sets. The two major protocols in current use each show a different pattern, but for a given protocol the pattern of sequence specificity seems to be quite consistent. The patterns are substantially different from biases seen in other types of HTS data sets, and in some cases the most constrained position lies outside the sequenced fragment, implying that this constraint must relate to the digestion process rather than events occurring during library preparation or sequencing. Conclusions: DNase I is a sequence-specific enzyme, with a specificity that may depend on experimental conditions. This sequence specificity is not taken into account by existing pipelines for identifying open chromatin regions. Care must be taken when interpreting DNase I results, especially when looking at the precise locations of the reads. Future studies may be able to improve the sensitivity and precision of chromatin state measurement by compensating for sequence bias.
PloS one 2013;8;7;e69853
PUBMED: 23922824; DOI: 10.1371/journal.pone.0069853
-
Criteria for inference of chromothripsis in cancer genomes.
Genome Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany. Electronic address: jan.korbel@embl.de.
Chromothripsis scars the genome when localized chromosome shattering and repair occurs in a one-off catastrophe. Outcomes of this process are detectable as massive DNA rearrangements affecting one or a few chromosomes. Although recent findings suggest a crucial role of chromothripsis in cancer development, the reproducible inference of this process remains challenging, requiring that cataclysmic one-off rearrangements be distinguished from localized lesions that occur progressively. We describe conceptual criteria for the inference of chromothripsis, based on ruling out the alternative hypothesis that stepwise rearrangements occurred. Robust means of inference may facilitate in-depth studies on the impact of, and the mechanisms underlying, chromothripsis.
Cell 2013;152;6;1226-36
PUBMED: 23498933; DOI: 10.1016/j.cell.2013.02.023
-
The genome and transcriptome of Haemonchus contortus, a key model parasite for drug and vaccine discovery.
Background: The small ruminant parasite Haemonchus contortus is the most widely used parasitic nematode in drug discovery, vaccine development and anthelmintic resistance research. Its remarkable propensity to develop resistance threatens the viability of the sheep industry in many regions of the world and provides a cautionary example of the effect of mass drug administration to control parasitic nematodes. Its phylogenetic position makes it particularly well placed for comparison with the free-living nematode Caenorhabditis elegans and the most economically important parasites of livestock and humans.
Results: Here we report the detailed analysis of a draft genome assembly and extensive transcriptomic dataset for H. contortus. This represents the first genome to be published for a strongylid nematode and the most extensive transcriptomic dataset for any parasitic nematode reported to date. We show a general pattern of conservation of genome structure and gene content between H. contortus and C. elegans, but also a dramatic expansion of important parasite gene families. We identify genes involved in parasite-specific pathways such as blood feeding, neurological function, and drug metabolism. In particular, we describe complete gene repertoires for known drug target families, providing the most comprehensive understanding yet of the action of several important anthelmintics. Also, we identify a set of genes enriched in the parasitic stages of the lifecycle and the parasite gut that provide a rich source of vaccine and drug target candidates.
Conclusions: The H. contortus genome and transcriptome provides an essential platform for postgenomic research in this and other important strongylid parasites.
Genome biology 2013;14;8;R88
PUBMED: 23985316; DOI: 10.1186/gb-2013-14-8-r88
-
Cerebral organoids model human brain development and microcephaly.
Institute of Molecular Biotechnology of the Austrian Academy of Science (IMBA), Vienna 1030, Austria.
The complexity of the human brain has made it difficult to study many brain disorders in model organisms, highlighting the need for an in vitro model of human brain development. Here we have developed a human pluripotent stem cell-derived three-dimensional organoid culture system, termed cerebral organoids, that develop various discrete, although interdependent, brain regions. These include a cerebral cortex containing progenitor populations that organize and produce mature cortical neuron subtypes. Furthermore, cerebral organoids are shown to recapitulate features of human cortical development, namely characteristic progenitor zone organization with abundant outer radial glial stem cells. Finally, we use RNA interference and patient-specific induced pluripotent stem cells to model microcephaly, a disorder that has been difficult to recapitulate in mice. We demonstrate premature neuronal differentiation in patient organoids, a defect that could help to explain the disease phenotype. Together, these data show that three-dimensional organoids can recapitulate development and disease even in this most complex human tissue.
Nature 2013
PUBMED: 23995685; DOI: 10.1038/nature12517
-
Expression of recombinant ITGA2 and CD109 for the detection of human platelet antigen (HPA)-5 and -15 alloantibodies.
Cell surface Signalling Laboratory, Wellcome Trust Sanger Institute, Hinxton Cambridge, UK.
British journal of haematology 2013
PUBMED: 23406260; DOI: 10.1111/bjh.12252
-
Intestinal colonization resistance.
Bacterial Pathogenesis Laboratory, Wellcome Trust Sanger Institute, Hinxton, UK. tl2@sanger.ac.uk
Dense, complex microbial communities, collectively termed the microbiota, occupy a diverse array of niches along the length of the mammalian intestinal tract. During health and in the absence of antibiotic exposure the microbiota can effectively inhibit colonization and overgrowth by invading microbes such as pathogens. This phenomenon is called 'colonization resistance' and is associated with a stable and diverse microbiota in tandem with a controlled lack of inflammation, and involves specific interactions between the mucosal immune system and the microbiota. Here we overview the microbial ecology of the healthy mammalian intestinal tract and highlight the microbe-microbe and microbe-host interactions that promote colonization resistance. Emerging themes highlight immunological (T helper type 17/regulatory T-cell balance), microbiota (diverse and abundant) and metabolic (short-chain fatty acid) signatures of intestinal health and colonization resistance. Intestinal pathogens use specific virulence factors or exploit antibiotic use to subvert colonization resistance for their own benefit by triggering inflammation to disrupt the harmony of the intestinal ecosystem. A holistic view that incorporates immunological and microbiological facets of the intestinal ecosystem should facilitate the development of immunomodulatory and microbe-modulatory therapies that promote intestinal homeostasis and colonization resistance.
Funded by: Medical Research Council: 93614; Wellcome Trust: 076964, 098051
Immunology 2013;138;1;1-11
PUBMED: 23240815; PMC: 3533696; DOI: 10.1111/j.1365-2567.2012.03616.x
-
Richness of human gut microbiome correlates with metabolic markers.
INRA, Institut National de la Recherche Agronomique, US1367 Metagenopolis, 78350 Jouy en Josas, France.
We are facing a global metabolic health crisis provoked by an obesity epidemic. Here we report the human gut microbial composition in a population sample of 123 non-obese and 169 obese Danish individuals. We find two groups of individuals that differ by the number of gut microbial genes and thus gut bacterial richness. They contain known and previously unknown bacterial species at different proportions; individuals with a low bacterial richness (23% of the population) are characterized by more marked overall adiposity, insulin resistance and dyslipidaemia and a more pronounced inflammatory phenotype when compared with high bacterial richness individuals. The obese individuals among the lower bacterial richness group also gain more weight over time. Only a few bacterial species are sufficient to distinguish between individuals with high and low bacterial richness, and even between lean and obese participants. Our classifications based on variation in the gut microbiome identify subsets of individuals in the general white adult population who may be at increased risk of progressing to adiposity-associated co-morbidities.
Nature 2013;500;7464;541-6
PUBMED: 23985870; DOI: 10.1038/nature12506
-
Human SNP Links Differential Outcomes in Inflammatory and Infectious Disease to a FOXO3-Regulated Pathway.
Cambridge Institute for Medical Research, University of Cambridge, Cambridge Biomedical Campus, Cambridge CB2 0XY, UK; Department of Medicine, University of Cambridge School of Clinical Medicine, Addenbrooke's Hospital, Cambridge CB2 0QQ, UK.
The clinical course and eventual outcome, or prognosis, of complex diseases varies enormously between affected individuals. This variability critically determines the impact a disease has on a patient's life but is very poorly understood. Here, we exploit existing genome-wide association study data to gain insight into the role of genetics in prognosis. We identify a noncoding polymorphism in FOXO3A (rs12212067: T > G) at which the minor (G) allele, despite not being associated with disease susceptibility, is associated with a milder course of Crohn's disease and rheumatoid arthritis and with increased risk of severe malaria. Minor allele carriage is shown to limit inflammatory responses in monocytes via a FOXO3-driven pathway, which through TGFβ1 reduces production of proinflammatory cytokines, including TNFα, and increases production of anti-inflammatory cytokines, including IL-10. Thus, we uncover a shared genetic contribution to prognosis in distinct diseases that operates via a FOXO3-driven pathway modulating inflammatory responses.
Cell 2013
PUBMED: 24035192; DOI: 10.1016/j.cell.2013.08.034
-
Common variants in the HLA-DRB1-HLA-DQA1 HLA class II region are associated with susceptibility to visceral leishmaniasis.
1] Cambridge Institute for Medical Research, University of Cambridge School of Clinical Medicine, Addenbrooke's Hospital, Cambridge, UK. [2].
To identify susceptibility loci for visceral leishmaniasis, we undertook genome-wide association studies in two populations: 989 cases and 1,089 controls from India and 357 cases in 308 Brazilian families (1,970 individuals). The HLA-DRB1-HLA-DQA1 locus was the only region to show strong evidence of association in both populations. Replication at this region was undertaken in a second Indian population comprising 941 cases and 990 controls, and combined analysis across the three cohorts for rs9271858 at this locus showed P(combined) = 2.76 × 10(-17) and odds ratio (OR) = 1.41, 95% confidence interval (CI) = 1.30-1.52. A conditional analysis provided evidence for multiple associations within the HLA-DRB1-HLA-DQA1 region, and a model in which risk differed between three groups of haplotypes better explained the signal and was significant in the Indian discovery and replication cohorts. In conclusion, the HLA-DRB1-HLA-DQA1 HLA class II region contributes to visceral leishmaniasis susceptibility in India and Brazil, suggesting shared genetic risk factors for visceral leishmaniasis that cross the epidemiological divides of geography and parasite species.
Nature genetics 2013;45;2;208-13
PUBMED: 23291585; DOI: 10.1038/ng.2518
-
The piggybac transposon displays local and distant reintegration preferences and can cause mutations at non-canonical integration sites.
Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK, CB10 1SA.
The DNA transposon piggyBac is widely used as a tool in mammalian experimental systems for transgenesis, mutagenesis and genome engineering. We have characterised genome-wide insertion site preferences of piggyBac by sequencing a large set of integration sites arising from transposition from two separate genomic loci and a plasmid donor in mouse embryonic stem cells. We found that piggyBac preferentially integrates locally to the excision site when mobilised from a chromosomal location, and identified other non-local regions of the genome with elevated insertion frequencies. PiggyBac insertions were associated with expressed genes and markers of open chromatin structure, and were excluded from heterochromatin. At the nucleotide level, piggyBac prefers to insert into TA-rich regions within a broader GC-rich context. We also found that piggyBac can insert into sites other than its known TTAA insertion site at low frequency (2%). Such insertions introduce mismatches that are repaired with signatures of host cell mismatch repair pathways. Transposons could be mobilised from plasmids with the observed mismatches, indicating that piggyBac could generate point mutations in the genome.
Molecular and cellular biology 2013
PUBMED: 23358416; DOI: 10.1128/MCB.00670-12
-
Non dominant-negative KCNJ2 gene mutations leading to Andersen-Tawil syndrome with an isolated cardiac phenotype.
Institut für Physiologie und Pathophysiologie, Vegetative Physiologie, Philipps-University Marburg, Deutschhausstraße 1-2, 35037, Marburg, Germany.
Andersen-Tawil syndrome (ATS) is characterized by dysmorphic features, periodic paralyses and abnormal ventricular repolarization. After genotyping a large set of patients with congenital long-QT syndrome, we identified two novel, heterozygous KCNJ2 mutations (p.N318S, p.W322C) located in the C-terminus of the Kir2.1 subunit. These mutations have a different localization than classical ATS mutations which are mostly located at a potential interaction face with the slide helix or at the interface between the C-termini. Mutation carriers were without the key features of ATS, causing an isolated cardiac phenotype. While the N318S mutants regularly reached the plasma membrane, W322C mutants primarily resided in late endosomes. Co-expression of N318S or W322C with wild-type Kir2.1 reduced current amplitudes only by 20-25 %. This mild loss-of-function for the heteromeric channels resulted from defective channel trafficking (W322C) or gating (N318S). Strikingly, and in contrast to the majority of ATS mutations, neither mutant caused a dominant-negative suppression of wild-type Kir2.1, Kir2.2 and Kir2.3 currents. Thus, a mild reduction of native Kir2.x currents by non dominant-negative mutants may cause ATS with an isolated cardiac phenotype.
Basic research in cardiology 2013;108;3;353
PUBMED: 23644778; DOI: 10.1007/s00395-013-0353-1
-
The future role of genetic screening to detect newborns at risk of childhood-onset hearing loss.
* National Institute for Health Research (NIHR) Horizon Scanning Centre, School of Health and Population Sciences, University of Birmingham , Birmingham , UK.
Abstract Objective: To explore the future potential of genetic screening to detect newborns at risk of childhood-onset hearing loss. Design: An expert led discussion of current and future developments in genetic technology and the knowledge base of genetic hearing loss to determine the viability of genetic screening and the implications for screening policy. Results and Discussion: Despite increasing pressure to adopt genetic technologies, a major barrier for genetic screening in hearing loss is the uncertain clinical significance of the identified mutations and their interactions. Only when a reliable estimate of the future risk of hearing loss can be made at a reasonable cost, will genetic screening become viable. Given the speed of technological advancement this may be within the next 10 years. Decision-makers should start to consider how genetic screening could augment current screening programmes as well as the associated data processing and storage requirements. Conclusion: In the interim, we suggest that decision makers consider the benefits of (1) genetically testing all newborns and children with hearing loss, to determine aetiology and to increase knowledge of the genetic causes of hearing loss, and (2) consider screening pregnant women for the m.1555A> G mutation to reduce the risk of aminoglycoside antibiotic-associated hearing loss.
International journal of audiology 2013;52;2;124-33
PUBMED: 23131088; PMC: 3545543; DOI: 10.3109/14992027.2012.733424
-
Dense genotyping of immune-related disease regions identifies nine new risk loci for primary sclerosing cholangitis.
Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK.
Primary sclerosing cholangitis (PSC) is a severe liver disease of unknown etiology leading to fibrotic destruction of the bile ducts and ultimately to the need for liver transplantation. We compared 3,789 PSC cases of European ancestry to 25,079 population controls across 130,422 SNPs genotyped using the Immunochip. We identified 12 genome-wide significant associations outside the human leukocyte antigen (HLA) complex, 9 of which were new, increasing the number of known PSC risk loci to 16. Despite comorbidity with inflammatory bowel disease (IBD) in 72% of the cases, 6 of the 12 loci showed significantly stronger association with PSC than with IBD, suggesting overlapping yet distinct genetic architectures for these two diseases. We incorporated association statistics from 7 diseases clinically occurring with PSC in the analysis and found suggestive evidence for 33 additional pleiotropic PSC risk loci. Together with network analyses, these findings add to the genetic risk map of PSC and expand on the relationship between PSC and other immune-mediated diseases.
Nature genetics 2013;45;6;670-5
PUBMED: 23603763; PMC: 3667736; DOI: 10.1038/ng.2616
-
Fine Mapping of the Pond Snail Left-Right Asymmetry (Chirality) Locus Using RAD-Seq and Fibre-FISH.
School of Biology, University of Nottingham, University Park, Nottingham, United Kingdom ; Department of Plant Sciences, University of Cambridge, Cambridge, United Kingdom.
The left-right asymmetry of snails, including the direction of shell coiling, is determined by the delayed effect of a maternal gene on the chiral twist that takes place during early embryonic cell divisions. Yet, despite being a well-established classical problem, the identity of the gene and the means by which left-right asymmetry is established in snails remain unknown. We here demonstrate the power of new genomic approaches for identification of the chirality gene, "D". First, heterozygous (Dd) pond snails Lymnaea stagnalis were self-fertilised or backcrossed, and the genotype of more than six thousand offspring inferred, either dextral (DD/Dd) or sinistral (dd). Then, twenty of the offspring were used for Restriction-site-Associated DNA Sequencing (RAD-Seq) to identify anonymous molecular markers that are linked to the chirality locus. A local genetic map was constructed by genotyping three flanking markers in over three thousand snails. The three markers lie either side of the chirality locus, with one very tightly linked (<0.1 cM). Finally, bacterial artificial chromosomes (BACs) were isolated that contained the three loci. Fluorescent in situ hybridization (FISH) of pachytene cells showed that the three BACs tightly cluster on the same bivalent chromosome. Fibre-FISH identified a region of greater that ∼0.4 Mb between two BAC clone markers that must contain D. This work therefore establishes the resources for molecular identification of the chirality gene and the variation that underpins sinistral and dextral coiling. More generally, the results also show that combining genomic technologies, such as RAD-Seq and high resolution FISH, is a robust approach for mapping key loci in non-model systems.
PloS one 2013;8;8;e71067
PUBMED: 23951082; PMC: 3741322; DOI: 10.1371/journal.pone.0071067
-
Detecting and Characterizing Genomic Signatures of Positive Selection in Global Populations.
NUS Graduate School for Integrative Science and Engineering, National University of Singapore, Singapore 117456, Singapore; Saw Swee Hock School of Public Health, National University of Singapore, Singapore 117597, Singapore.
Natural selection is a significant force that shapes the architecture of the human genome and introduces diversity across global populations. The question of whether advantageous mutations have arisen in the human genome as a result of single or multiple mutation events remains unanswered except for the fact that there exist a handful of genes such as those that confer lactase persistence, affect skin pigmentation, or cause sickle cell anemia. We have developed a long-range-haplotype method for identifying genomic signatures of positive selection to complement existing methods, such as the integrated haplotype score (iHS) or cross-population extended haplotype homozygosity (XP-EHH), for locating signals across the entire allele frequency spectrum. Our method also locates the founder haplotypes that carry the advantageous variants and infers their corresponding population frequencies. This presents an opportunity to systematically interrogate the whole human genome whether a selection signal shared across different populations is the consequence of a single mutation process followed subsequently by gene flow between populations or of convergent evolution due to the occurrence of multiple independent mutation events either at the same variant or within the same gene. The application of our method to data from 14 populations across the world revealed that positive-selection events tend to cluster in populations of the same ancestry. Comparing the founder haplotypes for events that are present across different populations revealed that convergent evolution is a rare occurrence and that the majority of shared signals stem from the same evolutionary event.
American journal of human genetics 2013
PUBMED: 23731540; DOI: 10.1016/j.ajhg.2013.04.021
-
Epigenetic conservation at gene regulatory elements revealed by non-methylated DNA profiling in seven vertebrates.
Department of Biochemistry , University of Oxford , Oxford , United Kingdom ; Weatherall Institute of Molecular Medicine, University of Oxford , Oxford , United Kingdom.
Two-thirds of gene promoters in mammals are associated with regions of non-methylated DNA, called CpG islands (CGIs), which counteract the repressive effects of DNA methylation on chromatin. In cold-blooded vertebrates, computational CGI predictions often reside away from gene promoters, suggesting a major divergence in gene promoter architecture across vertebrates. By experimentally identifying non-methylated DNA in the genomes of seven diverse vertebrates, we instead reveal that non-methylated islands (NMIs) of DNA are a central feature of vertebrate gene promoters. Furthermore, NMIs are present at orthologous genes across vast evolutionary distances, revealing a surprising level of conservation in this epigenetic feature. By profiling NMIs in different tissues and developmental stages we uncover a unifying set of features that are central to the function of NMIs in vertebrates. Together these findings demonstrate an ancient logic for NMI usage at gene promoters and reveal an unprecedented level of epigenetic conservation across vertebrate evolution. DOI:http://dx.doi.org/10.7554/eLife.00348.001.
eLife 2013;2;e00348
PUBMED: 23467541; PMC: 3583005; DOI: 10.7554/eLife.00348
-
Human Spermatogenic Failure Purges Deleterious Mutation Load from the Autosomes and Both Sex Chromosomes, including the Gene DMRT1.
Institute of Molecular Pathology and Immunology of the University of Porto (IPATIMUP), Porto, Portugal.
Gonadal failure, along with early pregnancy loss and perinatal death, may be an important filter that limits the propagation of harmful mutations in the human population. We hypothesized that men with spermatogenic impairment, a disease with unknown genetic architecture and a common cause of male infertility, are enriched for rare deleterious mutations compared to men with normal spermatogenesis. After assaying genomewide SNPs and CNVs in 323 Caucasian men with idiopathic spermatogenic impairment and more than 1,100 controls, we estimate that each rare autosomal deletion detected in our study multiplicatively changes a man's risk of disease by 10% (OR 1.10 [1.04-1.16], p<2×10(-3)), rare X-linked CNVs by 29%, (OR 1.29 [1.11-1.50], p<1×10(-3)), and rare Y-linked duplications by 88% (OR 1.88 [1.13-3.13], p<0.03). By contrasting the properties of our case-specific CNVs with those of CNV callsets from cases of autism, schizophrenia, bipolar disorder, and intellectual disability, we propose that the CNV burden in spermatogenic impairment is distinct from the burden of large, dominant mutations described for neurodevelopmental disorders. We identified two patients with deletions of DMRT1, a gene on chromosome 9p24.3 orthologous to the putative sex determination locus of the avian ZW chromosome system. In an independent sample of Han Chinese men, we identified 3 more DMRT1 deletions in 979 cases of idiopathic azoospermia and none in 1,734 controls, and found none in an additional 4,519 controls from public databases. The combined results indicate that DMRT1 loss-of-function mutations are a risk factor and potential genetic cause of human spermatogenic failure (frequency of 0.38% in 1306 cases and 0% in 7,754 controls, p = 6.2×10(-5)). Our study identifies other recurrent CNVs as potential causes of idiopathic azoospermia and generates hypotheses for directing future studies on the genetic basis of male infertility and IVF outcomes.
PLoS genetics 2013;9;3;e1003349
PUBMED: 23555275; PMC: 3605256; DOI: 10.1371/journal.pgen.1003349
-
Genome-wide association study on detailed profiles of smoking behavior and nicotine dependence in a twin sample.
Department of Public Health, Hjelt Institute, University of Helsinki, Helsinki, Finland.
Smoking is a major risk factor for several somatic diseases and is also emerging as a causal factor for neuropsychiatric disorders. Genome-wide association (GWA) and candidate gene studies for smoking behavior and nicotine dependence (ND) have disclosed too few predisposing variants to account for the high estimated heritability. Previous large-scale GWA studies have had very limited phenotypic definitions of relevance to smoking-related behavior, which has likely impeded the discovery of genetic effects. We performed GWA analyses on 1114 adult twins ascertained for ever smoking from the population-based Finnish Twin Cohort study. The availability of 17 smoking-related phenotypes allowed us to comprehensively portray the dimensions of smoking behavior, clustered into the domains of smoking initiation, amount smoked and ND. Our results highlight a locus on 16p12.3, with several single-nucleotide polymorphisms (SNPs) in the vicinity of CLEC19A showing association (P<1 × 10(-6)) with smoking quantity. Interestingly, CLEC19A is located close to a previously reported attention-deficit hyperactivity disorder (ADHD) linkage locus and an evident link between ADHD and smoking has been established. Intriguing preliminary association (P<1 × 10(-5)) was detected between DSM-IV (Diagnostic and Statistical Manual of Mental Disorders, 4th edition) ND diagnosis and several SNPs in ERBB4, coding for a Neuregulin receptor, on 2q33. The association between ERBB4 and DSM-IV ND diagnosis was replicated in an independent Australian sample. Recently, a significant increase in ErbB4 and Neuregulin 3 (Nrg3) expression was revealed following chronic nicotine exposure and withdrawal in mice and an association between NRG3 SNPs and smoking cessation success was detected in a clinical trial. ERBB4 has previously been associated with schizophrenia; further, it is located within an established schizophrenia linkage locus and within a linkage locus for a smoker phenotype identified in this sample. In conclusion, we disclose novel tentative evidence for the involvement of ERBB4 in ND, suggesting the involvement of the Neuregulin/ErbB signalling pathway in addictions and providing a plausible link between the high co-morbidity of schizophrenia and ND.Molecular Psychiatry advance online publication, 11 June 2013; doi:10.1038/mp.2013.72.
Molecular psychiatry 2013
PUBMED: 23752247; DOI: 10.1038/mp.2013.72
-
Generation of a Tn5 transposon library in Haemophilus parasuis and analysis by transposon-directed insertion-site sequencing (TraDIS).
Department of Veterinary Medicine, University of Cambridge, Madingley Road, Cambridge CB3 0ES, UK. Electronic address: sl470@cam.ac.uk.
Haemophilus parasuis is an important respiratory tract pathogen of swine and the etiological agent of Glässer's disease. The molecular pathogenesis of H. parasuis is not well studied, mainly due to the lack of efficient tools for genetic manipulation of this bacterium. In this study we describe a Tn5-based random mutagenesis method for use in H. parasuis. A novel chloramphenicol-resistant Tn5 transposome was electroporated into the virulent H. parasuis serovar 5 strain 29755. High transposition efficiency of Tn5, up to 10(4) transformants/μg of transposon DNA, was obtained by modification of the Tn5 DNA in the H. parasuis strain HS071 and establishment of optimal electrotransformation conditions, and a library of approximately 10,500 mutants was constructed. Analysis of the library using transposon-directed insertion-site sequencing (TraDIS) revealed that the insertion of Tn5 was evenly distributed throughout the genome. 10,001 individual mutants were identified, with 1561 genes being disrupted (69.4% of the genome). This newly-developed, efficient mutagenesis approach will be a powerful tool for genetic manipulation of H. parasuis in order to study its physiology and pathogenesis.
Veterinary microbiology 2013
PUBMED: 23928120; DOI: 10.1016/j.vetmic.2013.07.008
-
Genome-wide association study in a Chinese population identifies a susceptibility locus for type 2 diabetes at 7q32 near PAX4.
Department of Medicine and Therapeutics, Chinese University of Hong Kong, Prince of Wales Hospital, Shatin, New Territories, Hong Kong, SAR, People's Republic of China, rcwma@cuhk.edu.hk.
AIMS/HYPOTHESIS: Most genetic variants identified for type 2 diabetes have been discovered in European populations. We performed genome-wide association studies (GWAS) in a Chinese population with the aim of identifying novel variants for type 2 diabetes in Asians. METHODS: We performed a meta-analysis of three GWAS comprising 684 patients with type 2 diabetes and 955 controls of Southern Han Chinese descent. We followed up the top signals in two independent Southern Han Chinese cohorts (totalling 10,383 cases and 6,974 controls), and performed in silico replication in multiple populations. RESULTS: We identified CDKN2A/B and four novel type 2 diabetes association signals with p < 1 × 10(-5) from the meta-analysis. Thirteen variants within these four loci were followed up in two independent Chinese cohorts, and rs10229583 at 7q32 was found to be associated with type 2 diabetes in a combined analysis of 11,067 cases and 7,929 controls (p meta = 2.6 × 10(-8); OR [95% CI] 1.18 [1.11, 1.25]). In silico replication revealed consistent associations across multiethnic groups, including five East Asian populations (p meta = 2.3 × 10(-10)) and a population of European descent (p = 8.6 × 10(-3)). The rs10229583 risk variant was associated with elevated fasting plasma glucose, impaired beta cell function in controls, and an earlier age at diagnosis for the cases. The novel variant lies within an islet-selective cluster of open regulatory elements. There was significant heterogeneity of effect between Han Chinese and individuals of European descent, Malaysians and Indians. CONCLUSIONS/INTERPRETATION: Our study identifies rs10229583 near PAX4 as a novel locus for type 2 diabetes in Chinese and other populations and provides new insights into the pathogenesis of type 2 diabetes.
Diabetologia 2013
PUBMED: 23532257; DOI: 10.1007/s00125-013-2874-4
-
Characterization of Vibrio cholerae Bacteriophages Isolated from the Environmental Waters of the Lake Victoria Region of Kenya.
School of Biological Sciences, University of Nairobi, Nairobi, Kenya, nyamburagichuhi@gmail.com.
Over the last decade, cholera outbreaks have become common in some parts of Kenya. The most recent cholera outbreak occurred in Coastal and Lake Victoria region during January 2009 and May 2010, where a total of 11,769 cases and 274 deaths were reported by the Ministry of Public Health and Sanitation. The objective of this study is to isolate Vibrio cholerae bacteriophages from the environmental waters of the Lake Victoria region of Kenya with potential for use as a biocontrol for cholera outbreaks. Water samples from wells, ponds, sewage effluent, boreholes, rivers, and lakes of the Lake Victoria region of Kenya were enriched for 48 h at 37 °C in broth containing a an environmental strain of V. cholerae. Bacteriophages were isolated from 5 out of the 42 environmental water samples taken. Isolated phages produced tiny, round, and clear plaques suggesting that these phages were lytic to V. cholerae. Transmission electron microscope examination revealed that all the nine phages belonged to the family Myoviridae, with typical icosahedral heads, long contractile tails, and fibers. Head had an average diameter of 88.3 nm and tail of length and width 84.9 and 16.1 nm, respectively. Vibriophages isolated from the Lake Victoria region of Kenya have been characterized and the isolated phages may have a potential to be used as antibacterial agents to control pathogenic V. cholerae bacteria in water reservoirs.
Current microbiology 2013
PUBMED: 23982202; DOI: 10.1007/s00284-013-0447-x
-
A follow-up linkage study of Finnish pre-eclampsia families identifies a new fetal susceptibility locus on chromosome 18.
1] Department of Medical Genetics, Haartman Institute, University of Helsinki, Helsinki, Finland [2] Research Programs Unit, Women's Health, University of Helsinki, Helsinki, Finland.
Pre-eclampsia is a common vascular disorder of pregnancy. It originates in the placenta and targets the maternal endothelium. According to epidemiological research, >50% of the liability to this disorder can be accounted for by genetic factors. Both maternal and fetal genes contribute to the risk, but especially the fetal genetic risk profile is still poorly understood. We have previously detected linkage signals in multiplex Finnish families on chromosomes 2p25, 4q32, and 9p13 using maternal phenotypes. We performed a linkage analysis using updated maternal phenotypes and an unprecedented linkage analysis using fetal phenotypes. Markers genotyped were available from 237 individuals in 15 Finnish families, including 72 affected mothers and 49 affected fetuses. The MERLIN software was used for sample and marker quality control and linkage analysis. The results were compared against the original ones obtained by using the GENEHUNTER 2.1 software. The previous identification of the maternal susceptibility locus to a genetic location at 21.70 cM near marker D2S168 on chromosome 2 was confirmed by using both maternal and fetal phenotypes (maternal non-parametric linkage (NPL) score 3.79, P=0.00008, LOD (logarithm (base 10) of odds)=2.20 and fetal NPL score 2.95, P=0.002, LOD=1.71). As a novel finding, we present a suggestive linkage to chromosome 18 at 86.80 cM near marker D18S64 (NPL score 2.51, P=0.006, LOD=1.20) using the fetal phenotype. We propose that chromosome 18 may harbor a new fetal susceptibility locus for pre-eclampsia.European Journal of Human Genetics advance online publication, 6 February 2013; doi:10.1038/ejhg.2013.6.
European journal of human genetics : EJHG 2013
PUBMED: 23386034; DOI: 10.1038/ejhg.2013.6
-
The agr Locus Regulates Virulence and Colonization Genes in Clostridium difficile 027.
Department of Pathogen Molecular Biology, London School of Hygiene and Tropical Medicine, University of London, London, United Kingdom.
The transcriptional regulator AgrA, a member of the LytTR family of proteins, plays a key role in controlling gene expression in some Gram-positive pathogens, including Staphylococcus aureus and Enterococcus faecalis. AgrA is encoded by the agrACDB global regulatory locus, and orthologues are found within the genome of most Clostridium difficile isolates, including the epidemic lineage 027/BI/NAP1. Comparative RNA sequencing of the wild type and otherwise isogenic agrA null mutant derivatives of C. difficile R20291 revealed a network of approximately 75 differentially regulated transcripts at late exponential growth phase, including many genes associated with flagellar assembly and function, such as the major structural subunit, FliC. Other differentially regulated genes include several involved in bis-(3'-5')-cyclic dimeric GMP (c-di-GMP) synthesis and toxin A expression. C. difficile 027 R20291 agrA mutant derivatives were poorly flagellated and exhibited reduced levels of colonization and relapses in the murine infection model. Thus, the agr locus likely plays a contributory role in the fitness and virulence potential of C. difficile strains in the 027/BI/NAP1 lineage.
Journal of bacteriology 2013;195;16;3672-81
PUBMED: 23772065; DOI: 10.1128/JB.00473-13
-
Distinguishable Epidemics of Multidrug-Resistant Salmonella Typhimurium DT104 in Different Hosts.
Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK.
The global epidemic of multidrug-resistant Salmonella Typhimurium DT104 provides an important example, both in terms of the agent and its resistance, of a widely disseminated zoonotic pathogen. Here, with an unprecedented national collection of isolates collected contemporaneously from humans and animals and including a sample of internationally derived isolates, we have used whole-genome sequencing to dissect the phylogenetic associationsof the bacterium and its antimicrobial resistance genes through the course of an epidemic. Contrary to current tenets supporting a single homogeneous epidemic, we demonstrate that the bacterium and its resistance genes were largely maintained within animal and human populations separately and that there was limited transmission, in either direction. We also show considerable variation in the resistance profiles, in contrast to the largely stable bacterial core genome, which emphasizes the critical importance of integrated genotypic data sets in understanding the ecology of bacterial zoonoses and antimicrobial resistance.
Science (New York, N.Y.) 2013
PUBMED: 24030491; DOI: 10.1126/science.1240578
-
The Evolutionary Path to Extraintestinal Pathogenic, Drug-Resistant Escherichia coli Is Marked by Drastic Reduction in Detectable Recombination within the Core Genome.
Pathogen Research Group, Nottingham Trent University, United Kingdom.
Escherichia coli is a highly diverse group of pathogens ranging from commensal of the intestinal tract, through to intestinal pathogen, and extraintestinal pathogen. Here, we present data on the population diversity of E. coli, using Bayesian analysis to identify 13 distinct clusters within the population from multilocus sequence typing data, which map onto a whole-genome-derived phylogeny based on 62 genome sequences. Bayesian analysis of recombination within the core genome identified reduction in detectable core genome recombination as one moves from the commensals, through the intestinal pathogens down to the multidrug-resistant extraintestinal pathogenic clone E. coli ST131. Our data show that the emergence of a multidrug-resistant, extraintestinal pathogenic lineage of E. coli is marked by substantial reduction in detectable core genome recombination, resulting in a lineage which is phylogenetically distinct and sexually isolated in terms of core genome recombination.
Genome biology and evolution 2013;5;4;699-710
PUBMED: 23493634; PMC: 3641635; DOI: 10.1093/gbe/evt038
-
The sex-specific associations of the aromatase gene with Alzheimer's disease and its interaction with IL10 in the Epistasis Project.
Human Genetics Research, Queens Medical Centre, School of Molecular Medical Sciences, University of Nottingham, Nottingham, UK.
Epistasis between interleukin-10 (IL10) and aromatase gene polymorphisms has previously been reported to modify the risk of Alzheimer's disease (AD). However, although the main effects of aromatase variants suggest a sex-specific effect in AD, there has been insufficient power to detect sex-specific epistasis between these genes to date. Here we used the cohort of 1757 AD patients and 6294 controls in the Epistasis Project. We replicated the previously reported main effects of aromatase polymorphisms in AD risk in women, for example, adjusted odds ratio of disease for rs1065778 GG=1.22 (95% confidence interval: 1.01-1.48, P=0.03). We also confirmed a reported epistatic interaction between IL10 rs1800896 and aromatase (CYP19A1) rs1062033, again only in women: adjusted synergy factor=1.94 (1.16-3.25, 0.01). Aromatase, a rate-limiting enzyme in the synthesis of estrogens, is expressed in AD-relevant brain regions ,and is downregulated during the disease. IL-10 is an anti-inflammatory cytokine. Given that estrogens have neuroprotective and anti-inflammatory activities and regulate microglial cytokine production, epistasis is biologically plausible. Diminishing serum estrogen in postmenopausal women, coupled with suboptimal brain estrogen synthesis, may contribute to the inflammatory state, that is a pathological hallmark of AD.European Journal of Human Genetics advance online publication, 5 June 2013; doi:10.1038/ejhg.2013.116.
European journal of human genetics : EJHG 2013
PUBMED: 23736221; DOI: 10.1038/ejhg.2013.116
-
Machine learning prediction of cancer cell sensitivity to drugs based on genomic and chemical properties.
European Bioinformatics Institute, Wellcome Trust Genome Campus-Cambridge, Cambridge, United Kingdom.
Predicting the response of a specific cancer to a therapy is a major goal in modern oncology that should ultimately lead to a personalised treatment. High-throughput screenings of potentially active compounds against a panel of genomically heterogeneous cancer cell lines have unveiled multiple relationships between genomic alterations and drug responses. Various computational approaches have been proposed to predict sensitivity based on genomic features, while others have used the chemical properties of the drugs to ascertain their effect. In an effort to integrate these complementary approaches, we developed machine learning models to predict the response of cancer cell lines to drug treatment, quantified through IC50 values, based on both the genomic features of the cell lines and the chemical properties of the considered drugs. Models predicted IC50 values in a 8-fold cross-validation and an independent blind test with coefficient of determination R(2) of 0.72 and 0.64 respectively. Furthermore, models were able to predict with comparable accuracy (R(2) of 0.61) IC50s of cell lines from a tissue not used in the training stage. Our in silico models can be used to optimise the experimental design of drug-cell screenings by estimating a large proportion of missing IC50 values rather than experimentally measuring them. The implications of our results go beyond virtual drug screening design: potentially thousands of drugs could be probed in silico to systematically test their potential efficacy as anti-tumour agents based on their structure, thus providing a computational framework to identify new drug repositioning opportunities as well as ultimately be useful for personalized medicine by linking the genomic traits of patients to drug sensitivity.
PloS one 2013;8;4;e61318
PUBMED: 23646105; PMC: 3640019; DOI: 10.1371/journal.pone.0061318
-
Metabolomic markers reveal novel pathways of ageing and early development in human populations.
Department of Twin Research & Genetic Epidemiology, King's College London, London, UK, Institute of Bioinformatics and Systems Biology, Helmholtz Zentrum München, Neuherberg, Germany, Institute of Genetic Epidemiology, Helmholtz Zentrum München, Neuherberg, Germany, Institute of Epidemiology I, Helmholtz Zentrum München, Neuherberg, Germany, Pfizer Research Laboratories, Groton, CT, USA, Worldwide R&D, Pfizer Inc., Cambridge, MA, USA, School of Medicine and Pharmacology, University of Western Australia, Crawley, WA, Australia, Department of Endocrinology and Diabetes, Sir Charles Gairdner Hospital, Nedlands, WA, Australia, Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK, Metabolon Inc., 617 Davis Drive, Durham, NC 27713, USA; Department of Physiology and Biophysics, Weill Cornell Medical College in Qatar, Education City, Qatar Foundation, Doha, State of Qatar and Academic Rheumatology, University of Nottingham, Nottingham City Hospital, Nottingham, UK.
Background: Human ageing is a complex, multifactorial process and early developmental factors affect health outcomes in old age.
Methods: Metabolomic profiling on fasting blood was carried out in 6055 individuals from the UK. Stepwise regression was performed to identify a panel of independent metabolites which could be used as a surrogate for age. We also investigated the association with birthweight overall and within identical discordant twins and with genome-wide methylation levels.
Results: We identified a panel of 22 metabolites which combined are strongly correlated with age (R(2) = 59%) and with age-related clinical traits independently of age. One particular metabolite, C-glycosyl tryptophan (C-glyTrp), correlated strongly with age (beta = 0.03, SE = 0.001, P = 7.0 × 10(-157)) and lung function (FEV1 beta = -0.04, SE = 0.008, P = 1.8 × 10(-8) adjusted for age and confounders) and was replicated in an independent population (n = 887). C-glyTrp was also associated with bone mineral density (beta = -0.01, SE = 0.002, P = 1.9 × 10(-6)) and birthweight (beta = -0.06, SE = 0.01, P = 2.5 × 10(-9)). The difference in C-glyTrp levels explained 9.4% of the variance in the difference in birthweight between monozygotic twins. An epigenome-wide association study in 172 individuals identified three CpG-sites, associated with levels of C-glyTrp (P < 2 × 10(-6)). We replicated one CpG site in the promoter of the WDR85 gene in an independent sample of 350 individuals (beta = -0.20, SE = 0.04, P = 2.9 × 10(-8)). WDR85 is a regulator of translation elongation factor 2, essential for protein synthesis in eukaryotes.
Conclusions: Our data illustrate how metabolomic profiling linked with epigenetic studies can identify some key molecular mechanisms potentially determined in early development that produce long-term physiological changes influencing human health and ageing.
International journal of epidemiology 2013
PUBMED: 23838602; DOI: 10.1093/ije/dyt094
-
Quantifying single nucleotide variant detection sensitivity in exome sequencing.
MRC Human Genetics Unit, MRC Institute for Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Crewe Road, Edinburgh, UK. alison.meynert@igmm.ed.ac.uk.
Background: The targeted capture and sequencing of genomic regions has rapidly demonstrated its utility in genetic studies. Inherent in this technology is considerable heterogeneity of target coverage and this is expected to systematically impact our sensitivity to detect genuine polymorphisms. To fully interpret the polymorphisms identified in a genetic study it is often essential to both detect polymorphisms and to understand where and with what probability real polymorphisms may have been missed. Results: Using down-sampling of 30 deeply sequenced exomes and a set of gold-standard single nucleotide variant (SNV) genotype calls for each sample, we developed an empirical model relating the read depth at a polymorphic site to the probability of calling the correct genotype at that site. We find that measured sensitivity in SNV detection is substantially worse than that predicted from the naive expectation of sampling from a binomial. This calibrated model allows us to produce single nucleotide resolution SNV sensitivity estimates which can be merged to give summary sensitivity measures for any arbitrary partition of the target sequences (nucleotide, exon, gene, pathway, exome). These metrics are directly comparable between platforms and can be combined between samples to give "power estimates" for an entire study. We estimate a local read depth of 13X is required to detect the alleles and genotype of a heterozygous SNV 95% of the time, but only 3X for a homozygous SNV. At a mean on-target read depth of 20X, commonly used for rare disease exome sequencing studies, we predict 5-15% of heterozygous and 1-4% of homozygous SNVs in the targeted regions will be missed. Conclusions: Non-reference alleles in the heterozygote state have a high chance of being missed when commonly applied read coverage thresholds are used despite the widely held assumption that there is good polymorphism detection at these coverage levels. Such alleles are likely to be of functional importance in population based studies of rare diseases, somatic mutations in cancer and explaining the "missing heritability" of quantitative traits.
BMC bioinformatics 2013;14;195
PUBMED: 23773188; PMC: 3695811; DOI: 10.1186/1471-2105-14-195
-
Empirical research on the ethics of genomic research.
Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK. am33@sanger.ac.uk
Funded by: Wellcome Trust
American journal of medical genetics. Part A 2013;161A;8;2099-101
PUBMED: 23813698; DOI: 10.1002/ajmg.a.36067
-
Multiple populations of artemisinin-resistant Plasmodium falciparum in Cambodia.
Medical Research Council MRC Centre for Genomics and Global Health, University of Oxford, Oxford, UK.
We describe an analysis of genome variation in 825 P. falciparum samples from Asia and Africa that identifies an unusual pattern of parasite population structure at the epicenter of artemisinin resistance in western Cambodia. Within this relatively small geographic area, we have discovered several distinct but apparently sympatric parasite subpopulations with extremely high levels of genetic differentiation. Of particular interest are three subpopulations, all associated with clinical resistance to artemisinin, which have skewed allele frequency spectra and high levels of haplotype homozygosity, indicative of founder effects and recent population expansion. We provide a catalog of SNPs that show high levels of differentiation in the artemisinin-resistant subpopulations, including codon variants in transporter proteins and DNA mismatch repair proteins. These data provide a population-level genetic framework for investigating the biological origins of artemisinin resistance and for defining molecular markers to assist in its elimination.
Funded by: Howard Hughes Medical Institute: 55005502; Medical Research Council: G19/9; Wellcome Trust: 090532/Z/09/Z, 090770/Z/09/Z, 098051, G0600718
Nature genetics 2013;45;6;648-55
PUBMED: 23624527; DOI: 10.1038/ng.2624
-
The challenge of increasing Pfam coverage of the human proteome.
EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK.
It is a worthy goal to completely characterize all human proteins in terms of their domains. Here, using the Pfam database, we asked how far we have progressed in this endeavour. Ninety per cent of proteins in the human proteome matched at least one of 5494 manually curated Pfam-A families. In contrast, human residue coverage by Pfam-A families was <45%, with 9418 automatically generated Pfam-B families adding a further 10%. Even after excluding predicted signal peptide regions and short regions (<50 consecutive residues) unlikely to harbour new families, for ∼38% of the human protein residues, there was no information in Pfam about conservation and evolutionary relationship with other protein regions. This uncovered portion of the human proteome was found to be distributed over almost 25 000 distinct protein regions. Comparison with proteins in the UniProtKB database suggested that the human regions that exhibited similarity to thousands of other sequences were often either divergent elements or N- or C-terminal extensions of existing families. Thirty-four per cent of regions, on the other hand, matched fewer than 100 sequences in UniProtKB. Most of these did not appear to share any relationship with existing Pfam-A families, suggesting that thousands of new families would need to be generated to cover them. Also, these latter regions were particularly rich in amino acid compositional bias such as the one associated with intrinsic disorder. This could represent a significant obstacle toward their inclusion into new Pfam families. Based on these observations, a major focus for increasing Pfam coverage of the human proteome will be to improve the definition of existing families. New families will also be built, prioritizing those that have been experimentally functionally characterized. Database URL: http://pfam.sanger.ac.uk/
Database : the journal of biological databases and curation 2013;2013;bat023
PUBMED: 23603847; PMC: 3630804; DOI: 10.1093/database/bat023
-
Challenges in homology search: HMMER3 and convergent evolution of coiled-coil regions.
EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK and HHMI Janelia Farm Research Campus, 19700 Helix Drive, Ashburn, VA 20147, USA.
Detection of protein homology via sequence similarity has important applications in biology, from protein structure and function prediction to reconstruction of phylogenies. Although current methods for aligning protein sequences are powerful, challenges remain, including problems with homologous overextension of alignments and with regions under convergent evolution. Here, we test the ability of the profile hidden Markov model method HMMER3 to correctly assign homologous sequences to >13 000 manually curated families from the Pfam database. We identify problem families using protein regions that match two or more Pfam families not currently annotated as related in Pfam. We find that HMMER3 E-value estimates seem to be less accurate for families that feature periodic patterns of compositional bias, such as the ones typically observed in coiled-coils. These results support the continued use of manually curated inclusion thresholds in the Pfam database, especially on the subset of families that have been identified as problematic in experiments such as these. They also highlight the need for developing new methods that can correct for this particular type of compositional bias.
Nucleic acids research 2013
PUBMED: 23598997; DOI: 10.1093/nar/gkt263
-
Nuclear Wave1 is required for reprogramming transcription in oocytes and for normal development.
Wellcome Trust/Cancer Research UK Gurdon Institute, The Henry Wellcome Building of Cancer and Developmental Biology, Cambridge, UK. k.miyamoto@gurdon.cam.ac.uk
Eggs and oocytes have a remarkable ability to induce transcription of sperm after normal fertilization and in somatic nuclei after somatic cell nuclear transfer. This ability of eggs and oocytes is essential for normal development. Nuclear actin and actin-binding proteins have been shown to contribute to transcription, although their mode of action is elusive. Here, we find that Xenopus Wave1, previously characterized as a protein involved in actin cytoskeleton organization, is present in the oocyte nucleus and is required for efficient transcriptional reprogramming. Moreover, Wave1 knockdown in embryos results in abnormal development and defective hox gene activation. Nuclear Wave1 binds by its WHD domain to active transcription components, and this binding contributes to the action of RNA polymerase II. We identify Wave1 as a maternal reprogramming factor that also has a necessary role in gene activation in development.
Funded by: Medical Research Council: G1001690/1; Wellcome Trust: 101050/Z/13/Z, WT077187, WT089613
Science (New York, N.Y.) 2013;341;6149;1002-5
PUBMED: 23990560; DOI: 10.1126/science.1240376
-
Deciphering the Mechanisms of Developmental Disorders (DMDD): a new programme for phenotyping embryonic lethal mice.
MRC National Institute for Medical Research, London, NW7 1AA, UK.
International efforts to test gene function in the mouse by the systematic knockout of each gene are creating many lines in which embryonic development is compromised. These homozygous lethal mutants represent a potential treasure trove for the biomedical community. Developmental biologists could exploit them in their studies of tissue differentiation and organogenesis; for clinical researchers they offer a powerful resource for investigating the origins of developmental diseases that affect newborns. Here, we outline a new programme of research in the UK aiming to kick-start research with embryonic lethal mouse lines. The 'Deciphering the Mechanisms of Developmental Disorders' (DMDD) programme has the ambitious goal of identifying all embryonic lethal knockout lines made in the UK over the next 5 years, and will use a combination of comprehensive imaging and transcriptomics to identify abnormalities in embryo structure and development. All data will be made freely available, enabling individual researchers to identify lines relevant to their research. The DMDD programme will coordinate its work with similar international efforts through the umbrella of the International Mouse Phenotyping Consortium [see accompanying Special Article (Adams et al., 2013)] and, together, these programmes will provide a novel database for embryonic development, linking gene identity with molecular profiles and morphology phenotypes.
Disease models & mechanisms 2013;6;3;562-6
PUBMED: 23519034; DOI: 10.1242/dmm.011957
-
MiR-210 Is Induced by Oct-2, Regulates B Cells, and Inhibits Autoantibody Production.
Department of Medicine, Cambridge Institute for Medical Research, University of Cambridge School of Clinical Medicine, Addenbrooke's Hospital, Cambridge CB2 0XY, United Kingdom;
MicroRNAs (MiRs) are small, noncoding RNAs that regulate gene expression posttranscriptionally. In this study, we show that MiR-210 is induced by Oct-2, a key transcriptional mediator of B cell activation. Germline deletion of MiR-210 results in the development of autoantibodies from 5 mo of age. Overexpression of MiR-210 in vivo resulted in cell autonomous expansion of the B1 lineage and impaired fitness of B2 cells. Mice overexpressing MiR-210 exhibited impaired class-switched Ab responses, a finding confirmed in wild-type B cells transfected with a MiR-210 mimic. In vitro studies demonstrated defects in cellular proliferation and cell cycle entry, which were consistent with the transcriptomic analysis demonstrating downregulation of genes involved in cellular proliferation and B cell activation. These findings indicate that Oct-2 induction of MiR-210 provides a novel inhibitory mechanism for the control of B cells and autoantibody production.
Journal of immunology (Baltimore, Md. : 1950) 2013;191;6;3037-48
PUBMED: 23960236; DOI: 10.4049/jimmunol.1301289
-
The origin, evolution, and functional impact of short insertion-deletion variants identified in 179 human genomes.
Department of Genetic Medicine and Development, University of Geneva Medical School, Geneva, 1211, Switzerland;
Short insertions and deletions (indels) are the second most abundant form of human genetic variation, but our understanding of their origins and functional effects lags behind that of other types of variants. Using population-scale sequencing, we have identified a high-quality set of 1.6 million indels from 179 individuals representing three diverse human populations. We show that rates of indel mutagenesis are highly heterogeneous, with 43%-48% of indels occurring in 4.03% of the genome, whereas in the remaining 96% their prevalence is 16 times lower than SNPs. Polymerase slippage can explain upwards of three-fourths of all indels, with the remainder being mostly simple deletions in complex sequence. However, insertions do occur and are significantly associated with pseudo-palindromic sequence features compatible with the fork stalling and template switching (FoSTeS) mechanism more commonly associated with large structural variations. We introduce a quantitative model of polymerase slippage, which enables us to identify indel-hypermutagenic protein-coding genes, some of which are associated with recurrent mutations leading to disease. Accounting for mutational rate heterogeneity due to sequence context, we find that indels across functional sequence are generally subject to stronger purifying selection than SNPs. We find that indel length modulates selection strength, and that indels affecting multiple functionally constrained nucleotides undergo stronger purifying selection. We further find that indels are enriched in associations with gene expression and find evidence for a contribution of nonsense-mediated decay. Finally, we show that indels can be integrated in existing genome-wide association studies (GWAS); although we do not find direct evidence that potentially causal protein-coding indels are enriched with associations to known disease-associated SNPs, our findings suggest that the causal variant underlying some of these associations may be indels.
Genome research 2013;23;5;749-61
PUBMED: 23478400; PMC: 3638132; DOI: 10.1101/gr.148718.112
-
Independent specialization of the human and mouse X chromosomes for the male germ line.
Whitehead Institute, Cambridge, Massachusetts, USA.
We compared the human and mouse X chromosomes to systematically test Ohno's law, which states that the gene content of X chromosomes is conserved across placental mammals. First, we improved the accuracy of the human X-chromosome reference sequence through single-haplotype sequencing of ampliconic regions. The new sequence closed gaps in the reference sequence, corrected previously misassembled regions and identified new palindromic amplicons. Our subsequent analysis led us to conclude that the evolution of human and mouse X chromosomes was bimodal. In accord with Ohno's law, 94-95% of X-linked single-copy genes are shared by humans and mice; most are expressed in both sexes. Notably, most X-ampliconic genes are exceptions to Ohno's law: only 31% of human and 22% of mouse X-ampliconic genes had orthologs in the other species. X-ampliconic genes are expressed predominantly in testicular germ cells, and many were independently acquired since divergence from the common ancestor of humans and mice, specializing portions of their X chromosomes for sperm production.
Nature genetics 2013
PUBMED: 23872635; DOI: 10.1038/ng.2705
-
Reciprocal Duplication of the Williams-Beuren Syndrome Deletion on Chromosome 7q11.23 Is Associated with Schizophrenia.
Department of Epidemiology (JGM, AFD), Rollins School of Public Health, Emory University; and Department of Human Genetics (JGM, DJC, STW), Emory University School of Medicine, Atlanta, Georgia. Electronic address: jmulle@emory.edu.
Background: Several copy number variants (CNVs) have been implicated as susceptibility factors for schizophrenia (SZ). Some of these same CNVs also increase risk for autism spectrum disorders, suggesting an etiologic overlap between these conditions. Recently, de novo duplications of a region on chromosome 7q11.23 were associated with autism spectrum disorders. The reciprocal deletion of this region causes Williams-Beuren syndrome.
Methods: We assayed an Ashkenazi Jewish cohort of 554 SZ cases and 1014 controls for genome-wide CNV. An excess of large rare and de novo CNVs were observed, including a 1.4 Mb duplication on chromosome 7q11.23 identified in two unrelated patients. To test whether this 7q11.23 duplication is also associated with SZ, we obtained data for 14,387 SZ cases and 28,139 controls from seven additional studies with high-resolution genome-wide CNV detection. We performed a meta-analysis, correcting for study population of origin, to assess whether the duplication is associated with SZ.
Results: We found duplications at 7q11.23 in 11 of 14,387 SZ cases with only 1 in 28,139 control subjects (unadjusted odds ratio 21.52, 95% confidence interval: 3.13-922.6, p value 5.5 × 10(-5); adjusted odds ratio 10.8, 95% confidence interval: 1.46-79.62, p value .007). Of three SZ duplication carriers with detailed retrospective data, all showed social anxiety and language delay premorbid to SZ onset, consistent with both human studies and animal models of the 7q11.23 duplication.
Conclusions: We have identified a new CNV associated with SZ. Reciprocal duplication of the Williams-Beuren syndrome deletion at chromosome 7q11.23 confers an approximately tenfold increase in risk for SZ.
Biological psychiatry 2013
PUBMED: 23871472; DOI: 10.1016/j.biopsych.2013.05.040
-
A powerful molecular synergy between mutant Nucleophosmin and Flt3-ITD drives acute myeloid leukemia in mice.
The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK.
Leukemia 2013;27;9;1917-20
PUBMED: 23478666; PMC: 3768110; DOI: 10.1038/leu.2013.77
-
Evolution of equine influenza virus in vaccinated horses.
Medical Research Council-University of Glasgow Centre for Virus Research, Institute of Infection, Inflammation and Immunity, College of Medical, Veterinary and Life Sciences, University of Glasgow, Glasgow, United Kingdom.
Influenza A viruses are characterized by their ability to evade host immunity, even in vaccinated individuals. To determine how prior immunity shapes viral diversity in vivo, we studied the intra- and interhost evolution of equine influenza virus in vaccinated horses. Although the level and structure of genetic diversity were similar to those in naïve horses, intrahost bottlenecks may be more stringent in vaccinated animals, and mutations shared among horses often fall close to putative antigenic sites.
Journal of virology 2013;87;8;4768-71
PUBMED: 23388708; PMC: 3624384; DOI: 10.1128/JVI.03379-12
-
Cardiometabolic risk in a rural ugandan population.
Corresponding author: Georgina A.V. Murphy, gm7@sanger.ac.uk.
Diabetes care 2013;36;9;e143
PUBMED: 23970722; DOI: 10.2337/dc13-0739
-
Meta-analysis investigating associations between healthy diet and fasting glucose and insulin levels and modification by Loci associated with glucose homeostasis in data from 15 cohorts.
Whether loci that influence fasting glucose (FG) and fasting insulin (FI) levels, as identified by genome-wide association studies, modify associations of diet with FG or FI is unknown. We utilized data from 15 US and European cohort studies comprising 51,289 persons without diabetes to test whether genotype and diet interact to influence FG or FI concentration. We constructed a diet score using study-specific quartile rankings for intakes of whole grains, fish, fruits, vegetables, and nuts/seeds (favorable) and red/processed meats, sweets, sugared beverages, and fried potatoes (unfavorable). We used linear regression within studies, followed by inverse-variance-weighted meta-analysis, to quantify 1) associations of diet score with FG and FI levels and 2) interactions of diet score with 16 FG-associated loci and 2 FI-associated loci. Diet score (per unit increase) was inversely associated with FG (β = -0.004 mmol/L, 95% confidence interval: -0.005, -0.003) and FI (β = -0.008 ln-pmol/L, 95% confidence interval: -0.009, -0.007) levels after adjustment for demographic factors, lifestyle, and body mass index. Genotype variation at the studied loci did not modify these associations. Healthier diets were associated with lower FG and FI concentrations regardless of genotype at previously replicated FG- and FI-associated loci. Studies focusing on genomic regions that do not yield highly statistically significant associations from main-effect genome-wide association studies may be more fruitful in identifying diet-gene interactions.
American journal of epidemiology 2013;177;2;103-15
PUBMED: 23255780; DOI: 10.1093/aje/kws297
-
The relative timing of mutations in a breast cancer genome.
Hutchison/MRC Research Centre and Department of Pathology, University of Cambridge, Cambridge, United Kingdom.
Many tumors have highly rearranged genomes, but a major unknown is the relative importance and timing of genome rearrangements compared to sequence-level mutation. Chromosome instability might arise early, be a late event contributing little to cancer development, or happen as a single catastrophic event. Another unknown is which of the point mutations and rearrangements are selected. To address these questions we show, using the breast cancer cell line HCC1187 as a model, that we can reconstruct the likely history of a breast cancer genome. We assembled probably the most complete map to date of a cancer genome, by combining molecular cytogenetic analysis with sequence data. In particular, we assigned most sequence-level mutations to individual chromosomes by sequencing of flow sorted chromosomes. The parent of origin of each chromosome was assigned from SNP arrays. We were then able to classify most of the mutations as earlier or later according to whether they occurred before or after a landmark event in the evolution of the genome, endoreduplication (duplication of its entire genome). Genome rearrangements and sequence-level mutations were fairly evenly divided earlier and later, suggesting that genetic instability was relatively constant throughout the life of this tumor, and chromosome instability was not a late event. Mutations that caused chromosome instability would be in the earlier set. Strikingly, the great majority of inactivating mutations and in-frame gene fusions happened earlier. The non-random timing of some of the mutations may be evidence that they were selected.
PloS one 2013;8;6;e64991
PUBMED: 23762276; PMC: 3677865; DOI: 10.1371/journal.pone.0064991
-
Comparative genomics in Chlamydomonas and Plasmodium identifies an ancient nuclear envelope protein family essential for sexual reproduction in protists, fungi, plants, and vertebrates.
Department of Cell Biology, University of Texas Southwestern Medical School, Dallas, Texas 75390, USA;
Fertilization is a crucial yet poorly characterized event in eukaryotes. Our previous discovery that the broadly conserved protein HAP2 (GCS1) functioned in gamete membrane fusion in the unicellular green alga Chlamydomonas and the malaria pathogen Plasmodium led us to exploit the rare biological phenomenon of isogamy in Chlamydomonas in a comparative transcriptomics strategy to uncover additional conserved sexual reproduction genes. All previously identified Chlamydomonas fertilization-essential genes fell into related clusters based on their expression patterns. Out of several conserved genes in a minus gamete cluster, we focused on Cre06.g280600, an ortholog of the fertilization-related Arabidopsis GEX1. Gene disruption, cell biological, and immunolocalization studies show that CrGEX1 functions in nuclear fusion in Chlamydomonas. Moreover, CrGEX1 and its Plasmodium ortholog, PBANKA_113980, are essential for production of viable meiotic progeny in both organisms and thus for mosquito transmission of malaria. Remarkably, we discovered that the genes are members of a large, previously unrecognized family whose first-characterized member, KAR5, is essential for nuclear fusion during yeast sexual reproduction. Our comparative transcriptomics approach provides a new resource for studying sexual development and demonstrates that exploiting the data can lead to the discovery of novel biology that is conserved across distant taxa.
Genes & development 2013;27;10;1198-215
PUBMED: 23699412; DOI: 10.1101/gad.212746.112
-
Binding of nucleoid-associated protein fis to DNA is regulated by DNA breathing dynamics.
Bioscience Division, Los Alamos National Laboratory, Los Alamos, New Mexico, United States of America.
Physicochemical properties of DNA, such as shape, affect protein-DNA recognition. However, the properties of DNA that are most relevant for predicting the binding sites of particular transcription factors (TFs) or classes of TFs have yet to be fully understood. Here, using a model that accurately captures the melting behavior and breathing dynamics (spontaneous local openings of the double helix) of double-stranded DNA, we simulated the dynamics of known binding sites of the TF and nucleoid-associated protein Fis in Escherichia coli. Our study involves simulations of breathing dynamics, analysis of large published in vitro and genomic datasets, and targeted experimental tests of our predictions. Our simulation results and available in vitro binding data indicate a strong correlation between DNA breathing dynamics and Fis binding. Indeed, we can define an average DNA breathing profile that is characteristic of Fis binding sites. This profile is significantly enriched among the identified in vivo E. coli Fis binding sites. To test our understanding of how Fis binding is influenced by DNA breathing dynamics, we designed base-pair substitutions, mismatch, and methylation modifications of DNA regions that are known to interact (or not interact) with Fis. The goal in each case was to make the local DNA breathing dynamics either closer to or farther from the breathing profile characteristic of a strong Fis binding site. For the modified DNA segments, we found that Fis-DNA binding, as assessed by gel-shift assay, changed in accordance with our expectations. We conclude that Fis binding is associated with DNA breathing dynamics, which in turn may be regulated by various nucleotide modifications.
PLoS computational biology 2013;9;1;e1002881
PUBMED: 23341768; PMC: 3547798; DOI: 10.1371/journal.pcbi.1002881
-
Chlamydia trachomatis clinical isolates identified as tetracycline resistant do not exhibit resistance in vitro: whole-genome sequencing reveals a mutation in porB but no evidence for tetracycline resistance genes.
Faculty of Medicine, CES Academic Unit, Level C, South Block, University of Southampton, Southampton General Hospital, Tremona Road, Southampton, UK.
Chlamydia trachomatis is the most common bacterial sexually transmitted infection worldwide and the leading cause of preventable blindness in developing countries. Tetracycline is commonly the drug of choice for treating C. trachomatis infections, but cases of antibiotic resistance in clinical isolates have previously been reported. Here, we used antibiotic resistance assays and whole-genome sequencing to interrogate the hypothesis that two clinical isolates (IU824 and IU888) have acquired mechanisms of antibiotic resistance. Immunofluorescence staining was used to identify C. trachomatis inclusions in cell cultures grown in the presence of tetracycline; however, only antibiotic-free control cultures yielded the strong fluorescence associated with the presence of chlamydial inclusions. Infectivity was lost upon passage of harvested cultures grown in the presence of tetracycline into antibiotic-free medium, so we conclude that these isolates were phenotypically sensitive to tetracycline. Comparisons of the genome and plasmid sequences for the two isolates with tetracycline-sensitive strains did not identify regions of low sequence identity that could accommodate horizontally acquired resistance genes, and the tetracycline binding region of the 16S rRNA gene was identical to that of the sensitive control strains. The porB gene of strain IU824, however, was found to contain a premature stop codon not previously identified, which is noteworthy but unlikely to be related to tetracycline resistance. In conclusion, we found no evidence of tetracycline resistance in the two strains investigated, and it seems most likely that the small, aberrant inclusions previously identified resulted from the high chlamydial load used in the original antibiotic resistance assays.
Microbiology (Reading, England) 2013;159;Pt 4;748-56
PUBMED: 23378575; DOI: 10.1099/mic.0.065391-0
-
Mutations in BICD2 Cause Dominant Congenital Spinal Muscular Atrophy and Hereditary Spastic Paraplegia.
Institute for Neuroscience and Muscle Research, Children's Hospital at Westmead, Westmead, Sydney, NSW 2145, Australia; Discipline of Paediatrics and Child Health, Faculty of Medicine, The University of Sydney, Sydney, NSW 2006, Australia.
Dominant congenital spinal muscular atrophy (DCSMA) is a disorder of developing anterior horn cells and shows lower-limb predominance and clinical overlap with hereditary spastic paraplegia (HSP), a lower-limb-predominant disorder of corticospinal motor neurons. We have identified four mutations in bicaudal D homolog 2 (Drosophila) (BICD2) in six kindreds affected by DCSMA, DCSMA with upper motor neuron features, or HSP. BICD2 encodes BICD2, a key adaptor protein that interacts with the dynein-dynactin motor complex, which facilitates trafficking of cellular cargos that are critical to motor neuron development and maintenance. We demonstrate that mutations resulting in amino acid substitutions in two binding regions of BICD2 increase its binding affinity for the cytoplasmic dynein-dynactin complex, which might result in the perturbation of BICD2-dynein-dynactin-mediated trafficking, and impair neurite outgrowth. These findings provide insight into the mechanism underlying both the static and the slowly progressive clinical features and the motor neuron pathology that characterize BICD2-associated diseases, and underscore the importance of the dynein-dynactin transport pathway in the development and survival of both lower and upper motor neurons.
American journal of human genetics 2013
PUBMED: 23664120; DOI: 10.1016/j.ajhg.2013.04.018
-
Getting ready for the human phenome project: the 2012 forum of the human variome project.
Department of Experimental and Clinical Pharmacology, College of Pharmacy, University of Minnesota, Minneapolis, Minnesota.
A forum of the Human Variome Project (HVP) was held as a satellite to the 2012 Annual Meeting of the American Society of Human Genetics in San Francisco, California. The theme of this meeting was "Getting Ready for the Human Phenome Project." Understanding the genetic contribution to both rare single-gene "Mendelian" disorders and more complex common diseases will require integration of research efforts among many fields and better defined phenotypes. The HVP is dedicated to bringing together researchers and research populations throughout the world to provide the resources to investigate the impact of genetic variation on disease. To this end, there needs to be a greater sharing of phenotype and genotype data. For this to occur, many databases that currently exist will need to become interoperable to allow for the combining of cohorts with similar phenotypes to increase statistical power for studies attempting to identify novel disease genes or causative genetic variants. Improved systems and tools that enhance the collection of phenotype data from clinicians are urgently needed. This meeting begins the HVP's effort toward this important goal.
Human mutation 2013;34;4;661-6
PUBMED: 23401191; DOI: 10.1002/humu.22293
-
Efficient depletion of host DNA contamination in malaria clinical sequencing.
Wellcome Trust Sanger Institute, Hinxton, Cambridge, United Kingdom. Samuel.oyola@sanger.ac.uk
The cost of whole-genome sequencing (WGS) is decreasing rapidly as next-generation sequencing technology continues to advance, and the prospect of making WGS available for public health applications is becoming a reality. So far, a number of studies have demonstrated the use of WGS as an epidemiological tool for typing and controlling outbreaks of microbial pathogens. Success of these applications is hugely dependent on efficient generation of clean genetic material that is free from host DNA contamination for rapid preparation of sequencing libraries. The presence of large amounts of host DNA severely affects the efficiency of characterizing pathogens using WGS and is therefore a serious impediment to clinical and epidemiological sequencing for health care and public health applications. We have developed a simple enzymatic treatment method that takes advantage of the methylation of human DNA to selectively deplete host contamination from clinical samples prior to sequencing. Using malaria clinical samples with over 80% human host DNA contamination, we show that the enzymatic treatment enriches Plasmodium falciparum DNA up to ∼9-fold and generates high-quality, nonbiased sequence reads covering >98% of 86,158 catalogued typeable single-nucleotide polymorphism loci.
Funded by: Wellcome Trust: 079355/Z/06/Z
Journal of clinical microbiology 2013;51;3;745-51
PUBMED: 23224084; PMC: 3592063; DOI: 10.1128/JCM.02507-12
-
Probing the brain of comorbidity.
Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.
Inherited sleep disorders might provide insights for migraine pathophysiology (Brennan et al., this issue).
Science translational medicine 2013;5;183;183fs15
PUBMED: 23636091; DOI: 10.1126/scitranslmed.3006229
-
Tailoring the models of transcription.
The Welcome Trust Sanger Institute, Genome Campus Hinxton, Cambridge CB10 1SA, UK. ap9@sanger.ac.uk.
Molecular biology is a rapidly evolving field that has led to the development of increasingly sophisticated technologies to improve our capacity to study cellular processes in much finer detail. Transcription is the first step in protein expression and the major point of regulation of the components that determine the characteristics, fate and functions of cells. The study of transcriptional regulation has been greatly facilitated by the development of reporter genes and transcription factor expression vectors, which have become versatile tools for manipulating promoters, as well as transcription factors in order to examine their function. The understanding of promoter complexity and transcription factor structure offers an insight into the mechanisms of transcriptional control and their impact on cell behaviour. This review focuses on some of the many applications of molecular cut-and-paste tools for the manipulation of promoters and transcription factors leading to the understanding of crucial aspects of transcriptional regulation.
International journal of molecular sciences 2013;14;4;7583-97
PUBMED: 23567272; DOI: 10.3390/ijms14047583
-
Advances in osteoarthritis genetics.
Department of Human Genetics, Wellcome Trust Sanger Institute, Cambridgeshire, UK.
Osteoarthritis (OA), the most common form of arthritis, is a highly debilitating disease of the joints and can lead to severe pain and disability. There is no cure for OA. Current treatments often fail to alleviate its symptoms leading to an increased demand for joint replacement surgery. Previous epidemiological and genetic research has established that OA is a multifactorial disease with both environmental and genetic components. Over the past 6 years, a candidate gene study and several genome-wide association scans (GWAS) in populations of Asian and European descent have collectively established 15 loci associated with knee or hip OA that have been replicated with genome-wide significance, shedding some light on the aetiogenesis of the disease. All OA associated variants to date are common in frequency and appear to confer moderate to small effect sizes. Some of the associated variants are found within or near genes with clear roles in OA pathogenesis, whereas others point to unsuspected, less characterised pathways. These studies have also provided further evidence in support of the existence of ethnic, sex, and joint specific effects in OA and have highlighted the importance of expanded and more homogeneous phenotype definitions in genetic studies of OA.
Journal of medical genetics 2013
PUBMED: 23868913; DOI: 10.1136/jmedgenet-2013-101754
-
The effect of FTO variation on increased osteoarthritis risk is mediated through body mass index: a mendelian randomisation study.
Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, UK.
Objective: Variation in the fat mass and obesity-associated (FTO) gene influences susceptibility to obesity. A variant in the FTO gene has been implicated in genetic risk to osteoarthritis (OA). We examined the role of the FTO polymorphism rs8044769 in risk of knee and hip OA in cases and controls incorporating body mass index (BMI) information. Methods: 5409 knee OA patients, 4355 hip OA patients and up to 5362 healthy controls from 7 independent cohorts from the UK and Australia were genotyped for rs8044769. The association of the FTO variant with OA was investigated in case/control analyses with and without BMI adjustment and in analyses matched for BMI category. A mendelian randomisation approach was employed using the FTO variant as the instrumental variable to evaluate the role of overweight on OA. Results: In the meta-analysis of all overweight (BMI≥25) samples versus normal-weight controls irrespective of OA status the association of rs8044769 with overweight is highly significant (OR[CIs] for allele G=1.14 [01.08 to 1.19], p=7.5×10(-7)). A significant association with knee OA is present in the analysis without BMI adjustment (OR[CIs]=1.08[1.02 to 1.14], p=0.009) but the signal fully attenuates after BMI adjustment (OR[CIs]=0.99[0.93 to 1.05], p=0.666). We observe no evidence for association in the BMI-matched meta-analyses. Using mendelian randomisation approaches we confirm the causal role of overweight on OA. Conclusions: Our data highlight the contribution of genetic risk to overweight in defining risk to OA but the association is exclusively mediated by the effect on BMI. This is consistent with what is known of the biology of the FTO gene and supports the causative role of high BMI in OA.
Annals of the rheumatic diseases 2013
PUBMED: 23921993; DOI: 10.1136/annrheumdis-2013-203772
-
In search of low frequency and rare variants affecting complex traits.
Wellcome Trust Sanger Institute, Hinxton, UK.
The allelic architecture of complex traits is likely to be underpinned by a combination of multiple common-frequency and rare variants. Targeted genotyping arrays and next generation sequencing technologies at the whole genome and whole exome scales are increasingly employed to access sequence variation across the full minor allele frequency spectrum. Different study design strategies that make use of diverse technologies, imputation and sample selection approaches are an active target of development and evaluation efforts. Initial insights into the contribution of rare variants in common diseases and medically-relevant quantitative traits point to low-frequency and rare alleles acting either independently or in aggregate and in several cases alongside common variants. Studies conducted in population isolates have been successful in detecting rare variant associations with complex phenotypes. Statistical methodologies that enable the joint analysis of rare variants across regions of the genome continue to evolve with current efforts focusing on incorporating information such as functional annotation, and on the meta-analysis of these burden tests. In addition, population stratification, defining genome-wide statistical significance thresholds and the design of appropriate replication experiments constitute important considerations for the powerful analysis and interpretation of rare variant association studies. Progress in addressing these emerging challenges and the accrual of sufficiently large data sets are poised to help the field of complex trait genetics enter a promising era of discovery.
Human molecular genetics 2013
PUBMED: 23922232; DOI: 10.1093/hmg/ddt376
-
Clinical and biological implications of driver mutations in myelodysplastic syndromes.
Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, United Kingdom;
Myelodysplastic syndromes (MDS) are a heterogeneous group of chronic hematological malignancies characterized by dysplasia, ineffective hematopoiesis and a variable risk of progression to acute myeloid leukemia. Sequencing of MDS genomes has identified mutations in genes implicated in RNA splicing, DNA modification, chromatin regulation and cell signaling. We sequenced 111 genes across 738 patients with MDS or closely related neoplasms (including CMML and MDS-MPN) to explore the role of acquired mutations in MDS biology and clinical phenotype. 78% patients had one or more oncogenic mutations. We identify complex patterns of pairwise association between genes, indicative of epistatic interactions involving components of the spliceosome machinery and epigenetic modifiers. Coupled with inferences on subclonal mutations, these data suggest a hypothesis of genetic 'predestination', in which early driver mutations, typically affecting genes involved in RNA splicing, dictate future trajectories of disease evolution with distinct clinical phenotypes. Driver mutations had equivalent prognostic significance whether clonal or subclonal, and leukemia-free survival deteriorated steadily as numbers of driver mutations increased. Thus, analysis of oncogenic mutations in large, well-characterized cohorts of patients illustrates the interconnections between the cancer genome and disease biology, with considerable potential for clinical application.
Blood 2013
PUBMED: 24030381; DOI: 10.1182/blood-2013-08-518886
-
A member of the Plasmodium falciparum PHIST family binds to the erythrocyte cytoskeleton component band 4.1.
William C Gorgas Center for Geographic Medicine, Division of Infectious Diseases, Department of Medicine, University of Alabama at Birmingham, 845 19th St, South, Birmingham, AL, 35294-2170, USA. julian.rayner@sanger.ac.uk.
Background: Plasmodium falciparum parasites export more than 400 proteins into the cytosol of their host erythrocytes. These exported proteins catalyse the formation of knobs on the erythrocyte plasma membrane and an overall increase in erythrocyte rigidity, presumably by modulating the endogenous erythrocyte cytoskeleton. In uninfected erythrocytes, Band 4.1 (4.1R) plays a key role in regulating erythrocyte shape by interacting with multiple proteins through the three lobes of its cloverleaf-shaped N-terminal domain. In P. falciparum-infected erythrocytes, the C-lobe of 4.1R interacts with the P. falciparum protein mature parasite-infected erythrocyte surface antigen (MESA), but it is not currently known whether other P. falciparum proteins bind to other lobes of the 4.1R N-terminal domain. Methods: In order to identify novel 4.1R interacting proteins, a yeast two-hybrid screen was performed with a fragment of 4.1R containing both the N- and α-lobes. Positive interactions were confirmed and investigated using site-directed mutagenesis, and antibodies were raised against the interacting partner to characterise it's expression and distribution in P. falciparum infected erythrocytes. Results: Yeast two-hybrid screening identified a positive interaction between the 4.1R N- and α-lobes and PF3D7_0402000. PF3D7_0402000 is a member of a large family of exported proteins that share a domain of unknown function, the PHIST domain. Domain mapping and site-directed mutagenesis established that it is the PHIST domain of PF3D7_0402000 that interacts with 4.1R. Native PF3D7_0402000 is localized at the parasitophorous vacuole membrane (PVM), and colocalizes with a subpopulation of 4.1R. Discussion: The function of the majority of P. falciparum exported proteins, including most members of the PHIST family, is unknown, and in only a handful of cases has a direct interaction between P. falciparum-exported proteins and components of the erythrocyte cytoskeleton been established. The interaction between 4.1R and PF3D7_0402000, and localization of PF3D7_0402000 with a sub-population of 4.1R at the PVM could indicate a role in modulating PVM structure. Further investigation into the mechanisms for 4.1R recruitment is needed. Conclusion: PF3D7_0402000 was identified as a new binding partner for the major erythrocyte cytoskeletal protein, 4.1R. This interaction is consistent with a growing body of literature that suggests the PHIST family members function by interacting directly with erythrocyte proteins.
Malaria journal 2013;12;160
PUBMED: 23663475; PMC: 3658886; DOI: 10.1186/1475-2875-12-160
-
What has high-throughput sequencing ever done for us?
Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.
This month's Genome Watch looks back over the past 10 years and highlights how the incredible advances in sequencing technologies have transformed research into microbial genomes.
Nature reviews. Microbiology 2013;11;10;664-5
PUBMED: 23979431; DOI: 10.1038/nrmicro3112
-
Proteomic and Genetic Analyses Demonstrate that Plasmodium berghei Blood Stages Export a Large and Diverse Repertoire of Proteins.
‡Biomedical Primate Research Centre, 2288 GJ Rijswijk, The Netherlands;
Malaria parasites actively remodel the infected red blood cell (irbc) by exporting proteins into the host cell cytoplasm. The human parasite Plasmodium falciparum exports particularly large numbers of proteins, including proteins that establish a vesicular network allowing the trafficking of proteins onto the surface of irbcs that are responsible for tissue sequestration. Like P. falciparum, the rodent parasite P. berghei ANKA sequesters via irbc interactions with the host receptor CD36. We have applied proteomic, genomic, and reverse-genetic approaches to identify P. berghei proteins potentially involved in the transport of proteins to the irbc surface. A comparative proteomics analysis of P. berghei non-sequestering and sequestering parasites was used to determine changes in the irbc membrane associated with sequestration. Subsequent tagging experiments identified 13 proteins (Plasmodium export element (PEXEL)-positive as well as PEXEL-negative) that are exported into the irbc cytoplasm and have distinct localization patterns: a dispersed and/or patchy distribution, a punctate vesicle-like pattern in the cytoplasm, or a distinct location at the irbc membrane. Members of the PEXEL-negative BIR and PEXEL-positive Pb-fam-3 show a dispersed localization in the irbc cytoplasm, but not at the irbc surface. Two of the identified exported proteins are transported to the irbc membrane and were named erythrocyte membrane associated proteins. EMAP1 is a member of the PEXEL-negative Pb-fam-1 family, and EMAP2 is a PEXEL-positive protein encoded by a single copy gene; neither protein plays a direct role in sequestration. Our observations clearly indicate that P. berghei traffics a diverse range of proteins to different cellular locations via mechanisms that are analogous to those employed by P. falciparum. This information can be exploited to generate transgenic humanized rodent P. berghei parasites expressing chimeric P. berghei/P. falciparum proteins on the surface of rodent irbc, thereby opening new avenues for in vivo screening adjunct therapies that block sequestration.
Molecular & cellular proteomics : MCP 2013;12;2;426-48
PUBMED: 23197789; PMC: 3567864; DOI: 10.1074/mcp.M112.021238
-
Incidence and Characterisation of Methicillin-Resistant Staphylococcus aureus (MRSA) from Nasal Colonisation in Participants Attending a Cattle Veterinary Conference in the UK.
Department of Veterinary Medicine, University of Cambridge, Madingley Road, Cambridge, United Kingdom.
We sought to determine the prevalence of nasal colonisation with methicillin-resistant Staphylococcus aureus among cattle veterinarians in the UK. There was particular interest in examining the frequency of colonisation with MRSA harbouring mecC, as strains with this mecA homologue were originally identified in bovine milk and may represent a zoonotic risk to those in contact with dairy livestock. Three hundred and seven delegates at the British Cattle Veterinarian Association (BCVA) Congress 2011 in Southport, UK were screening for nasal colonisation with MRSA. Isolates were characterised by whole genome sequencing and antimicrobial susceptibility testing. Eight out of three hundred and seven delegates (2.6%) were positive for nasal colonisation with MRSA. All strains were positive for mecA and none possessed mecC. The time since a delegate's last visit to a farm was significantly shorter in the MRSA-positive group than in MRSA-negative counterparts. BCVA delegates have an increased risk of MRSA colonisation compared to the general population but their frequency of colonisation is lower than that reported from other types of veterinarian conference, and from that seen in human healthcare workers. The results indicate that recent visitation to a farm is a risk factor for MRSA colonisation and that mecC-MRSA are rare among BCVA delegates (<1% based on sample size). Contact with livestock, including dairy cattle, may still be a risk factor for human colonisation with mecC-MRSA but occurs at a rate below the lower limit of detection available in this study.
PloS one 2013;8;7;e68463
PUBMED: 23869220; PMC: 3711812; DOI: 10.1371/journal.pone.0068463
-
A sequence-based variation map of zebrafish.
1 CSIR- Institute of Genomics and Integrative Biology (CSIR-IGIB), Delhi, India .
Abstract Zebrafish (Danio rerio) is a popular vertebrate model organism largely deployed using outbred laboratory animals. The nonisogenic nature of the zebrafish as a model system offers the opportunity to understand natural variations and their effect in modulating phenotype. In an effort to better characterize the range of natural variation in this model system and to complement the zebrafish reference genome project, the whole genome sequence of a wild zebrafish at 39-fold genome coverage was determined. Comparative analysis with the zebrafish reference genome revealed approximately 5.2 million single nucleotide variations and over 1.6 million insertion-deletion variations. This dataset thus represents a new catalog of genetic variations in the zebrafish genome. Further analysis revealed selective enrichment for variations in genes involved in immune function and response to the environment, suggesting genome-level adaptations to environmental niches. We also show that human disease gene orthologs in the sequenced wild zebrafish genome show a lower ratio of nonsynonymous to synonymous single nucleotide variations.
Zebrafish 2013;10;1;15-20
PUBMED: 23590399; PMC: 3629779; DOI: 10.1089/zeb.2012.0848
-
Maps of open chromatin highlight cell type-restricted patterns of regulatory sequence variation at hematological trait loci.
Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, United Kingdom. d.paul@ucl.ac.uk
Nearly three-quarters of the 143 genetic signals associated with platelet and erythrocyte phenotypes identified by meta-analyses of genome-wide association (GWA) studies are located at non-protein-coding regions. Here, we assessed the role of candidate regulatory variants associated with cell type-restricted, closely related hematological quantitative traits in biologically relevant hematopoietic cell types. We used formaldehyde-assisted isolation of regulatory elements followed by next-generation sequencing (FAIRE-seq) to map regions of open chromatin in three primary human blood cells of the myeloid lineage. In the precursors of platelets and erythrocytes, as well as in monocytes, we found that open chromatin signatures reflect the corresponding hematopoietic lineages of the studied cell types and associate with the cell type-specific gene expression patterns. Dependent on their signal strength, open chromatin regions showed correlation with promoter and enhancer histone marks, distance to the transcription start site, and ontology classes of nearby genes. Cell type-restricted regions of open chromatin were enriched in sequence variants associated with hematological indices. The majority (63.6%) of such candidate functional variants at platelet quantitative trait loci (QTLs) coincided with binding sites of five transcription factors key in regulating megakaryopoiesis. We experimentally tested 13 candidate regulatory variants at 10 platelet QTLs and found that 10 (76.9%) affected protein binding, suggesting that this is a frequent mechanism by which regulatory variants influence quantitative trait levels. Our findings demonstrate that combining large-scale GWA data with open chromatin profiles of relevant cell types can be a powerful means of dissecting the genetic architecture of closely related quantitative traits.
Funded by: British Heart Foundation: RG/09/12/28096; Wellcome Trust: 098051
Genome research 2013;23;7;1130-41
PUBMED: 23570689; PMC: 3698506; DOI: 10.1101/gr.155127.113
-
Meander: visually exploring the structural variome using space-filling curves.
Department of Electrical Engineering (ESAT/SCD), University of Leuven, Kasteelpark Arenberg 10, Box 2446, 3001 Leuven, Belgium, iMinds Future Health Department, University of Leuven, Kasteelpark Arenberg 10, Box 2446, 3001 Leuven, Belgium, Division of Basic Sciences, University of Crete, Medical School, Heraklion, 71110 Crete, Greece, Laboratory of Reproductive Genomics, Department of Human Genetics, University of Leuven, Herestraat 49, 3000 Leuven, Belgium and Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton - Cambridge, CB10 1SA, UK.
The introduction of next generation sequencing methods in genome studies has made it possible to shift research from a gene-centric approach to a genome wide view. Although methods and tools to detect single nucleotide polymorphisms are becoming more mature, methods to identify and visualize structural variation (SV) are still in their infancy. Most genome browsers can only compare a given sequence to a reference genome; therefore, direct comparison of multiple individuals still remains a challenge. Therefore, the implementation of efficient approaches to explore and visualize SVs and directly compare two or more individuals is desirable. In this article, we present a visualization approach that uses space-filling Hilbert curves to explore SVs based on both read-depth and pair-end information. An interactive open-source Java application, called Meander, implements the proposed methodology, and its functionality is demonstrated using two cases. With Meander, users can explore variations at different levels of resolution and simultaneously compare up to four different individuals against a common reference. The application was developed using Java version 1.6 and Processing.org and can be run on any platform. It can be found at http://homes.esat.kuleuven.be/∼bioiuser/meander.
Nucleic acids research 2013
PUBMED: 23605045; DOI: 10.1093/nar/gkt254
-
Genetic variants associated with warfarin dose in African-American individuals: a genome-wide association study.
Section of Genetic Medicine, Department of Medicine, University of Chicago, IL, USA.
BACKGROUND: VKORC1 and CYP2C9 are important contributors to warfarin dose variability, but explain less variability for individuals of African descent than for those of European or Asian descent. We aimed to identify additional variants contributing to warfarin dose requirements in African Americans. METHODS: We did a genome-wide association study of discovery and replication cohorts. Samples from African-American adults (aged ≥18 years) who were taking a stable maintenance dose of warfarin were obtained at International Warfarin Pharmacogenetics Consortium (IWPC) sites and the University of Alabama at Birmingham (Birmingham, AL, USA). Patients enrolled at IWPC sites but who were not used for discovery made up the independent replication cohort. All participants were genotyped. We did a stepwise conditional analysis, conditioning first for VKORC1 -1639G→A, followed by the composite genotype of CYP2C9*2 and CYP2C9*3. We prespecified a genome-wide significance threshold of p<5×10(-8) in the discovery cohort and p<0·0038 in the replication cohort. FINDINGS: The discovery cohort contained 533 participants and the replication cohort 432 participants. After the prespecified conditioning in the discovery cohort, we identified an association between a novel single nucleotide polymorphism in the CYP2C cluster on chromosome 10 (rs12777823) and warfarin dose requirement that reached genome-wide significance (p=1·51×10(-8)). This association was confirmed in the replication cohort (p=5·04×10(-5)); analysis of the two cohorts together produced a p value of 4·5×10(-12). Individuals heterozygous for the rs12777823 A allele need a dose reduction of 6·92 mg/week and those homozygous 9·34 mg/week. Regression analysis showed that the inclusion of rs12777823 significantly improves warfarin dose variability explained by the IWPC dosing algorithm (21% relative improvement). INTERPRETATION: A novel CYP2C single nucleotide polymorphism exerts a clinically relevant effect on warfarin dose in African Americans, independent of CYP2C9*2 and CYP2C9*3. Incorporation of this variant into pharmacogenetic dosing algorithms could improve warfarin dose prediction in this population. FUNDING: National Institutes of Health, American Heart Association, Howard Hughes Medical Institute, Wisconsin Network for Health Research, and the Wellcome Trust.
Lancet 2013
PUBMED: 23755828; DOI: 10.1016/S0140-6736(13)60681-9
-
Computational proteomics pitfalls and challenges: HavanaBioinfo 2012 Workshop report.
Center for Genetic Engineering and Biotechnology, Ave 31 e/158 y 190, Cubanacán, Playa, Havana, Cuba; European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK. Electronic address: yasset.perez@biocomp.cigb.edu.cu.
The workshop "Bioinformatics for Biotechnology Applications (HavanaBioinfo 2012)", held December 8-11, 2012 in Havana, aimed at exploring new bioinformatics tools and approaches for large-scale proteomics, genomics and chemoinformatics. Major conclusions of the workshop include the following: (i) development of new applications and bioinformatics tools for proteomic repository analysis is crucial; current proteomic repositories contain enough data (spectra/identifications) that can be used to increase the annotations in protein databases and to generate new tools for protein identification; (ii) spectral libraries, de novo sequencing and database search tools should be combined to increase the number of protein identifications; (iii) protein probabilities and FDR are not yet sufficiently mature; (iv) computational proteomics software needs to become more intuitive; and at the same time appropriate education and training should be provided to help in the efficient exchange of knowledge between mass spectrometrists and experimental biologists and bioinformaticians in order to increase their bioinformatics background, especially statistics knowledge.
Journal of proteomics 2013
PUBMED: 23376229; DOI: 10.1016/j.jprot.2013.01.019
-
Automatic event detection within thrombus formation based on integer programming
Lecture Notes in Computer Science 2013;7766;215-24
DOI: 10.1007/978-3-642-36620-8_21; URL: http://link.springer.com/chapter.../10.1007%2F978-3-642-36620-8_21
-
Recombination-mediated genetic engineering of Plasmodium berghei DNA.
Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK.
DNA of Plasmodium berghei is difficult to manipulate in Escherichia coli by conventional restriction and ligation methods due to its high content of adenine and thymine (AT) nucleotides. This limits our ability to clone large genes and to generate complex vectors for modifying the parasite genome. We here describe a protocol for using lambda Red recombinase to modify inserts of a P. berghei genomic DNA library constructed in a linear, low-copy, phage-derived vector. The method uses primer extensions of 50 bp, which provide sufficient homology for an antibiotic resistance marker to recombine efficiently with a P. berghei genomic DNA insert in E. coli. In a subsequent in vitro Gateway reaction the bacterial marker is replaced with a cassette for selection in P. berghei. The insert is then released and used for transfection. The basic techniques we describe here can be adapted to generate highly efficient vectors for gene deletion, tagging, targeted mutagenesis, or genetic complementation with larger genomic regions.
Methods in molecular biology (Clifton, N.J.) 2013;923;127-38
PUBMED: 22990774; DOI: 10.1007/978-1-62703-026-7_8
-
Identification of Salmonella enterica Serovar Typhi Genotypes by Use of Rapid Multiplex Ligation-Dependent Probe Amplification.
The Hospital for Tropical Diseases, Wellcome Trust Major Overseas Programme, Oxford University Clinical Research Unit, Ho Chi Minh City, Vietnam.
Salmonella enterica serovar Typhi, the causative agent of typhoid fever, is highly clonal and genetically conserved, making isolate subtyping difficult. We describe a standardized multiplex ligation-dependent probe amplification (MLPA) genotyping scheme targeting 11 key phylogenetic markers of the S. Typhi genome. The MLPA method demonstrated 90% concordance with single nucleotide polymorphism (SNP) typing, the gold standard for S. Typhi genotyping, and had the ability to identify isolates of the H58 haplotype, which is associated with resistance to multiple antimicrobials. Additionally, the assay permitted the detection of fluoroquinolone resistance-associated mutations in the DNA gyrase-encoding gene gyrA and the topoisomerase gene parC with a sensitivity of 100%. The MLPA methodology is simple and reliable, providing phylogenetically and phenotypically relevant genotyping information. This MLPA scheme offers a more-sensitive and interpretable alternative to the nonphylogenetic subgrouping methodologies that are currently used in reference and research laboratories in areas where typhoid is endemic.
Journal of clinical microbiology 2013;51;9;2950-8
PUBMED: 23824765; DOI: 10.1128/JCM.01010-13
-
A genome-wide mutagenesis screen identifies multiple genes contributing to Vi capsular expression in Salmonella Typhi.
The Wellcome Trust Sanger Institute, The Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom.
A transposon-based, genome-wide mutagenesis screen exploiting the killing activity of a lytic ViII bacteriophage was used to identify Salmonella Typhi genes that contribute to Vi polysaccharide capsule expression. Genes enriched in the screen included those within the viaB locus (tviABCDE, vexABCDE) as well as oxyR, barA/sirA and yrfF, which have not previously been associated with Vi expression. The role of these genes in Vi expression was confirmed by constructing defined null mutant derivatives of S. Typhi and these were negative for Vi expression as determined by agglutination assays with Vi-specific sera or susceptibility to Vi-targeting bacteriophage. Transcriptome analysis confirmed a reduction in expression from the viaB locus in these S. Typhi mutant derivatives and defined regulatory networks associated with Vi expression.
Journal of bacteriology 2013
PUBMED: 23316043; DOI: 10.1128/JB.01632-12
-
Genome Wide Association Analysis of a Founder Population Identified TAF3 as a Gene for MCHC in Humans.
Division of Genetics and Cell Biology, San Raffaele Research Institute and Vita Salute University, Milano, Italy.
The red blood cell related traits are highly heritable but their genetics are poorly defined. Only 5-10% of the total observed variance is explained by the genetic loci found to date, suggesting that additional loci should be searched using approaches alternative to large meta analysis. GWAS (Genome Wide Association Study) for red blood cell traits in a founder population cohort from Northern Italy identified a new locus for mean corpuscular hemoglobin concentration (MCHC) in the TAF3 gene. The association was replicated in two cohorts (rs1887582, P = 4.25E-09). TAF3 encodes a transcription cofactor that participates in core promoter recognition complex, and is involved in zebrafish and mouse erythropoiesis. We show here that TAF3 is required for transcription of the SPTA1 gene, encoding alpha spectrin, one of the proteins that link the plasma membrane to the actin cytoskeleton. Mutations in SPTA1 are responsible for hereditary spherocytosis, a monogenic disorder of MCHC, as well as for the normal MCHC level. Based on our results, we propose that TAF3 is required for normal erythropoiesis in human and that it might have a role in controlling the ratio between hemoglobin (Hb) and cell volume and in the dynamics of RBC maturation in healthy individuals. Finally, TAF3 represents a potential candidate or a modifier gene for disorders of red cell membrane.
PloS one 2013;8;7;e69206
PUBMED: 23935956; PMC: 3729833; DOI: 10.1371/journal.pone.0069206
-
NDUFA4 Mutations Underlie Dysfunction of a Cytochrome c Oxidase Subunit Linked to Human Neurological Disease.
MRC Centre for Neuromuscular Diseases, UCL Institute of Neurology and National Hospital for Neurology and Neurosurgery, Queen Square, London WC1N 3BG, UK.
The molecular basis of cytochrome c oxidase (COX, complex IV) deficiency remains genetically undetermined in many cases. Homozygosity mapping and whole-exome sequencing were performed in a consanguineous pedigree with isolated COX deficiency linked to a Leigh syndrome neurological phenotype. Unexpectedly, affected individuals harbored homozygous splice donor site mutations in NDUFA4, a gene previously assigned to encode a mitochondrial respiratory chain complex I (NADH:ubiquinone oxidoreductase) subunit. Western blot analysis of denaturing gels and immunocytochemistry revealed undetectable steady-state NDUFA4 protein levels, indicating that the mutation causes a loss-of-function effect in the homozygous state. Analysis of one- and two-dimensional blue-native polyacrylamide gels confirmed an interaction between NDUFA4 and the COX enzyme complex in control muscle, whereas the COX enzyme complex without NDUFA4 was detectable with no abnormal subassemblies in patient muscle. These observations support recent work in cell lines suggesting that NDUFA4 is an additional COX subunit and demonstrate that NDUFA4 mutations cause human disease. Our findings support reassignment of the NDUFA4 protein to complex IV and suggest that patients with unexplained COX deficiency should be screened for NDUFA4 mutations.
Cell reports 2013
PUBMED: 23746447; DOI: 10.1016/j.celrep.2013.05.005
-
High-fat feeding rapidly induces obesity and lipid derangements in C57BL/6N mice.
Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK.
C57BL/6N (B6N) is becoming the standard background for genetic manipulation of the mouse genome. The B6N, whose genome is very closely related to the reference C57BL/6J genome, is versatile in a wide range of phenotyping and experimental settings and large repositories of B6N ES cells have been developed. Here, we present a series of studies showing the baseline characteristics of B6N fed a high-fat diet (HFD) for up to 12 weeks. We show that HFD-fed B6N mice show increased weight gain, fat mass, and hypercholesterolemia compared to control diet-fed mice. In addition, HFD-fed B6N mice display a rapid onset of lipid accumulation in the liver with both macro- and microvacuolation, which became more severe with increasing duration of HFD. Our results suggest that the B6N mouse strain is a versatile background for studying diet-induced metabolic syndrome and may also represent a model for early nonalcoholic fatty liver disease.
Mammalian genome : official journal of the International Mammalian Genome Society 2013
PUBMED: 23712496; DOI: 10.1007/s00335-013-9456-0
-
Genome-wide mutational signatures of aristolochic acid and its application as a screening tool.
NCCS-VARI Translational Research Laboratory, Division of Medical Sciences, National Cancer Centre Singapore, 11 Hospital Drive, Singapore 169610, Singapore.
Aristolochic acid (AA), a natural product of Aristolochia plants found in herbal remedies and health supplements, is a group 1 carcinogen that can cause nephrotoxicity and upper urinary tract urothelial cell carcinoma (UTUC). Whole-genome and exome analysis of nine AA-associated UTUCs revealed a strikingly high somatic mutation rate (150 mutations/Mb), exceeding smoking-associated lung cancer (8 mutations/Mb) and ultraviolet radiation-associated melanoma (111 mutations/Mb). The AA-UTUC mutational signature was characterized by A:T to T:A transversions at the sequence motif A[C|T]AGG, located primarily on nontranscribed strands. AA-induced mutations were also significantly enriched at splice sites, suggesting a role for splice-site mutations in UTUC pathogenesis. RNA sequencing of AA-UTUC confirmed a general up-regulation of nonsense-mediated decay machinery components and aberrant splicing events associated with splice-site mutations. We observed a high frequency of somatic mutations in chromatin modifiers, particularly KDM6A, in AA-UTUC, demonstrated the sufficiency of AA to induce renal dysplasia in mice, and reproduced the AA mutational signature in experimentally treated human renal tubular cells. Finally, exploring other malignancies that were not known to be associated with AA, we screened 93 hepatocellular carcinoma genomes/exomes and identified AA-like mutational signatures in 11. Our study highlights an unusual genome-wide AA mutational signature and the potential use of mutation signatures as "molecular fingerprints" for interrogating high-throughput cancer genome data to infer previous carcinogen exposures.
Science translational medicine 2013;5;197;197ra101
PUBMED: 23926199; DOI: 10.1126/scitranslmed.3006086
-
A meta-analysis of thyroid-related traits reveals novel Loci and gender-specific differences in the regulation of thyroid function.
Istituto di Ricerca Genetica e Biomedica (IRGB), Consiglio Nazionale delle Ricerche, c/o Cittadella Universitaria di Monserrato, Monserrato, Cagliari, Italy ; Dipartimento di Scienze Biomediche, Università di Sassari, Sassari, Italy.
Thyroid hormone is essential for normal metabolism and development, and overt abnormalities in thyroid function lead to common endocrine disorders affecting approximately 10% of individuals over their life span. In addition, even mild alterations in thyroid function are associated with weight changes, atrial fibrillation, osteoporosis, and psychiatric disorders. To identify novel variants underlying thyroid function, we performed a large meta-analysis of genome-wide association studies for serum levels of the highly heritable thyroid function markers TSH and FT4, in up to 26,420 and 17,520 euthyroid subjects, respectively. Here we report 26 independent associations, including several novel loci for TSH (PDE10A, VEGFA, IGFBP5, NFIA, SOX9, PRDM11, FGF7, INSR, ABO, MIR1179, NRG1, MBIP, ITPK1, SASH1, GLIS3) and FT4 (LHX3, FOXE1, AADAT, NETO1/FBXO15, LPCAT2/CAPNS2). Notably, only limited overlap was detected between TSH and FT4 associated signals, in spite of the feedback regulation of their circulating levels by the hypothalamic-pituitary-thyroid axis. Five of the reported loci (PDE8B, PDE10A, MAF/LOC440389, NETO1/FBXO15, and LPCAT2/CAPNS2) show strong gender-specific differences, which offer clues for the known sexual dimorphism in thyroid function and related pathologies. Importantly, the TSH-associated loci contribute not only to variation within the normal range, but also to TSH values outside the reference range, suggesting that they may be involved in thyroid dysfunction. Overall, our findings explain, respectively, 5.64% and 2.30% of total TSH and FT4 trait variance, and they improve the current knowledge of the regulation of hypothalamic-pituitary-thyroid axis function and the consequences of genetic variation for hypo- or hyperthyroidism.
PLoS genetics 2013;9;2;e1003266
PUBMED: 23408906; PMC: 3567175; DOI: 10.1371/journal.pgen.1003266
-
Comparative study of transcriptome profiles of mechanical- and skin-transformed Schistosoma mansoni schistosomula.
Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, United Kingdom.
Schistosome infection begins with the penetration of cercariae through healthy unbroken host skin. This process leads to the transformation of the free-living larvae into obligate parasites called schistosomula. This irreversible transformation, which occurs in as little as two hours, involves casting the cercaria tail and complete remodelling of the surface membrane. At this stage, parasites are vulnerable to host immune attack and oxidative stress. Consequently, the mechanisms by which the parasite recognises and swiftly adapts to the human host are still the subject of many studies, especially in the context of development of intervention strategies against schistosomiasis infection. Because obtaining enough material from in vivo infections is not always feasible for such studies, the transformation process is often mimicked in the laboratory by application of shear pressure to a cercarial sample resulting in mechanically transformed (MT) schistosomula. These parasites share remarkable morphological and biochemical similarity to the naturally transformed counterparts and have been considered a good proxy for parasites undergoing natural infection. Relying on this equivalency, MT schistosomula have been used almost exclusively in high-throughput studies of gene expression, identification of drug targets and identification of effective drugs against schistosomes. However, the transcriptional equivalency between skin-transformed (ST) and MT schistosomula has never been proven. In our approach to compare these two types of schistosomula preparations and to explore differences in gene expression triggered by the presence of a skin barrier, we performed RNA-seq transcriptome profiling of ST and MT schistosomula at 24 hours post transformation. We report that these two very distinct schistosomula preparations differ only in the expression of 38 genes (out of ∼11,000), providing convincing evidence to resolve the skin vs. mechanical long-lasting controversy.
Funded by: Wellcome Trust: WT 083931/Z/07/Z, WT 098051
PLoS neglected tropical diseases 2013;7;3;e2091
PUBMED: 23516644; PMC: 3597483; DOI: 10.1371/journal.pntd.0002091
-
Targeting MYCN in Neuroblastoma by BET Bromodomain Inhibition.
Departments of 1Pediatric Oncology and 2Medical Oncology, Dana-Farber Cancer Institute; 3Boston Children's Hospital; 4Department of Medicine, Harvard Medical School; 5Bioinformatics Graduate Program, Boston University, Boston; 6The Broad Institute of Harvard University and Massachusetts Institute of Technology, Cambridge; 7Massachusetts General Hospital Cancer Center, Harvard Medical School, Charlestown, Massachusetts; 8Department of Pediatrics, Helen Diller Family Comprehensive Cancer Center; 9Departments of Neurology and Neurosurgery, Brain Tumor Research Center, University of California, San Francisco, San Francisco, California; and 10Cancer Genome Project, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, United Kingdom.
Bromodomain inhibition comprises a promising therapeutic strategy in cancer, particularly for hematologic malignancies. To date, however, genomic biomarkers to direct clinical translation have been lacking. We conducted a cell-based screen of genetically defined cancer cell lines using a prototypical inhibitor of BET bromodomains. Integration of genetic features with chemosensitivity data revealed a robust correlation between MYCN amplification and sensitivity to bromodomain inhibition. We characterized the mechanistic and translational significance of this finding in neuroblastoma, a childhood cancer with frequent amplification of MYCN. Genome-wide expression analysis showed downregulation of the MYCN transcriptional program accompanied by suppression of MYCN transcription. Functionally, bromodomain-mediated inhibition of MYCN impaired growth and induced apoptosis in neuroblastoma. BRD4 knockdown phenocopied these effects, establishing BET bromodomains as transcriptional regulators of MYCN. BET inhibition conferred a significant survival advantage in 3 in vivo neuroblastoma models, providing a compelling rationale for developing BET bromodomain inhibitors in patients with neuroblastoma.
Funded by: NCI NIH HHS: P01 CA081403, R01 CA102321
Cancer discovery 2013
PUBMED: 23430699; DOI: 10.1158/2159-8290.CD-12-0418
-
SpoIVA and SipL Are Clostridium difficile Spore Morphogenetic Proteins.
Department of Microbiology and Molecular Genetics, University of Vermont, Burlington, Vermont, USA.
Clostridium difficile is a major nosocomial pathogen whose infections are difficult to treat because of their frequent recurrence. The spores of C. difficile are responsible for these clinical features, as they resist common disinfectants and antibiotic treatment. Although spores are the major transmissive form of C. difficile, little is known about their composition or morphogenesis. Spore morphogenesis has been well characterized for Bacillus sp., but Bacillus sp. spore coat proteins are poorly conserved in Clostridium sp. Of the known spore morphogenetic proteins in Bacillus subtilis, SpoIVA is one of the mostly highly conserved in the Bacilli and the Clostridia. Using genetic analyses, we demonstrate that SpoIVA is required for proper spore morphogenesis in C. difficile. In particular, a spoIVA mutant exhibits defects in spore coat localization but not cortex formation. Our study also identifies SipL, a previously uncharacterized protein found in proteomic studies of C. difficile spores, as another critical spore morphogenetic protein, since a sipL mutant phenocopies a spoIVA mutant. Biochemical analyses and mutational analyses indicate that SpoIVA and SipL directly interact. This interaction depends on the Walker A ATP binding motif of SpoIVA and the LysM domain of SipL. Collectively, these results provide the first insights into spore morphogenesis in C. difficile.
Funded by: NIGMS NIH HHS: R00 GM092934
Journal of bacteriology 2013;195;6;1214-25
PUBMED: 23292781; PMC: 3592010; DOI: 10.1128/JB.02181-12
-
A genetic progression model of Braf(V600E)-induced intestinal tumorigenesis reveals targets for therapeutic intervention.
Department of Medicine II, Klinikum Rechts der Isar, Technische Universität München, 81675, München, Germany. roland.rad@lrz.tum.de
We show that BRAF(V600E) initiates an alternative pathway to colorectal cancer (CRC), which progresses through a hyperplasia/adenoma/carcinoma sequence. This pathway underlies significant subsets of CRCs with distinctive pathomorphologic/genetic/epidemiologic/clinical characteristics. Genetic and functional analyses in mice revealed a series of stage-specific molecular alterations driving different phases of tumor evolution and uncovered mechanisms underlying this stage specificity. We further demonstrate dose-dependent effects of oncogenic signaling, with physiologic Braf(V600E) expression being sufficient for hyperplasia induction, but later stage intensified Mapk-signaling driving both tumor progression and activation of intrinsic tumor suppression. Such phenomena explain, for example, the inability of p53 to restrain tumor initiation as well as its importance in invasiveness control, and the late stage specificity of its somatic mutation. Finally, systematic drug screening revealed sensitivity of this CRC subtype to targeted therapeutics, including Mek or combinatorial PI3K/Braf inhibition.
Funded by: Wellcome Trust
Cancer cell 2013;24;1;15-29
PUBMED: 23845441; PMC: 3706745; DOI: 10.1016/j.ccr.2013.05.014
-
Dnmt2-dependent methylomes lack defined DNA methylation patterns.
Division of Epigenetics, DKFZ-ZMBH Alliance, German Cancer Research Center, 69120 Heidelberg, Germany.
Several organisms have retained methyltransferase 2 (Dnmt2) as their only candidate DNA methyltransferase gene. However, information about Dnmt2-dependent methylation patterns has been limited to a few isolated loci and the results have been discussed controversially. In addition, recent studies have shown that Dnmt2 functions as a tRNA methyltransferase, which raised the possibility that Dnmt2-only genomes might be unmethylated. We have now used whole-genome bisulfite sequencing to analyze the methylomes of Dnmt2-only organisms at single-base resolution. Our results show that the genomes of Schistosoma mansoni and Drosophila melanogaster lack detectable DNA methylation patterns. Residual unconverted cytosine residues shared many attributes with bisulfite deamination artifacts and were observed at comparable levels in Dnmt2-deficient flies. Furthermore, genetically modified Dnmt2-only mouse embryonic stem cells lost the DNA methylation patterns found in wild-type cells. Our results thus uncover fundamental differences among animal methylomes and suggest that DNA methylation is dispensable for a considerable number of eukaryotic organisms.
Proceedings of the National Academy of Sciences of the United States of America 2013;110;21;8627-31
PUBMED: 23641003; DOI: 10.1073/pnas.1306723110
-
Rare variants in single-minded 1 (SIM1) are associated with severe obesity.
Single-minded 1 (SIM1) is a basic helix-loop-helix transcription factor involved in the development and function of the paraventricular nucleus of the hypothalamus. Obesity has been reported in Sim1 haploinsufficient mice and in a patient with a balanced translocation disrupting SIM1. We sequenced the coding region of SIM1 in 2,100 patients with severe, early onset obesity and in 1,680 controls. Thirteen different heterozygous variants in SIM1 were identified in 28 unrelated severely obese patients. Nine of the 13 variants significantly reduced the ability of SIM1 to activate a SIM1-responsive reporter gene when studied in stably transfected cells coexpressing the heterodimeric partners of SIM1 (ARNT or ARNT2). SIM1 variants with reduced activity cosegregated with obesity in extended family studies with variable penetrance. We studied the phenotype of patients carrying variants that exhibited reduced activity in vitro. Variant carriers exhibited increased ad libitum food intake at a test meal, normal basal metabolic rate, and evidence of autonomic dysfunction. Eleven of the 13 probands had evidence of a neurobehavioral phenotype. The phenotypic similarities between patients with SIM1 deficiency and melanocortin 4 receptor (MC4R) deficiency suggest that some of the effects of SIM1 deficiency on energy homeostasis are mediated by altered melanocortin signaling.
The Journal of clinical investigation 2013
PUBMED: 23778139; PMC: 3696558; DOI: 10.1172/JCI68016
-
DeNovoGear: de novo indel and point mutation discovery and phasing.
1] Department of Genetics, Washington University School of Medicine, St. Louis, Missouri, USA. [2].
We present DeNovoGear software for analyzing de novo mutations from familial and somatic tissue sequencing data. DeNovoGear uses likelihood-based error modeling to reduce the false positive rate of mutation discovery in exome analysis and fragment information to identify the parental origin of germ-line mutations. We used DeNovoGear on human whole-genome sequencing data to produce a set of predicted de novo insertion and/or deletion (indel) mutations with a 95% validation rate.
Nature methods 2013
PUBMED: 23975140; DOI: 10.1038/nmeth.2611
-
Sex-stratified Genome-wide Association Studies Including 270,000 Individuals Show Sexual Dimorphism in Genetic Loci for Anthropometric Traits.
Wellcome Trust Sanger Institute, Hinxton, Cambridge, United Kingdom ; Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, United Kingdom.
Given the anthropometric differences between men and women and previous evidence of sex-difference in genetic effects, we conducted a genome-wide search for sexually dimorphic associations with height, weight, body mass index, waist circumference, hip circumference, and waist-to-hip-ratio (133,723 individuals) and took forward 348 SNPs into follow-up (additional 137,052 individuals) in a total of 94 studies. Seven loci displayed significant sex-difference (FDR<5%), including four previously established (near GRB14/COBLL1, LYPLAL1/SLC30A10, VEGFA, ADAMTS9) and three novel anthropometric trait loci (near MAP3K1, HSD17B4, PPARG), all of which were genome-wide significant in women (P<5×10(-8)), but not in men. Sex-differences were apparent only for waist phenotypes, not for height, weight, BMI, or hip circumference. Moreover, we found no evidence for genetic effects with opposite directions in men versus women. The PPARG locus is of specific interest due to its role in diabetes genetics and therapy. Our results demonstrate the value of sex-specific GWAS to unravel the sexually dimorphic genetic underpinning of complex traits.
PLoS genetics 2013;9;6;e1003500
PUBMED: 23754948; PMC: 3674993; DOI: 10.1371/journal.pgen.1003500
-
Cake: a bioinformatics pipeline for the integrated analysis of somatic variants in cancer genomes.
Experimental Cancer Genetics, Wellcome Trust Sanger Institute, Hinxton, Cambridgeshire, CB10 1HH.
Summary: We have developed Cake, a bioinformatics software pipeline that integrates four publicly available somatic variant-calling algorithms to identify single nucleotide variants with higher sensitivity and accuracy than any one algorithm alone. Cake can be run on a high-performance computer cluster or used as a standalone application. Availability: Cake is open-source and is available from http://cakesomatic.sourceforge.net/ CONTACT: da1@sanger.ac.uk SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Bioinformatics (Oxford, England) 2013
PUBMED: 23803469; DOI: 10.1093/bioinformatics/btt371
-
Combined sequence-based and genetic mapping analysis of complex traits in outbred rats.
Wellcome Trust Centre for Human Genetics, Oxford, UK.
Genetic mapping on fully sequenced individuals is transforming understanding of the relationship between molecular variation and variation in complex traits. Here we report a combined sequence and genetic mapping analysis in outbred rats that maps 355 quantitative trait loci for 122 phenotypes. We identify 35 causal genes involved in 31 phenotypes, implicating new genes in models of anxiety, heart disease and multiple sclerosis. The relationship between sequence and genetic variation is unexpectedly complex: at approximately 40% of quantitative trait loci, a single sequence variant cannot account for the phenotypic effect. Using comparable sequence and mapping data from mice, we show that the extent and spatial pattern of variation in inbred rats differ substantially from those of inbred mice and that the genetic variants in orthologous genes rarely contribute to the same phenotype in both species.
Nature genetics 2013;45;7;767-75
PUBMED: 23708188; DOI: 10.1038/ng.2644
-
Identification and prioritization of novel uncharacterized peptidases for biochemical characterization.
Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.
Genome sequencing projects are generating enormous amounts of biological data that require analysis, which in turn identifies genes and proteins that require characterization. Enzymes that act on proteins are especially difficult to characterize because of the time required to distinguish one from another. This is particularly true of peptidases, the enzymes that activate, inactivate and degrade proteins. This article aims to identify clusters of sequences each of which represents the species variants of a single putative peptidase that is widely distributed and is thus merits biochemical characterization. The MEROPS database maintains large collections of sequences, references, substrate cleavage positions and inhibitor interactions of peptidases and their homologues. MEROPS also maintains a hierarchical classification of peptidase homologues, in which sequences are clustered as species variants of a single peptidase; homologous sequences are assembled into a family; and families are clustered into a clan. For each family, an alignment and a phylogenetic tree are generated. By assigning an identifier to a peptidase that has been biochemically characterized from a particular species (called a holotype), the identifier can be automatically extended to sequences from other species that cluster with the holotype. This permits transference of annotation from the holotype to other members of the cluster. By extending this concept to all peptidase homologues (including those of unknown function that have not been characterized) from model organisms representing all the major divisions of cellular life, clusters of sequences representing putative peptidases can also be identified. The 42 most widely distributed of these putative peptidases have been identified and discussed here and are prioritized as ideal candidates for biochemical characterization. Database URL: http://merops.sanger.ac.uk.
Database : the journal of biological databases and curation 2013;2013;bat022
PUBMED: 23584835; PMC: 3625958; DOI: 10.1093/database/bat022
-
Genes involved in host-parasite interactions can be revealed by their correlated expression.
Parasite genomics group, Wellcome Trust Sanger Institute, Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK. ar11@sanger.ac.uk
Molecular interactions between a parasite and its host are key to the ability of the parasite to enter the host and persist. Our understanding of the genes and proteins involved in these interactions is limited. To better understand these processes it would be advantageous to have a range of methods to predict pairs of genes involved in such interactions. Correlated gene expression profiles can be used to identify molecular interactions within a species. Here we have extended the concept to different species, showing that genes with correlated expression are more likely to encode proteins, which directly or indirectly participate in host-parasite interaction. We go on to examine our predictions of molecular interactions between the malaria parasite and both its mammalian host and insect vector. Our approach could be applied to study any interaction between species, for example, between a host and its parasites or pathogens, but also symbiotic and commensal pairings.
Funded by: Wellcome Trust: 098051
Nucleic acids research 2013;41;3;1508-18
PUBMED: 23275547; PMC: 3561955; DOI: 10.1093/nar/gks1340
-
Secretory meningiomas are defined by combined KLF4 K409Q and TRAF7 mutations.
Department of Neuropathology, Institute of Pathology, Ruprecht-Karls-University Heidelberg, Im Neuenheimer Feld 224, 69120, Heidelberg, Germany.
Meningiomas are among the most frequent intracranial tumors. The secretory variant of meningioma is characterized by glandular differentiation, formation of intracellular lumina and pseudopsammoma bodies, expression of a distinct pattern of cytokeratins and clinically by pronounced perifocal brain edema. Here we describe whole-exome sequencing analysis of DNA from 16 secretory meningiomas and corresponding constitutional tissues. All secretory meningiomas invariably harbored a mutation in both KLF4 and TRAF7. Validation in an independent cohort of 14 secretory meningiomas by Sanger sequencing or derived cleaved amplified polymorphic sequence (dCAPS) assay detected the same pattern, with KLF4 mutations observed in a total of 30/30 and TRAF7 mutations in 29/30 of these tumors. All KLF4 mutations were identical, affected codon 409 and resulted in a lysine to glutamine exchange (K409Q). KLF4 mutations were not found in 89 non-secretory meningiomas, 267 other intracranial tumors including gliomas, glioneuronal tumors, pituitary adenomas and metastases, 59 peripheral nerve sheath tumors and 52 pancreatic tumors. TRAF7 mutations were restricted to the WD40 domains. While KLF4 mutations were exclusively seen in secretory meningiomas, TRAF7 mutations were also observed in 7/89 (8 %) of non-secretory meningiomas. KLF4 and TRAF7 mutations were mutually exclusive with NF2 mutations. In conclusion, our findings suggest an essential contribution of combined KLF4 K409Q and TRAF7 mutations in the genesis of secretory meningioma and demonstrate a role for TRAF7 alterations in other non-NF2 meningiomas.
Acta neuropathologica 2013
PUBMED: 23404370; DOI: 10.1007/s00401-013-1093-x
-
Rapid Bacterial Whole-Genome Sequencing to Enhance Diagnostic and Public Health Microbiology.
Wellcome Trust Sanger Institute, Hinxton, England.
IMPORTANCE The latest generation of benchtop DNA sequencing platforms can provide an accurate whole-genome sequence (WGS) for a broad range of bacteria in less than a day. These could be used to more effectively contain the spread of multidrug-resistant pathogens. OBJECTIVE To compare WGS with standard clinical microbiology practice for the investigation of nosocomial outbreaks caused by multidrug-resistant bacteria, the identification of genetic determinants of antimicrobial resistance, and typing of other clinically important pathogens. DESIGN, SETTING, AND PARTICIPANTS A laboratory-based study of hospital inpatients with a range of bacterial infections at Cambridge University Hospitals NHS Foundation Trust, a secondary and tertiary referral center in England, comparing WGS with standard diagnostic microbiology using stored bacterial isolates and clinical information. MAIN OUTCOMES AND MEASURES Specimens were taken and processed as part of routine clinical care, and cultured isolates stored and referred for additional reference laboratory testing as necessary. Isolates underwent DNA extraction and library preparation prior to sequencing on the Illumina MiSeq platform. Bioinformatic analyses were performed by persons blinded to the clinical, epidemiologic, and antimicrobial susceptibility data. RESULTS We investigated 2 putative nosocomial outbreaks, one caused by vancomycin-resistant Enterococcus faecium and the other by carbapenem-resistant Enterobacter cloacae; WGS accurately discriminated between outbreak and nonoutbreak isolates and was superior to conventional typing methods. We compared WGS with standard methods for the identification of the mechanism of carbapenem resistance in a range of gram-negative bacteria (Acinetobacter baumannii, E cloacae, Escherichia coli, and Klebsiella pneumoniae). This demonstrated concordance between phenotypic and genotypic results, and the ability to determine whether resistance was attributable to the presence of carbapenemases or other resistance mechanisms. Whole-genome sequencing was used to recapitulate reference laboratory typing of clinical isolates of Neisseria meningitidis and to provide extended phylogenetic analyses of these. CONCLUSIONS AND RELEVANCE The speed, accuracy, and depth of information provided by WGS platforms to confirm or refute outbreaks in hospitals and the community, and to accurately define transmission of multidrug-resistant and other organisms, represents an important advance.
JAMA internal medicine 2013
PUBMED: 23857503; DOI: 10.1001/jamainternmed.2013.7734
-
A pilot study of rapid whole-genome sequencing for the investigation of a Legionella outbreak.
The Wellcome Trust Sanger Institute, Hinxton, UK.
Objectives: Epidemiological investigations of Legionnaires' disease outbreaks rely on the rapid identification and typing of clinical and environmental Legionella isolates in order to identify and control the source of infection. Rapid bacterial whole-genome sequencing (WGS) is an emerging technology that has the potential to rapidly discriminate outbreak from non-outbreak isolates in a clinically relevant time frame. Methods: We performed a pilot study to determine the feasibility of using bacterial WGS to differentiate outbreak from non-outbreak isolates collected during an outbreak of Legionnaires' disease. Seven Legionella isolates (three clinical and four environmental) were obtained from the reference laboratory and sequenced using the Illumina MiSeq platform at Addenbrooke's Hospital, Cambridge. Bioinformatic analysis was performed blinded to the epidemiological data at the Wellcome Trust Sanger Institute. Results: We were able to distinguish outbreak from non-outbreak isolates using bacterial WGS, and to confirm the probable environmental source. Our analysis also highlighted constraints, which were the small number of Legionella pneumophila isolates available for sequencing, and the limited number of published genomes for comparison. Conclusions: We have demonstrated the feasibility of using rapid WGS to investigate an outbreak of Legionnaires' disease. Future work includes building larger genomic databases of L pneumophila from both clinical and environmental sources, developing automated data interpretation software, and conducting a cost-benefit analysis of WGS versus current typing methods.
BMJ open 2013;3;1
PUBMED: 23306006; DOI: 10.1136/bmjopen-2012-002175
-
GWAS of 126,559 Individuals Identifies Genetic Variants Associated with Educational Attainment.
Department of Applied Economics, Erasmus School of Economics, Erasmus University Rotterdam, 3000 DR Rotterdam, The Netherlands.
A genome-wide association study of educational attainment was conducted in a discovery sample of 101,069 individuals and a replication sample of 25,490. Three independent SNPs are genome-wide significant (rs9320913, rs11584700, rs4851266), and all three replicate. Estimated effects sizes are small (R(2) ≈ 0.02%), approximately 1 month of schooling per allele. A linear polygenic score from all measured SNPs accounts for ≈ 2% of the variance in both educational attainment and cognitive function. Genes in the region of the loci have previously been associated with health, cognitive, and central nervous system phenotypes, and bioinformatics analyses suggest the involvement of the anterior caudate nucleus. These findings provide promising candidate SNPs for follow-up work, and our effect size estimates can anchor power analyses in social-science genetics.
Science (New York, N.Y.) 2013
PUBMED: 23722424; DOI: 10.1126/science.1235488
-
μ-Opioid Receptor Gene (OPRM1) Polymorphism A118G: Lack of Association in Finnish Populations with Alcohol Dependence or Alcohol Consumption.
Corresponding author: Ministry of Social Affairs and Health, Department of Occupational Safety and Health, PO Box 33, FI-00023 Government, Finland;
Aims: The molecular epidemiological studies on the association of the opioid receptor µ-1 (OPRM1) polymorphism A118G (Asn40Asp, rs1799971) and alcohol use disorders have given conflicting results. The aim of this study was to test the possible association of A118G polymorphism and alcohol use disorders and alcohol consumption in three large cohort-based study samples.
Methods: The association between the OPRM1 A118G (Asn40Asp, rs1799971) polymorphism and alcohol use disorders and alcohol consumption was analyzed using three different population-based samples: (a) a Finnish cohort study, Health 2000, with 503 participants having a DSM-IV diagnosis for alcohol dependence and/or alcohol abuse and 506 age- and sex-matched controls; (b) a Finnish cohort study, FINRISK (n = 2360) and (c) the Helsinki Birth Cohort Study (n = 1384). The latter two populations lacked diagnosis-based phenotypes, but included detailed information on alcohol consumption.
Results: We found no statistically significant differences in genotypic or allelic distribution between controls and subjects with alcohol dependence or abuse diagnoses. Likewise no significant effects were observed between the A118G genotype and alcohol consumption.
Conclusion: These results suggest that A118G (Asn40Asp) polymorphism may not have a major effect on the development of alcohol use disorders at least in the Finnish population.
Alcohol and alcoholism (Oxford, Oxfordshire) 2013;48;5;519-25
PUBMED: 23729673; DOI: 10.1093/alcalc/agt050
-
Identification of nine new susceptibility loci for testicular cancer, including variants near DAZL and PRDM14.
Division of Genetics and Epidemiology, Institute of Cancer Research, Sutton, UK.
Testicular germ cell tumor (TGCT) is the most common cancer in young men and is notable for its high familial risks. So far, six loci associated with TGCT have been reported. From genome-wide association study (GWAS) analysis of 307,291 SNPs in 986 TGCT cases and 4,946 controls, we selected for follow-up 694 SNPs, which we genotyped in a further 1,064 TGCT cases and 10,082 controls from the UK. We identified SNPs at nine new loci (1q22, 1q24.1, 3p24.3, 4q24, 5q31.1, 8q13.3, 16q12.1, 17q22 and 21q22.3) showing association with TGCT (P < 5 × 10(-8)), which together account for an additional 4-6% of the familial risk of TGCT. The loci include genes plausibly related to TGCT development. PRDM14, at 8q13.3, is essential for early germ cell specification, and DAZL, at 3p24.3, is required for the regulation of germ cell development. Furthermore, PITX1, at 5q31.1, regulates TERT expression and is the third TGCT-associated locus implicated in telomerase regulation.
Nature genetics 2013;45;6;686-9
PUBMED: 23666240; PMC: 3680037; DOI: 10.1038/ng.2635
-
Mosaic PPM1D mutations are associated with predisposition to breast and ovarian cancer.
Division of Genetics & Epidemiology, The Institute of Cancer Research, Sutton SM2 5NG, UK.
Improved sequencing technologies offer unprecedented opportunities for investigating the role of rare genetic variation in common disease. However, there are considerable challenges with respect to study design, data analysis and replication. Using pooled next-generation sequencing of 507 genes implicated in the repair of DNA in 1,150 samples, an analytical strategy focused on protein-truncating variants (PTVs) and a large-scale sequencing case-control replication experiment in 13,642 individuals, here we show that rare PTVs in the p53-inducible protein phosphatase PPM1D are associated with predisposition to breast cancer and ovarian cancer. PPM1D PTV mutations were present in 25 out of 7,781 cases versus 1 out of 5,861 controls (P = 1.12 × 10(-5)), including 18 mutations in 6,912 individuals with breast cancer (P = 2.42 × 10(-4)) and 12 mutations in 1,121 individuals with ovarian cancer (P = 3.10 × 10(-9)). Notably, all of the identified PPM1D PTVs were mosaic in lymphocyte DNA and clustered within a 370-base-pair region in the final exon of the gene, carboxy-terminal to the phosphatase catalytic domain. Functional studies demonstrate that the mutations result in enhanced suppression of p53 in response to ionizing radiation exposure, suggesting that the mutant alleles encode hyperactive PPM1D isoforms. Thus, although the mutations cause premature protein truncation, they do not result in the simple loss-of-function effect typically associated with this class of variant, but instead probably have a gain-of-function effect. Our results have implications for the detection and management of breast and ovarian cancer risk. More generally, these data provide new insights into the role of rare and of mosaic genetic variants in common conditions, and the use of sequencing in their identification.
Funded by: Cancer Research UK: C12292/A11174; Medical Research Council: G0000934, G0900747 91070; Wellcome Trust: 068545/Z/02, 090532/Z/09/Z, 091157
Nature 2013;493;7432;406-10
PUBMED: 23242139; DOI: 10.1038/nature11725
-
Evolution of GluN2A/B cytoplasmic domains diversified vertebrate synaptic plasticity and behavior.
Genes to Cognition Programme, Wellcome Trust Sanger Institute, Cambridge, UK.
Two genome duplications early in the vertebrate lineage expanded gene families, including GluN2 subunits of the NMDA receptor. Diversification between the four mammalian GluN2 proteins occurred primarily at their intracellular C-terminal domains (CTDs). To identify shared ancestral functions and diversified subunit-specific functions, we exchanged the exons encoding the GluN2A (also known as Grin2a) and GluN2B (also known as Grin2b) CTDs in two knock-in mice and analyzed the mice's biochemistry, synaptic physiology, and multiple learned and innate behaviors. The eight behaviors were genetically separated into four groups, including one group comprising three types of learning linked to conserved GluN2A/B regions. In contrast, the remaining five behaviors exhibited subunit-specific regulation. GluN2A/B CTD diversification conferred differential binding to cytoplasmic MAGUK proteins and differential forms of long-term potentiation. These data indicate that vertebrate behavior and synaptic signaling acquired increased complexity from the duplication and diversification of ancestral GluN2 genes.
Funded by: Medical Research Council; NIMH NIH HHS: R01 MH060919; Wellcome Trust
Nature neuroscience 2013;16;1;25-32
PUBMED: 23201971; DOI: 10.1038/nn.3277
-
Molecular Characterization of Mutant Mouse Strains Generated from the EUCOMM/KOMP-CSD ES Cell Resource.
The Wellcome Trust Sanger Institute, Hinxton, Cambridgeshire, CB10 1SA, UK, er1@sanger.ac.uk.
The Sanger Mouse Genetics Project generates knockout mice strains using the EUCOMM/KOMP-CSD embryonic stem (ES) cell collection and characterizes the consequences of the mutations using a high-throughput primary phenotyping screen. Upon achieving germline transmission, new strains are subject to a panel of quality control (QC) PCR- and qPCR-based assays to confirm the correct targeting, cassette structure, and the presence of the 3' LoxP site (required for the potential conditionality of the allele). We report that over 86 % of the 731 strains studied showed the correct targeting and cassette structure, of which 97 % retained the 3' LoxP site. We discuss the characteristics of the lines that failed QC and postulate that the majority of these may be due to mixed ES cell populations which were not detectable with the original screening techniques employed when creating the ES cell resource.
Mammalian genome : official journal of the International Mammalian Genome Society 2013
PUBMED: 23912999; DOI: 10.1007/s00335-013-9467-x
-
Genomic analysis of a novel spontaneous albino C57BL/6N mouse strain.
The Wellcome Trust Sanger Institute, Hinxton, Cambridgeshire CB10 1SA, United Kingdom.
We report an albino C57BL/6N mouse strain carrying a spontaneous mutation in the Tyrosinase gene (C57BL/6N-Tyr(cWTSI) ). Deep whole genome sequencing of founder mice revealed very little divergence from C57BL/6NJ and C57BL/6N (Taconic). This co-isogenic strain will be of great utility for the International Mouse Phenotyping Consortium (IMPC), which uses the EUCOMM/KOMP targeted C57BL/6N ES cell resource, and other investigators wishing to work on a defined C57BL/6N background. © 2013 Wiley Periodicals, Inc.
Genesis (New York, N.Y. : 2000) 2013
PUBMED: 23620107; DOI: 10.1002/dvg.22398
-
Characterization and comparative analysis of the complete Haemonchus contortus β-tubulin gene family and implications for benzimidazole resistance in strongylid nematodes.
Institute of Infection, Immunity and Inflammation, College of Medical, Veterinary and Life Sciences, University of Glasgow, 464 Bearsden Road, Glasgow, Scotland G61 1QH, UK; Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.
Parasitic nematode β-tubulin genes are of particular interest because they are the targets of benzimidazole drugs. However, in spite of this, the full β-tubulin gene family has not been characterized for any parasitic nematode to date. Haemonchus contortus is the parasite species for which we understand benzimidazole resistance the best and its close phylogenetic relationship with Caenorhabditis elegans potentially allows inferences of gene function by comparative analysis. Consequently, we have characterized the full β-tubulin gene family in H. contortus. Further to the previously identified Hco-tbb-iso-1 and Hco-tbb-iso-2 genes, we have characterized two additional family members designated Hco-tbb-iso-3 and Hco-tbb-iso-4. We show that Hco-tbb-iso-1 is not a one-to-one orthologue with Cel-ben-1, the only β-tubulin gene in C. elegans that is a benzimidazole drug target. Instead, both Hco-tbb-iso-1 and Hco-tbb-iso-2 have a complex evolutionary relationship with three C. elegans β-tubulin genes: Cel-ben-1, Cel-tbb-1 and Cel-tbb-2. Furthermore, we show that both Hco-tbb-iso-1 and Hco-tbb-iso-2 are highly expressed in adult worms; in contrast, Hco-tbb-iso-3 and Hco-tbb-iso-4 are expressed only at very low levels and are orthologous to the Cel-mec-7 and Cel-tbb-4 genes, respectively, suggesting that they have specialized functional roles. Indeed, we have found that the expression pattern of Hco-tbb-iso-3 in H. contortus is identical to that of Cel-mec-7 in C. elegans, being expressed in just six "touch receptor" mechano-sensory neurons. These results suggest that further investigation is warranted into the potential involvement of strongylid isotype-2 β-tubulin genes in mechanisms of benzimidazole resistance.
International journal for parasitology 2013;43;6;465-75
PUBMED: 23416426; DOI: 10.1016/j.ijpara.2012.12.011
-
Genome-wide association study identifies a novel locus contributing to type 2 diabetes susceptibility in sikhs of punjabi origin from India.
Center for Human Genetic Research and Department of Anesthesia, Critical Care and Pain Medicine, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts.
We performed a genome-wide association study (GWAS) and a multistage meta-analysis of type 2 diabetes (T2D) in Punjabi Sikhs from India. Our discovery GWAS in 1,616 individuals (842 case subjects) was followed by in silico replication of the top 513 independent single nucleotide polymorphisms (SNPs) (P < 10(-3)) in Punjabi Sikhs (n = 2,819; 801 case subjects). We further replicated 66 SNPs (P < 10(-4)) through genotyping in a Punjabi Sikh sample (n = 2,894; 1,711 case subjects). On combined meta-analysis in Sikh populations (n = 7,329; 3,354 case subjects), we identified a novel locus in association with T2D at 13q12 represented by a directly genotyped intronic SNP (rs9552911, P = 1.82 × 10(-8)) in the SGCG gene. Next, we undertook in silico replication (stage 2b) of the top 513 signals (P < 10(-3)) in 29,157 non-Sikh South Asians (10,971 case subjects) and de novo genotyping of up to 31 top signals (P < 10(-4)) in 10,817 South Asians (5,157 case subjects) (stage 3b). In combined South Asian meta-analysis, we observed six suggestive associations (P < 10(-5) to < 10(-7)), including SNPs at HMG1L1/CTCFL, PLXNA4, SCAP, and chr5p11. Further evaluation of 31 top SNPs in 33,707 East Asians (16,746 case subjects) (stage 3c) and 47,117 Europeans (8,130 case subjects) (stage 3d), and joint meta-analysis of 128,127 individuals (44,358 case subjects) from 27 multiethnic studies, did not reveal any additional loci nor was there any evidence of replication for the new variant. Our findings provide new evidence on the presence of a population-specific signal in relation to T2D, which may provide additional insights into T2D pathogenesis.
Diabetes 2013;62;5;1746-55
PUBMED: 23300278; PMC: 3636649; DOI: 10.2337/db12-1077
-
A genome-wide survey of genetic variation in gorillas using reduced representation sequencing.
The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, United Kingdom.
All non-human great apes are endangered in the wild, and it is therefore important to gain an understanding of their demography and genetic diversity. Whole genome assembly projects have provided an invaluable foundation for understanding genetics in all four genera, but to date genetic studies of multiple individuals within great ape species have largely been confined to mitochondrial DNA and a small number of other loci. Here, we present a genome-wide survey of genetic variation in gorillas using a reduced representation sequencing approach, focusing on the two lowland subspecies. We identify 3,006,670 polymorphic sites in 14 individuals: 12 western lowland gorillas (Gorilla gorilla gorilla) and 2 eastern lowland gorillas (Gorilla beringei graueri). We find that the two species are genetically distinct, based on levels of heterozygosity and patterns of allele sharing. Focusing on the western lowland population, we observe evidence for population substructure, and a deficit of rare genetic variants suggesting a recent episode of population contraction. In western lowland gorillas, there is an elevation of variation towards telomeres and centromeres on the chromosomal scale. On a finer scale, we find substantial variation in genetic diversity, including a marked reduction close to the major histocompatibility locus, perhaps indicative of recent strong selection there. These findings suggest that despite their maintaining an overall level of genetic diversity equal to or greater than that of humans, population decline, perhaps associated with disease, has been a significant factor in recent and long-term pressures on wild gorilla populations.
Funded by: Wellcome Trust: 098051
PloS one 2013;8;6;e65066
PUBMED: 23750230; PMC: 3672199; DOI: 10.1371/journal.pone.0065066
-
On Your MARK, Get SET(D2), Go! H3K36me3 Primes DNA Mismatch Repair.
The Gurdon Institute and the Department of Biochemistry, University of Cambridge, Cambridge CB2 1QN, UK.
Trimethylation of histone H3 on Lys36 (H3K36me3) by SETD2 is linked to actively transcribed regions. Li et al. identify a novel role for H3K36me3 that facilitates DNA mismatch repair (MMR) in cells by targeting the MMR machinery to chromatin during the cell cycle, thereby explaining certain cases of MMR-defective cancers.
Cell 2013;153;3;513-5
PUBMED: 23622237; DOI: 10.1016/j.cell.2013.04.018
-
Exome sequencing identifies DYNC2H1 mutations as a common cause of asphyxiating thoracic dystrophy (Jeune syndrome) without major polydactyly, renal or retinal involvement.
1Molecular Medicine Unit, Birth Defects Research Centre, University College London (UCL) Institute of Child Health, London, UK.
BACKGROUND: Jeune asphyxiating thoracic dystrophy (JATD) is a rare, often lethal, recessively inherited chondrodysplasia characterised by shortened ribs and long bones, sometimes accompanied by polydactyly, and renal, liver and retinal disease. Mutations in intraflagellar transport (IFT) genes cause JATD, including the IFT dynein-2 motor subunit gene DYNC2H1. Genetic heterogeneity and the large DYNC2H1 gene size have hindered JATD genetic diagnosis. AIMS AND METHODS: To determine the contribution to JATD we screened DYNC2H1 in 71 JATD patients JATD patients combining SNP mapping, Sanger sequencing and exome sequencing. RESULTS AND CONCLUSIONS: We detected 34 DYNC2H1 mutations in 29/71 (41%) patients from 19/57 families (33%), showing it as a major cause of JATD especially in Northern European patients. This included 13 early protein termination mutations (nonsense/frameshift, deletion, splice site) but no patients carried these in combination, suggesting the human phenotype is at least partly hypomorphic. In addition, 21 missense mutations were distributed across DYNC2H1 and these showed some clustering to functional domains, especially the ATP motor domain. DYNC2H1 patients largely lacked significant extra-skeletal involvement, demonstrating an important genotype-phenotype correlation in JATD. Significant variability exists in the course and severity of the thoracic phenotype, both between affected siblings with identical DYNC2H1 alleles and among individuals with different alleles, which suggests the DYNC2H1 phenotype might be subject to modifier alleles, non-genetic or epigenetic factors. Assessment of fibroblasts from patients showed accumulation of anterograde IFT proteins in the ciliary tips, confirming defects similar to patients with other retrograde IFT machinery mutations, which may be of undervalued potential for diagnostic purposes.
Journal of medical genetics 2013
PUBMED: 23456818; DOI: 10.1136/jmedgenet-2012-101284
-
Combined NGS approaches identify mutations in the intraflagellar transport gene IFT140 in skeletal ciliopathies with early progressive kidney Disease.
Molecular Medicine Unit, University College London Institute of Child Health, London, UK.
Ciliopathies are genetically heterogeneous disorders characterized by variable expressivity and overlaps between different disease entities. This is exemplified by the short rib-polydactyly syndromes, Jeune, Sensenbrenner, and Mainzer-Saldino chondrodysplasia syndromes. These three syndromes are frequently caused by mutations in intraflagellar transport (IFT) genes affecting the primary cilia, which play a crucial role in skeletal and chondral development. Here, we identified mutations in IFT140, an IFT complex A gene, in five Jeune asphyxiating thoracic dystrophy (JATD) and two Mainzer-Saldino syndrome (MSS) families, by screening a cohort of 66 JATD/MSS patients using whole exome sequencing and targeted resequencing of a customized ciliopathy gene panel. We also found an enrichment of rare IFT140 alleles in JATD compared with nonciliopathy diseases, implying putative modifier effects for certain alleles. IFT140 patients presented with mild chest narrowing, but all had end-stage renal failure under 13 years of age and retinal dystrophy when examined for ocular dysfunction. This is consistent with the severe cystic phenotype of Ift140 conditional knockout mice, and the higher level of Ift140 expression in kidney and retina compared with the skeleton at E15.5 in the mouse. IFT140 is therefore a major cause of cono-renal syndromes (JATD and MSS). The present study strengthens the rationale for IFT140 screening in skeletal ciliopathy spectrum patients that have kidney disease and/or retinal dystrophy.
Funded by: NIGMS NIH HHS: GM060992, R01 GM060992; Wellcome Trust: UK10K, WT091310
Human mutation 2013;34;5;714-24
PUBMED: 23418020; DOI: 10.1002/humu.22294
-
Mechanisms controlling the temporal degradation of Nek2A and Kif18A by the APC/C-Cdc20 complex.
The Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Copenhagen, Denmark.
The Anaphase Promoting Complex/Cyclosome (APC/C) in complex with its co-activator Cdc20 is responsible for targeting proteins for ubiquitin-mediated degradation during mitosis. The activity of APC/C-Cdc20 is inhibited during prometaphase by the Spindle Assembly Checkpoint (SAC) yet certain substrates escape this inhibition. Nek2A degradation during prometaphase depends on direct binding of Nek2A to the APC/C via a C-terminal MR dipeptide but whether this motif alone is sufficient is not clear. Here, we identify Kif18A as a novel APC/C-Cdc20 substrate and show that Kif18A degradation depends on a C-terminal LR motif. However in contrast to Nek2A, Kif18A is not degraded until anaphase showing that additional mechanisms contribute to Nek2A degradation. We find that dimerization via the leucine zipper, in combination with the MR motif, is required for stable Nek2A binding to and ubiquitination by the APC/C. Nek2A and the mitotic checkpoint complex (MCC) have an overlap in APC/C subunit requirements for binding and we propose that Nek2A binds with high affinity to apo-APC/C and is degraded by the pool of Cdc20 that avoids inhibition by the SAC.
Funded by: Wellcome Trust: 079643/Z/06/Z
The EMBO journal 2013;32;2;303-14
PUBMED: 23288039; PMC: 3553385; DOI: 10.1038/emboj.2012.335
-
Conceptual links between DNA methylation reprogramming in the early embryo and primordial germ cells.
Epigenetics Programme, The Babraham Institute, Cambridge CB22 3AT, UK. Electronic address: stefanie.seisenberger@babraham.ac.uk.
DNA methylation is a carrier of important regulatory information that undergoes global reprogramming in the mammalian germ line, including pre-implantation embryos and primordial germ cells (PGCs). A flurry of recent studies have employed technical advances to generate global profiles of methylation and hydroxymethylation in these cells, unravelling the dynamics of methylation erasure at single locus resolution. Active demethylation in the zygote, involving extensive oxidation, is followed by passive loss over early cell divisions. Certain gamete-contributed methylation marks appear to have evolved non-canonical mechanisms for targeted maintenance of methylation in the face of these processes. These protected sequences include the imprinting control regions (ICRs) required for parental imprinting but also a surprising number of other regions. Such targeted maintenance mechanisms may also operate at certain sequences during early PGC migration when global passive demethylation occurs. In later gonadal PGCs, imprints must be reset and this may be achieved through the targeting of active mechanisms including oxidation. Thus, emerging evidence paints a complex picture whereby active and passive demethylation pathways operate synergistically and in parallel to ensure robust erasure in the early embryo and PGCs.
Current opinion in cell biology 2013
PUBMED: 23510682; DOI: 10.1016/j.ceb.2013.02.013
-
Reprogramming DNA methylation in the mammalian life cycle: building and breaking epigenetic barriers.
Epigenetics Programme, The Babraham Institute, , Cambridge CB22 3AT, UK.
In mammalian development, epigenetic modifications, including DNA methylation patterns, play a crucial role in defining cell fate but also represent epigenetic barriers that restrict developmental potential. At two points in the life cycle, DNA methylation marks are reprogrammed on a global scale, concomitant with restoration of developmental potency. DNA methylation patterns are subsequently re-established with the commitment towards a distinct cell fate. This reprogramming of DNA methylation takes place firstly on fertilization in the zygote, and secondly in primordial germ cells (PGCs), which are the direct progenitors of sperm or oocyte. In each reprogramming window, a unique set of mechanisms regulates DNA methylation erasure and re-establishment. Recent advances have uncovered roles for the TET3 hydroxylase and passive demethylation, together with base excision repair (BER) and the elongator complex, in methylation erasure from the zygote. Deamination by AID, BER and passive demethylation have been implicated in reprogramming in PGCs, but the process in its entirety is still poorly understood. In this review, we discuss the dynamics of DNA methylation reprogramming in PGCs and the zygote, the mechanisms involved and the biological significance of these events. Advances in our understanding of such natural epigenetic reprogramming are beginning to aid enhancement of experimental reprogramming in which the role of potential mechanisms can be investigated in vitro. Conversely, insights into in vitro reprogramming techniques may aid our understanding of epigenetic reprogramming in the germline and supply important clues in reprogramming for therapies in regenerative medicine.
Philosophical transactions of the Royal Society of London. Series B, Biological sciences 2013;368;1609;20110330
PUBMED: 23166394; DOI: 10.1098/rstb.2011.0330
-
Playing the 'next-generation game'.
Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK. microbes@sanger.ac.uk.
Advances in single-molecule DNA sequencing are enabling research into the fine resolution of DNA structure, and rapid, direct sequencing of pathogen genomes.
Nature reviews. Microbiology 2013;11;2;74
PUBMED: 23321531; DOI: 10.1038/nrmicro2956
-
Whole-genome sequences of Chlamydia trachomatis directly from clinical samples without culture.
Pathogen Genomics, The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SA, United Kingdom. hss@sanger.ac.uk
The use of whole-genome sequencing as a tool for the study of infectious bacteria is of growing clinical interest. Chlamydia trachomatis is responsible for sexually transmitted infections and the blinding disease trachoma, which affect hundreds of millions of people worldwide. Recombination is widespread within the genome of C. trachomatis, thus whole-genome sequencing is necessary to understand the evolution, diversity, and epidemiology of this pathogen. Culture of C. trachomatis has, until now, been a prerequisite to obtain DNA for whole-genome sequencing; however, as C. trachomatis is an obligate intracellular pathogen, this procedure is technically demanding and time consuming. Discarded clinical samples represent a large resource for sequencing the genomes of pathogens, yet clinical swabs frequently contain very low levels of C. trachomatis DNA and large amounts of contaminating microbial and human DNA. To determine whether it is possible to obtain whole-genome sequences from bacteria without the need for culture, we have devised an approach that combines immunomagnetic separation (IMS) for targeted bacterial enrichment with multiple displacement amplification (MDA) for whole-genome amplification. Using IMS-MDA in conjunction with high-throughput multiplexed Illumina sequencing, we have produced the first whole bacterial genome sequences direct from clinical samples. We also show that this method can be used to generate genome data from nonviable archived samples. This method will prove a useful tool in answering questions relating to the biology of many difficult-to-culture or fastidious bacteria of clinical concern.
Funded by: Wellcome Trust: 098051
Genome research 2013;23;5;855-66
PUBMED: 23525359; PMC: 3638141; DOI: 10.1101/gr.150037.112
-
Genome Sequence of Chlamydia psittaci Strain 01DC12 Originating from Swine.
Pathogen Genomics, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, United Kingdom.
Chlamydia psittaci is the etiological agent of psittacosis and is a zoonotic pathogen infecting birds and a variety of mammalian hosts. Here we report the genome sequence of the porcine strain 01DC12 which is representative of a novel clade of C. psittaci belonging to ompA genotype E.
Genome announcements 2013;1;1
PUBMED: 23405306; DOI: 10.1128/genomeA.00078-12
-
Evolution. Great apes and zoonoses.
Institute of Evolutionary Biology, University of Edinburgh, Edinburgh EH9 3JT, UK.
Science (New York, N.Y.) 2013;340;6130;284-6
PUBMED: 23599472; DOI: 10.1126/science.1236958
-
Historical Zoonoses and Other Changes in Host Tropism of Staphylococcus aureus, Identified by Phylogenetic Analysis of a Population Dataset.
MRC Centre for Outbreak Analysis and Modelling, Department of Infectious Disease Epidemiology, School of Public Health, Imperial College London, London, United Kingdom.
BACKGROUND: Staphylococcus aureus exhibits tropisms to many distinct animal hosts. While spillover events can occur wherever there is an interface between host species, changes in host tropism only occur with the establishment of sustained transmission in the new host species, leading to clonal expansion. Although the genomic variation underpinning adaptation in S. aureus genotypes infecting bovids and poultry has been well characterized the frequency of switches from one host to another remains obscure. We sought to identify sustained switches in host tropism in the S. aureus population, both anthroponotic and zoonotic, and their distribution over the species phylogeny. METHODOLOGIESRESULTS: We have used a sample of 3042 isolates, representing 696 distinct MLST genotypes, from a well-established database (www.mlst.net). Using an empirical parsimony approach (AdaptML) we have investigated the distribution of switches in host association between both human and non-human (henceforth referred to as animal) hosts. We reconstructed a credible description of past events in the form of a phylogenetic tree; the nodes and leaves of which are statistically associated with either human or animal habitats, estimated from extant host-association and the degree of sequence divergence between genotypes. We identified 15 likely historical switching events; 13 anthroponoses and two zoonoses. Importantly, we identified two human-associated clade candidates (CC25 and CC59) that have arisen from animal-associated ancestors; this demonstrates that a human-specific lineage can emerge from an animal host. We also highlight novel rabbit-associated genotypes arising from a human ancestor. CONCLUSIONS: S. aureus is an organism with the capacity to switch into and adapt to novel hosts, even after long periods of isolation in a single host species. Based on this evidence, animal-adapted S. aureus lineages exhibiting resistance to antibiotics must be considered a major threat to public health, as they can adapt to the human population.
PloS one 2013;8;5;e62369
PUBMED: 23667472; DOI: 10.1371/journal.pone.0062369
-
Progressive genome-wide introgression in agricultural Campylobacter coli.
Department of Zoology, The Tinbergen Building, University of Oxford, South Parks Road, Oxford, OX1 3PS, UK. s.k.sheppard@swansea.ac.uk
Hybridization between distantly related organisms can facilitate rapid adaptation to novel environments, but is potentially constrained by epistatic fitness interactions among cell components. The zoonotic pathogens Campylobacter coli and C. jejuni differ from each other by around 15% at the nucleotide level, corresponding to an average of nearly 40 amino acids per protein-coding gene. Using whole genome sequencing, we show that a single C. coli lineage, which has successfully colonized an agricultural niche, has been progressively accumulating C. jejuni DNA. Members of this lineage belong to two groups, the ST-828 and ST-1150 clonal complexes. The ST-1150 complex is less frequently isolated and has undergone a substantially greater amount of introgression leading to replacement of up to 23% of the C. coli core genome as well as import of novel DNA. By contrast, the more commonly isolated ST-828 complex bacteria have 10-11% introgressed DNA, and C. jejuni and nonagricultural C. coli lineages each have <2%. Thus, the C. coli that colonize agriculture, and consequently cause most human disease, have hybrid origin, but this cross-species exchange has so far not had a substantial impact on the gene pools of either C. jejuni or nonagricultural C. coli. These findings also indicate remarkable interchangeability of basic cellular machinery after a prolonged period of independent evolution.
Funded by: Biotechnology and Biological Sciences Research Council
Molecular ecology 2013;22;4;1051-64
PUBMED: 23279096; DOI: 10.1111/mec.12162
-
Genome-wide association study identifies vitamin B5 biosynthesis as a host specificity factor in Campylobacter.
Department of Zoology, University of Oxford, Oxford OX1 3PS, United Kingdom.
Genome-wide association studies have the potential to identify causal genetic factors underlying important phenotypes but have rarely been performed in bacteria. We present an association mapping method that takes into account the clonal population structure of bacteria and is applicable to both core and accessory genome variation. Campylobacter is a common cause of human gastroenteritis as a consequence of its proliferation in multiple farm animal species and its transmission via contaminated meat and poultry. We applied our association mapping method to identify the factors responsible for adaptation to cattle and chickens among 192 Campylobacter isolates from these and other host sources. Phylogenetic analysis implied frequent host switching but also showed that some lineages were strongly associated with particular hosts. A seven-gene region with a host association signal was found. Genes in this region were almost universally present in cattle but were frequently absent in isolates from chickens and wild birds. Three of the seven genes encoded vitamin B5 biosynthesis. We found that isolates from cattle were better able to grow in vitamin B5-depleted media and propose that this difference may be an adaptation to host diet.
Funded by: Biotechnology and Biological Sciences Research Council; Wellcome Trust
Proceedings of the National Academy of Sciences of the United States of America 2013;110;29;11923-7
PUBMED: 23818615; PMC: 3718156; DOI: 10.1073/pnas.1305559110
-
Stress-induced lipocalin-2 controls dendritic spine formation and neuronal activity in the amygdala.
University of Exeter Medical School, Exeter, United Kingdom.
Behavioural adaptation to psychological stress is dependent on neuronal plasticity and dysfunction at this cellular level may underlie the pathogenesis of affective disorders such as depression and post-traumatic stress disorder. Taking advantage of genome-wide microarray assay, we performed detailed studies of stress-affected transcripts in the amygdala - an area which forms part of the innate fear circuit in mammals. Having previously demonstrated the role of lipocalin-2 (Lcn-2) in promoting stress-induced changes in dendritic spine morphology/function and neuronal excitability in the mouse hippocampus, we show here that the Lcn-2 gene is one of the most highly upregulated transcripts detected by microarray analysis in the amygdala after acute restraint-induced psychological stress. This is associated with increased Lcn-2 protein synthesis, which is found on immunohistochemistry to be predominantly localised to neurons. Stress-naïve Lcn-2(-/-) mice show a higher spine density in the basolateral amygdala and a 2-fold higher rate of neuronal firing rate compared to wild-type mice. Unlike their wild-type counterparts, Lcn-2(-/-) mice did not show an increase in dendritic spine density in response to stress but did show a distinct pattern of spine morphology. Thus, amygdala-specific neuronal responses to Lcn-2 may represent a mechanism for behavioural adaptation to psychological stress.
PloS one 2013;8;4;e61046
PUBMED: 23593384; PMC: 3621903; DOI: 10.1371/journal.pone.0061046
-
PhenoDigm: analyzing curated annotations to associate animal models with human diseases.
Mouse Informatics Group, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.
The ultimate goal of studying model organisms is to translate what is learned into useful knowledge about normal human biology and disease to facilitate treatment and early screening for diseases. Recent advances in genomic technologies allow for rapid generation of models with a range of targeted genotypes as well as their characterization by high-throughput phenotyping. As an abundance of phenotype data become available, only systematic analysis will facilitate valid conclusions to be drawn from these data and transferred to human diseases. Owing to the volume of data, automated methods are preferable, allowing for a reliable analysis of the data and providing evidence about possible gene-disease associations. Here, we propose Phenotype comparisons for DIsease Genes and Models (PhenoDigm), as an automated method to provide evidence about gene-disease associations by analysing phenotype information. PhenoDigm integrates data from a variety of model organisms and, at the same time, uses several intermediate scoring methods to identify only strongly data-supported gene candidates for human genetic diseases. We show results of an automated evaluation as well as selected manually assessed examples that support the validity of PhenoDigm. Furthermore, we provide guidance on how to browse the data with PhenoDigm's web interface and illustrate its usefulness in supporting research. Database URL: http://www.sanger.ac.uk/resources/databases/phenodigm.
Database : the journal of biological databases and curation 2013;2013;bat025
PUBMED: 23660285; PMC: 3649640; DOI: 10.1093/database/bat025
-
Sequencing of the sea lamprey (Petromyzon marinus) genome provides insights into vertebrate evolution.
1] Department of Biology, University of Kentucky, Lexington, Kentucky, USA. [2] Benaroya Research Institute at Virginia Mason, Seattle, Washington, USA.
Lampreys are representatives of an ancient vertebrate lineage that diverged from our own ∼500 million years ago. By virtue of this deeply shared ancestry, the sea lamprey (P. marinus) genome is uniquely poised to provide insight into the ancestry of vertebrate genomes and the underlying principles of vertebrate biology. Here, we present the first lamprey whole-genome sequence and assembly. We note challenges faced owing to its high content of repetitive elements and GC bases, as well as the absence of broad-scale sequence information from closely related species. Analyses of the assembly indicate that two whole-genome duplications likely occurred before the divergence of ancestral lamprey and gnathostome lineages. Moreover, the results help define key evolutionary events within vertebrate lineages, including the origin of myelin-associated proteins and the development of appendages. The lamprey genome provides an important resource for reconstructing vertebrate origins and the evolutionary events that have shaped the genomes of extant organisms.
Nature genetics 2013
PUBMED: 23435085; DOI: 10.1038/ng.2568
-
Sherlock Genomes - viral investigator.
Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK. microbes@sanger.ac.uk.
This month's Genome Watch highlights how deep sequencing technologies have vastly reduced the time and prior knowledge needed to generate viral genomes.
Nature reviews. Microbiology 2013;11;3;150
PUBMED: 23411861; DOI: 10.1038/nrmicro2979
-
Genetic variants from lipid-related pathways and risk for incident myocardial infarction.
Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden.
Background: Circulating lipids levels, as well as several familial lipid metabolism disorders, are strongly associated with initiation and progression of atherosclerosis and incidence of myocardial infarction (MI). Objectives: We hypothesized that genetic variants associated with circulating lipid levels would also be associated with MI incidence, and have tested this in three independent samples. Using age- and sex-adjusted additive genetic models, we analyzed 554 single nucleotide polymorphisms (SNPs) in 41 candidate gene regions proposed to be involved in lipid-related pathways potentially predisposing to incidence of MI in 2,602 participants of the Swedish Twin Register (STR; 57% women). All associations with nominal P<0.01 were further investigated in the Uppsala Longitudinal Study of Adult Men (ULSAM; N = 1,142). Results: In the present study, we report associations of lipid-related SNPs with incident MI in two community-based longitudinal studies with in silico replication in a meta-analysis of genome-wide association studies. Overall, there were 9 SNPs in STR with nominal P-value <0.01 that were successfully genotyped in ULSAM. rs4149313 located in ABCA1 was associated with MI incidence in both longitudinal study samples with nominal significance (hazard ratio, 1.36 and 1.40; P-value, 0.004 and 0.015 in STR and ULSAM, respectively). In silico replication supported the association of rs4149313 with coronary artery disease in an independent meta-analysis including 173,975 individuals of European descent from the CARDIoGRAMplusC4D consortium (odds ratio, 1.03; P-value, 0.048). Conclusions: rs4149313 is one of the few amino acid changing variants in ABCA1 known to associate with reduced cholesterol efflux. Our results are suggestive of a weak association between this variant and the development of atherosclerosis and MI.
PloS one 2013;8;3;e60454
PUBMED: 23555974; PMC: 3612051; DOI: 10.1371/journal.pone.0060454
-
Cooperativity and rapid evolution of cobound transcription factors in closely related mammals.
Cancer Research UK Cambridge Institute, Li Ka Shing Centre, University of Cambridge, Cambridge CB2 0RE, UK.
To mechanistically characterize the microevolutionary processes active in altering transcription factor (TF) binding among closely related mammals, we compared the genome-wide binding of three tissue-specific TFs that control liver gene expression in six rodents. Despite an overall fast turnover of TF binding locations between species, we identified thousands of TF regions of highly constrained TF binding intensity. Although individual mutations in bound sequence motifs can influence TF binding, most binding differences occur in the absence of nearby sequence variations. Instead, combinatorial binding was found to be significant for genetic and evolutionary stability; cobound TFs tend to disappear in concert and were sensitive to genetic knockout of partner TFs. The large, qualitative differences in genomic regions bound between closely related mammals, when contrasted with the smaller, quantitative TF binding differences among Drosophila species, illustrate how genome structure and population genetics together shape regulatory evolution.
Cell 2013;154;3;530-40
PUBMED: 23911320; PMC: 3732390; DOI: 10.1016/j.cell.2013.07.007
-
So, you want to sequence a genome...
Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK. ds4@sanger.ac.uk.
Genome biology 2013;14;7;128
PUBMED: 23906089; DOI: 10.1186/gb-2013-14-7-128
-
Whole exome sequencing of adenoid cystic carcinoma.
Adenoid cystic carcinoma (ACC) is a rare malignancy that can occur in multiple organ sites and is primarily found in the salivary gland. While the identification of recurrent fusions of the MYB-NFIB genes have begun to shed light on the molecular underpinnings, little else is known about the molecular genetics of this frequently fatal cancer. We have undertaken exome sequencing in a series of 24 ACC to further delineate the genetics of the disease. We identified multiple mutated genes that, combined, implicate chromatin deregulation in half of cases. Further, mutations were identified in known cancer genes, including PIK3CA, ATM, CDKN2A, SF3B1, SUFU, TSC1, and CYLD. Mutations in NOTCH1/2 were identified in 3 cases, and we identify the negative NOTCH signaling regulator, SPEN, as a new cancer gene in ACC with mutations in 5 cases. Finally, the identification of 3 likely activating mutations in the tyrosine kinase receptor FGFR2, analogous to those reported in ovarian and endometrial carcinoma, point to potential therapeutic avenues for a subset of cases.
The Journal of clinical investigation 2013
PUBMED: 23778141; DOI: 10.1172/JCI67201
-
The intermediate filament protein, vimentin, is a regulator of NOD2 activity.
Centre for Molecular Medicine, University of Edinburgh, Western General Hospital, Crewe Road, Edinburgh EH4 2XU, UK. craig.stevens@ed.ac.uk
Objective: Mutations in the nucleotide-binding oligomerisation domain-containing protein 2 (NOD2) gene remain the strongest genetic determinants for Crohn's disease (CD). Having previously identified vimentin as a novel NOD2-interacting protein, the authors aimed to investigate the regulatory effects of vimentin on NOD2 function and the association of variants in Vim with CD susceptibility.
Design: Coimmunoprecipitation, fluorescent microscopy and fractionation were used to confirm the interaction between NOD2 and vimentin. HEK293 cells stably expressing wild-type NOD2 or a NOD2 frameshift variant (L1007fs) and SW480 colonic epithelial cells were used alongside the vimentin inhibitor, withaferin A (WFA), to assess effects on NOD2 function using the nuclear factor-kappaB (NF-κB) reporter gene, green fluorescent protein-LC3-based autophagy, and bacterial gentamicin protection assays. International genome-wide association meta-analysis data were used to test for associations of single-nucleotide polymorphisms in Vim with CD susceptibility.
Results: The leucine-rich repeat domain of NOD2 contained the elements required for vimentin binding; CD-associated polymorphisms disrupted this interaction. NOD2 and vimentin colocalised at the cell plasma membrane, and cytosolic mislocalisation of the L1007fs and R702W variants correlated with an inability to interact with vimentin. Use of WFA demonstrated that vimentin was required for NOD2-dependent NF-κB activation and muramyl dipeptide-induced autophagy induction, and that NOD2 and vimentin regulated the invasion and survival properties of a CD-associated adherent-invasive Escherichia coli strain. Genetic analysis revealed an association signal across the haplotype block containing Vim.
Conclusion: Vimentin is an important regulator of NOD2 function and a potential novel therapeutic target in the treatment of CD. In addition, Vim is a candidate susceptibility gene for CD, supporting the functional data.
Funded by: Medical Research Council: G0800675, G0800759
Gut 2013;62;5;695-707
PUBMED: 22684479; DOI: 10.1136/gutjnl-2011-301775
-
Mutations in B3GALNT2 cause congenital muscular dystrophy and hypoglycosylation of α-dystroglycan.
Dubowitz Neuromuscular Centre, UCL Institute of Child Health, London, UK.
Mutations in several known or putative glycosyltransferases cause glycosylation defects in α-dystroglycan (α-DG), an integral component of the dystrophin glycoprotein complex. The hypoglycosylation reduces the ability of α-DG to bind laminin and other extracellular matrix ligands and is responsible for the pathogenesis of an inherited subset of muscular dystrophies known as the dystroglycanopathies. By exome and Sanger sequencing we identified two individuals affected by a dystroglycanopathy with mutations in β-1,3-N-acetylgalactosaminyltransferase 2 (B3GALNT2). B3GALNT2 transfers N-acetyl galactosamine (GalNAc) in a β-1,3 linkage to N-acetyl glucosamine (GlcNAc). A subsequent study of a separate cohort of individuals identified recessive mutations in four additional cases that were all affected by dystroglycanopathy with structural brain involvement. We show that functional dystroglycan glycosylation was reduced in the fibroblasts and muscle (when available) of these individuals via flow cytometry, immunoblotting, and immunocytochemistry. B3GALNT2 localized to the endoplasmic reticulum, and this localization was perturbed by some of the missense mutations identified. Moreover, knockdown of b3galnt2 in zebrafish recapitulated the human congenital muscular dystrophy phenotype with reduced motility, brain abnormalities, and disordered muscle fibers with evidence of damage to both the myosepta and the sarcolemma. Functional dystroglycan glycosylation was also reduced in the b3galnt2 knockdown zebrafish embryos. Together these results demonstrate a role for B3GALNT2 in the glycosylation of α-DG and show that B3GALNT2 mutations can cause dystroglycanopathy with muscle and brain involvement.
Funded by: Howard Hughes Medical Institute; Medical Research Council; NICHD NIH HHS: K99HD067379, P30HD19655; NIMH NIH HHS: RC2MH089952; NINDS NIH HHS: 1U54NS053672
American journal of human genetics 2013;92;3;354-65
PUBMED: 23453667; PMC: 3591840; DOI: 10.1016/j.ajhg.2013.01.016
-
The non-obese diabetic mouse sequence, annotation and variation resource: an aid for investigating type 1 diabetes.
The Wellcome Trust Sanger Institute, Hinxton, Cambridge, CB10 1SA, UK. cas@sanger.ac.uk
Model organisms are becoming increasingly important for the study of complex diseases such as type 1 diabetes (T1D). The non-obese diabetic (NOD) mouse is an experimental model for T1D having been bred to develop the disease spontaneously in a process that is similar to humans. Genetic analysis of the NOD mouse has identified around 50 disease loci, which have the nomenclature Idd for insulin-dependent diabetes, distributed across at least 11 different chromosomes. In total, 21 Idd regions across 6 chromosomes, that are major contributors to T1D susceptibility or resistance, were selected for finished sequencing and annotation at the Wellcome Trust Sanger Institute. Here we describe the generation of 40.4 mega base-pairs of finished sequence from 289 bacterial artificial chromosomes for the NOD mouse. Manual annotation has identified 738 genes in the diabetes sensitive NOD mouse and 765 genes in homologous regions of the diabetes resistant C57BL/6J reference mouse across 19 candidate Idd regions. This has allowed us to call variation consequences between homologous exonic sequences for all annotated regions in the two mouse strains. We demonstrate the importance of this resource further by illustrating the technical difficulties that regions of inter-strain structural variation between the NOD mouse and the C57BL/6J reference mouse can cause for current next generation sequencing and assembly techniques. Furthermore, we have established that the variation rate in the Idd regions is 2.3 times higher than the mean found for the whole genome assembly for the NOD/ShiLtJ genome, which we suggest reflects the fact that positive selection for functional variation in immune genes is beneficial in regard to host defence. In summary, we provide an important resource, which aids the analysis of potential causative genes involved in T1D susceptibility. Database URLs: http://www.sanger.ac.uk/resources/mouse/nod/; http://vega-previous.sanger.ac.uk/info/data/mouse_regions.html#Idd
Funded by: NIAID NIH HHS: AI 15416; Wellcome Trust: 096388, 100140
Database : the journal of biological databases and curation 2013;2013;bat032
PUBMED: 23729657; PMC: 3668384; DOI: 10.1093/database/bat032
-
Deletion of TOP3β, a component of FMRP-containing mRNPs, contributes to neurodevelopmental disorders.
1] Department of Biochemistry, University of Würzburg, Würzburg, Germany. [2].
Implicating particular genes in the generation of complex brain and behavior phenotypes requires multiple lines of evidence. The rarity of most high-impact genetic variants typically precludes the possibility of accruing statistical evidence that they are associated with a given trait. We found that the enrichment of a rare chromosome 22q11.22 deletion in a recently expanded Northern Finnish sub-isolate enabled the detection of association between TOP3B and both schizophrenia and cognitive impairment. Biochemical analysis of TOP3β revealed that this topoisomerase was a component of cytosolic messenger ribonucleoproteins (mRNPs) and was catalytically active on RNA. The recruitment of TOP3β to mRNPs was independent of RNA cis-elements and was coupled to the co-recruitment of FMRP, the disease gene product in fragile X mental retardation syndrome. Our results indicate a previously unknown role for TOP3β in mRNA metabolism and suggest that it is involved in neurodevelopmental disorders.
Nature neuroscience 2013;16;9;1228-37
PUBMED: 23912948; DOI: 10.1038/nn.3484
-
Harnessing the genome: development of a hierarchical typing scheme for meticillin-resistant Staphylococcus aureus.
Department of Microbiology, Imperial College Healthcare NHS Trust, London, UK.
A major barrier to using genome sequencing in medical microbiology is the ability to interpret the data. New schemes that provide information about the importance of sequence variation in both clinical and public health settings are required. Meticillin-resistant Staphylococcus aureus (MRSA) is an important nosocomial pathogen that is being observed with increasing frequency in community settings. Better tools are needed to improve our understanding of its transmissibility and micro-epidemiology in order to develop effective interventions. Using DNA microarray technology we identified a set of 20 binary targets whose presence or absence could be determined by PCR, producing a PCR binary typing scheme (PCR-BT). This was combined with multi-locus sequence type-based, sequence nucleotide polymorphism typing to form a hierarchical typing scheme. When applied to a set of epidemiologically unrelated isolates, a high degree of concordance was observed with PFGE (98.8 %). The scheme was able to detect the presence or absence of an outbreak strain in eight out of nine outbreak investigations, demonstrating epidemiological concordance. PCR-BT was better than PFGE at distinguishing between outbreak strains, particularly where epidemic MRSA-15 was involved. The method developed here is a rapid, digital typing scheme for S. aureus for use in both micro- and macro-epidemiological investigations that has the advantage of being suitable for use in routine diagnostic laboratories. The targets are defined and therefore the types can be defined by any platform capable of detecting the sequences used, including whole genome sequencing.
Journal of medical microbiology 2013;62;Pt 1;36-45
PUBMED: 23002072; DOI: 10.1099/jmm.0.049957-0
-
Journeys into the genome of cancer cells.
Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, UK. mrs@sanger.ac.uk.
EMBO molecular medicine 2013
PUBMED: 23339072; DOI: 10.1002/emmm.201202388
-
Genomic and physiological variability within Group II (non-proteolytic) Clostridium botulinum.
Institute of Food Research (IFR), Norwich Research Park, Colney, Norwich NR4 7UA, UK. sandra.stringer@ifr.ac.uk.
Background: Clostridium botulinum is a group of four physiologically and phylogenetically distinct bacteria that produce botulinum neurotoxin. While studies have characterised variability between strains of Group I (proteolytic) C. botulinum, the genetic and physiological variability and relationships between strains within Group II (non-proteolytic) C. botulinum are not well understood. In this study the genome of Group II strain C. botulinum Eklund 17B (NRP) was sequenced and used to construct a whole genome DNA microarray. This was used in a comparative genomic indexing study to compare the relatedness of 43 strains of Group II C. botulinum (14 type B, 24 type E and 5 type F). These results were compared with characteristics determined from physiological tests. Results: Whole genome indexing showed that strains of Group II C. botulinum isolated from a wide variety of environments over more than 75 years clustered together indicating the genetic background of Group II C. botulinum is stable. Further analysis showed that strains forming type B or type F toxin are closely related with only toxin cluster genes targets being unique to either type. Strains producing type E toxin formed a separate subset. Carbohydrate fermentation tests supported the observation that type B and F strains form a separate subset to type E strains. All the type F strains and most of type B strains produced acid from amylopectin, amylose and glycogen whereas type E strains did not. However, these two subsets did not differ strongly in minimum growth temperature or maximum NaCl concentration for growth. No relationship was found between tellurite resistance and toxin type despite all the tested type B and type F strains carrying tehB, while the sequence was absent or diverged in all type E strains. Conclusions: Although Group II C. botulinum form a tight genetic group, genomic and physiological analysis indicates there are two distinct subsets within this group. All type B strains and type F strains are in one subset and all type E strains in the other.
BMC genomics 2013;14;333
PUBMED: 23679073; PMC: 3672017; DOI: 10.1186/1471-2164-14-333
-
Detecting low-affinity extracellular protein interactions using protein microarrays.
Cell Surface Signalling Laboratory, Wellcome Trust Sanger Institute, Hinxton, Cambridge, United Kingdom.
Low-affinity extracellular protein interactions are critical for cellular recognition processes, but are not generally detected by methods that can be applied in a high-throughput manner. This unit describes a protein microarray platform that significantly improves the throughput of assays capable of detecting transient extracellular protein interactions. These methodological improvements now permit screening for novel extracellular receptor-ligand interactions on a genome-wide scale. Curr. Protoc. Protein Sci. 72:27.5.1-27.5.15. © 2013 by John Wiley & Sons, Inc.
Current protocols in protein science / editorial board, John E. Coligan ... [et al.] 2013;Chapter 27;Unit27.5
PUBMED: 23546623; DOI: 10.1002/0471140864.ps2705s72
-
Plasmodium falciparum-like parasites infecting wild apes in southern Cameroon do not represent a recurrent source of human malaria.
Departments of Medicine and Microbiology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104.
Wild-living chimpanzees and gorillas harbor a multitude of Plasmodium species, including six of the subgenus Laverania, one of which served as the progenitor of Plasmodium falciparum. Despite the magnitude of this reservoir, it is unknown whether apes represent a source of human infections. Here, we used Plasmodium species-specific PCR, single-genome amplification, and 454 sequencing to screen humans from remote areas of southern Cameroon for ape Laverania infections. Among 1,402 blood samples, we found 1,000 to be Plasmodium mitochondrial DNA (mtDNA) positive, all of which contained human parasites as determined by sequencing and/or restriction enzyme digestion. To exclude low-abundance infections, we subjected 514 of these samples to 454 sequencing, targeting a region of the mtDNA genome that distinguishes ape from human Laverania species. Using algorithms specifically developed to differentiate rare Plasmodium variants from 454-sequencing error, we identified single and mixed-species infections with P. falciparum, Plasmodium malariae, and/or Plasmodium ovale. However, none of the human samples contained ape Laverania parasites, including the gorilla precursor of P. falciparum. To characterize further the diversity of P. falciparum in Cameroon, we used single-genome amplification to amplify 3.4-kb mtDNA fragments from 229 infected humans. Phylogenetic analysis identified 62 new variants, all of which clustered with extant P. falciparum, providing further evidence that P. falciparum emerged following a single gorilla-to-human transmission. Thus, unlike Plasmodium knowlesi-infected macaques in southeast Asia, African apes harboring Laverania parasites do not seem to serve as a recurrent source of human malaria, a finding of import to ongoing control and eradication measures.
Proceedings of the National Academy of Sciences of the United States of America 2013;110;17;7020-5
PUBMED: 23569255; PMC: 3637760; DOI: 10.1073/pnas.1305201110
-
Rapid whole-genome sequencing for investigation of a suspected tuberculosis outbreak.
Department of Medicine, University of Cambridge, Cambridge, United Kingdom.
Two Southeast Asian students attending the same school in the United Kingdom presented with pulmonary tuberculosis. An epidemiological investigation failed to link the two cases, and drug resistance profiles of the Mycobacterium tuberculosis isolates were discrepant. Whole-genome sequencing of the isolates found them to be genetically identical, suggesting a missed transmission event.
Journal of clinical microbiology 2013;51;2;611-4
PUBMED: 23175259; PMC: 3553910; DOI: 10.1128/JCM.02279-12
-
The ancestor of extant Japanese fancy mice contributed to the mosaic genomes of classical inbred strains.
Mammalian Genetics Laboratory, National Institute of Genetics, Mishima, Shizuoka 411-8540, Japan;
Commonly used classical inbred mouse strains have mosaic genomes with sequences from different subspecific origins. Their genomes are derived predominantly from the Western European subspecies Mus musculus domesticus, with the remaining sequences derived mostly from the Japanese subspecies Mus musculus molossinus. However, it remains unknown how this intersubspecific genome introgression occurred during the establishment of classical inbred strains. In this study, we resequenced the genomes of two M. m. molossinus-derived inbred strains, MSM/Ms and JF1/Ms. MSM/Ms originated from Japanese wild mice, and the ancestry of JF1/Ms was originally found in Europe and then transferred to Japan. We compared the characteristics of these sequences to those of the C57BL/6J reference sequence and the recent data sets from the resequencing of 17 inbred strains in the Mouse Genome Project (MGP), and the results unequivocally show that genome introgression from M. m. molossinus into M. m. domesticus provided the primary framework for the mosaic genomes of classical inbred strains. Furthermore, the genomes of C57BL/6J and other classical inbred strains have long consecutive segments with extremely high similarity (>99.998%) to the JF1/Ms strain. In the early 20th century, Japanese waltzing mice with a morphological phenotype resembling that of JF1/Ms mice were often crossed with European fancy mice for early studies of "Mendelism," which suggests that the ancestor of the extant JF1/Ms strain provided the origin of the M. m. molossinus genome in classical inbred strains and largely contributed to its intersubspecific genome diversity.
Genome research 2013;23;8;1329-38
PUBMED: 23604024; PMC: 3730106; DOI: 10.1101/gr.156497.113
-
Genome-wide meta-analysis of observational studies shows common genetic variants associated with macronutrient intake.
Translational Gerontology Branch, National Institute on Aging, Baltimore, MD 21225, USA. tanakato@mail.nih.gov
Background: Macronutrient intake varies substantially between individuals, and there is evidence that this variation is partly accounted for by genetic variants. Objective: The objective of the study was to identify common genetic variants that are associated with macronutrient intake. Design: We performed 2-stage genome-wide association (GWA) meta-analysis of macronutrient intake in populations of European descent. Macronutrients were assessed by using food-frequency questionnaires and analyzed as percentages of total energy consumption from total fat, protein, and carbohydrate. From the discovery GWA (n = 38,360), 35 independent loci associated with macronutrient intake at P < 5 × 10(-6) were identified and taken forward to replication in 3 additional cohorts (n = 33,533) from the DietGen Consortium. For one locus, fat mass obesity-associated protein (FTO), cohorts with Illumina MetaboChip genotype data (n = 7724) provided additional replication data. Results: A variant in the chromosome 19 locus (rs838145) was associated with higher carbohydrate (β ± SE: 0.25 ± 0.04%; P = 1.68 × 10(-8)) and lower fat (β ± SE: -0.21 ± 0.04%; P = 1.57 × 10(-9)) consumption. A candidate gene in this region, fibroblast growth factor 21 (FGF21), encodes a fibroblast growth factor involved in glucose and lipid metabolism. The variants in this locus were associated with circulating FGF21 protein concentrations (P < 0.05) but not mRNA concentrations in blood or brain. The body mass index (BMI)-increasing allele of the FTO variant (rs1421085) was associated with higher protein intake (β ± SE: 0.10 ± 0.02%; P = 9.96 × 10(-10)), independent of BMI (after adjustment for BMI, β ± SE: 0.08 ± 0.02%; P = 3.15 × 10(-7)). Conclusion: Our results indicate that variants in genes involved in nutrient metabolism and obesity are associated with macronutrient consumption in humans. Trials related to this study were registered at clinicaltrials.gov as NCT00005131 (Atherosclerosis Risk in Communities), NCT00005133 (Cardiovascular Health Study), NCT00005136 (Family Heart Study), NCT00005121 (Framingham Heart Study), NCT00083369 (Genetic and Environmental Determinants of Triglycerides), NCT01331512 (InCHIANTI Study), and NCT00005487 (Multi-Ethnic Study of Atherosclerosis).
Funded by: Canadian Institutes of Health Research; Cancer Research UK; Medical Research Council; Wellcome Trust
The American journal of clinical nutrition 2013;97;6;1395-402
PUBMED: 23636237; PMC: 3652928; DOI: 10.3945/ajcn.112.052183
-
Frequent mutation of the major cartilage collagen gene COL2A1 in chondrosarcoma.
Cancer Genome Project, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK.
Chondrosarcoma is a heterogeneous collection of malignant bone tumors and is the second most common primary malignancy of bone after osteosarcoma. Recent work has identified frequent, recurrent mutations in IDH1 or IDH2 in nearly half of central chondrosarcomas. However, there has been little systematic genomic analysis of this tumor type, and, thus, the contribution of other genes is unclear. Here we report comprehensive genomic analyses of 49 individuals with chondrosarcoma (cases). We identified hypermutability of the major cartilage collagen gene COL2A1, with insertions, deletions and rearrangements identified in 37% of cases. The patterns of mutation were consistent with selection for variants likely to impair normal collagen biosynthesis. In addition, we identified mutations in IDH1 or IDH2 (59%), TP53 (20%), the RB1 pathway (33%) and Hedgehog signaling (18%).
Funded by: Wellcome Trust: 077012/Z/05/Z, 093867, WT088340MA
Nature genetics 2013;45;8;923-6
PUBMED: 23770606; PMC: 3743157; DOI: 10.1038/ng.2668
-
DNA deaminases induce break-associated mutation showers with implication of APOBEC3B and 3A in breast cancer kataegis.
Protein and Nucleic Acid Chemistry Division , Medical Research Council Laboratory of Molecular Biology , Cambridge , United Kingdom.
Breast cancer genomes have revealed a novel form of mutation showers (kataegis) in which multiple same-strand substitutions at C:G pairs spaced one to several hundred nucleotides apart are clustered over kilobase-sized regions, often associated with sites of DNA rearrangement. We show kataegis can result from AID/APOBEC-catalysed cytidine deamination in the vicinity of DNA breaks, likely through action on single-stranded DNA exposed during resection. Cancer-like kataegis can be recapitulated by expression of AID/APOBEC family deaminases in yeast where it largely depends on uracil excision, which generates an abasic site for strand breakage. Localized kataegis can also be nucleated by an I-SceI-induced break. Genome-wide patterns of APOBEC3-catalyzed deamination in yeast reveal APOBEC3B and 3A as the deaminases whose mutational signatures are most similar to those of breast cancer kataegic mutations. Together with expression and functional assays, the results implicate APOBEC3B/A in breast cancer hypermutation and give insight into the mechanism of kataegis. DOI:http://dx.doi.org/10.7554/eLife.00534.001.
eLife 2013;2;e00534
PUBMED: 23599896; PMC: 3628087; DOI: 10.7554/eLife.00534
-
Genetic risk prediction and a 2-stage risk screening strategy for coronary heart disease.
From the Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, Finland (E.T., A.P., S.R.); National Institute for Health and Welfare, Helsinki, Finland (E.T., A.S.H., V.S., S.R.); Wellcome Trust Sanger Institute, Hinxton, Cambridge, United Kingdom (A.P., S.R.); Broad Institute, Cambridge, MA (A.P.); and Hjelt Institute, University of Helsinki, Helsinki, Finland (S.R.).
Objective: Genome-wide association studies have identified several genetic variants associated with coronary heart disease (CHD). The aim of this study was to evaluate the genetic risk discrimination and reclassification and apply the results for a 2-stage population risk screening strategy for CHD. We genotyped 28 genetic variants in 24 124 participants in 4 Finnish population-based, prospective cohorts (recruitment years 1992-2002). We constructed a multilocus genetic risk score and evaluated its association with incident cardiovascular disease events. During the median follow-up time of 12 years (interquartile range 8.75-15.25 years), we observed 1093 CHD, 1552 cardiovascular disease, and 731 acute coronary syndrome events. Adding genetic information to conventional risk factors and family history improved risk discrimination of CHD (C-index 0.856 versus 0.851; P=0.0002) and other end points (cardiovascular disease: C-index 0.840 versus 0.837, P=0.0004; acute coronary syndrome: C-index 0.859 versus 0.855, P=0.001). In a standard population of 100 000 individuals, additional genetic screening of subjects at intermediate risk for CHD would reclassify 2144 subjects (12%) into high-risk category. Statin allocation for these subjects is estimated to prevent 135 CHD cases over 14 years. Similar results were obtained by external validation, where the effects were estimated from a training data set and applied for a test data set. Conclusions: Genetic risk score improves risk prediction of CHD and helps to identify individuals at high risk for the first CHD event. Genetic screening for individuals at intermediate cardiovascular risk could help to prevent future cases through better targeting of statins.
Arteriosclerosis, thrombosis, and vascular biology 2013;33;9;2261-6
PUBMED: 23599444; DOI: 10.1161/ATVBAHA.112.301120
-
Community-associated MRSA from the Indian subcontinent.
Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK; Menzies School of Health Research, Darwin, NT, Australia. Electronic address: steven.tong@menzies.edu.au.
The Lancet infectious diseases 2013;13;9;734-5
PUBMED: 23969208; DOI: 10.1016/S1473-3099(13)70231-7
-
Extensive Diversity of Streptococcus pyogenes in a Remote Human Population Reflects Global-Scale Transmission Rather than Localised Diversification.
Menzies School of Health Research, Division of Global and Tropical Health, Casuarina, Northern Territory, Australia.
The Indigenous population of the Northern Territory of Australia (NT) suffers from a very high burden of Streptococcus pyogenes disease, including cardiac and renal sequelae. The aim of this study was to determine if S. pyogenes isolated from this population represent NT endemic strains, or conversely reflect strains with global distribution. emm sequence typing data were used to select 460 S. pyogenes isolates representing NT S. pyogenes diversity from 1987-2008. These isolates were genotyped using either multilocus sequence typing (MLST) or a high resolution melting-based MLST surrogate (Minim typing). These data were combined with MLST data from other studies on NT S. pyogenes to yield a set of 731 MLST or Minim typed isolates for analysis. goeBURST analysis of MLST allelic profiles and neighbour-joining trees of the MLST allele sequences revealed that a large proportion of the known global S. pyogenes MLST-defined diversity has now been found in the NT. Specifically, fully sequence typed NT isolates encompass 19% of known S. pyogenes STs and 43% of known S. pyogenes MLST alleles. These analyses provided no evidence for major NT-endemic strains, with many STs and MLST alleles shared between the NT and the rest of the world. The relationship between the number of known Minim types, and the probability that a Minim type identified in a calendar year would be novel was determined. This revealed that Minim types typically persist in the NT for >1 year, and indicate that the majority of NT Minim types have been identified. This study revealed that many diverse S. pyogenes strains exhibit global scale mobility that extends to isolated populations. The burden of S. pyogenes disease in the NT is unlikely to be due to the nature of NT S. pyogenes strains, but is rather a function of social and living conditions.
PloS one 2013;8;9;e73851
PUBMED: 24066079; DOI: 10.1371/journal.pone.0073851
-
The genomes of four tapeworm species reveal adaptations to parasitism.
Parasite Genomics, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.
Tapeworms (Cestoda) cause neglected diseases that can be fatal and are difficult to treat, owing to inefficient drugs. Here we present an analysis of tapeworm genome sequences using the human-infective species Echinococcus multilocularis, E. granulosus, Taenia solium and the laboratory model Hymenolepis microstoma as examples. The 115- to 141-megabase genomes offer insights into the evolution of parasitism. Synteny is maintained with distantly related blood flukes but we find extreme losses of genes and pathways that are ubiquitous in other animals, including 34 homeobox families and several determinants of stem cell fate. Tapeworms have specialized detoxification pathways, metabolism that is finely tuned to rely on nutrients scavenged from their hosts, and species-specific expansions of non-canonical heat shock proteins and families of known antigens. We identify new potential drug targets, including some on which existing pharmaceuticals may act. The genomes provide a rich resource to underpin the development of urgently needed treatments and control.
Funded by: Biotechnology and Biological Sciences Research Council: BBG0038151; Canadian Institutes of Health Research: MOP#84556; FIC NIH HHS: TW008588; Wellcome Trust: 098051
Nature 2013;496;7443;57-63
PUBMED: 23485966; DOI: 10.1038/nature12031
-
Molecular Analysis of an Outbreak of Lethal Postpartum Sepsis Caused by Streptococcus pyogenes.
Infectious Diseases & Immunity, Imperial College London, Hammersmith Hospital, London, United Kingdom.
Sepsis is now the leading direct cause of maternal death in the United Kingdom, and Streptococcus pyogenes is the leading pathogen. We combined conventional and genomic analyses to define the duration and scale of a lethal outbreak. Two postpartum deaths caused by S. pyogenes occurred within 24 h; one was characterized by bacteremia and shock and the other by hemorrhagic pneumonia. The women gave birth within minutes of each other in the same maternity unit 2 days earlier. Seven additional infections in health care and household contacts were subsequently detected and treated. All cluster-associated S. pyogenes isolates were genotype emm1 and were initially indistinguishable from other United Kingdom emm1 isolates. Sequencing of the virulence gene sic revealed that all outbreak isolates had the same unique sic type. Genome sequencing confirmed that the cluster was caused by a unique S. pyogenes clone. Transmission between patients occurred on a single day and was associated with casual contact only. A single isolate from one patient demonstrated a sequence change in sic consistent with longer infection duration. Transmission to health care workers was traced to single clinical contacts with index cases. The last case was detected 18 days after the first case. Following enhanced surveillance, the outbreak isolate was not detected again. Mutations in bacterial regulatory genes played no detectable role in this outbreak, illustrating the intrinsic ability of emm1 S. pyogenes to spread while retaining virulence. This fast-moving outbreak highlights the potential of S. pyogenes to cause a range of diseases in the puerperium with rapid transmission, underlining the importance of immediate recognition and response by clinical infection and occupational health teams.
Journal of clinical microbiology 2013;51;7;2089-95
PUBMED: 23616448; PMC: 3697669; DOI: 10.1128/JCM.00679-13
-
Public Health Value of Next-Generation DNA Sequencing of Enterohemorrhagic Escherichia coli Isolates from an Outbreak.
Health Protection Agency, London, United Kingdom.
In 2009, an outbreak of enterohemorrhagic Escherichia coli (EHEC) on an open farm infected 93 persons, and approximately 22% of these individuals developed hemolytic-uremic syndrome (HUS). Genome sequencing was used to investigate outbreak-derived animal and human EHEC isolates. Phylogeny based on the whole-genome sequence was used to place outbreak isolates in the context of the overall E. coli species and the O157:H7 sequence type 11 (ST11) subgroup. Four informative single nucleotide polymorphisms (SNPs) were identified and used to design an assay to type 122 other outbreak isolates. The SNP phylogeny demonstrated that the outbreak strain was from a lineage distinct from previously reported O157:H7 ST11 EHEC and was not a member of the hypervirulent clade 8. The strain harbored determinants for two Stx2 verotoxins and other putative virulence factors. When linked to the epidemiological information, the sequence data indicate that gross contamination of a single outbreak strain occurred across the farm prior to the first clinical report of HUS. The most likely explanation for these results is that a single successful strain of EHEC spread from a single introduction through the farm by clonal expansion and that contamination of the environment (including the possible colonization of several animals) led ultimately to human cases.
Journal of clinical microbiology 2013;51;1;232-7
PUBMED: 23135946; DOI: 10.1128/JCM.01696-12
-
Histone methyltransferase MLL3 contributes to genome-scale circadian transcription.
Department of Clinical Neurosciences, University of Cambridge Metabolic Research Laboratories, Institute of Metabolic Science, National Institute for Health Research Cambridge Biomedical Research Centre, Addenbrooke's Hospital, University of Cambridge, Cambridge CB2 0QQ, United Kingdom.
Daily cyclical expression of thousands of genes in tissues such as the liver is orchestrated by the molecular circadian clock, the disruption of which is implicated in metabolic disorders and cancer. Although we understand much about the circadian transcription factors that can switch gene expression on and off, it is still unclear how global changes in rhythmic transcription are controlled at the genomic level. Here, we demonstrate circadian modification of an activating histone mark at a significant proportion of gene loci that undergo daily transcription, implicating widespread epigenetic modification as a key node regulated by the clockwork. Furthermore, we identify the histone-remodelling enzyme mixed lineage leukemia (MLL)3 as a clock-controlled factor that is able to directly and indirectly modulate over a hundred epigenetically targeted circadian "output" genes in the liver. Importantly, catalytic inactivation of the histone methyltransferase activity of MLL3 also severely compromises the oscillation of "core" clock gene promoters, including Bmal1, mCry1, mPer2, and Rev-erbα, suggesting that rhythmic histone methylation is vital for robust transcriptional oscillator function. This highlights a pathway by which the clockwork exerts genome-wide control over transcription, which is critical for sustaining temporal programming of tissue physiology.
Proceedings of the National Academy of Sciences of the United States of America 2013;110;4;1554-9
PUBMED: 23297224; DOI: 10.1073/pnas.1214168110
-
The miRNA Profile of Human Pancreatic Islets and Beta-Cells and Relationship to Type 2 Diabetes Pathogenesis.
Oxford Centre for Diabetes, Endocrinology & Metabolism, University of Oxford, Oxford, United Kingdom ; Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, United Kingdom.
Recent advances in the understanding of the genetics of type 2 diabetes (T2D) susceptibility have focused attention on the regulation of transcriptional activity within the pancreatic beta-cell. MicroRNAs (miRNAs) represent an important component of regulatory control, and have proven roles in the development of human disease and control of glucose homeostasis. We set out to establish the miRNA profile of human pancreatic islets and of enriched beta-cell populations, and to explore their potential involvement in T2D susceptibility. We used Illumina small RNA sequencing to profile the miRNA fraction in three preparations each of primary human islets and of enriched beta-cells generated by fluorescence-activated cell sorting. In total, 366 miRNAs were found to be expressed (i.e. >100 cumulative reads) in islets and 346 in beta-cells; of the total of 384 unique miRNAs, 328 were shared. A comparison of the islet-cell miRNA profile with those of 15 other human tissues identified 40 miRNAs predominantly expressed (i.e. >50% of all reads seen across the tissues) in islets. Several highly-expressed islet miRNAs, such as miR-375, have established roles in the regulation of islet function, but others (e.g. miR-27b-3p, miR-192-5p) have not previously been described in the context of islet biology. As a first step towards exploring the role of islet-expressed miRNAs and their predicted mRNA targets in T2D pathogenesis, we looked at published T2D association signals across these sites. We found evidence that predicted mRNA targets of islet-expressed miRNAs were globally enriched for signals of T2D association (p-values <0.01, q-values <0.1). At six loci with genome-wide evidence for T2D association (AP3S2, KCNK16, NOTCH2, SCL30A8, VPS26A, and WFS1) predicted mRNA target sites for islet-expressed miRNAs overlapped potentially causal variants. In conclusion, we have described the miRNA profile of human islets and beta-cells and provide evidence linking islet miRNAs to T2D pathogenesis.
PloS one 2013;8;1;e55272
PUBMED: 23372846; PMC: 3555946; DOI: 10.1371/journal.pone.0055272
-
Preimplantation genetic diagnosis guided by single-cell genomics.
Laboratory of Reproductive Genomics, Department of Human Genetics, KU Leuven, Leuven 3000, Belgium. Thierry.Voet@med.kuleuven.be.
Preimplantation genetic diagnosis (PGD) aims to help couples with heritable genetic disorders to avoid the birth of diseased offspring or the recurrence of loss of conception. Following in vitro fertilization, one or a few cells are biopsied from each human preimplantation embryo for genetic testing, allowing diagnosis and selection of healthy embryos for uterine transfer. Although classical methods, including single-cell PCR and fluorescent in situ hybridization, enable PGD for many genetic disorders, they have limitations. They often require family-specific designs and can be labor intensive, resulting in long waiting lists. Furthermore, certain types of genetic anomalies are not easy to diagnose using these classical approaches, and healthy offspring carrying the parental mutant allele(s) can result. Recently, state-of-the-art methods for single-cell genomics have flourished, which may overcome the limitations associated with classical PGD, and these underpin the development of generic assays for PGD that enable selection of embryos not only for the familial genetic disorder in question, but also for various other genetic aberrations and traits at once. Here, we discuss the latest single-cell genomics methodologies based on DNA microarrays, single-nucleotide polymorphism arrays or next-generation sequence analysis. We focus on their strengths, their validation status, their weaknesses and the challenges for implementing them in PGD.
Genome medicine 2013;5;8;71
PUBMED: 23998893; DOI: 10.1186/gm475
-
The Molecular Genetic Architecture of Self-Employment.
Department of Applied Economics, Erasmus School of Economics, Erasmus University Rotterdam, Rotterdam, The Netherlands ; Department of Epidemiology, Erasmus Medical Center, Rotterdam, The Netherlands.
Economic variables such as income, education, and occupation are known to affect mortality and morbidity, such as cardiovascular disease, and have also been shown to be partly heritable. However, very little is known about which genes influence economic variables, although these genes may have both a direct and an indirect effect on health. We report results from the first large-scale collaboration that studies the molecular genetic architecture of an economic variable-entrepreneurship-that was operationalized using self-employment, a widely-available proxy. Our results suggest that common SNPs when considered jointly explain about half of the narrow-sense heritability of self-employment estimated in twin data (σg (2)/σP (2) = 25%, h (2) = 55%). However, a meta-analysis of genome-wide association studies across sixteen studies comprising 50,627 participants did not identify genome-wide significant SNPs. 58 SNPs with p<10(-5) were tested in a replication sample (n = 3,271), but none replicated. Furthermore, a gene-based test shows that none of the genes that were previously suggested in the literature to influence entrepreneurship reveal significant associations. Finally, SNP-based genetic scores that use results from the meta-analysis capture less than 0.2% of the variance in self-employment in an independent sample (p≥0.039). Our results are consistent with a highly polygenic molecular genetic architecture of self-employment, with many genetic variants of small effect. Although self-employment is a multi-faceted, heavily environmentally influenced, and biologically distal trait, our results are similar to those for other genetically complex and biologically more proximate outcomes, such as height, intelligence, personality, and several diseases.
PloS one 2013;8;4;e60542
PUBMED: 23593239; DOI: 10.1371/journal.pone.0060542
-
Cancer of mice and men: Old twists and new tails.
Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK.
In this review we set out to celebrate the contribution that mouse models of human cancer have made to our understanding of the fundamental mechanisms driving tumourigenesis. We take the opportunity to look forward to how the mouse will be used to model cancer and the tools and technologies that will be applied, and indulge in looking back at the key advances the mouse has made possible.
The Journal of pathology 2013
PUBMED: 23436574; DOI: 10.1002/path.4184
-
Jdp2 downregulates Trp53 transcription to promote leukaemogenesis in the context of Trp53 heterozygosity.
Wellcome Trust Sanger Institute, Cambridge, UK.
We performed a genetic screen in mice to identify candidate genes that are associated with leukaemogenesis in the context of Trp53 heterozygosity. To do this we generated Trp53 heterozygous mice carrying the T2/Onc transposon and SB11 transposase alleles to allow transposon-mediated insertional mutagenesis to occur. From the resulting leukaemias/lymphomas that developed in these mice, we identified nine loci that are potentially associated with tumour formation in the context of Trp53 heterozygosity, including AB041803 and the Jun dimerization protein 2 (Jdp2). We show that Jdp2 transcriptionally regulates the Trp53 promoter, via an atypical AP-1 site, and that Jdp2 expression negatively regulates Trp53 expression levels. This study is the first to identify a genetic mechanism for tumour formation in the context of Trp53 heterozygosity.
Funded by: Cancer Research UK; Wellcome Trust
Oncogene 2013;32;3;397-402
PUBMED: 22370638; PMC: 3550594; DOI: 10.1038/onc.2012.56
-
Beyond the Sympathetic Tone: The New Brown Fat Activators.
Departament de Bioquimica i Biologia Molecular, Institute of Biomedicine (IBUB), University of Barcelona, and CIBER Fisiopatologia de la Obesidad y Nutrición, Av Diagonal 643, 08028 Barcelona, Catalonia, Spain. Electronic address: fvillarroya@ub.edu.
If we could avoid the side effects associated with global sympathetic activation, activating brown adipose tissue to increase thermogenesis would be a safe way to lose weight. The discovery of adrenergic-independent brown fat activators opens the prospect of developing this alternative way to efficiently and safely induce negative energy balance.
Cell metabolism 2013
PUBMED: 23583169; DOI: 10.1016/j.cmet.2013.02.020
-
Expression of Cellulosome Components and Type IV Pili within the Extracellular Proteome of Ruminococcus flavefaciens 007.
Chair for Microbiology and Microbial Biotechnology, Biotechnical Faculty, University of Ljubljana, Ljubljana, Slovenia ; Rowett Institute of Nutrition and Health, University of Aberdeen, Aberdeen, United Kingdom.
Background: Ruminococcus flavefaciens is an important fibre-degrading bacterium found in the mammalian gut. Cellulolytic strains from the bovine rumen have been shown to produce complex cellulosome structures that are associated with the cell surface. R. flavefaciens 007 is a highly cellulolytic strain whose ability to degrade dewaxed cotton, but not Avicel cellulose, was lost following initial isolation in the variant 007S. The ability was recovered after serial subculture to give the cotton-degrading strain 007C. This has allowed us to investigate the factors required for degradation of this particularly recalcitrant form of cellulose. METHODOLOGYPRINCIPAL FINDINGS: The major proteins associated with the bacterial cell surface and with the culture supernatant were analyzed for R. flavefaciens 007S and 007C grown with cellobiose, xylan or Avicel cellulose as energy sources. Identification of the proteins was enabled by a draft genome sequence obtained for 007C. Among supernatant proteins a cellulosomal GH48 hydrolase, a rubrerthyrin-like protein and a protein with type IV pili N-terminal domain were the most strongly up-regulated in 007C cultures grown on Avicel compared with cellobiose. Strain 007S also showed substrate-related changes, but supernatant expression of the Pil protein and rubrerythrin in particular were markedly lower in 007S than in 007C during growth on Avicel. CONCLUSIONSSIGNIFICANCE: This study provides new information on the extracellular proteome of R. flavefaciens and its regulation in response to different growth substrates. Furthermore it suggests that the cotton cellulose non-degrading strain (007S) has altered regulation of multiple proteins that may be required for breakdown of cotton cellulose. One of these, the type IV pilus was previously shown to play a role in adhesion to cellulose in R. albus, and a related pilin protein was identified here for the first time as a major extracellular protein in R. flavefaciens.
PloS one 2013;8;6;e65333
PUBMED: 23750253; PMC: 3672088; DOI: 10.1371/journal.pone.0065333
-
Single-cell paired-end genome sequencing reveals structural variation per cell cycle.
Department of Human Genetics, KU Leuven, Leuven, 3000, Belgium, Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton CB10 1SA, UK, Department of Human Genetics, VIB and KU Leuven, Leuven, 3000, Belgium, Sequencing R&D, Wellcome Trust Sanger Institute, Hinxton CB10 1SA, UK, Leuven University Fertility Center, University Hospitals Leuven, Gasthuisberg, Leuven, 3000, Belgium, Department of Electrical Engineering, KU Leuven, Leuven, 3000, Belgium, Department of Genomic Medicine, The University of Texas MD Anderson Cancer Center, Houston, Texas, TX 77230-1429, USA, Department of Haematology, Addenbrooke's Hospital, Cambridge CB2 0QQ, UK and Department of Haematology, University of Cambridge, Cambridge CB2 2XY, UK.
The nature and pace of genome mutation is largely unknown. Because standard methods sequence DNA from populations of cells, the genetic composition of individual cells is lost, de novo mutations in cells are concealed within the bulk signal and per cell cycle mutation rates and mechanisms remain elusive. Although single-cell genome analyses could resolve these problems, such analyses are error-prone because of whole-genome amplification (WGA) artefacts and are limited in the types of DNA mutation that can be discerned. We developed methods for paired-end sequence analysis of single-cell WGA products that enable (i) detecting multiple classes of DNA mutation, (ii) distinguishing DNA copy number changes from allelic WGA-amplification artefacts by the discovery of matching aberrantly mapping read pairs among the surfeit of paired-end WGA and mapping artefacts and (iii) delineating the break points and architecture of structural variants. By applying the methods, we capture DNA copy number changes acquired over one cell cycle in breast cancer cells and in blastomeres derived from a human zygote after in vitro fertilization. Furthermore, we were able to discover and fine-map a heritable inter-chromosomal rearrangement t(1;16)(p36;p12) by sequencing a single blastomere. The methods will expedite applications in basic genome research and provide a stepping stone to novel approaches for clinical genetic diagnosis.
Nucleic acids research 2013
PUBMED: 23630320; DOI: 10.1093/nar/gkt345
-
Deep-sea striving.
Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.
Nature reviews. Microbiology 2013;11;6;364
PUBMED: 23669892; DOI: 10.1038/nrmicro3037
-
Microbiology. Fighting obesity with bacteria.
Pathogen Genomics Group, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK.
Science (New York, N.Y.) 2013;341;6150;1069-70
PUBMED: 24009379; DOI: 10.1126/science.1243787
-
Whole-genome sequencing to delineate Mycobacterium tuberculosis outbreaks: a retrospective observational study.
Nuffield Department of Medicine, John Radcliffe Hospital, University of Oxford, Oxford, UK. timothy.walker@ndm.ox.ac.uk
Background: Tuberculosis incidence in the UK has risen in the past decade. Disease control depends on epidemiological data, which can be difficult to obtain. Whole-genome sequencing can detect microevolution within Mycobacterium tuberculosis strains. We aimed to estimate the genetic diversity of related M tuberculosis strains in the UK Midlands and to investigate how this measurement might be used to investigate community outbreaks.
Methods: In a retrospective observational study, we used Illumina technology to sequence M tuberculosis genomes from an archive of frozen cultures. We characterised isolates into four groups: cross-sectional, longitudinal, household, and community. We measured pairwise nucleotide differences within hosts and between hosts in household outbreaks and estimated the rate of change in DNA sequences. We used the findings to interpret network diagrams constructed from 11 community clusters derived from mycobacterial interspersed repetitive-unit-variable-number tandem-repeat data.
Findings: We sequenced 390 separate isolates from 254 patients, including representatives from all five major lineages of M tuberculosis. The estimated rate of change in DNA sequences was 0.5 single nucleotide polymorphisms (SNPs) per genome per year (95% CI 0.3-0.7) in longitudinal isolates from 30 individuals and 25 families. Divergence is rarely higher than five SNPs in 3 years. 109 (96%) of 114 paired isolates from individuals and households differed by five or fewer SNPs. More than five SNPs separated isolates from none of 69 epidemiologically linked patients, two (15%) of 13 possibly linked patients, and 13 (17%) of 75 epidemiologically unlinked patients (three-way comparison exact p<0.0001). Genetic trees and clinical and epidemiological data suggest that super-spreaders were present in two community clusters.
Interpretation: Whole-genome sequencing can delineate outbreaks of tuberculosis and allows inference about direction of transmission between cases. The technique could identify super-spreaders and predict the existence of undiagnosed cases, potentially leading to early treatment of infectious patients and their contacts.
Funding: Medical Research Council, Wellcome Trust, National Institute for Health Research, and the Health Protection Agency.
Funded by: Biotechnology and Biological Sciences Research Council; Department of Health: G0800778; Medical Research Council; Wellcome Trust: 087646/Z/08/Z, 098051
The Lancet infectious diseases 2013;13;2;137-46
PUBMED: 23158499; PMC: 3556524; DOI: 10.1016/S1473-3099(12)70277-3
-
A biochemical analysis of the Plasmodium falciparum Erythrocyte Binding Antigen-175 (EBA175) - Glycophorin-A interaction: implications for vaccine design.
Wellcome Trust Sanger Institute, United Kingdom;
PfEBA175 has an important role in the invasion of human erythrocytes by Plasmodium falciparum, and is therefore considered a high priority blood-stage malaria vaccine candidate. PfEBA175 mediates adhesion to erythrocytes through binding of the Duffy-binding-like (DBL) domains in its extracellular domain to Neu5Acα2-3Galactose displayed on the O-linked glycans of Glycophorin-A (GYPA). Because of the difficulties in expressing active full-length P. falciparum proteins in a recombinant form, previous analyses of the PfEBA175-GYPA interaction have largely focused on the DBL domains alone, and therefore have not been performed in the context of the native protein sequence. Here, we express the entire ectodomain of PfEBA175 (PfEBA175 FL) in soluble form, allowing us to compare the biochemical and immunological properties with a fragment containing only the tandem DBL domains (Region II - PfEBA175 RII). Recombinant PfEBA175 FL bound human erythrocytes in a trypsin and neuraminidase-sensitive manner and recognised Neu5Acα2-3Galactose-containing glycans, confirming its biochemical activity. A quantitative binding analysis showed that PfEBA175 FL interacted with native GYPA with a KD ~0.26 µM and is capable of self-association. By comparison, the RII fragment alone bound GYPA with a lower affinity demonstrating that regions outside of the DBL domains are important for interactions with GYPA; antibodies directed to these other regions also contributed to the inhibition of parasite invasion. These data demonstrate the importance of PfEBA175 regions other than the DBL domains in the interaction with GYPA and merit their inclusion in an EBA175-based vaccine.
The Journal of biological chemistry 2013
PUBMED: 24043627; DOI: 10.1074/jbc.M113.484840
-
Genetic basis of Y-linked hearing impairment.
Department of Otolaryngology, Head and Neck Surgery, Chinese PLA Institute of Otolaryngology, Chinese PLA General Hospital, Beijing, China.
A single Mendelian trait has been mapped to the human Y chromosome: Y-linked hearing impairment. The molecular basis of this disorder is unknown. Here, we report the detailed characterization of the DFNY1 Y chromosome and its comparison with a closely related Y chromosome from an unaffected branch of the family. The DFNY1 chromosome carries a complex rearrangement, including duplication of several noncontiguous segments of the Y chromosome and insertion of ∼160 kb of DNA from chromosome 1, in the pericentric region of Yp. This segment of chromosome 1 is derived entirely from within a known hearing impairment locus, DFNA49. We suggest that a third copy of one or more genes from the shared segment of chromosome 1 might be responsible for the hearing-loss phenotype.
Funded by: Wellcome Trust: 098051
American journal of human genetics 2013;92;2;301-6
PUBMED: 23352258; PMC: 3567277; DOI: 10.1016/j.ajhg.2012.12.015
-
The draft genomes of soft-shell turtle and green sea turtle yield insights into the development and evolution of the turtle-specific body plan.
1] BGI-Shenzhen, Shenzhen, China. [2].
The unique anatomical features of turtles have raised unanswered questions about the origin of their unique body plan. We generated and analyzed draft genomes of the soft-shell turtle (Pelodiscus sinensis) and the green sea turtle (Chelonia mydas); our results indicated the close relationship of the turtles to the bird-crocodilian lineage, from which they split ∼267.9-248.3 million years ago (Upper Permian to Triassic). We also found extensive expansion of olfactory receptor genes in these turtles. Embryonic gene expression analysis identified an hourglass-like divergence of turtle and chicken embryogenesis, with maximal conservation around the vertebrate phylotypic period, rather than at later stages that show the amniote-common pattern. Wnt5a expression was found in the growth zone of the dorsal shell, supporting the possible co-option of limb-associated Wnt signaling in the acquisition of this turtle-specific novelty. Our results suggest that turtle evolution was accompanied by an unexpectedly conservative vertebrate phylotypic period, followed by turtle-specific repatterning of development to yield the novel structure of the shell.
Nature genetics 2013
PUBMED: 23624526; DOI: 10.1038/ng.2615
-
Viral population analysis and minority-variant detection using short read next-generation sequencing.
Wellcome Trust Sanger Institute, , Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.
RNA viruses within infected individuals exist as a population of evolutionary-related variants. Owing to evolutionary change affecting the constitution of this population, the frequency and/or occurrence of individual viral variants can show marked or subtle fluctuations. Since the development of massively parallel sequencing platforms, such viral populations can now be investigated to unprecedented resolution. A critical problem with such analyses is the presence of sequencing-related errors that obscure the identification of true biological variants present at low frequency. Here, we report the development and assessment of the Quality Assessment of Short Read (QUASR) Pipeline (http://sourceforge.net/projects/quasr) specific for virus genome short read analysis that minimizes sequencing errors from multiple deep-sequencing platforms, and enables post-mapping analysis of the minority variants within the viral population. QUASR significantly reduces the error-related noise in deep-sequencing datasets, resulting in increased mapping accuracy and reduction of erroneous mutations. Using QUASR, we have determined influenza virus genome dynamics in sequential samples from an in vitro evolution of 2009 pandemic H1N1 (A/H1N1/09) influenza from samples sequenced on both the Roche 454 GSFLX and Illumina GAIIx platforms. Importantly, concordance between the 454 and Illumina sequencing allowed unambiguous minority-variant detection and accurate determination of virus population turnover in vitro.
Philosophical transactions of the Royal Society of London. Series B, Biological sciences 2013;368;1614;20120205
PUBMED: 23382427; DOI: 10.1098/rstb.2012.0205
-
Clinically relevant mutant DNA gyrase alters supercoiling, changes the transcriptome, and confers multidrug resistance.
Antimicrobial Agents Research Group, School of Immunity and Infection, Institute of Microbiology and Infection, The University of Birmingham, Edgbaston, Birmingham, United Kingdom.
Unlabelled: Bacterial DNA is maintained in a supercoiled state controlled by the action of topoisomerases. Alterations in supercoiling affect fundamental cellular processes, including transcription. Here, we show that substitution at position 87 of GyrA of Salmonella influences sensitivity to antibiotics, including nonquinolone drugs, alters global supercoiling, and results in an altered transcriptome with increased expression of stress response pathways. Decreased susceptibility to multiple antibiotics seen with a GyrA Asp87Gly mutant was not a result of increased efflux activity or reduced reactive-oxygen production. These data show that a frequently observed and clinically relevant substitution within GyrA results in altered expression of numerous genes, including those important in bacterial survival of stress, suggesting that GyrA mutants may have a selective advantage under specific conditions. Our findings help contextualize the high rate of quinolone resistance in pathogenic strains of bacteria and may partly explain why such mutant strains are evolutionarily successful. Importance: Fluoroquinolones are a powerful group of antibiotics that target bacterial enzymes involved in helping bacteria maintain the conformation of their chromosome. Mutations in the target enzymes allow bacteria to become resistant to these antibiotics, and fluoroquinolone resistance is common. We show here that these mutations also provide protection against a broad range of other antimicrobials by triggering a defensive stress response in the cell. This work suggests that fluoroquinolone resistance mutations may be beneficial under a range of conditions.
Funded by: Biotechnology and Biological Sciences Research Council: BB/G012016/1; Medical Research Council: G0501415, G0807977
mBio 2013;4;4
PUBMED: 23882012; PMC: 3735185; DOI: 10.1128/mBio.00273-13
-
A calibrated human Y-chromosomal phylogeny based on resequencing.
The Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, United Kingdom.
We have identified variants present in high-coverage complete sequences of 36 diverse human Y chromosomes from Africa, Europe, South Asia, East Asia, and the Americas, representing eight major haplogroups. After restricting our analysis to 8.97 Mb of the unique male-specific Y sequence, we identified 6662 high-confidence variants, including single-nucleotide polymorphisms (SNPs), multi-nucleotide polymorphisms (MNPs), and indels. We constructed phylogenetic trees using these variants, or subsets of them, and recapitulated the known structure of the tree. Assuming a male mutation rate of 1 × 10(-9) per base pair per year, the time depth of the tree (haplogroups A3-R) was ~101,000-115,000 yr, and the lineages found outside Africa dated to 57,000-74,000 yr, both as expected. In addition, we dated a striking Paleolithic male lineage expansion to 41,000-52,000 yr ago and the node representing the major European Y lineage, R1b, to 4000-13,000 yr ago, supporting a Neolithic origin for these modern European Y chromosomes. In all, we provide a nearly 10-fold increase in the number of Y markers with phylogenetic information, and novel historical insights derived from placing them on a calibrated phylogenetic tree.
Funded by: Wellcome Trust: 098051
Genome research 2013;23;2;388-95
PUBMED: 23038768; PMC: 3561879; DOI: 10.1101/gr.143198.112
-
A comparison of Y-chromosomal lineage dating using either resequencing or Y-SNP plus Y-STR genotyping.
The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK.
We have compared phylogenies and time estimates for Y-chromosomal lineages based on resequencing ∼9Mb of DNA and applying the program GENETREE to similar analyses based on the more standard approach of genotyping 26 Y-SNPs plus 21 Y-STRs and applying the programs NETWORK and BATWING. We find that deep phylogenetic structure is not adequately reconstructed after Y-SNP plus Y-STR genotyping, and that times estimated using observed Y-STR mutation rates are several-fold too recent. In contrast, an evolutionary mutation rate gives times that are more similar to the resequencing data. In principle, systematic comparisons of this kind can in future studies be used to identify the combinations of Y-SNP and Y-STR markers, and time estimation methodologies, that correspond best to resequencing data.
Forensic science international. Genetics 2013
PUBMED: 23768990; DOI: 10.1016/j.fsigen.2013.03.014
-
Systematic identification of trans eQTLs as putative drivers of known disease associations.
1] Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands. [2].
Identifying the downstream effects of disease-associated SNPs is challenging. To help overcome this problem, we performed expression quantitative trait locus (eQTL) meta-analysis in non-transformed peripheral blood samples from 5,311 individuals with replication in 2,775 individuals. We identified and replicated trans eQTLs for 233 SNPs (reflecting 103 independent loci) that were previously associated with complex traits at genome-wide significance. Some of these SNPs affect multiple genes in trans that are known to be altered in individuals with disease: rs4917014, previously associated with systemic lupus erythematosus (SLE), altered gene expression of C1QB and five type I interferon response genes, both hallmarks of SLE. DeepSAGE RNA sequencing showed that rs4917014 strongly alters the 3' UTR levels of IKZF1 in cis, and chromatin immunoprecipitation and sequencing analysis of the trans-regulated genes implicated IKZF1 as the causal gene. Variants associated with cholesterol metabolism and type 1 diabetes showed similar phenomena, indicating that large-scale eQTL mapping provides insight into the downstream effects of many trait-associated variants.
Nature genetics 2013
PUBMED: 24013639; DOI: 10.1038/ng.2756
-
Genome-wide SNP and CNV analysis identifies common and low-frequency variants associated with severe early-onset obesity.
Wellcome Trust Sanger Institute, Cambridge, UK.
Common and rare variants associated with body mass index (BMI) and obesity account for <5% of the variance in BMI. We performed SNP and copy number variation (CNV) association analyses in 1,509 children with obesity at the extreme tail (>3 s.d. from the mean) of the BMI distribution and 5,380 controls. Evaluation of 29 SNPs (P < 1 × 10(-5)) in an additional 971 severely obese children and 1,990 controls identified 4 new loci associated with severe obesity (LEPR, PRKCH, PACS1 and RMST). A previously reported 43-kb deletion at the NEGR1 locus was significantly associated with severe obesity (P = 6.6 × 10(-7)). However, this signal was entirely driven by a flanking 8-kb deletion; absence of this deletion increased risk for obesity (P = 6.1 × 10(-11)). We found a significant burden of rare, single CNVs in severely obese cases (P < 0.0001). Integrative gene network pathway analysis of rare deletions indicated enrichment of genes affecting G protein-coupled receptors (GPCRs) involved in the neuronal regulation of energy homeostasis.
Funded by: Cancer Research UK; Medical Research Council; NIDA NIH HHS: R25 DA027995; Wellcome Trust: 084713, WT098051
Nature genetics 2013;45;5;513-7
PUBMED: 23563609; DOI: 10.1038/ng.2607
-
Genome-wide generation and systematic phenotyping of knockout mice reveals new roles for many genes.
Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, UK.
Mutations in whole organisms are powerful ways of interrogating gene function in a realistic context. We describe a program, the Sanger Institute Mouse Genetics Project, that provides a step toward the aim of knocking out all genes and screening each line for a broad range of traits. We found that hitherto unpublished genes were as likely to reveal phenotypes as known genes, suggesting that novel genes represent a rich resource for investigating the molecular basis of disease. We found many unexpected phenotypes detected only because we screened for them, emphasizing the value of screening all mutants for a wide range of traits. Haploinsufficiency and pleiotropy were both surprisingly common. Forty-two percent of genes were essential for viability, and these were less likely to have a paralog and more likely to contribute to a protein complex than other genes. Phenotypic data and more than 900 mutants are openly available for further analysis. PAPERCLIP:
Funded by: Cancer Research UK; Medical Research Council: RG45277 PCAG/116; NEI NIH HHS: 5K08EY020530-02, EY08213; Wellcome Trust: 098051
Cell 2013;154;2;452-64
PUBMED: 23870131; PMC: 3717207; DOI: 10.1016/j.cell.2013.06.022
-
Ischemic stroke is associated with the ABO locus: The EuroCLOT study.
Department of Twin Research and Genetic Epidemiology, King's College London, London, United Kingdom. frances.williams@kcl.ac.uk.
Objective: End-stage coagulation and the structure/function of fibrin are implicated in the pathogenesis of ischemic stroke. We explored whether genetic variants associated with end-stage coagulation in healthy REFVIDunteers account for the genetic predisposition to ischemic stroke and examined their influence on stroke subtype. Methods: Common genetic variants identified through genome-wide association studies of coagulation factors and fibrin structure/function in healthy twins (n = 2,100, Stage 1) were examined in ischemic stroke (n = 4,200 cases) using 2 independent samples of European ancestry (Stage 2). A third clinical collection having stroke subtyping (total 8,900 cases, 55,000 controls) was used for replication (Stage 3). Results: Stage 1 identified 524 single nucleotide polymorphisms (SNPs) from 23 linkage disequilibrium blocks having significant association (p < 5 × 10(-8) ) with 1 or more coagulation/fibrin phenotypes. The most striking associations included SNP rs5985 with factor XIII activity (p = 2.6 × 10(-186) ), rs10665 with FVII (p = 2.4 × 10(-47) ), and rs505922 in the ABO gene with both von Willebrand factor (p = 4.7 × 10(-57) ) and factor VIII (p = 1.2 × 10(-36) ). In Stage 2, the 23 independent SNPs were examined in stroke cases/noncases using MOnica Risk, Genetics, Archiving and Monograph (MORGAM) and Wellcome Trust Case Control Consortium 2 collections. SNP rs505922 was nominally associated with ischemic stroke (odds ratio = 0.94, 95% confidence interval = 0.88-0.99, p = 0.023). Independent replication in Meta-Stroke confirmed the rs505922 association with stroke, beta (standard error, SE) = 0.066 (0.02), p = 0.001, a finding specific to large-vessel and cardioembolic stroke (p = 0.001 and p = < 0.001, respectively) but not seen with small-vessel stroke (p = 0.811). Interpretation: ABO gene variants are associated with large-vessel and cardioembolic stroke but not small-vessel disease. This work sheds light on the different pathogenic mechanisms underpinning stroke subtype. Ann Neurol 2013.
Funded by: NCRR NIH HHS: UL1 RR033176; NHGRI NIH HHS: U01 HG004402, U01 HG004446; NHLBI NIH HHS: N01 HC035129, R01 HL075366, R01 HL080295, R01 HL085251, R01 HL087641, R01 HL087652, R01 HL093029, R01 HL105756; NIA NIH HHS: R01 AG008122, R01 AG015928, R01 AG016495, R01 AG020098, R01 AG023629, R01 AG027058, R01 AG031287, R01 AG033193; NIDDK NIH HHS: P30 DK063491; NINDS NIH HHS: R01 NS017950, R01 NS045012, U01 NS069208
Annals of neurology 2013;73;1;16-31
PUBMED: 23381943; PMC: 3582024; DOI: 10.1002/ana.23838
-
Sequencing and comparative analysis of the gorilla MHC genomic sequence.
Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1HH, UK.
Major histocompatibility complex (MHC) genes play a critical role in vertebrate immune response and because the MHC is linked to a significant number of auto-immune and other diseases it is of great medical interest. Here we describe the clone-based sequencing and subsequent annotation of the MHC region of the gorilla genome. Because the MHC is subject to extensive variation, both structural and sequence-wise, it is not readily amenable to study in whole genome shotgun sequence such as the recently published gorilla genome. The variation of the MHC also makes it of evolutionary interest and therefore we analyse the sequence in the context of human and chimpanzee. In our comparisons with human and re-annotated chimpanzee MHC sequence we find that gorilla has a trimodular RCCX cluster, versus the reference human bimodular cluster, and additional copies of Class I (pseudo)genes between Gogo-K and Gogo-A (the orthologues of HLA-K and -A). We also find that Gogo-H (and Patr-H) is coding versus the HLA-H pseudogene and, conversely, there is a Gogo-DQB2 pseudogene versus the HLA-DQB2 coding gene. Our analysis, which is freely available through the VEGA genome browser, provides the research community with a comprehensive dataset for comparative and evolutionary research of the MHC.
Funded by: Wellcome Trust
Database : the journal of biological databases and curation 2013;2013;bat011
PUBMED: 23589541; PMC: 3626023; DOI: 10.1093/database/bat011
-
TM4SF20 Ancestral Deletion and Susceptibility to a Pediatric Disorder of Early Language Delay and Cerebral White Matter Hyperintensities.
Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA.
White matter hyperintensities (WMHs) of the brain are important markers of aging and small-vessel disease. WMHs are rare in healthy children and, when observed, often occur with comorbid neuroinflammatory or vasculitic processes. Here, we describe a complex 4 kb deletion in 2q36.3 that segregates with early childhood communication disorders and WMH in 15 unrelated families predominantly from Southeast Asia. The premature brain aging phenotype with punctate and multifocal WMHs was observed in ∼70% of young carrier parents who underwent brain MRI. The complex deletion removes the penultimate exon 3 of TM4SF20, a gene encoding a transmembrane protein of unknown function. Minigene analysis showed that the resultant net loss of an exon introduces a premature stop codon, which, in turn, leads to the generation of a stable protein that fails to target to the plasma membrane and accumulates in the cytoplasm. Finally, we report this deletion to be enriched in individuals of Vietnamese Kinh descent, with an allele frequency of about 1%, embedded in an ancestral haplotype. Our data point to a constellation of early language delay and WMH phenotypes, driven by a likely toxic mechanism of TM4SF20 truncation, and highlight the importance of understanding and managing population-specific low-frequency pathogenic alleles.
Funded by: NINDS NIH HHS: K23 NS078056
American journal of human genetics 2013
PUBMED: 23810381; DOI: 10.1016/j.ajhg.2013.05.027
-
Go retro and get a GRIP.
Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1HH, UK. da1@sanger.ac.uk.
Gene retrocopy insertions are a source of new genes and new gene functions, and can now be identified using paired-end whole genome sequencing data.
Genome biology 2013;14;3;108
PUBMED: 23514103; DOI: 10.1186/gb-2013-14-3-108
-
Duplication and retention biases of essential and non-essential genes revealed by systematic knockdown analyses.
The Gurdon Institute and Department of Genetics, University of Cambridge, Cambridge, United Kingdom.
When a duplicate gene has no apparent loss-of-function phenotype, it is commonly considered that the phenotype has been masked as a result of functional redundancy with the remaining paralog. This is supported by indirect evidence showing that multi-copy genes show loss-of-function phenotypes less often than single-copy genes and by direct tests of phenotype masking using select gene sets. Here we take a systematic genome-wide RNA interference approach to assess phenotype masking in paralog pairs in the Caenorhabditis elegans genome. Remarkably, in contrast to expectations, we find that phenotype masking makes only a minor contribution to the low knockdown phenotype rate for duplicate genes. Instead, we find that non-essential genes are highly over-represented among duplicates, leading to a low observed loss-of-function phenotype rate. We further find that duplicate pairs derived from essential and non-essential genes have contrasting evolutionary dynamics: whereas non-essential genes are both more often successfully duplicated (fixed) and lost, essential genes are less often duplicated but upon successful duplication are maintained over longer periods. We expect the fundamental evolutionary duplication dynamics presented here to be broadly applicable.
PLoS genetics 2013;9;5;e1003330
PUBMED: 23675306; PMC: 3649981; DOI: 10.1371/journal.pgen.1003330
-
Bcl11a Controls Flt3 Expression in Early Hematopoietic Progenitors and Is Required for pDC Development In Vivo.
Department of Pathology and Immunology, Washington University School of Medicine, St. Louis, Missouri, United States of America.
Bcl11a is a transcription factor known to regulate lymphoid and erythroid development. Recent bioinformatic analysis of global gene expression patterns has suggested a role for Bcl11a in the development of dendritic cell (DC) lineages. We tested this hypothesis by analyzing the development of DC and other lineages in Bcl11a (-/-) mice. We found that Bcl11a was required for expression of IL-7 receptor (IL-7R) and Flt3 in early hematopoietic progenitor cells. In addition, we found severely decreased numbers of plasmacytoid dendritic cells (pDCs) in Bcl11a (-/-) fetal livers and in the bone marrow of Bcl11a (-/-) fetal liver chimeras. Moreover, Bcl11a (-/-) cells showed severely impaired in vitro development of Flt3L-derived pDCs and classical DCs (cDCs). In contrast, we found normal in vitro development of DCs from Bcl11a (-/-) fetal liver cells treated with GM-CSF. These results suggest that the persistent cDC development observed in Bcl11a (-/-) fetal liver chimeras reflects derivation from a Bcl11a- and Flt3-independent pathway in vivo.
PloS one 2013;8;5;e64800
PUBMED: 23741395; PMC: 3669380; DOI: 10.1371/journal.pone.0064800
-
Pneumococcal capsular switching: a historical perspective.
Department of Zoology, University of Oxford, and.
Background. Changes in serotype prevalence among pneumococcal populations result from both serotype replacement and serotype (capsular) switching. Temporal changes in serotype distributions are well documented, but the contribution of capsular switching to such changes is unknown. Furthermore, it is unclear to what extent vaccine-induced selective pressures drive capsular switching. Methods. Serotype and multilocus sequence typing data for 426 pneumococci dated from 1937 through 2007 were analyzed. Whole-genome sequence data for a subset of isolates were used to investigate capsular switching events. Results. We identified 36 independent capsular switch events, 18 of which were explored in detail with whole-genome sequence data. Recombination fragment lengths were estimated for 11 events and ranged from approximately 19.0 kb to ≥58.2 kb. Two events took place no later than 1960, and the imported DNA included the capsular locus and the nearby penicillin-binding protein genes pbp2x and pbp1a. Conclusions. Capsular switching has been a regular occurrence among pneumococcal populations throughout the past 7 decades. Recombination of large DNA fragments (>30 kb), sometimes including the capsular locus and penicillin-binding protein genes, predated both vaccine introduction and widespread antibiotic use. This type of recombination has likely been an intrinsic feature throughout the history of pneumococcal evolution.
The Journal of infectious diseases 2013;207;3;439-49
PUBMED: 23175765; PMC: 3537446; DOI: 10.1093/infdis/jis703
-
INMEX--a web-based tool for integrative meta-analysis of expression data.
Department of Microbiology and Immunology, University of British Columbia, Vancouver, V6T 1Z3, Canada, Centre for Microbial Diseases and Immunity Research, University of British Columbia, Vancouver, V6T 1Z4, Canada, James Hogg Research Centre, University of British Columbia, Vancouver, V6Z 1Y6, Canada, Department of Computing Science, University of Alberta, T6G 2E8, Canada, Department of Biological Sciences, University of Alberta, T6G 2E8, Canada and Wellcome Trust Sanger Institute, Hinxton, CB10 1SA, UK.
The widespread applications of various 'omics' technologies in biomedical research together with the emergence of public data repositories have resulted in a plethora of data sets for almost any given physiological state or disease condition. Properly combining or integrating these data sets with similar basic hypotheses can help reduce study bias, increase statistical power and improve overall biological understanding. However, the difficulties in data management and the complexities of analytical approaches have significantly limited data integration to enable meta-analysis. Here, we introduce integrative meta-analysis of expression data (INMEX), a user-friendly web-based tool designed to support meta-analysis of multiple gene-expression data sets, as well as to enable integration of data sets from gene expression and metabolomics experiments. INMEX contains three functional modules. The data preparation module supports flexible data processing, annotation and visualization of individual data sets. The statistical analysis module allows researchers to combine multiple data sets based on P-values, effect sizes, rank orders and other features. The significant genes can be examined in functional analysis module for enriched Gene Ontology terms or Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways, or expression profile visualization. INMEX has built-in support for common gene/metabolite identifiers (IDs), as well as 45 popular microarray platforms for human, mouse and rat. Complex operations are performed through a user-friendly web interface in a step-by-step manner. INMEX is freely available at http://www.inmex.ca.
Nucleic acids research 2013
PUBMED: 23766290; DOI: 10.1093/nar/gkt338
-
Serotonin: The Central Link between Bone Mass and Energy Metabolism
YADAV,V.K.
Translational Endocrinology of Bone 2013
DOI: 10.1016/B978-0-12-415784-2.00005-1; URL: http://dx.doi.org/10.1016/B97...8-0-12-415784-2.00005-1
-
Enhancement of microhomology-mediated genomic rearrangements by transient loss of mouse Bloom syndrome helicase.
Department of Social and Environmental Medicine, Graduate School of Medicine, Osaka University, Suita, Osaka 565-0871, Japan;
Bloom syndrome, an autosomal recessive disorder of the BLM gene, confers predisposition to a broad spectrum of early-onset cancers in multiple tissue types. Loss of genomic integrity is a primary hallmark of such human malignancies, but many studies using disease-affected specimens are limited in that they are retrospective and devoid of an appropriate experimental control. To overcome this, we devised an experimental system to recapitulate the early molecular events in genetically engineered mouse embryonic stem cells, in which cells undergoing loss of heterozygosity (LOH) can be enriched after inducible down-regulation of Blm expression, with or without site-directed DNA double-strand break (DSB) induction. Transient loss of BLM increased the rate of LOH, whose breakpoints were distributed along the chromosome. Combined with site-directed DSB induction, loss of BLM synergistically increased the rate of LOH and concentrated the breakpoints around the targeted chromosomal region. We characterized the LOH events using specifically tailored genomic tools, such as high-resolution array comparative genomic hybridization and high-density single nucleotide polymorphism genotyping, revealing that the combination of BLM suppression and DSB induction enhanced genomic rearrangements, including deletions and insertions, whose breakpoints were clustered in genomic inverted repeats and associated with junctional microhomologies. Our experimental approach successfully uncovered the detailed molecular mechanisms of as-yet-uncharacterized loss of heterozygosities and reveals the significant contribution of microhomology-mediated genomic rearrangements, which could be widely applicable to the early steps of cancer formation in general.
Genome research 2013;23;9;1462-73
PUBMED: 23908384; PMC: 3759722; DOI: 10.1101/gr.152744.112
-
Interferon-induced transmembrane protein-3 genetic variant rs12252-C is associated with severe influenza in Chinese individuals.
Beijing You'an Hospital, Capital Medical University, Beijing PO 100069, China.
The SNP rs12252-C allele alters the function of interferon-induced transmembrane protein-3 increasing the disease severity of influenza virus infection in Caucasians, but the allele is rare. However, rs12252-C is much more common in Han Chinese. Here we report that the CC genotype is found in 69% of Chinese patients with severe pandemic influenza A H1N1/09 virus infection compared with 25% in those with mild infection. Specifically, the CC genotype was estimated to confer a sixfold greater risk for severe infection than the CT and TT genotypes. More importantly, because the risk genotype occurs with such a high frequency, its effect translates to a large population-attributable risk of 54.3% for severe infection in the Chinese population studied compared with 5.4% in Northern Europeans. Interferon-induced transmembrane protein-3 genetic variants could, therefore, have a strong effect of the epidemiology of influenza in China and in people of Chinese descent.
Funded by: Medical Research Council; Wellcome Trust
Nature communications 2013;4;1418
PUBMED: 23361009; PMC: 3562464; DOI: 10.1038/ncomms2433


