Sanger Institute - Publications 2013

Number of papers published in 2013: 144

  • Sequencing ancient calcified dental plaque shows changes in oral microbiota with dietary shifts of the Neolithic and Industrial revolutions.

    Adler CJ, Dobney K, Weyrich LS, Kaidonis J, Walker AW, Haak W, Bradshaw CJ, Townsend G, Sołtysiak A, Alt KW, Parkhill J and Cooper A

    1] Australian Centre for Ancient DNA, School of Earth and Environmental Sciences, The University of Adelaide, Adelaide, South Australia, Australia. [2] Environment Institute, The University of Adelaide, Adelaide, South Australia, Australia. [3] Institute of Dental Research, Westmead Millennium Institute, Faculty of Dentistry, University of Sydney, Sydney, New South Wales, Australia.

    The importance of commensal microbes for human health is increasingly recognized, yet the impacts of evolutionary changes in human diet and culture on commensal microbiota remain almost unknown. Two of the greatest dietary shifts in human evolution involved the adoption of carbohydrate-rich Neolithic (farming) diets (beginning ∼10,000 years before the present) and the more recent advent of industrially processed flour and sugar (in ∼1850). Here, we show that calcified dental plaque (dental calculus) on ancient teeth preserves a detailed genetic record throughout this period. Data from 34 early European skeletons indicate that the transition from hunter-gatherer to farming shifted the oral microbial community to a disease-associated configuration. The composition of oral microbiota remained unexpectedly constant between Neolithic and medieval times, after which (the now ubiquitous) cariogenic bacteria became dominant, apparently during the Industrial Revolution. Modern oral microbiotic ecosystems are markedly less diverse than historic populations, which might be contributing to chronic oral (and other) disease in postindustrial lifestyles.

    Nature genetics 2013

  • New insights into the genetic basis of TAR (thrombocytopenia-absent radii) syndrome.

    Albers CA, Newbury-Ecob R, Ouwehand WH and Ghevaert C

    Department of Haematology, University of Cambridge, UK; NHS Blood and Transplant, Cambridge, UK; Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK. Electronic address: c.albers@gen.umcn.nl.

    Thrombocytopenia with absent radii (TAR) syndrome is a rare disorder combining specific skeletal abnormalities with a reduced platelet count. Rare proximal microdeletions of 1q21.1 are found in the majority of patients but are also found in unaffected parents. Recently it was shown that TAR syndrome is caused by the compound inheritance of a low-frequency noncoding SNP and a rare null allele in RBM8A, a gene encoding the exon-junction complex subunit member Y14 located in the deleted region. This finding provides new insight into the complex inheritance pattern and new clues to the molecular mechanisms underlying TAR syndrome. We discuss TAR syndrome in the context of abnormal phenotypes associated with proximal and distal 1q21.1 microdeletion and microduplications with incomplete penetrance and variable expressivity.

    Current opinion in genetics & development 2013

  • Specificity and heterogeneity of terahertz radiation effect on gene expression in mouse mesenchymal stem cells.

    Alexandrov BS, Phipps ML, Alexandrov LB, Booshehri LG, Erat A, Zabolotny J, Mielke CH, Chen HT, Rodriguez G, Rasmussen KØ, Martinez JS, Bishop AR and Usheva A

    Theoretical Division, Los Alamos National Laboratory , Los Alamos, NM 87545, USA ; Harvard Medical School, Beth Israel Deaconess Medical Center, Department of Medicine , Boston, MA 02215, USA.

    We report that terahertz (THz) irradiation of mouse mesenchymal stem cells (mMSCs) with a single-frequency (SF) 2.52 THz laser or pulsed broadband (centered at 10 THz) source results in irradiation specific heterogenic changes in gene expression. The THz effect depends on irradiation parameters such as the duration and type of THz source, and on the degree of stem cell differentiation. Our microarray survey and RT-PCR experiments demonstrate that prolonged broadband THz irradiation drives mMSCs toward differentiation, while 2-hour irradiation (regardless of THz sources) affects genes transcriptionally active in pluripotent stem cells. The strictly controlled experimental environment indicates minimal temperature changes and the absence of any discernable response to heat shock and cellular stress genes imply a non-thermal response. Computer simulations of the core promoters of two pluripotency markers reveal association between gene upregulation and propensity for DNA breathing. We propose that THz radiation has potential for non-contact control of cellular gene expression.

    Scientific reports 2013;3;1184

  • Deciphering signatures of mutational processes operative in human cancer.

    Alexandrov LB, Nik-Zainal S, Wedge DC, Campbell PJ and Stratton MR

    Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton CB10 1SA, UK.

    The genome of a cancer cell carries somatic mutations that are the cumulative consequences of the DNA damage and repair processes operative during the cellular lineage between the fertilized egg and the cancer cell. Remarkably, these mutational processes are poorly characterized. Global sequencing initiatives are yielding catalogs of somatic mutations from thousands of cancers, thus providing the unique opportunity to decipher the signatures of mutational processes operative in human cancer. However, until now there have been no theoretical models describing the signatures of mutational processes operative in cancer genomes and no systematic computational approaches are available to decipher these mutational signatures. Here, by modeling mutational processes as a blind source separation problem, we introduce a computational framework that effectively addresses these questions. Our approach provides a basis for characterizing mutational signatures from cancer-derived somatic mutational catalogs, paving the way to insights into the pathogenetic mechanism underlying all cancers.

    Funded by: Wellcome Trust: 093867, 098051, WT088340MA

    Cell reports 2013;3;1;246-59

  • Inappropriately low hepcidin levels in patients with myelodysplastic syndrome carrying a somatic mutation of SF3B1.

    Ambaglio I, Malcovati L, Papaemmanuil E, Laarakkers CM, Della Porta MG, Gallì A, Da Vià MC, Bono E, Ubezio M, Travaglino E, Albertini R, Campbell PJ, Swinkels DW and Cazzola M

    luca.malcovati@unipv.it.

    Somatic mutations of the RNA splicing machinery have been recently identified in myelodysplastic syndromes. In particular, a strong association has been found between SF3B1 mutation and refractory anemia with ring sider-oblasts, a condition characterized by ineffective erythropoiesis and parenchymal iron overload. We studied the relationship between SF3B1 mutation, erythroid activity and hepcidin levels in myelodysplastic syndrome patients. Erythroid activity was evaluated through the proportion of marrow erythroblasts, soluble transferrin receptor and serum growth differentiation factor 15. Significant relationships were found between SF3B1 mutation and marrow erythroblasts (P=0.001), soluble transferrin receptor (P=0.003) and serum growth differentiation factor 15 (P=0.033). Serum hepcidin varied considerably, and multivariable analysis showed that the hepcidin to ferritin ratio, a measure of adequacy of hepcidin levels relative to body iron stores, was inversely related to the SF3B1 mutation (P=0.013). These observations suggest that patients with SF3B1 mutation have inappropriately low hepcidin levels, which may explain their propensity to parenchymal iron loading.

    Haematologica 2013;98;3;420-3

  • The African coelacanth genome provides insights into tetrapod evolution.

    Amemiya CT, Alföldi J, Lee AP, Fan S, Philippe H, Maccallum I, Braasch I, Manousaki T, Schneider I, Rohner N, Organ C, Chalopin D, Smith JJ, Robinson M, Dorrington RA, Gerdol M, Aken B, Biscotti MA, Barucca M, Baurain D, Berlin AM, Blatch GL, Buonocore F, Burmester T, Campbell MS, Canapa A, Cannon JP, Christoffels A, De Moro G, Edkins AL, Fan L, Fausto AM, Feiner N, Forconi M, Gamieldien J, Gnerre S, Gnirke A, Goldstone JV, Haerty W, Hahn ME, Hesse U, Hoffmann S, Johnson J, Karchner SI, Kuraku S, Lara M, Levin JZ, Litman GW, Mauceli E, Miyake T, Mueller MG, Nelson DR, Nitsche A, Olmo E, Ota T, Pallavicini A, Panji S, Picone B, Ponting CP, Prohaska SJ, Przybylski D, Saha NR, Ravi V, Ribeiro FJ, Sauka-Spengler T, Scapigliati G, Searle SM, Sharpe T, Simakov O, Stadler PF, Stegeman JJ, Sumiyama K, Tabbaa D, Tafer H, Turner-Maier J, van Heusden P, White S, Williams L, Yandell M, Brinkmann H, Volff JN, Tabin CJ, Shubin N, Schartl M, Jaffe DB, Postlethwait JH, Venkatesh B, Di Palma F, Lander ES, Meyer A and Lindblad-Toh K

    Molecular Genetics Program, Benaroya Research Institute, Seattle, Washington 98101, USA. camemiya@benaroyaresearch.org

    The discovery of a living coelacanth specimen in 1938 was remarkable, as this lineage of lobe-finned fish was thought to have become extinct 70 million years ago. The modern coelacanth looks remarkably similar to many of its ancient relatives, and its evolutionary proximity to our own fish ancestors provides a glimpse of the fish that first walked on land. Here we report the genome sequence of the African coelacanth, Latimeria chalumnae. Through a phylogenomic analysis, we conclude that the lungfish, and not the coelacanth, is the closest living relative of tetrapods. Coelacanth protein-coding genes are significantly more slowly evolving than those of tetrapods, unlike other genomic features. Analyses of changes in genes and regulatory elements during the vertebrate adaptation to land highlight genes involved in immunity, nitrogen excretion and the development of fins, tail, ear, eye, brain and olfaction. Functional assays of enhancers involved in the fin-to-limb transition and in the emergence of extra-embryonic tissues show the importance of the coelacanth genome as a blueprint for understanding tetrapod evolution.

    Nature 2013;496;7445;311-6

  • The general population cohort in rural south-western Uganda: a platform for communicable and non-communicable disease studies.

    Asiki G, Murphy G, Nakiyingi-Miiro J, Seeley J, Nsubuga RN, Karabarinde A, Waswa L, Biraro S, Kasamba I, Pomilla C, Maher D, Young EH, Kamali A, Sandhu MS and On behalf of the GPC team

    Medical Research Council/Uganda Virus Research Institute (MRC/UVRI), Uganda Research Unit on AIDS, Entebbe, Uganda, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK, Wellcome Trust Sanger Institute, Hinxton, UK, London School of Hygiene and Tropical Medicine, London, UK, School of International Development, University of East Anglia, Norwich, UK and Wellcome Trust, UK (formerly with MRC/UVRI Uganda Research Unit on AIDS, Entebbe, Uganda).

    The General Population Cohort (GPC) was set up in 1989 to examine trends in HIV prevalence and incidence, and their determinants in rural south-western Uganda. Recently, the research questions have included the epidemiology and genetics of communicable and non-communicable diseases (NCDs) to address the limited data on the burden and risk factors for NCDs in sub-Saharan Africa. The cohort comprises all residents (52% aged ≥13years, men and women in equal proportions) within one-half of a rural sub-county, residing in scattered houses, and largely farmers of three major ethnic groups. Data collected through annual surveys include; mapping for spatial analysis and participant location; census for individual socio-demographic and household socioeconomic status assessment; and a medical survey for health, lifestyle and biophysical and blood measurements to ascertain disease outcomes and risk factors for selected participants. This cohort offers a rich platform to investigate the interplay between communicable diseases and NCDs. There is robust infrastructure for data management, sample processing and storage, and diverse expertise in epidemiology, social and basic sciences. For any data access enquiries you may contact the director, MRC/UVRI, Uganda Research Unit on AIDS by email to mrc@mrcuganda.org or the corresponding author.

    International journal of epidemiology 2013

  • Effective Preparation of Plasmodium vivax Field Isolates for High-Throughput Whole Genome Sequencing.

    Auburn S, Marfurt J, Maslen G, Campino S, Ruano Rubio V, Manske M, Machunter B, Kenangalem E, Noviyanti R, Trianty L, Sebayang B, Wirjanata G, Sriprawat K, Alcock D, Macinnis B, Miotto O, Clark TG, Russell B, Anstey NM, Nosten F, Kwiatkowski DP and Price RN

    Global and Tropical Health Division, Menzies School of Health Research, Charles Darwin University, Darwin, Australia.

    Whole genome sequencing (WGS) of Plasmodium vivax is problematic due to the reliance on clinical isolates which are generally low in parasitaemia and sample volume. Furthermore, clinical isolates contain a significant contaminating background of host DNA which confounds efforts to map short read sequence of the target P. vivax DNA. Here, we discuss a methodology to significantly improve the success of P. vivax WGS on natural (non-adapted) patient isolates. Using 37 patient isolates from Indonesia, Thailand, and travellers, we assessed the application of CF11-based white blood cell filtration alone and in combination with short term ex vivo schizont maturation. Although CF11 filtration reduced human DNA contamination in 8 Indonesian isolates tested, additional short-term culture increased the P. vivax DNA yield from a median of 0.15 to 6.2 ng µl(-1) packed red blood cells (pRBCs) (p = 0.001) and reduced the human DNA percentage from a median of 33.9% to 6.22% (p = 0.008). Furthermore, post-CF11 and culture samples from Thailand gave a median P. vivax DNA yield of 2.34 ng µl(-1) pRBCs, and 2.65% human DNA. In 22 P. vivax patient isolates prepared with the 2-step method, we demonstrate high depth (median 654X coverage) and breadth (≥89%) of coverage on the Illumina GAII and HiSeq platforms. In contrast to the A+T-rich P. falciparum genome, negligible bias was observed in coverage depth between coding and non-coding regions of the P. vivax genome. This uniform coverage will greatly facilitate the detection of SNPs and copy number variants across the genome, enabling unbiased exploration of the natural diversity in P. vivax populations.

    PloS one 2013;8;1;e53160

  • FOXP2 Targets Show Evidence of Positive Selection in European Populations.

    Ayub Q, Yngvadottir B, Chen Y, Xue Y, Hu M, Vernes SC, Fisher SE and Tyler-Smith C

    The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK. Electronic address: qa1@sanger.ac.uk.

    Forkhead box P2 (FOXP2) is a highly conserved transcription factor that has been implicated in human speech and language disorders and plays important roles in the plasticity of the developing brain. The pattern of nucleotide polymorphisms in FOXP2 in modern populations suggests that it has been the target of positive (Darwinian) selection during recent human evolution. In our study, we searched for evidence of selection that might have followed FOXP2 adaptations in modern humans. We examined whether or not putative FOXP2 targets identified by chromatin-immunoprecipitation genomic screening show evidence of positive selection. We developed an algorithm that, for any given gene list, systematically generates matched lists of control genes from the Ensembl database, collates summary statistics for three frequency-spectrum-based neutrality tests from the low-coverage resequencing data of the 1000 Genomes Project, and determines whether these statistics are significantly different between the given gene targets and the set of controls. Overall, there was strong evidence of selection of FOXP2 targets in Europeans, but not in the Han Chinese, Japanese, or Yoruba populations. Significant outliers included several genes linked to cellular movement, reproduction, development, and immune cell trafficking, and 13 of these constituted a significant network associated with cardiac arteriopathy. Strong signals of selection were observed for CNTNAP2 and RBFOX1, key neurally expressed genes that have been consistently identified as direct FOXP2 targets in multiple studies and that have themselves been associated with neurodevelopmental disorders involving language dysfunction.

    American journal of human genetics 2013

  • Y-chromosome and mtDNA genetics reveal significant contrasts in affinities of modern Middle Eastern populations with European and African populations.

    Badro DA, Douaihy B, Haber M, Youhanna SC, Salloum A, Ghassibe-Sabbagh M, Johnsrud B, Khazen G, Matisoo-Smith E, Soria-Hernanz DF, Wells RS, Tyler-Smith C, Platt DE, Zalloua PA and Genographic Consortium

    The Lebanese American University, Chouran, Beirut, Lebanon.

    The Middle East was a funnel of human expansion out of Africa, a staging area for the Neolithic Agricultural Revolution, and the home to some of the earliest world empires. Post LGM expansions into the region and subsequent population movements created a striking genetic mosaic with distinct sex-based genetic differentiation. While prior studies have examined the mtDNA and Y-chromosome contrast in focal populations in the Middle East, none have undertaken a broad-spectrum survey including North and sub-Saharan Africa, Europe, and Middle Eastern populations. In this study 5,174 mtDNA and 4,658 Y-chromosome samples were investigated using PCA, MDS, mean-linkage clustering, AMOVA, and Fisher exact tests of F(ST)'s, R(ST)'s, and haplogroup frequencies. Geographic differentiation in affinities of Middle Eastern populations with Africa and Europe showed distinct contrasts between mtDNA and Y-chromosome data. Specifically, Lebanon's mtDNA shows a very strong association to Europe, while Yemen shows very strong affinity with Egypt and North and East Africa. Previous Y-chromosome results showed a Levantine coastal-inland contrast marked by J1 and J2, and a very strong North African component was evident throughout the Middle East. Neither of these patterns were observed in the mtDNA. While J2 has penetrated into Europe, the pattern of Y-chromosome diversity in Lebanon does not show the widespread affinities with Europe indicated by the mtDNA data. Lastly, while each population shows evidence of connections with expansions that now define the Middle East, Africa, and Europe, many of the populations in the Middle East show distinctive mtDNA and Y-haplogroup characteristics that indicate long standing settlement with relatively little impact from and movement into other populations.

    PloS one 2013;8;1;e54616

  • Metagenomic study of the viruses of African straw-coloured fruit bats: Detection of a chiropteran poxvirus and isolation of a novel adenovirus.

    Baker KS, Leggett RM, Bexfield NH, Alston M, Daly G, Todd S, Tachedjian M, Holmes CE, Crameri S, Wang LF, Heeney JL, Suu-Ire R, Kellam P, Cunningham AA, Wood JL, Caccamo M and Murcia PR

    University of Cambridge, Department of Veterinary Medicine, Madingley Rd, Cambridge, Cambridgeshire, CB3 0ES, United Kingdom; Institute of Zoology, Zoological Society of London, Regent's Park, NW1 4RY, United Kingdom. Electronic address: kf281@cam.ac.uk.

    Viral emergence as a result of zoonotic transmission constitutes a continuous public health threat. Emerging viruses such as SARS coronavirus, hantaviruses and henipaviruses have wildlife reservoirs. Characterising the viruses of candidate reservoir species in geographical hot spots for viral emergence is a sensible approach to develop tools to predict, prevent, or contain emergence events. Here, we explore the viruses of Eidolon helvum, an Old World fruit bat species widely distributed in Africa that lives in close proximity to humans. We identified a great abundance and diversity of novel herpes and papillomaviruses, described the isolation of a novel adenovirus, and detected, for the first time, sequences of a chiropteran poxvirus closely related with Molluscum contagiosum. In sum, E. helvum display a wide variety of mammalian viruses, some of them genetically similar to known human pathogens, highlighting the possibility of zoonotic transmission.

    Virology 2013

  • A comparison of dense transposon insertion libraries in the Salmonella serovars Typhi and Typhimurium.

    Barquist L, Langridge GC, Turner DJ, Phan MD, Turner AK, Bateman A, Parkhill J, Wain J and Gardner PP

    Wellcome Trust Sanger Institute, Hinxton, Cambridge, CB10 1SA, UK.

    Salmonella Typhi and Typhimurium diverged only ∼50 000 years ago, yet have very different host ranges and pathogenicity. Despite the availability of multiple whole-genome sequences, the genetic differences that have driven these changes in phenotype are only beginning to be understood. In this study, we use transposon-directed insertion-site sequencing to probe differences in gene requirements for competitive growth in rich media between these two closely related serovars. We identify a conserved core of 281 genes that are required for growth in both serovars, 228 of which are essential in Escherichia coli. We are able to identify active prophage elements through the requirement for their repressors. We also find distinct differences in requirements for genes involved in cell surface structure biogenesis and iron utilization. Finally, we demonstrate that transposon-directed insertion-site sequencing is not only applicable to the protein-coding content of the cell but also has sufficient resolution to generate hypotheses regarding the functions of non-coding RNAs (ncRNAs) as well. We are able to assign probable functions to a number of cis-regulatory ncRNA elements, as well as to infer likely differences in trans-acting ncRNA regulatory networks.

    Nucleic acids research 2013

  • Peripheral administration of prokineticin 2 potently reduces food intake and body weight in mice via the brainstem.

    Beale K, Gardiner J, Bewick G, Hostomska K, Patel N, Hussain S, Jayasena C, Ebling F, Jethwa P, Prosser H, Lattanzi R, Negri L, Ghatei M, Bloom S and Dhillo W

    Section of Investigative Medicine, Imperial College London, London, UK.

    BACKGROUND AND PURPOSE: Prokineticin 2 (PK2) has recently been shown to acutely reduce food intake in rodents. We aimed to determine the CNS sites and receptors that mediate the anorectic effects of peripherally administered PK2 and its chronic effects on glucose and energy homeostasis. EXPERIMENTAL APPROACH: We investigated neuronal activation following i.p. administration of PK2 using c-Fos-like immunoreactivity (CFL-IR). The anorectic effect of PK2 was examined in mice with targeted deletion of either prokineticin receptor 1 (PKR1) or prokineticin receptor 2 (PKR2), and in wild-type mice following administration of the PKR1 antagonist, PC1. The effect of IP PK2 administration on glucose homeostasis was investigated. Finally, the effect of long-term administration of PK2 on glucose and energy homeostasis in diet-induced obese (DIO) mice was determined. KEY RESULTS: I.p. PK2 administration significantly increased CFL-IR in the dorsal motor vagal nucleus of the brainstem. The anorectic effect of PK2 was maintained in mice lacking the PKR2 but abolished in mice lacking PKR1 and in wild-type mice pre-treated with PC1. DIO mice treated chronically with PK2 had no changes in glucose levels but significantly reduced food intake and body weight compared to controls. CONCLUSIONS AND IMPLICATIONS: Together, our data suggest that the anorectic effects of peripherally administered PK2 are mediated via the brainstem and this effect requires PKR1 but not PKR2 signalling. Chronic administration of PK2 reduces food intake and body weight in a mouse model of human obesity, suggesting that PKR1-selective agonists have potential to be novel therapeutics for the treatment of obesity.

    British journal of pharmacology 2013;168;2;403-410

  • Microbial genomes as cheat sheets.

    Bennett HM

    Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.

    Nature reviews. Microbiology 2013;11;5;302

  • Genome-wide meta-analysis identifies 11 new loci for anthropometric traits and provides insights into genetic architecture.

    Berndt SI, Gustafsson S, Mägi R, Ganna A, Wheeler E, Feitosa MF, Justice AE, Monda KL, Croteau-Chonka DC, Day FR, Esko T, Fall T, Ferreira T, Gentilini D, Jackson AU, Luan J, Randall JC, Vedantam S, Willer CJ, Winkler TW, Wood AR, Workalemahu T, Hu YJ, Lee SH, Liang L, Lin DY, Min JL, Neale BM, Thorleifsson G, Yang J, Albrecht E, Amin N, Bragg-Gresham JL, Cadby G, den Heijer M, Eklund N, Fischer K, Goel A, Hottenga JJ, Huffman JE, Jarick I, Johansson A, Johnson T, Kanoni S, Kleber ME, König IR, Kristiansson K, Kutalik Z, Lamina C, Lecoeur C, Li G, Mangino M, McArdle WL, Medina-Gomez C, Müller-Nurasyid M, Ngwa JS, Nolte IM, Paternoster L, Pechlivanis S, Perola M, Peters MJ, Preuss M, Rose LM, Shi J, Shungin D, Smith AV, Strawbridge RJ, Surakka I, Teumer A, Trip MD, Tyrer J, Van Vliet-Ostaptchouk JV, Vandenput L, Waite LL, Zhao JH, Absher D, Asselbergs FW, Atalay M, Attwood AP, Balmforth AJ, Basart H, Beilby J, Bonnycastle LL, Brambilla P, Bruinenberg M, Campbell H, Chasman DI, Chines PS, Collins FS, Connell JM, Cookson WO, de Faire U, de Vegt F, Dei M, Dimitriou M, Edkins S, Estrada K, Evans DM, Farrall M, Ferrario MM, Ferrières J, Franke L, Frau F, Gejman PV, Grallert H, Grönberg H, Gudnason V, Hall AS, Hall P, Hartikainen AL, Hayward C, Heard-Costa NL, Heath AC, Hebebrand J, Homuth G, Hu FB, Hunt SE, Hyppönen E, Iribarren C, Jacobs KB, Jansson JO, Jula A, Kähönen M, Kathiresan S, Kee F, Khaw KT, Kivimäki M, Koenig W, Kraja AT, Kumari M, Kuulasmaa K, Kuusisto J, Laitinen JH, Lakka TA, Langenberg C, Launer LJ, Lind L, Lindström J, Liu J, Liuzzi A, Lokki ML, Lorentzon M, Madden PA, Magnusson PK, Manunta P, Marek D, März W, Leach IM, McKnight B, Medland SE, Mihailov E, Milani L, Montgomery GW, Mooser V, Mühleisen TW, Munroe PB, Musk AW, Narisu N, Navis G, Nicholson G, Nohr EA, Ong KK, Oostra BA, Palmer CN, Palotie A, Peden JF, Pedersen N, Peters A, Polasek O, Pouta A, Pramstaller PP, Prokopenko I, Pütter C, Radhakrishnan A, Raitakari O, Rendon A, Rivadeneira F, Rudan I, Saaristo TE, Sambrook JG, Sanders AR, Sanna S, Saramies J, Schipf S, Schreiber S, Schunkert H, Shin SY, Signorini S, Sinisalo J, Skrobek B, Soranzo N, Stančáková A, Stark K, Stephens JC, Stirrups K, Stolk RP, Stumvoll M, Swift AJ, Theodoraki EV, Thorand B, Tregouet DA, Tremoli E, Van der Klauw MM, van Meurs JB, Vermeulen SH, Viikari J, Virtamo J, Vitart V, Waeber G, Wang Z, Widén E, Wild SH, Willemsen G, Winkelmann BR, Witteman JC, Wolffenbuttel BH, Wong A, Wright AF, Zillikens MC, Amouyel P, Boehm BO, Boerwinkle E, Boomsma DI, Caulfield MJ, Chanock SJ, Cupples LA, Cusi D, Dedoussis GV, Erdmann J, Eriksson JG, Franks PW, Froguel P, Gieger C, Gyllensten U, Hamsten A, Harris TB, Hengstenberg C, Hicks AA, Hingorani A, Hinney A, Hofman A, Hovingh KG, Hveem K, Illig T, Jarvelin MR, Jöckel KH, Keinanen-Kiukaanniemi SM, Kiemeney LA, Kuh D, Laakso M, Lehtimäki T, Levinson DF, Martin NG, Metspalu A, Morris AD, Nieminen MS, Njølstad I, Ohlsson C, Oldehinkel AJ, Ouwehand WH, Palmer LJ, Penninx B, Power C, Province MA, Psaty BM, Qi L, Rauramaa R, Ridker PM, Ripatti S, Salomaa V, Samani NJ, Snieder H, Sørensen TI, Spector TD, Stefansson K, Tönjes A, Tuomilehto J, Uitterlinden AG, Uusitupa M, van der Harst P, Vollenweider P, Wallaschofski H, Wareham NJ, Watkins H, Wichmann HE, Wilson JF, Abecasis GR, Assimes TL, Barroso I, Boehnke M, Borecki IB, Deloukas P, Fox CS, Frayling T, Groop LC, Haritunian T, Heid IM, Hunter D, Kaplan RC, Karpe F, Moffatt MF, Mohlke KL, O'Connell JR, Pawitan Y, Schadt EE, Schlessinger D, Steinthorsdottir V, Strachan DP, Thorsteinsdottir U, van Duijn CM, Visscher PM, Di Blasio AM, Hirschhorn JN, Lindgren CM, Morris AP, Meyre D, Scherag A, McCarthy MI, Speliotes EK, North KE, Loos RJ and Ingelsson E

    1] US Department of Health and Human Services, Division of Cancer Epidemiology and Genetics, National Cancer Institute, US National Institutes of Health, Bethesda, Maryland, USA. [2].

    Approaches exploiting trait distribution extremes may be used to identify loci associated with common traits, but it is unknown whether these loci are generalizable to the broader population. In a genome-wide search for loci associated with the upper versus the lower 5th percentiles of body mass index, height and waist-to-hip ratio, as well as clinical classes of obesity, including up to 263,407 individuals of European ancestry, we identified 4 new loci (IGFBP4, H6PD, RSRC1 and PPP2R2A) influencing height detected in the distribution tails and 7 new loci (HNF4G, RPTOR, GNAT2, MRPS33P4, ADCY9, HS6ST3 and ZZZ3) for clinical classes of obesity. Further, we find a large overlap in genetic structure and the distribution of variants between traits based on extremes and the general population and little etiological heterogeneity between obesity subgroups.

    Nature genetics 2013

  • The evolutionary dynamics of influenza A virus adaptation to mammalian hosts.

    Bhatt S, Lam TT, Lycett SJ, Leigh Brown AJ, Bowden TA, Holmes EC, Guan Y, Wood JL, Brown IH, Kellam P, Combating Swine Influenza Consortium and Pybus OG

    Department of Zoology, University of Oxford, , Oxford, UK.

    Few questions on infectious disease are more important than understanding how and why avian influenza A viruses successfully emerge in mammalian populations, yet little is known about the rate and nature of the virus' genetic adaptation in new hosts. Here, we measure, for the first time, the genomic rate of adaptive evolution of swine influenza viruses (SwIV) that originated in birds. By using a curated dataset of more than 24 000 human and swine influenza gene sequences, including 41 newly characterized genomes, we reconstructed the adaptive dynamics of three major SwIV lineages (Eurasian, EA; classical swine, CS; triple reassortant, TR). We found that, following the transfer of the EA lineage from birds to swine in the late 1970s, EA virus genes have undergone substantially faster adaptive evolution than those of the CS lineage, which had circulated among swine for decades. Further, the adaptation rates of the EA lineage antigenic haemagglutinin and neuraminidase genes were unexpectedly high and similar to those observed in human influenza A. We show that the successful establishment of avian influenza viruses in swine is associated with raised adaptive evolution across the entire genome for many years after zoonosis, reflecting the contribution of multiple mutations to the coordinated optimization of viral fitness in a new environment. This dynamics is replicated independently in the polymerase genes of the TR lineage, which established in swine following separate transmission from non-swine hosts.

    Philosophical transactions of the Royal Society of London. Series B, Biological sciences 2013;368;1614;20120382

  • Compression of FASTQ and SAM Format Sequencing Data.

    Bonfield JK and Mahoney MV

    Wellcome Trust Sanger Institute, Cambridge, United Kingdom.

    Storage and transmission of the data produced by modern DNA sequencing instruments has become a major concern, which prompted the Pistoia Alliance to pose the SequenceSqueeze contest for compression of FASTQ files. We present several compression entries from the competition, Fastqz and Samcomp/Fqzcomp, including the winning entry. These are compared against existing algorithms for both reference based compression (CRAM, Goby) and non-reference based compression (DSRC, BAM) and other recently published competition entries (Quip, SCALCE). The tools are shown to be the new Pareto frontier for FASTQ compression, offering state of the art ratios at affordable CPU costs. All programs are freely available on SourceForge. Fastqz: https://sourceforge.net/projects/fastqz/, fqzcomp: https://sourceforge.net/projects/fqzcomp/, and samcomp: https://sourceforge.net/projects/samcomp/.

    PloS one 2013;8;3;e59190

  • A Single Multilocus Sequence Typing (MLST) Scheme for Seven Pathogenic Leptospira Species.

    Boonsilp S, Thaipadungpanit J, Amornchai P, Wuthiekanun V, Bailey MS, Holden MT, Zhang C, Jiang X, Koizumi N, Taylor K, Galloway R, Hoffmaster AR, Craig S, Smythe LD, Hartskeerl RA, Day NP, Chantratita N, Feil EJ, Aanensen DM, Spratt BG and Peacock SJ

    Mahidol-Oxford Tropical Medicine Research Unit, Faculty of Tropical Medicine, Mahidol University, Bangkok, Thailand ; Department of Microbiology and Immunology, Faculty of Tropical Medicine, Mahidol University, Bangkok, Thailand.

    Background: The available Leptospira multilocus sequence typing (MLST) scheme supported by a MLST website is limited to L. interrogans and L. kirschneri. Our aim was to broaden the utility of this scheme to incorporate a total of seven pathogenic species. We modified the existing scheme by replacing one of the seven MLST loci (fadD was changed to caiB), as the former gene did not appear to be present in some pathogenic species. Comparison of the original and modified schemes using data for L. interrogans and L. kirschneri demonstrated that the discriminatory power of the two schemes was not significantly different. The modified scheme was used to further characterize 325 isolates (L. alexanderi [n = 5], L. borgpetersenii [n = 34], L. interrogans [n = 222], L. kirschneri [n = 29], L. noguchii [n = 9], L. santarosai [n = 10], and L. weilii [n = 16]). Phylogenetic analysis using concatenated sequences of the 7 loci demonstrated that each species corresponded to a discrete clade, and that no strains were misclassified at the species level. Comparison between genotype and serovar was possible for 254 isolates. Of the 31 sequence types (STs) represented by at least two isolates, 18 STs included isolates assigned to two or three different serovars. Conversely, 14 serovars were identified that contained between 2 to 10 different STs. New observations were made on the global phylogeography of Leptospira spp., and the utility of MLST in making associations between human disease and specific maintenance hosts was demonstrated. Conclusion: The new MLST scheme, supported by an updated MLST website, allows the characterization and species assignment of isolates of the seven major pathogenic species associated with leptospirosis.

    PLoS neglected tropical diseases 2013;7;1;e1954

  • Whole-genome sequencing to identify transmission of Mycobacterium abscessus between patients with cystic fibrosis: a retrospective cohort study.

    Bryant JM, Grogono DM, Greaves D, Foweraker J, Roddick I, Inns T, Reacher M, Haworth CS, Curran MD, Harris SR, Peacock SJ, Parkhill J and Floto RA

    Wellcome Trust Sanger Institute, Hinxton, UK.

    BACKGROUND: Increasing numbers of individuals with cystic fibrosis are becoming infected with the multidrug-resistant non-tuberculous mycobacterium (NTM) Mycobacterium abscessus, which causes progressive lung damage and is extremely challenging to treat. How this organism is acquired is not currently known, but there is growing concern that person-to-person transmission could occur. We aimed to define the mechanisms of acquisition of M abscessus in individuals with cystic fibrosis. METHOD: Whole genome sequencing and antimicrobial susceptibility testing were done on 168 consecutive isolates of M abscessus from 31 patients attending an adult cystic fibrosis centre in the UK between 2007 and 2011. In parallel, we undertook detailed environmental testing for NTM and defined potential opportunities for transmission between patients both in and out of hospital using epidemiological data and social network analysis. FINDINGS: Phylogenetic analysis revealed two clustered outbreaks of near-identical isolates of the M abscessus subspecies massiliense (from 11 patients), differing by less than ten base pairs. This variation represents less diversity than that seen within isolates from a single individual, strongly indicating between-patient transmission. All patients within these clusters had numerous opportunities for within-hospital transmission from other individuals, while comprehensive environmental sampling, initiated during the outbreak, failed to detect any potential point source of NTM infection. The clusters of M abscessus subspecies massiliense showed evidence of transmission of mutations acquired during infection of an individual to other patients. Thus, isolates with constitutive resistance to amikacin and clarithromycin were isolated from several individuals never previously exposed to long-term macrolides or aminoglycosides, further indicating cross-infection. INTERPRETATION: Whole genome sequencing has revealed frequent transmission of multidrug resistant NTM between patients with cystic fibrosis despite conventional cross-infection measures. Although the exact transmission route is yet to be established, our epidemiological analysis suggests that it could be indirect. FUNDING: The Wellcome Trust, Papworth Hospital, NIHR Cambridge Biomedical Research Centre, UK Health Protection Agency, Medical Research Council, and the UKCRC Translational Infection Research Initiative.

    Lancet 2013

  • Inferring patient to patient transmission of Mycobacterium tuberculosis from whole genome sequencing data.

    Bryant JM, Schürch AC, van Deutekom H, Harris SR, de Beer JL, de Jager V, Kremer K, van Hijum SA, Siezen RJ, Borgdorff M, Bentley SD, Parkhill J and van Soolingen D

    BACKGROUND: Mycobacterium tuberculosis is characterised by limited genomic diversity, which makes the application of whole genome sequencing particularly attractive for clinical and epidemiological investigation. However, in order to confidently infer transmission events, an accurate knowledge of the rate of change in the genome over relevant timescales is required. METHODS: We attempted to estimate a molecular clock by sequencing 199 isolates from epidemiologically linked tuberculosis cases, collected in the Netherlands spanning almost 16 years. RESULTS: Multiple analyses support an average mutation rate of ~0.3 SNPs per genome per year. However, all analyses revealed a very high degree of variation around this mean, making the confirmation of links proposed by epidemiology, and inference of novel links, difficult. Despite this, in some cases, the phylogenetic context of other strains provided evidence supporting the confident exclusion of previously inferred epidemiological links. CONCLUSIONS: This in-depth analysis of the molecular clock revealed that it is slow and variable over short time scales, which limits its usefulness in transmission studies. However, the superior resolution of whole genome sequencing can provide the phylogenetic context to allow the confident exclusion of possible transmission events previously inferred via traditional DNA fingerprinting techniques and epidemiological cluster investigation. Despite the slow generation of variation even at the whole genome level we conclude that the investigation of tuberculosis transmission will benefit greatly from routine whole genome sequencing.

    BMC infectious diseases 2013;13;1;110

  • Headbobber: a combined morphogenetic and cochleosaccular mouse model to study 10qter deletions in human deafness.

    Buniello A, Hardisty-Hughes RE, Pass JC, Bober E, Smith RJ and Steel KP

    Wellcome Trust Sanger Institute, Hinxton, Cambridgeshire, United Kingdom ; Wolfson Centre for Age-Related Diseases, King's College London, London, United Kingdom.

    The recessive mouse mutant headbobber () displays the characteristic behavioural traits associated with vestibular defects including headbobbing, circling and deafness. This mutation was caused by the insertion of a transgene into distal chromosome 7 affecting expression of native genes. We show that the inner ear of mutants lacks semicircular canals and cristae, and the saccule and utricle are fused together in a single utriculosaccular sac. Moreover, we detect severe abnormalities of the cochlear sensory hair cells, the stria vascularis looks severely disorganised, Reissner's membrane is collapsed and no endocochlear potential is detected. Myo7a and Kcnj10 expression analysis show a lack of the melanocyte-like intermediate cells in stria vascularis, which can explain the absence of endocochlear potential. We use Trp2 as a marker of melanoblasts migrating from the neural crest at E12.5 and show that they do not interdigitate into the developing strial epithelium, associated with abnormal persistence of the basal lamina in the cochlea. We perform array CGH, deep sequencing as well as an extensive expression analysis of candidate genes in the headbobber region of and littermate controls, and conclude that the headbobber phenotype is caused by: 1) effect of a 648 kb deletion on distal Chr7, resulting in the loss of three protein coding genes (, and ) with expression in the inner ear but unknown function; and 2) indirect, long range effect of the deletion on the expression of neighboring genes on Chr7, associated with downregulation of and homeobox transcription factors. Interestingly, deletions of the orthologous region in humans, affecting the same genes, have been reported in nineteen patients with common features including sensorineural hearing loss and vestibular problems. Therefore, we propose that headbobber is a useful model to gain insight into the mechanisms underlying deafness in human 10qter deletion syndrome.

    PloS one 2013;8;2;e56274

  • Missense mutations in β-1,3-N-acetylglucosaminyltransferase 1 (B3GNT1) cause Walker-Warburg syndrome.

    Buysse K, Riemersma M, Powell G, van Reeuwijk J, Chitayat D, Roscioli T, Kamsteeg EJ, van den Elzen C, van Beusekom E, Blaser S, Babul-Hirji R, Halliday W, Wright GJ, Stemple DL, Lin YY, Lefeber DJ and van Bokhoven H

    The authors wish it to be known that, in their opinion, the first five authors should be regarded as joint First Authors.

    Several known or putative glycosyltransferases are required for the synthesis of laminin-binding glycans on alpha-dystroglycan (αDG), including POMT1, POMT2, POMGnT1, LARGE, Fukutin, FKRP, ISPD and GTDC2. Mutations in these glycosyltransferase genes result in defective αDG glycosylation and reduced ligand binding by αDG causing a clinically heterogeneous group of congenital muscular dystrophies, commonly referred to as dystroglycanopathies. The most severe clinical form, Walker-Warburg syndrome (WWS), is characterized by congenital muscular dystrophy and severe neurological and ophthalmological defects. Here, we report two homozygous missense mutations in the β-1,3-N-acetylglucosaminyltransferase 1 (B3GNT1) gene in a family affected with WWS. Functional studies confirmed the pathogenicity of the mutations. First, expression of wild-type but not mutant B3GNT1 in human prostate cancer (PC3) cells led to increased levels of αDG glycosylation. Second, morpholino knockdown of the zebrafish b3gnt1 orthologue caused characteristic muscular defects and reduced αDG glycosylation. These functional studies identify an important role of B3GNT1 in the synthesis of the uncharacterized laminin-binding glycan of αDG and implicate B3GNT1 as a novel causative gene for WWS.

    Human molecular genetics 2013;22;9;1746-54

  • A CRISPR view of genome sequences.

    Cain AK and Boinett CJ

    Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK. microbes@sanger.ac.uk.

    This month's Genome Watch explores recent applications of the CRISPR immune system for bacterial phylogenetic analysis and genome editing.

    Nature reviews. Microbiology 2013

  • Large-scale association analysis identifies new risk loci for coronary artery disease.

    CARDIoGRAMplusC4D Consortium, Deloukas P, Kanoni S, Willenborg C, Farrall M, Assimes TL, Thompson JR, Ingelsson E, Saleheen D, Erdmann J, Goldstein BA, Stirrups K, König IR, Cazier JB, Johansson A, Hall AS, Lee JY, Willer CJ, Chambers JC, Esko T, Folkersen L, Goel A, Grundberg E, Havulinna AS, Ho WK, Hopewell JC, Eriksson N, Kleber ME, Kristiansson K, Lundmark P, Lyytikäinen LP, Rafelt S, Shungin D, Strawbridge RJ, Thorleifsson G, Tikkanen E, Van Zuydam N, Voight BF, Waite LL, Zhang W, Ziegler A, Absher D, Altshuler D, Balmforth AJ, Barroso I, Braund PS, Burgdorf C, Claudi-Boehm S, Cox D, Dimitriou M, Do R, DIAGRAM Consortium, CARDIOGENICS Consortium, Doney AS, El Mokhtari N, Eriksson P, Fischer K, Fontanillas P, Franco-Cereceda A, Gigante B, Groop L, Gustafsson S, Hager J, Hallmans G, Han BG, Hunt SE, Kang HM, Illig T, Kessler T, Knowles JW, Kolovou G, Kuusisto J, Langenberg C, Langford C, Leander K, Lokki ML, Lundmark A, McCarthy MI, Meisinger C, Melander O, Mihailov E, Maouche S, Morris AD, Müller-Nurasyid M, MuTHER Consortium, Nikus K, Peden JF, Rayner NW, Rasheed A, Rosinger S, Rubin D, Rumpf MP, Schäfer A, Sivananthan M, Song C, Stewart AF, Tan ST, Thorgeirsson G, van der Schoot CE, Wagner PJ, Wellcome Trust Case Control Consortium, Wells GA, Wild PS, Yang TP, Amouyel P, Arveiler D, Basart H, Boehnke M, Boerwinkle E, Brambilla P, Cambien F, Cupples AL, de Faire U, Dehghan A, Diemert P, Epstein SE, Evans A, Ferrario MM, Ferrières J, Gauguier D, Go AS, Goodall AH, Gudnason V, Hazen SL, Holm H, Iribarren C, Jang Y, Kähönen M, Kee F, Kim HS, Klopp N, Koenig W, Kratzer W, Kuulasmaa K, Laakso M, Laaksonen R, Lee JY, Lind L, Ouwehand WH, Parish S, Park JE, Pedersen NL, Peters A, Quertermous T, Rader DJ, Salomaa V, Schadt E, Shah SH, Sinisalo J, Stark K, Stefansson K, Trégouët DA, Virtamo J, Wallentin L, Wareham N, Zimmermann ME, Nieminen MS, Hengstenberg C, Sandhu MS, Pastinen T, Syvänen AC, Hovingh GK, Dedoussis G, Franks PW, Lehtimäki T, Metspalu A, Zalloua PA, Siegbahn A, Schreiber S, Ripatti S, Blankenberg SS, Perola M, Clarke R, Boehm BO, O'Donnell C, Reilly MP, März W, Collins R, Kathiresan S, Hamsten A, Kooner JS, Thorsteinsdottir U, Danesh J, Palmer CN, Roberts R, Watkins H, Schunkert H and Samani NJ

    Coronary artery disease (CAD) is the commonest cause of death. Here, we report an association analysis in 63,746 CAD cases and 130,681 controls identifying 15 loci reaching genome-wide significance, taking the number of susceptibility loci for CAD to 46, and a further 104 independent variants (r(2) < 0.2) strongly associated with CAD at a 5% false discovery rate (FDR). Together, these variants explain approximately 10.6% of CAD heritability. Of the 46 genome-wide significant lead SNPs, 12 show a significant association with a lipid trait, and 5 show a significant association with blood pressure, but none is significantly associated with diabetes. Network analysis with 233 candidate genes (loci at 10% FDR) generated 5 interaction networks comprising 85% of these putative genes involved in CAD. The four most significant pathways mapping to these networks are linked to lipid metabolism and inflammation, underscoring the causal role of these activities in the genetic etiology of CAD. Our study provides insights into the genetic basis of CAD and identifies key biological pathways.

    Funded by: NHLBI NIH HHS: K24 HL107643, R01 HL103635, R01 HL111694

    Nature genetics 2013;45;1;25-33

  • BamView: visualizing and interpretation of next-generation sequencing read alignments.

    Carver T, Harris SR, Otto TD, Berriman M, Parkhill J and McQuillan JA

    Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK. artemis@sanger.ac.uk.

    So-called next-generation sequencing (NGS) has provided the ability to sequence on a massive scale at low cost, enabling biologists to perform powerful experiments and gain insight into biological processes. BamView has been developed to visualize and analyse sequence reads from NGS platforms, which have been aligned to a reference sequence. It is a desktop application for browsing the aligned or mapped reads [Ruffalo, M, LaFramboise, T, Koyutürk, M. Comparative analysis of algorithms for next-generation sequencing read alignment. Bioinformatics 2011;27:2790-6] at different levels of magnification, from nucleotide level, where the base qualities can be seen, to genome or chromosome level where overall coverage is shown. To enable in-depth investigation of NGS data, various views are provided that can be configured to highlight interesting aspects of the data. Multiple read alignment files can be overlaid to compare results from different experiments, and filters can be applied to facilitate the interpretation of the aligned reads. As well as being a standalone application it can be used as an integrated part of the Artemis genome browser, BamView allows the user to study NGS data in the context of the sequence and annotation of the reference genome. Single nucleotide polymorphism (SNP) density and candidate SNP sites can be highlighted and investigated, and read-pair information can be used to discover large structural insertions and deletions. The application will also calculate simple analyses of the read mapping, including reporting the read counts and reads per kilobase per million mapped reads (RPKM) for genes selected by the user. Availability: BamView and Artemis are freely available software. These can be downloaded from their home pages: http://bamview.sourceforge.net/; http://www.sanger.ac.uk/resources/software/artemis/. Requirements: Java 1.6 or higher.

    Briefings in bioinformatics 2013;14;2;203-12

  • Mcph1-Deficient Mice Reveal a Role for MCPH1 in Otitis Media.

    Chen J, Ingham N, Clare S, Raisen C, Vancollie VE, Ismail O, McIntyre RE, Tsang SH, Mahajan VB, Dougan G, Adams DJ, White JK and Steel KP

    Wellcome Trust Sanger Institute, Hinxton, Cambridgeshire, United Kingdom.

    Otitis media is a common reason for hearing loss, especially in children. Otitis media is a multifactorial disease and environmental factors, anatomic dysmorphology and genetic predisposition can all contribute to its pathogenesis. However, the reasons for the variable susceptibility to otitis media are elusive. MCPH1 mutations cause primary microcephaly in humans. So far, no hearing impairment has been reported either in the MCPH1 patients or mouse models with Mcph1 deficiency. In this study, Mcph1-deficient (Mcph1(tm1a) (/tm1a) ) mice were produced using embryonic stem cells with a targeted mutation by the Sanger Institute's Mouse Genetics Project. Auditory brainstem response measurements revealed that Mcph1(tm1a) (/tm1a) mice had mild to moderate hearing impairment with around 70% penetrance. We found otitis media with effusion in the hearing-impaired Mcph1(tm1a) (/tm1a) mice by anatomic and histological examinations. Expression of Mcph1 in the epithelial cells of middle ear cavities supported its involvement in the development of otitis media. Other defects of Mcph1(tm1a) (/tm1a) mice included small skull sizes, increased micronuclei in red blood cells, increased B cells and ocular abnormalities. These findings not only recapitulated the defects found in other Mcph1-deficient mice or MCPH1 patients, but also revealed an unexpected phenotype, otitis media with hearing impairment, which suggests Mcph1 is a new gene underlying genetic predisposition to otitis media.

    PloS one 2013;8;3;e58156

  • Proteomic Comparison of Historic and Recently Emerged Hypervirulent Clostridium difficile Strains.

    Chen JW, Scaria J, Mao C, Sobral B, Zhang S, Lawley T and Chang YF

    Department of Population Medicine and Diagnostic Sciences, Cornell University , Ithaca, New York 14853, United States.

    Clostridium difficile in recent years has undergone rapid evolution and has emerged as a serious human pathogen. Proteomic approaches can improve the understanding of the diversity of this important pathogen, especially in comparing the adaptive ability of different C. difficile strains. In this study, TMT labeling and nanoLC-MS/MS driven proteomics were used to investigate the responses of four C. difficile strains to nutrient shift and osmotic shock. We detected 126 and 67 differentially expressed proteins in at least one strain under nutrition shift and osmotic shock, respectively. During nutrient shift, several components of the phosphotransferase system (PTS) were found to be differentially expressed, which indicated that the carbon catabolite repression (CCR) was relieved to allow the expression of enzymes and transporters responsible for the utilization of alternate carbon sources. Some classical osmotic shock associated proteins, such as GroEL, RecA, CspG, and CspF, and other stress proteins such as PurG and SerA were detected during osmotic shock. Furthermore, the recently emerged strains were found to contain a more robust gene network in response to both stress conditions. This work represents the first comparative proteomic analysis of historic and recently emerged hypervirulent C. difficile strains, complementing the previously published proteomics studies utilizing only one reference strain.

    Journal of proteome research 2013;12;3;1151-61

  • Hierarchical and spatially explicit clustering of DNA sequences with BAPS software.

    Cheng L, Connor TR, Sirén J, Aanensen DM and Corander J

    Department of Mathematics and statistics, University of Helsinki, 00014, Finland; Cardiff School of Biosciences, Cardiff University, Cardiff, CF10 3AX, UK; Department of Infectious Disease Epidemiology, Imperial College London, Norfolk Place, London, W2 1PG, UK; Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.

    Phylogeographical analyses have become commonplace for a myriad of organisms with the advent of cheap DNA sequencing technologies. Bayesian model-based clustering is a powerful tool for detecting important patterns in such data and can be used to decipher even quite subtle signals of systematic differences in molecular variation. Here we introduce two upgrades to the Bayesian Analysis of Population Structure (BAPS) software, which enable (1) spatially explicit modeling of variation in DNA sequences, and (2) hierarchical clustering of DNA sequence data to reveal nested genetic population structures. We provide a direct interface to map the results from spatial clustering with Google Maps using the portal http://www.spatialepidemiology.net/ and illustrate this approach using sequence data from Borrelia burgdorferii. The usefulness of hierarchical clustering is demonstrated through an analysis of the metapopulation structure within a bacterial population experiencing a high level of local horizontal gene transfer. The tools that are introduced are freely available at http://www.helsinki.fi/bsg/software/BAPS/.

    Molecular biology and evolution 2013

  • ISPD gene mutations are a common cause of congenital and limb-girdle muscular dystrophies.

    Cirak S, Foley AR, Herrmann R, Willer T, Yau S, Stevens E, Torelli S, Brodd L, Kamynina A, Vondracek P, Roper H, Longman C, Korinthenberg R, Marrosu G, Nürnberg P, UK10K Consortium, Michele DE, Plagnol V, Hurles M, Moore SA, Sewry CA, Campbell KP, Voit T and Muntoni F

    Dubowitz Neuromuscular Centre, UCL Institute of Child Health, University College London, 30 Guilford Street, London WC1N 1EH, UK. f.muntoni@ucl.ac.uk.

    Dystroglycanopathies are a clinically and genetically diverse group of recessively inherited conditions ranging from the most severe of the congenital muscular dystrophies, Walker-Warburg syndrome, to mild forms of adult-onset limb-girdle muscular dystrophy. Their hallmark is a reduction in the functional glycosylation of α-dystroglycan, which can be detected in muscle biopsies. An important part of this glycosylation is a unique O-mannosylation, essential for the interaction of α-dystroglycan with extracellular matrix proteins such as laminin-α2. Mutations in eight genes coding for proteins in the glycosylation pathway are responsible for ∼50% of dystroglycanopathy cases. Despite multiple efforts using traditional positional cloning, the causative genes for unsolved dystroglycanopathy cases have escaped discovery for several years. In a recent collaborative study, we discovered that loss-of-function recessive mutations in a novel gene, called isoprenoid synthase domain containing (ISPD), are a relatively common cause of Walker-Warburg syndrome. In this article, we report the involvement of the ISPD gene in milder dystroglycanopathy phenotypes ranging from congenital muscular dystrophy to limb-girdle muscular dystrophy and identified allelic ISPD variants in nine cases belonging to seven families. In two ambulant cases, there was evidence of structural brain involvement, whereas in seven, the clinical manifestation was restricted to a dystrophic skeletal muscle phenotype. Although the function of ISPD in mammals is not yet known, mutations in this gene clearly lead to a reduction in the functional glycosylation of α-dystroglycan, which not only causes the severe Walker-Warburg syndrome but is also a common cause of the milder forms of dystroglycanopathy.

    Brain : a journal of neurology 2013;136;Pt 1;269-81

  • Identification of seven loci affecting mean telomere length and their association with disease.

    Codd V, Nelson CP, Albrecht E, Mangino M, Deelen J, Buxton JL, Hottenga JJ, Fischer K, Esko T, Surakka I, Broer L, Nyholt DR, Mateo Leach I, Salo P, Hägg S, Matthews MK, Palmen J, Norata GD, O'Reilly PF, Saleheen D, Amin N, Balmforth AJ, Beekman M, de Boer RA, Böhringer S, Braund PS, Burton PR, Craen AJ, Denniff M, Dong Y, Douroudis K, Dubinina E, Eriksson JG, Garlaschelli K, Guo D, Hartikainen AL, Henders AK, Houwing-Duistermaat JJ, Kananen L, Karssen LC, Kettunen J, Klopp N, Lagou V, van Leeuwen EM, Madden PA, Mägi R, Magnusson PK, Männistö S, McCarthy MI, Medland SE, Mihailov E, Montgomery GW, Oostra BA, Palotie A, Peters A, Pollard H, Pouta A, Prokopenko I, Ripatti S, Salomaa V, Suchiman HE, Valdes AM, Verweij N, Viñuela A, Wang X, Wichmann HE, Widen E, Willemsen G, Wright MJ, Xia K, Xiao X, van Veldhuisen DJ, Catapano AL, Tobin MD, Hall AS, Blakemore AI, van Gilst WH, Zhu H, Consortium C, Erdmann J, Reilly MP, Kathiresan S, Schunkert H, Talmud PJ, Pedersen NL, Perola M, Ouwehand W, Kaprio J, Martin NG, van Duijn CM, Hovatta I, Gieger C, Metspalu A, Boomsma DI, Jarvelin MR, Slagboom PE, Thompson JR, Spector TD, van der Harst P and Samani NJ

    1] Department of Cardiovascular Sciences, University of Leicester, Leicester, UK. [2] National Institute for Health Research Leicester Cardiovascular Biomedical Research Unit, Glenfield Hospital, Leicester, UK. [3].

    Interindividual variation in mean leukocyte telomere length (LTL) is associated with cancer and several age-associated diseases. We report here a genome-wide meta-analysis of 37,684 individuals with replication of selected variants in an additional 10,739 individuals. We identified seven loci, including five new loci, associated with mean LTL (P < 5 × 10(-8)). Five of the loci contain candidate genes (TERC, TERT, NAF1, OBFC1 and RTEL1) that are known to be involved in telomere biology. Lead SNPs at two loci (TERC and TERT) associate with several cancers and other diseases, including idiopathic pulmonary fibrosis. Moreover, a genetic risk score analysis combining lead variants at all 7 loci in 22,233 coronary artery disease cases and 64,762 controls showed an association of the alleles associated with shorter LTL with increased risk of coronary artery disease (21% (95% confidence interval, 5-35%) per standard deviation in LTL, P = 0.014). Our findings support a causal role of telomere-length variation in some age-related diseases.

    Nature genetics 2013;45;4;422-7

  • A genetic study of Wilson's disease in the United Kingdom.

    Coffey AJ, Durkie M, Hague S, McLay K, Emmerson J, Lo C, Klaffke S, Joyce CJ, Dhawan A, Hadzic N, Mieli-Vergani G, Kirk R, Elizabeth Allen K, Nicholl D, Wong S, Griffiths W, Smithson S, Giffin N, Taha A, Connolly S, Gillett GT, Tanner S, Bonham J, Sharrack B, Palotie A, Rattray M, Dalton A and Bandmann O

    1 Wellcome Trust Sanger Institute, Hinxton, CB10 1SA, UK.

    Previous studies have failed to identify mutations in the Wilson's disease gene ATP7B in a significant number of clinically diagnosed cases. This has led to concerns about genetic heterogeneity for this condition but also suggested the presence of unusual mutational mechanisms. We now present our findings in 181 patients from the United Kingdom with clinically and biochemically confirmed Wilson's disease. A total of 116 different ATP7B mutations were detected, 32 of which are novel. The overall mutation detection frequency was 98%. The likelihood of mutations in genes other than ATP7B causing a Wilson's disease phenotype is therefore very low. We report the first cases with Wilson's disease due to segmental uniparental isodisomy as well as three patients with three ATP7B mutations and three families with Wilson's disease in two consecutive generations. We determined the genetic prevalence of Wilson's disease in the United Kingdom by sequencing the entire coding region and adjacent splice sites of ATP7B in 1000 control subjects. The frequency of all single nucleotide variants with in silico evidence of pathogenicity (Class 1 variant) was 0.056 or 0.040 if only those single nucleotide variants that had previously been reported as mutations in patients with Wilson's disease were included in the analysis (Class 2 variant). The frequency of heterozygote, putative or definite disease-associated ATP7B mutations was therefore considerably higher than the previously reported occurrence of 1:90 (or 0.011) for heterozygote ATP7B mutation carriers in the general population (P < 2.2 × 10(-16) for Class 1 variants or P < 5 × 10(-11) for Class 2 variants only). Subsequent exclusion of four Class 2 variants without additional in silico evidence of pathogenicity led to a further reduction of the mutation frequency to 0.024. Using this most conservative approach, the calculated frequency of individuals predicted to carry two mutant pathogenic ATP7B alleles is 1:7026 and thus still considerably higher than the typically reported prevalence of Wilson's disease of 1:30 000 (P = 0.00093). Our study provides strong evidence for monogenic inheritance of Wilson's disease. It also has major implications for ATP7B analysis in clinical practice, namely the need to consider unusual genetic mechanisms such as uniparental disomy or the possible presence of three ATP7B mutations. The marked discrepancy between the genetic prevalence and the number of clinically diagnosed cases of Wilson's disease may be due to both reduced penetrance of ATP7B mutations and failure to diagnose patients with this eminently treatable disorder.

    Brain : a journal of neurology 2013

  • SMIM1 underlies the Vel blood group and influences red blood cell traits.

    Cvejic A, Haer-Wigman L, Stephens JC, Kostadima M, Smethurst PA, Frontini M, van den Akker E, Bertone P, Bielczyk-Maczyńska E, Farrow S, Fehrmann RS, Gray A, de Haas M, Haver VG, Jordan G, Karjalainen J, Kerstens HH, Kiddle G, Lloyd-Jones H, Needs M, Poole J, Soussan AA, Rendon A, Rieneck K, Sambrook JG, Schepers H, Silljé HH, Sipos B, Swinkels D, Tamuri AU, Verweij N, Watkins NA, Westra HJ, Stemple D, Franke L, Soranzo N, Stunnenberg HG, Goldman N, van der Harst P, van der Schoot CE, Ouwehand WH and Albers CA

    1] Department of Haematology, University of Cambridge, Cambridge, UK. [2] Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK. [3].

    The blood group Vel was discovered 60 years ago, but the underlying gene is unknown. Individuals negative for the Vel antigen are rare and are required for the safe transfusion of patients with antibodies to Vel. To identify the responsible gene, we sequenced the exomes of five individuals negative for the Vel antigen and found that four were homozygous and one was heterozygous for a low-frequency 17-nucleotide frameshift deletion in the gene encoding the 78-amino-acid transmembrane protein SMIM1. A follow-up study showing that 59 of 64 Vel-negative individuals were homozygous for the same deletion and expression of the Vel antigen on SMIM1-transfected cells confirm SMIM1 as the gene underlying the Vel blood group. An expression quantitative trait locus (eQTL), the common SNP rs1175550 contributes to variable expression of the Vel antigen (P = 0.003) and influences the mean hemoglobin concentration of red blood cells (RBCs; P = 8.6 × 10(-15)). In vivo, zebrafish with smim1 knockdown showed a mild reduction in the number of RBCs, identifying SMIM1 as a new regulator of RBC formation. Our findings are of immediate relevance, as the homozygous presence of the deletion allows the unequivocal identification of Vel-negative blood donors.

    Nature genetics 2013

  • Histone deacetylase 1 and 2 are essential for normal T-cell development and genomic stability in mice.

    Dovey OM, Foster CT, Conte N, Edwards SA, Edwards JM, Singh R, Vassiliou G, Bradley A and Cowley SM

    Department of Biochemistry, University of Leicester, Leicester, UK.

    Histone deacetylase 1 and 2 (HDAC1/2) regulate chromatin structure as the catalytic core of the Sin3A, NuRD and CoREST co-repressor complexes. To better understand the key pathways regulated by HDAC1/2 in the adaptive immune system and inform their exploitation as drug targets, we have generated mice with a T-cell specific deletion. Loss of either HDAC1 or HDAC2 alone has little effect, while dual inactivation results in a 5-fold reduction in thymocyte cellularity, accompanied by developmental arrest at the double-negative to double-positive transition. Transcriptome analysis revealed 892 misregulated genes in Hdac1/2 knock-out thymocytes, including down-regulation of LAT, Themis and Itk, key components of the T-cell receptor (TCR) signaling pathway. Down-regulation of these genes suggests a model in which HDAC1/2 deficiency results in defective propagation of TCR signaling, thus blocking development. Furthermore, mice with reduced HDAC1/2 activity (Hdac1 deleted and a single Hdac2 allele) develop a lethal pathology by 3-months of age, caused by neoplastic transformation of immature T cells in the thymus. Tumor cells become aneuploid, express increased levels of c-Myc and show elevated levels of the DNA damage marker, γH2AX. These data demonstrate a crucial role for HDAC1/2 in T-cell development and the maintenance of genomic stability.

    Funded by: Medical Research Council: G0600135

    Blood 2013;121;8;1335-44

  • The presence of methylation quantitative trait Loci indicates a direct genetic influence on the level of DNA methylation in adipose tissue.

    Drong AW, Nicholson G, Hedman AK, Meduri E, Grundberg E, Small KS, Shin SY, Bell JT, Karpe F, Soranzo N, Spector TD, McCarthy MI, Deloukas P, Rantalainen M, Lindgren CM and MolPAGE Consortia

    Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, United Kingdom.

    Genetic variants that associate with DNA methylation at CpG sites (methylation quantitative trait loci, meQTLs) offer a potential biological mechanism of action for disease associated SNPs. We investigated whether meQTLs exist in abdominal subcutaneous adipose tissue (SAT) and if CpG methylation associates with metabolic syndrome (MetSyn) phenotypes. We profiled 27,718 genomic regions in abdominal SAT samples of 38 unrelated individuals using differential methylation hybridization (DMH) together with genotypes at 5,227,243 SNPs and expression of 17,209 mRNA transcripts. Validation and replication of significant meQTLs was pursued in an independent cohort of 181 female twins. We find that, at 5% false discovery rate, methylation levels of 149 DMH regions associate with at least one SNP in a ±500 kilobase cis-region in our primary study. We sought to validate 19 of these in the replication study and find that five of these significantly associate with the corresponding meQTL SNPs from the primary study. We find that none of the 149 meQTL top SNPs is a significant expression quantitative trait locus in our expression data, but we observed association between expression levels of two mRNA transcripts and cis-methylation status. Our results indicate that DNA CpG methylation in abdominal SAT is partly under genetic control. This study provides a starting point for future investigations of DNA methylation in adipose tissue.

    PloS one 2013;8;2;e55923

  • Sequencing and Functional Annotation of Avian Pathogenic Escherichia coli Serogroup O78 Strains Reveal the Evolution of E. coli Lineages Pathogenic for Poultry via Distinct Mechanisms.

    Dziva F, Hauser H, Connor TR, van Diemen PM, Prescott G, Langridge GC, Eckert S, Chaudhuri RR, Ewers C, Mellata M, Mukhopadhyay S, Curtiss R, Dougan G, Wieler LH, Thomson NR, Pickard DJ and Stevens MP

    Enteric Bacterial Pathogens Laboratory, Institute for Animal Health, Compton, Berkshire, United Kingdom.

    Avian pathogenic Escherichia coli (APEC) causes respiratory and systemic disease in poultry. Sequencing of a multilocus sequence type 95 (ST95) serogroup O1 strain previously indicated that APEC resembles E. coli causing extraintestinal human diseases. We sequenced the genomes of two strains of another dominant APEC lineage (ST23 serogroup O78 strains χ7122 and IMT2125) and compared them to each other and to the reannotated APEC O1 sequence. For comparison, we also sequenced a human enterotoxigenic E. coli (ETEC) strain of the same ST23 serogroup O78 lineage. Phylogenetic analysis indicated that the APEC O78 strains were more closely related to human ST23 ETEC than to APEC O1, indicating that separation of pathotypes on the basis of their extraintestinal or diarrheagenic nature is not supported by their phylogeny. The accessory genome of APEC ST23 strains exhibited limited conservation of APEC O1 genomic islands and a distinct repertoire of virulence-associated loci. In light of this diversity, we surveyed the phenotype of 2,185 signature-tagged transposon mutants of χ7122 following intra-air sac inoculation of turkeys. This procedure identified novel APEC ST23 genes that play strain- and tissue-specific roles during infection. For example, genes mediating group 4 capsule synthesis were required for the virulence of χ7122 and were conserved in IMT2125 but absent from APEC O1. Our data reveal the genetic diversity of E. coli strains adapted to cause the same avian disease and indicate that the core genome of the ST23 lineage serves as a chassis for the evolution of E. coli strains adapted to cause avian or human disease via acquisition of distinct virulence genes.

    Infection and immunity 2013;81;3;838-49

  • The SHOCT Domain: A Widespread Domain Under-Represented in Model Organisms.

    Eberhardt RY, Bartholdson SJ, Punta M and Bateman A

    European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, United Kingdom ; Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, United Kingdom.

    We have identified a new protein domain, which we have named the SHOCT domain (ort -erminal domain). This domain is widespread in bacteria with over a thousand examples. But we found it is missing from the most commonly studied model organisms, despite being present in closely related species. It's predominantly C-terminal location, co-occurrence with numerous other domains and short size is reminiscent of the Gram-positive anchor motif, however it is present in a much wider range of species. We suggest several hypotheses about the function of SHOCT, including oligomerisation and nucleic acid binding. Our initial experiments do not support its role as an oligomerisation domain.

    PloS one 2013;8;2;e57848

  • The DOT1L rs12982744 polymorphism is associated with osteoarthritis of the hip with genome-wide statistical significance in males.

    Evangelou E, Valdes AM, Castano-Betancourt MC, Doherty M, Doherty S, Esko T, Ingvarsson T, Ioannidis JP, Kloppenburg M, Metspalu A, Ntzani EE, Panoutsopoulou K, Slagboom PE, Southam L, Spector TD, Styrkarsdottir U, Stefanson K, Uitterlinden AG, Wheeler M, Zeggini E, Meulenbelt I, van Meurs JB and arcOGEN consortium, the TREAT-OA consortium

    1Department of Hygiene and Epidemiology, University of Ioannina Medical School, University Campus, Ioannina, Greece.

    Annals of the rheumatic diseases 2013

  • ImmunoChip Study Implicates Antigen Presentation to T Cells in Narcolepsy.

    Faraco J, Lin L, Kornum BR, Kenny EE, Trynka G, Einen M, Rico TJ, Lichtner P, Dauvilliers Y, Arnulf I, Lecendreux M, Javidi S, Geisler P, Mayer G, Pizza F, Poli F, Plazzi G, Overeem S, Lammers GJ, Kemlink D, Sonka K, Nevsimalova S, Rouleau G, Desautels A, Montplaisir J, Frauscher B, Ehrmann L, Högl B, Jennum P, Bourgin P, Peraita-Adrados R, Iranzo A, Bassetti C, Chen WM, Concannon P, Thompson SD, Damotte V, Fontaine B, Breban M, Gieger C, Klopp N, Deloukas P, Wijmenga C, Hallmayer J, Onengut-Gumuscu S, Rich SS, Winkelmann J and Mignot E

    Center for Sleep Sciences and Medicine, Stanford University, Palo Alto, California, United States of America.

    Recent advances in the identification of susceptibility genes and environmental exposures provide broad support for a post-infectious autoimmune basis for narcolepsy/hypocretin (orexin) deficiency. We genotyped loci associated with other autoimmune and inflammatory diseases in 1,886 individuals with hypocretin-deficient narcolepsy and 10,421 controls, all of European ancestry, using a custom genotyping array (ImmunoChip). Three loci located outside the Human Leukocyte Antigen (HLA) region on chromosome 6 were significantly associated with disease risk. In addition to a strong signal in the T cell receptor alpha (TRA@), variants in two additional narcolepsy loci, Cathepsin H () and Tumor necrosis factor (ligand) superfamily member 4 (, also called ), attained genome-wide significance. These findings underline the importance of antigen presentation by HLA Class II to T cells in the pathophysiology of this autoimmune disease.

    PLoS genetics 2013;9;2;e1003270

  • Reprogramming by cell fusion: boosted by tets.

    Ficz G and Reik W

    Epigenetics Programme, The Babraham Institute, Cambridge CB22 3AT, UK.

    Pluripotent cells, when fused with somatic cells, have the dominant ability to reprogram the somatic genome. Work by Piccolo et al. (2013) shows that the Tet1 and Tet2 hydroxylases are important for DNA methylation reprogramming of pluripotency genes and parental imprints.

    Molecular cell 2013;49;6;1017-8

  • Ensembl 2013.

    Flicek P, Ahmed I, Amode MR, Barrell D, Beal K, Brent S, Carvalho-Silva D, Clapham P, Coates G, Fairley S, Fitzgerald S, Gil L, García-Girón C, Gordon L, Hourlier T, Hunt S, Juettemann T, Kähäri AK, Keenan S, Komorowska M, Kulesha E, Longden I, Maurel T, McLaren WM, Muffato M, Nag R, Overduin B, Pignatelli M, Pritchard B, Pritchard E, Riat HS, Ritchie GR, Ruffier M, Schuster M, Sheppard D, Sobral D, Taylor K, Thormann A, Trevanion S, White S, Wilder SP, Aken BL, Birney E, Cunningham F, Dunham I, Harrow J, Herrero J, Hubbard TJ, Johnson N, Kinsella R, Parker A, Spudich G, Yates A, Zadissa A and Searle SM

    European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton Cambridge CB10 1SD, UK. flicek@ebi.ac.uk

    The Ensembl project (http://www.ensembl.org) provides genome information for sequenced chordate genomes with a particular focus on human, mouse, zebrafish and rat. Our resources include evidenced-based gene sets for all supported species; large-scale whole genome multiple species alignments across vertebrates and clade-specific alignments for eutherian mammals, primates, birds and fish; variation data resources for 17 species and regulation annotations based on ENCODE and other data sets. Ensembl data are accessible through the genome browser at http://www.ensembl.org and through other tools and programmatic interfaces.

    Funded by: Biotechnology and Biological Sciences Research Council: BB/I025506/1; NHGRI NIH HHS: U01HG004695, U41HG006104, U54HG004563; Wellcome Trust: WT062023, WT079643

    Nucleic acids research 2013;41;Database issue;D48-55

  • Spindle checkpoint deficiency is tolerated by murine epidermal cells but not hair follicle stem cells.

    Foijer F, Ditommaso T, Donati G, Hautaviita K, Xie SZ, Heath E, Smyth I, Watt FM, Sorger PK and Bradley A

    Mouse Genomics, Wellcome Trust Sanger Institute, Hinxton CB10 1SA, United Kingdom.

    The spindle assembly checkpoint (SAC) ensures correct chromosome segregation during mitosis by preventing aneuploidy, an event that is detrimental to the fitness and survival of normal cells but oncogenic in tumor cells. Deletion of SAC genes is incompatible with early mouse development, and RNAi-mediated depletion of SAC components in cultured cells results in rapid death. Here we describe the use of a conditional KO of mouse Mad2, an essential component of the SAC signaling cascade, as a means to selectively induce chromosome instability and aneuploidy in the epidermis of the skin. We observe that SAC inactivation is tolerated by interfollicular epidermal cells but results in depletion of hair follicle bulge stem cells. Eventually, a histologically normal epidermis develops within ∼1 mo after birth, albeit without any hair. Mad2-deficient cells in this epidermis exhibited abnormal transcription of metabolic genes, consistent with aneuploid cell state. Hair follicle bulge stem cells were completely absent, despite the continued presence of rudimentary hair follicles. These data demonstrate that different cell lineages within a single tissue respond differently to chromosome instability: some proliferating cell lineages can survive, but stem cells are highly sensitive.

    Proceedings of the National Academy of Sciences of the United States of America 2013

  • Genome Sequence of Klebsiella pneumoniae Ecl8, a Reference Strain for Targeted Genetic Manipulation.

    Fookes M, Yu J, De Majumdar S, Thomson N and Schneiders T

    Wellcome Trust, Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom.

    We report the genome sequence of Klebsiella pneumoniae subsp. pneumoniae Ecl8, a spontaneous streptomycin-resistant mutant of strain ECL4, derived from NCIB 418. K. pneumoniae Ecl8 has been shown to be genetically tractable for targeted gene deletion strategies and so provides a platform for in-depth analyses of this species.

    Genome announcements 2013;1;1

  • A CpG Mutational Hotspot in a ONECUT Binding Site Accounts for the Prevalent Variant of Hemophilia B Leyden.

    Funnell AP, Wilson MD, Ballester B, Mak KS, Burdach J, Magan N, Pearson RC, Lemaigre FP, Stowell KM, Odom DT, Flicek P and Crossley M

    School of Biotechnology and Biomolecular Sciences, University of New South Wales, Kensington NSW 2052, Australia.

    Hemophilia B, or the "royal disease," arises from mutations in coagulation factor IX (F9). Mutations within the F9 promoter are associated with a remarkable hemophilia B subtype, termed hemophilia B Leyden, in which symptoms ameliorate after puberty. Mutations at the -5/-6 site (nucleotides -5 and -6 relative to the transcription start site, designated +1) account for the majority of Leyden cases and have been postulated to disrupt the binding of a transcriptional activator, the identity of which has remained elusive for more than 20 years. Here, we show that ONECUT transcription factors (ONECUT1 and ONECUT2) bind to the -5/-6 site. The various hemophilia B Leyden mutations that have been reported in this site inhibit ONECUT binding to varying degrees, which correlate well with their associated clinical severities. In addition, expression of F9 is crucially dependent on ONECUT factors in vivo, and as such, mice deficient in ONECUT1, ONECUT2, or both exhibit depleted levels of F9. Taken together, our findings establish ONECUT transcription factors as the missing hemophilia B Leyden regulators that operate through the -5/-6 site.

    American journal of human genetics 2013;92;3;460-7

  • Genome-wide haplotype analysis of cis expression quantitative trait Loci in monocytes.

    Garnier S, Truong V, Brocheton J, Zeller T, Rovital M, Wild PS, Ziegler A, Cardiogenics Consortium, Munzel T, Tiret L, Blankenberg S, Deloukas P, Erdmann J, Hengstenberg C, Samani NJ, Schunkert H, Ouwehand WH, Goodall AH, Cambien F and Trégouët DA

    INSERM, UMR_S 937, Pierre and Marie Curie University (UPMC, Paris 6), Paris, France ; ICAN Institute for Cardiometabolism and Nutrition, Pierre and Marie Curie University (UPMC, Paris 6), Paris, France.

    In order to assess whether gene expression variability could be influenced by several SNPs acting in cis, either through additive or more complex haplotype effects, a systematic genome-wide search for cis haplotype expression quantitative trait loci (eQTL) was conducted in a sample of 758 individuals, part of the Cardiogenics Transcriptomic Study, for which genome-wide monocyte expression and GWAS data were available. 19,805 RNA probes were assessed for cis haplotypic regulation through investigation of ∼2,1×10(9) haplotypic combinations. 2,650 probes demonstrated haplotypic p-values >10(4)-fold smaller than the best single SNP p-value. Replication of significant haplotype effects were tested for 412 probes for which SNPs (or proxies) that defined the detected haplotypes were available in the Gutenberg Health Study composed of 1,374 individuals. At the Bonferroni correction level of 1.2×10(-4) (∼0.05/412), 193 haplotypic signals replicated. 1000G imputation was then conducted, and 105 haplotypic signals still remained more informative than imputed SNPs. In-depth analysis of these 105 cis eQTL revealed that at 76 loci genetic associations were compatible with additive effects of several SNPs, while for the 29 remaining regions data could be compatible with a more complex haplotypic pattern. As 24 of the 105 cis eQTL have previously been reported to be disease-associated loci, this work highlights the need for conducting haplotype-based and 1000G imputed cis eQTL analysis before commencing functional studies at disease-associated loci.

    PLoS genetics 2013;9;1;e1003240

  • Mutations in C10orf11, Encoding a Melanocyte-Differentiation Gene, Cause Autosomal-Recessive Albinism.

    Grønskov K, Dooley CM, Ostergaard E, Kelsh RN, Hansen L, Levesque MP, Vilhelmsen K, Møllgård K, Stemple DL and Rosenberg T

    Applied Human Molecular Genetics, Kennedy Center, Copenhagen University Hospital, Rigshospitalet, DK-2100 Copenhagen, Denmark; Department of Cellular and Molecular Medicine, University of Copenhagen, DK-2200 Copenhagen, Denmark. Electronic address: karen.groenskov@regionh.dk.

    Autosomal-recessive albinism is a hypopigmentation disorder with a broad phenotypic range. A substantial fraction of individuals with albinism remain genetically unresolved, and it has been hypothesized that more genes are to be identified. By using homozygosity mapping of an inbred Faroese family, we identified a 3.5 Mb homozygous region (10q22.2-q22.3) on chromosome 10. The region contains five protein-coding genes, and sequencing of one of these, C10orf11, revealed a nonsense mutation that segregated with the disease and showed a recessive inheritance pattern. Investigation of additional albinism-affected individuals from the Faroe Islands revealed that five out of eight unrelated affected persons had the nonsense mutation in C10orf11. Screening of a cohort of autosomal-recessive-albinism-affected individuals residing in Denmark showed a homozygous 1 bp duplication in C10orf11 in an individual originating from Lithuania. Immunohistochemistry showed localization of C10orf11 in melanoblasts and melanocytes in human fetal tissue, but no localization was seen in retinal pigment epithelial cells. Knockdown of the zebrafish (Danio rerio) homolog with the use of morpholinos resulted in substantially decreased pigmentation and a reduction of the apparent number of pigmented melanocytes. The morphant phenotype was rescued by wild-type C10orf11, but not by mutant C10orf11. In conclusion, we have identified a melanocyte-differentiation gene, C10orf11, which when mutated causes autosomal-recessive albinism in humans.

    American journal of human genetics 2013

  • Massively parallel sequencing reveals the complex structure of an irradiated human chromosome on a mouse background in the tc1 model of down syndrome.

    Gribble SM, Wiseman FK, Clayton S, Prigmore E, Langley E, Yang F, Maguire S, Fu B, Rajan D, Sheppard O, Scott C, Hauser H, Stephens PJ, Stebbings LA, Ng BL, Fitzgerald T, Quail MA, Banerjee R, Rothkamm K, Tybulewicz VL, Fisher EM and Carter NP

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom.

    Down syndrome (DS) is caused by trisomy of chromosome 21 (Hsa21) and presents a complex phenotype that arises from abnormal dosage of genes on this chromosome. However, the individual dosage-sensitive genes underlying each phenotype remain largely unknown. To help dissect genotype - phenotype correlations in this complex syndrome, the first fully transchromosomic mouse model, the Tc1 mouse, which carries a copy of human chromosome 21 was produced in 2005. The Tc1 strain is trisomic for the majority of genes that cause phenotypes associated with DS, and this freely available mouse strain has become used widely to study DS, the effects of gene dosage abnormalities, and the effect on the basic biology of cells when a mouse carries a freely segregating human chromosome. Tc1 mice were created by a process that included irradiation microcell-mediated chromosome transfer of Hsa21 into recipient mouse embryonic stem cells. Here, the combination of next generation sequencing, array-CGH and fluorescence in situ hybridization technologies has enabled us to identify unsuspected rearrangements of Hsa21 in this mouse model; revealing one deletion, six duplications and more than 25 de novo structural rearrangements. Our study is not only essential for informing functional studies of the Tc1 mouse but also (1) presents for the first time a detailed sequence analysis of the effects of gamma radiation on an entire human chromosome, which gives some mechanistic insight into the effects of radiation damage on DNA, and (2) overcomes specific technical difficulties of assaying a human chromosome on a mouse background where highly conserved sequences may confound the analysis. Sequence data generated in this study is deposited in the ENA database, Study Accession number: ERP000439.

    PloS one 2013;8;4;e60482

  • Gene-centric meta-analyses of 108 912 individuals confirm known body mass index loci and reveal three novel signals.

    Guo Y, Lanktree MB, Taylor KC, Hakonarson H, Lange LA, Keating BJ and The IBC 50K SNP array BMI Consortium

    List of authors is given in the Full Author List Section of Appendix.

    Recent genetic association studies have made progress in uncovering components of the genetic architecture of the body mass index (BMI). We used the ITMAT-Broad-Candidate Gene Association Resource (CARe) (IBC) array comprising up to 49 320 single nucleotide polymorphisms (SNPs) across ∼2100 metabolic and cardiovascular-related loci to genotype up to 108 912 individuals of European ancestry (EA), African-Americans, Hispanics and East Asians, from 46 studies, to provide additional insight into SNPs underpinning BMI. We used a five-phase study design: Phase I focused on meta-analysis of EA studies providing individual level genotype data; Phase II performed a replication of cohorts providing summary level EA data; Phase III meta-analyzed results from the first two phases; associated SNPs from Phase III were used for replication in Phase IV; finally in Phase V, a multi-ethnic meta-analysis of all samples from four ethnicities was performed. At an array-wide significance (P < 2.40E-06), we identify novel BMI associations in loci translocase of outer mitochondrial membrane 40 homolog (yeast) - apolipoprotein E - apolipoprotein C-I (TOMM40-APOE-APOC1) (rs2075650, P = 2.95E-10), sterol regulatory element binding transcription factor 2 (SREBF2, rs5996074, P = 9.43E-07) and neurotrophic tyrosine kinase, receptor, type 2 [NTRK2, a brain-derived neurotrophic factor (BDNF) receptor gene, rs1211166, P = 1.04E-06] in the Phase IV meta-analysis. Of 10 loci with previous evidence for BMI association represented on the IBC array, eight were replicated, with the remaining two showing nominal significance. Conditional analyses revealed two independent BMI-associated signals in BDNF and melanocortin 4 receptor (MC4R) regions. Of the 11 array-wide significant SNPs, three are associated with gene expression levels in both primary B-cells and monocytes; with rs4788099 in SH2B adaptor protein 1 (SH2B1) notably being associated with the expression of multiple genes in cis. These multi-ethnic meta-analyses expand our knowledge of BMI genetics.

    Funded by: NIA NIH HHS: R37 AG011099

    Human molecular genetics 2013;22;1;184-201

  • Genome-wide diversity in the levant reveals recent structuring by culture.

    Haber M, Gauguier D, Youhanna S, Patterson N, Moorjani P, Botigué LR, Platt DE, Matisoo-Smith E, Soria-Hernanz DF, Wells RS, Bertranpetit J, Tyler-Smith C, Comas D and Zalloua PA

    Institut de Biologia Evolutiva (CSIC-UPF), Departament de Ciències de la Salut i de la Vida, Universitat Pompeu Fabra, Barcelona, Spain.

    The Levant is a region in the Near East with an impressive record of continuous human existence and major cultural developments since the Paleolithic period. Genetic and archeological studies present solid evidence placing the Middle East and the Arabian Peninsula as the first stepping-stone outside Africa. There is, however, little understanding of demographic changes in the Middle East, particularly the Levant, after the first Out-of-Africa expansion and how the Levantine peoples relate genetically to each other and to their neighbors. In this study we analyze more than 500,000 genome-wide SNPs in 1,341 new samples from the Levant and compare them to samples from 48 populations worldwide. Our results show recent genetic stratifications in the Levant are driven by the religious affiliations of the populations within the region. Cultural changes within the last two millennia appear to have facilitated/maintained admixture between culturally similar populations from the Levant, Arabian Peninsula, and Africa. The same cultural changes seem to have resulted in genetic isolation of other groups by limiting admixture with culturally different neighboring populations. Consequently, Levant populations today fall into two main groups: one sharing more genetic characteristics with modern-day Europeans and Central Asians, and the other with closer genetic affinities to other Middle Easterners and Africans. Finally, we identify a putative Levantine ancestral component that diverged from other Middle Easterners ∼23,700-15,500 years ago during the last glacial period, and diverged from Europeans ∼15,900-9,100 years ago between the last glacial warming and the start of the Neolithic.

    Funded by: PEPFAR: 098051; Wellcome Trust

    PLoS genetics 2013;9;2;e1003316

  • Diagnostic pathway for the investigation of thrombocytosis.

    Harrison CN, Butt N, Campbell P, Conneally E, Drummond M, Green AR, Murrin R, Radia DH, Reilly JT and McMullin MF

    Department of Haematology, Guy's and St Thomas, Hospitals' NHS Foundation Trust, London, UK.

    British journal of haematology 2013

  • Whole genome sequencing identifies zoonotic transmission of MRSA isolates with the novel mecA homologue mecC.

    Harrison EM, Paterson GK, Holden MT, Larsen J, Stegger M, Larsen AR, Petersen A, Skov RL, Christensen JM, Bak Zeuthen A, Heltberg O, Harris SR, Zadoks RN, Parkhill J, Peacock SJ and Holmes MA

    Department of Veterinary Medicine, University of Cambridge, Cambridge, UK.

    Several methicillin-resistant Staphylococcus aureus (MRSA) lineages that carry a novel mecA homologue (mecC) have recently been described in livestock and humans. In Denmark, two independent human cases of mecC-MRSA infection have been linked to a livestock reservoir. We investigated the molecular epidemiology of the associated MRSA isolates using whole genome sequencing (WGS). Single nucleotide polymorphisms (SNP) were defined and compared to a reference genome to place the isolates into a phylogenetic context. Phylogenetic analysis revealed two distinct farm-specific clusters comprising isolates from the human case and their own livestock, whereas human and animal isolates from the same farm only differed by a small number of SNPs, which supports the likelihood of zoonotic transmission. Further analyses identified a number of genes and mutations that may be associated with host interaction and virulence. This study demonstrates that mecC-MRSA ST130 isolates are capable of transmission between animals and humans, and underscores the potential of WGS in epidemiological investigations and source tracking of bacterial infections. →See accompanying article http://dx.doi.org/10.1002/emmm.201302622.

    EMBO molecular medicine 2013;5;4;509-15

  • A Staphylococcus xylosus Isolate with a New mecC Allotype.

    Harrison EM, Paterson GK, Holden MT, Morgan FJ, Larsen AR, Petersen A, Leroy S, De Vliegher S, Perreten V, Fox LK, Lam TJ, Sampimon OC, Zadoks RN, Peacock SJ, Parkhill J and Holmes MA

    University of Cambridge, Department of Veterinary Medicine, Cambridge, United Kingdom.

    Recently, a novel variant of mecA known as mecC (mecA(LGA251)) was identified in Staphylococcus aureus isolates from both humans and animals. In this study, we identified a Staphylococcus xylosus isolate that harbors a new allotype of the mecC gene, mecC1. Whole-genome sequencing revealed that mecC1 forms part of a class E mec complex (mecI-mecR1-mecC1-blaZ) located at the orfX locus as part of a likely staphylococcal cassette chromosome mec element (SCCmec) remnant, which also contains a number of other genes present on the type XI SCCmec.

    Antimicrobial agents and chemotherapy 2013;57;3;1524-8

  • VS-5584, a Novel and Highly Selective PI3K/mTOR Kinase Inhibitor for the Treatment of Cancer.

    Hart S, Novotny-Diermayr V, Goh KC, Williams M, Tan YC, Ong LC, Cheong A, Ng BK, Amalini C, Madan B, Nagaraj H, Jayaraman R, Pasha KM, Ethirajulu K, Chng WJ, Mustafa N, Goh BC, Benes C, McDermott U, Garnett M, Dymock B and Wood JM

    Corresponding Author: Stefan Hart, S*BIO Pte Ltd., 1 Science Park Road, #05-09 The Capricorn, Singapore 117528, Singapore. stefan.sbio@gmail.com.

    Dysregulation of the PI3K/mTOR pathway, either through amplifications, deletions, or as a direct result of mutations, has been closely linked to the development and progression of a wide range of cancers. Moreover, this pathway activation is a poor prognostic marker for many tumor types and confers resistance to various cancer therapies. Here, we describe VS-5584, a novel, low-molecular weight compound with equivalent potent activity against mTOR (IC(50) = 37 nmol/L) and all class I phosphoinositide 3-kinase (PI3K) isoforms IC(50): PI3Kα = 16 nmol/L; PI3Kβ = 68 nmol/L; PI3Kγ = 25 nmol/L; PI3Kδ = 42 nmol/L, without relevant activity on 400 lipid and protein kinases. VS-5584 shows robust modulation of cellular PI3K/mTOR pathways, inhibiting phosphorylation of substrates downstream of PI3K and mTORC1/2. A large human cancer cell line panel screen (436 lines) revealed broad antiproliferative sensitivity and that cells harboring mutations in PI3KCA are generally more sensitive toward VS-5584 treatment. VS-5584 exhibits favorable pharmacokinetic properties after oral dosing in mice and is well tolerated. VS-5584 induces long-lasting and dose-dependent inhibition of PI3K/mTOR signaling in tumor tissue, leading to tumor growth inhibition in various rapalog-sensitive and -resistant human xenograft models. Furthermore, VS-5584 is synergistic with an EGF receptor inhibitor in a gastric tumor model. The unique selectivity profile and favorable pharmacologic and pharmaceutical properties of VS-5584 and its efficacy in a wide range of human tumor models supports further investigations of VS-5584 in clinical trials. Mol Cancer Ther; 12(2); 151-61. ©2012 AACR.

    Molecular cancer therapeutics 2013;12;2;151-61

  • Mcl-1 and FBW7 control a dominant survival pathway underlying HDAC and Bcl-2 inhibitor synergy in squamous cell carcinoma.

    He L, Torres-Lockhart K, Forster N, Ramakrishnan S, Greninger P, Garnett MJ, McDermott U, Rothenberg SM, Benes CH and Ellisen LW

    Massachusetts General Hospital Cancer Center and Harvard Medical School, Boston, Massachusetts 02114, USA.

    Effective targeted therapeutics for squamous cell carcinoma (SCC) are lacking. Here, we uncover Mcl-1 as a dominant and tissue-specific survival factor in SCC, providing a roadmap for a new therapeutic approach. Treatment with the histone deacetylase (HDAC) inhibitor vorinostat regulates Bcl-2 family member expression to disable the Mcl-1 axis and thereby induce apoptosis in SCC cells. Although Mcl-1 dominance renders SCC cells resistant to the BH3-mimetic ABT-737, vorinostat primes them for sensitivity to ABT-737 by shuttling Bim from Mcl-1 to Bcl-2/Bcl-xl, resulting in dramatic synergy for this combination and sustained tumor regression in vivo. Moreover, somatic FBW7 mutation in SCC is associated with stabilized Mcl-1 and high Bim levels, resulting in a poor response to standard chemotherapy but a robust response to HDAC inhibitors and enhanced synergy with the combination vorinostat/ABT-737. Collectively, our findings provide a biochemical rationale and predictive markers for the application of this therapeutic combination in SCC.

    Funded by: NCI NIH HHS: BC093523; NIDCR NIH HHS: NIH KO8 DE-020139, R01 DE015945; Wellcome Trust: 086357

    Cancer discovery 2013;3;3;324-37

  • Emergence and global spread of epidemic healthcare-associated Clostridium difficile.

    He M, Miyajima F, Roberts P, Ellison L, Pickard DJ, Martin MJ, Connor TR, Harris SR, Fairley D, Bamford KB, D'Arc S, Brazier J, Brown D, Coia JE, Douce G, Gerding D, Kim HJ, Koh TH, Kato H, Senoh M, Louie T, Michell S, Butt E, Peacock SJ, Brown NM, Riley T, Songer G, Wilcox M, Pirmohamed M, Kuijper E, Hawkey P, Wren BW, Dougan G, Parkhill J and Lawley TD

    Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK.

    Epidemic C. difficile (027/BI/NAP1) has rapidly emerged in the past decade as the leading cause of antibiotic-associated diarrhea worldwide. However, the key events in evolutionary history leading to its emergence and the subsequent patterns of global spread remain unknown. Here, we define the global population structure of C. difficile 027/BI/NAP1 using whole-genome sequencing and phylogenetic analysis. We show that two distinct epidemic lineages, FQR1 and FQR2, not one as previously thought, emerged in North America within a relatively short period after acquiring the same fluoroquinolone resistance-conferring mutation and a highly related conjugative transposon. The two epidemic lineages showed distinct patterns of global spread, and the FQR2 lineage spread more widely, leading to healthcare-associated outbreaks in the UK, continental Europe and Australia. Our analysis identifies key genetic changes linked to the rapid transcontinental dissemination of epidemic C. difficile 027/BI/NAP1 and highlights the routes by which it spreads through the global healthcare system.

    Funded by: Medical Research Council: 93614; Wellcome Trust: 086418, 098051

    Nature genetics 2013;45;1;109-13

  • A genome-wide association study of depressive symptoms.

    Hek K, Demirkan A, Lahti J, Terracciano A, Teumer A, Cornelis MC, Amin N, Bakshis E, Baumert J, Ding J, Liu Y, Marciante K, Meirelles O, Nalls MA, Sun YV, Vogelzangs N, Yu L, Bandinelli S, Benjamin EJ, Bennett DA, Boomsma D, Cannas A, Coker LH, de Geus E, De Jager PL, Diez-Roux AV, Purcell S, Hu FB, Rimm EB, Hunter DJ, Jensen MK, Curhan G, Rice K, Penman AD, Rotter JI, Sotoodehnia N, Emeny R, Eriksson JG, Evans DA, Ferrucci L, Fornage M, Gudnason V, Hofman A, Illig T, Kardia S, Kelly-Hayes M, Koenen K, Kraft P, Kuningas M, Massaro JM, Melzer D, Mulas A, Mulder CL, Murray A, Oostra BA, Palotie A, Penninx B, Petersmann A, Pilling LC, Psaty B, Rawal R, Reiman EM, Schulz A, Shulman JM, Singleton AB, Smith AV, Sutin AR, Uitterlinden AG, Völzke H, Widen E, Yaffe K, Zonderman AB, Cucca F, Harris T, Ladwig KH, Llewellyn DJ, Räikkönen K, Tanaka T, van Duijn CM, Grabe HJ, Launer LJ, Lunetta KL, Mosley TH, Newman AB, Tiemeier H and Murabito J

    Research Centre O3, Department of Psychiatry, Erasmus MC, Rotterdam, The Netherlands; Department of Epidemiology, Erasmus MC, Rotterdam, The Netherlands.

    Background: Depression is a heritable trait that exists on a continuum of varying severity and duration. Yet, the search for genetic variants associated with depression has had few successes. We exploit the entire continuum of depression to find common variants for depressive symptoms. Methods: In this genome-wide association study, we combined the results of 17 population-based studies assessing depressive symptoms with the Center for Epidemiological Studies Depression Scale. Replication of the independent top hits (p<1×10(-5)) was performed in five studies assessing depressive symptoms with other instruments. In addition, we performed a combined meta-analysis of all 22 discovery and replication studies. Results: The discovery sample comprised 34,549 individuals (mean age of 66.5) and no loci reached genome-wide significance (lowest p = 1.05×10(-7)). Seven independent single nucleotide polymorphisms were considered for replication. In the replication set (n = 16,709), we found suggestive association of one single nucleotide polymorphism with depressive symptoms (rs161645, 5q21, p = 9.19×10(-3)). This 5q21 region reached genome-wide significance (p = 4.78×10(-8)) in the overall meta-analysis combining discovery and replication studies (n = 51,258). Conclusions: The results suggest that only a large sample comprising more than 50,000 subjects may be sufficiently powered to detect genes for depressive symptoms.

    Biological psychiatry 2013;73;7;667-78

  • Aberrant 3' oligoadenylation of spliceosomal U6 small nuclear RNA in poikiloderma with neutropenia.

    Hilcenko C, Simpson PJ, Finch AJ, Bowler FR, Churcher MJ, Jin L, Packman LC, Shlien A, Campbell P, Kirwan M, Dokal I and Warren AJ

    MRC Laboratory of Molecular Biology, Cambridge, United Kingdom;

    Key Points Crystal structure of human USB1 identifies it as a member of the LigT-like superfamily of 2H phosphoesterases. USB1 protects spliceosomal U6 small nuclear RNA from aberrant 3' oligoadenylation.

    Blood 2013;121;6;1028-38

  • A genomic portrait of the emergence, evolution and global spread of a methicillin resistant Staphylococcus aureus pandemic.

    Holden MT, Hsu LY, Kurt K, Weinert LA, Mather AE, Harris SR, Strommenger B, Layer F, Witte W, de Lencastre H, Skov R, Westh H, Zemlickova H, Coombs G, Kearns AM, Hill RL, Edgeworth J, Gould I, Gant V, Cooke J, Edwards GF, McAdam PR, Templeton KE, McCann A, Zhou Z, Castillo-Ramirez S, Feil EJ, Hudson LO, Enright MC, Balloux F, Aanensen DM, Spratt BG, Fitzgerald JR, Parkhill J, Achtman M, Bentley SD and Nübel U

    Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK;

    The widespread use of antibiotics in association with high-density clinical care has driven the emergence of drug-resistant bacteria that are adapted to thrive in hospitalised patients. Of particular concern are globally disseminated methicillin-resistant Staphylococcus aureus (MRSA) clones that cause outbreaks and epidemics associated with healthcare. The most rapidly spreading and tenacious healthcare-associated clone in Europe currently is EMRSA-15, a lineage that was first detected in the UK in the early 1990s and subsequently spread throughout Europe and beyond. To understand the genetic events that have accompanied the emergence of the EMRSA-15 pandemic, we obtained genome sequences for 193 isolates that were chosen for their geographical and temporal diversity, and belong to the same multilocus sequence type as EMRSA-15. Using phylogenomic methods, we were able to show that the current pandemic population of EMRSA-15 descends from a healthcare-associated MRSA epidemic that spread through England in the 1980s, which had itself previously emerged from a primarily community-associated methicillin-sensitive population. The emergence of fluoroquinolone resistance in this EMRSA-15 sub-clone in the English Midlands during the mid-1980s appears to have played a key role in triggering pandemic spread, and occurred shortly after the first clinical trials of this drug. Genome-based coalescence analysis estimated that the population of this sub-clone over the last twenty years has grown four times faster than its progenitor. Using comparative genomic analysis we were able to identify the molecular genetic basis of 99.8% of the antimicrobial resistance phenotypes of the isolates, highlighting the potential of pathogen genome sequencing as a diagnostic tool. We document the genetic changes associated with adaptation to the hospital environment and with increasing drug resistance over time, and how MRSA evolution likely has been influenced by country-specific drug use regimens.

    Genome research 2013

  • Arginine Catabolic Mobile Element in Methicillin-Resistant Staphylococcus aureus (MRSA) Clonal Group ST239-MRSA-III Isolates in Singapore: Implications for PCR-Based Screening Tests.

    Hon PY, Chan KS, Holden MT, Harris SR, Tan TY, Zu YB, Krishnan P, Oon LL, Koh TH and Hsu LY

    Department of Medicine, National University Health System, Singapore, Singapore.

    Antimicrobial agents and chemotherapy 2013;57;3;1563-4

  • WikiGWA: an open platform for collecting and using genome-wide association results.

    Huang J, Liu EY, Welch R, Willer C, Hindorff LA and Li Y

    Department of Human Genetics, Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK.

    The number of discovered genetic variants from genome-wide association (GWA) studies (GWAS) has been growing rapidly. Centralized efforts such as the National Human Genome Research Institute's GWAS catalog provide regular updates and a convenient interface for quick lookup. However, the catalog entries are manually curated and rely on data from published articles. Other tools such as SNPedia (http://www.snpedia.com) collect published results regarding functional consequences of genetic variations. Here, we propose an approach that allows individual investigators to share their GWA results through an open platform. Unlike GWAS catalog or SNPedia, wikiGWA collects first-hand GWAS results and in a much larger scale. Investigators are not only able to post a much larger amount of results, but also post results from unpublished studies, which could alleviate publication bias and facilitate identification of weak signals. Our interface allows for flexible and fast queries, and the query results are formatted to work seamlessly with the LocusZoom program for visualization and annotation. We here describe wikiGWA, made publically available at http://www.wikiGWA.org.

    Funded by: NHGRI NIH HHS: R01 HG006292, R01 HG006703

    European journal of human genetics : EJHG 2013;21;4;471-3

  • Olfaction and olfactory-mediated behaviour in psychiatric disease models.

    Huckins LM, Logan DW and Sánchez-Andrade G

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.

    Rats and mice are the most widely used species for modelling psychiatric disease. Assessment of these rodent models typically involves the analysis of aberrant behaviour with behavioural interactions often being manipulated to generate the model. Rodents rely heavily on their excellent sense of smell and almost all their social interactions have a strong olfactory component. Therefore, experimental paradigms that exploit these olfactory-mediated behaviours are among the most robust available and are highly prevalent in psychiatric disease research. These include tests of aggression and maternal instinct, foraging, olfactory memory and habituation and the establishment of social hierarchies. An appreciation of the way that rodents regulate these behaviours in an ethological context can assist experimenters to generate better data from their models and to avoid common pitfalls. We describe some of the more commonly used behavioural paradigms from a rodent olfactory perspective and discuss their application in existing models of psychiatric disease. We introduce the four olfactory subsystems that integrate to mediate the behavioural responses and the types of sensory cue that promote them and discuss their control and practical implementation to improve experimental outcomes. In addition, because smell is critical for normal behaviour in rodents and yet olfactory dysfunction is often associated with neuropsychiatric disease, we introduce some tests for olfactory function that can be applied to rodent models of psychiatric disorders as part of behavioural analysis.

    Cell and tissue research 2013

  • Novel Loci Associated with Increased Risk of Sudden Cardiac Death in the Context of Coronary Artery Disease.

    Huertas-Vazquez A, Nelson CP, Guo X, Reinier K, Uy-Evanado A, Teodorescu C, Ayala J, Jerger K, Chugh H, Wtccc, Braund PS, Deloukas P, Hall AS, Balmforth AJ, Jones M, Taylor KD, Pulit SL, Newton-Cheh C, Gunson K, Jui J, Rotter JI, Albert CM, Samani NJ and Chugh SS

    The Heart Institute, Cedars-Sinai Medical Center, Los Angeles, California, United States of America.

    Background: Recent genome-wide association studies (GWAS) have identified novel loci associated with sudden cardiac death (SCD). Despite this progress, identified DNA variants account for a relatively small portion of overall SCD risk, suggesting that additional loci contributing to SCD susceptibility await discovery. The objective of this study was to identify novel DNA variation associated with SCD in the context of coronary artery disease (CAD). Using the MetaboChip custom array we conducted a case-control association analysis of 119,117 SNPs in 948 SCD cases (with underlying CAD) from the Oregon Sudden Unexpected Death Study (Oregon-SUDS) and 3,050 controls with CAD from the Wellcome Trust Case-Control Consortium (WTCCC). Two newly identified loci were significantly associated with increased risk of SCD after correction for multiple comparisons at: rs6730157 in the RAB3GAP1 gene on chromosome 2 (P = 4.93×10(-12), OR = 1.60) and rs2077316 in the ZNF365 gene on chromosome 10 (P = 3.64×10(-8), OR = 2.41). Conclusions: Our findings suggest that RAB3GAP1 and ZNF365 are relevant candidate genes for SCD and will contribute to the mechanistic understanding of SCD susceptibility.

    PloS one 2013;8;4;e59905

  • Astroglial IFITM3 mediates neuronal impairments following neonatal immune challenge in mice.

    Ibi D, Nagai T, Nakajima A, Mizoguchi H, Kawase T, Tsuboi D, Kano SI, Sato Y, Hayakawa M, Lange UC, Adams DJ, Surani MA, Satoh T, Sawa A, Kaibuchi K, Nabeshima T and Yamada K

    Department of Neuropsychopharmacology and Hospital Pharmacy, Nagoya University Graduate School of Medicine, Nagoya, Japan; Department of Chemical Pharmacology, Graduate School of Pharmaceutical Sciences, Meijo University, Nagoya, Japan.

    Interferon-induced transmembrane protein 3 (IFITM3) ıplays a crucial role in the antiviral responses of Type I interferons (IFNs). The role of IFITM3 in the central nervous system (CNS) is, however, largely unknown, despite the fact that its expression is increased in the brains of patients with neurologic and neuropsychiatric diseases. Here, we show the role of IFITM3 in long-lasting neuronal impairments in mice following polyriboinosinic-polyribocytidylic acid (polyI:C, a synthetic double-stranded RNA)-induced immune challenge during the early stages of development. We found that the induction of IFITM3 expression in the brain of mice treated with polyI:C was observed only in astrocytes. Cultured astrocytes were activated by polyI:C treatment, leading to an increase in the mRNA levels of inflammatory cytokines as well as Ifitm3. When cultured neurons were treated with the conditioned medium of polyI:C-treated astrocytes (polyI:C-ACM), neurite development was impaired. These polyI:C-ACM-induced neurodevelopmental abnormalities were alleviated by ifitm3(-) (/) (-) astrocyte-conditioned medium. Furthermore, decreases of MAP2 expression, spine density, and dendrite complexity in the frontal cortex as well as memory impairment were evident in polyI:C-treated wild-type mice, but such neuronal impairments were not observed in ifitm3(-) (/) (-) mice. We also found that IFITM3 proteins were localized to the early endosomes of astrocytes following polyI:C treatment and reduced endocytic activity. These findings suggest that the induction of IFITM3 expression in astrocytes by the activation of the innate immune system during the early stages of development has non-cell autonomous effects that affect subsequent neurodevelopment, leading to neuropathological impairments and brain dysfunction, by impairing endocytosis in astrocytes. GLIA 2013.

    Glia 2013

  • The role of high-throughput technologies in clinical cancer genomics.

    Idris SF, Ahmad SS, Scott MA, Vassiliou GS and Hadfield J

    Department of Hematology/Oncology, Cambridge University NHS Hospitals Foundation Trust, Cambridge, CB2 0QQ, UK.

    Cancer is a genetic disease driven by both heritable and somatic alterations in DNA, which underpin not only oncogenesis but also progression and eventual metastasis. The major impetus for elucidating the nature and function of somatic mutations in cancer genomes is the potential for the development of effective targeted anticancer therapies. Over the last decade, high-throughput technologies have allowed us unprecedented access to a host of cancer genomes, leading to an influx of new information about their pathobiology. The challenge now is to integrate such emerging information into clinical practice to achieve tangible benefits for cancer patients. This review examines the roles array-based comparative genomic hybridization and next-generation sequencing are playing in furthering our understanding of both hematological and solid-organ tumors. Furthermore, the authors discuss the current challenges in translating the role of these technologies from bench to bedside.

    Expert review of molecular diagnostics 2013;13;2;167-81

  • A Cell-surface Phylome for African Trypanosomes.

    Jackson AP, Allison HC, Barry JD, Field MC, Hertz-Fowler C and Berriman M

    Pathogen Genomics Group, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, England, United Kingdom ; Department of Infection Biology, Institute of Infection and Global Health, University of Liverpool, Liverpool, England, United Kingdom.

    The cell surface of Trypanosoma brucei, like many protistan blood parasites, is crucial for mediating host-parasite interactions and is instrumental to the initiation, maintenance and severity of infection. Previous comparisons with the related trypanosomatid parasites T. cruzi and Leishmania major suggest that the cell-surface proteome of T. brucei is largely taxon-specific. Here we compare genes predicted to encode cell surface proteins of T. brucei with those from two related African trypanosomes, T. congolense and T. vivax. We created a cell surface phylome (CSP) by estimating phylogenies for 79 gene families with putative surface functions to understand the more recent evolution of African trypanosome surface architecture. Our findings demonstrate that the transferrin receptor genes essential for bloodstream survival in T. brucei are conserved in T. congolense but absent from T. vivax and include an expanded gene family of insect stage-specific surface glycoproteins that includes many currently uncharacterized genes. We also identify species-specific features and innovations and confirm that these include most expression site-associated genes (ESAGs) in T. brucei, which are absent from T. congolense and T. vivax. The CSP presents the first global picture of the origins and dynamics of cell surface architecture in African trypanosomes, representing the principal differences in genomic repertoire between African trypanosome species and provides a basis from which to explore the developmental and pathological differences in surface architectures. All data can be accessed at: http://www.genedb.org/Page/trypanosoma_surface_phylome.

    PLoS neglected tropical diseases 2013;7;3;e2121

  • Genome-wide association analyses identify 18 new loci associated with serum urate concentrations.

    Köttgen A, Albrecht E, Teumer A, Vitart V, Krumsiek J, Hundertmark C, Pistis G, Ruggiero D, O'Seaghdha CM, Haller T, Yang Q, Tanaka T, Johnson AD, Kutalik Z, Smith AV, Shi J, Struchalin M, Middelberg RP, Brown MJ, Gaffo AL, Pirastu N, Li G, Hayward C, Zemunik T, Huffman J, Yengo L, Zhao JH, Demirkan A, Feitosa MF, Liu X, Malerba G, Lopez LM, van der Harst P, Li X, Kleber ME, Hicks AA, Nolte IM, Johansson A, Murgia F, Wild SH, Bakker SJ, Peden JF, Dehghan A, Steri M, Tenesa A, Lagou V, Salo P, Mangino M, Rose LM, Lehtimäki T, Woodward OM, Okada Y, Tin A, Müller C, Oldmeadow C, Putku M, Czamara D, Kraft P, Frogheri L, Thun GA, Grotevendt A, Gislason GK, Harris TB, Launer LJ, McArdle P, Shuldiner AR, Boerwinkle E, Coresh J, Schmidt H, Schallert M, Martin NG, Montgomery GW, Kubo M, Nakamura Y, Tanaka T, Munroe PB, Samani NJ, Jacobs DR, Liu K, D'Adamo P, Ulivi S, Rotter JI, Psaty BM, Vollenweider P, Waeber G, Campbell S, Devuyst O, Navarro P, Kolcic I, Hastie N, Balkau B, Froguel P, Esko T, Salumets A, Khaw KT, Langenberg C, Wareham NJ, Isaacs A, Kraja A, Zhang Q, Wild PS, Scott RJ, Holliday EG, Org E, Viigimaa M, Bandinelli S, Metter JE, Lupo A, Trabetti E, Sorice R, Döring A, Lattka E, Strauch K, Theis F, Waldenberger M, Wichmann HE, Davies G, Gow AJ, Bruinenberg M, LifeLines Cohort Study, Stolk RP, Kooner JS, Zhang W, Winkelmann BR, Boehm BO, Lucae S, Penninx BW, Smit JH, Curhan G, Mudgal P, Plenge RM, Portas L, Persico I, Kirin M, Wilson JF, Mateo Leach I, van Gilst WH, Goel A, Ongen H, Hofman A, Rivadeneira F, Uitterlinden AG, Imboden M, von Eckardstein A, Cucca F, Nagaraja R, Piras MG, Nauck M, Schurmann C, Budde K, Ernst F, Farrington SM, Theodoratou E, Prokopenko I, Stumvoll M, Jula A, Perola M, Salomaa V, Shin SY, Spector TD, Sala C, Ridker PM, Kähönen M, Viikari J, Hengstenberg C, Nelson CP, CARDIoGRAM Consortium, DIAGRAM Consortium, ICBP Consortium, MAGIC Consortium, Meschia JF, Nalls MA, Sharma P, Singleton AB, Kamatani N, Zeller T, Burnier M, Attia J, Laan M, Klopp N, Hillege HL, Kloiber S, Choi H, Pirastu M, Tore S, Probst-Hensch NM, Völzke H, Gudnason V, Parsa A, Schmidt R, Whitfield JB, Fornage M, Gasparini P, Siscovick DS, Polašek O, Campbell H, Rudan I, Bouatia-Naji N, Metspalu A, Loos RJ, van Duijn CM, Borecki IB, Ferrucci L, Gambaro G, Deary IJ, Wolffenbuttel BH, Chambers JC, März W, Pramstaller PP, Snieder H, Gyllensten U, Wright AF, Navis G, Watkins H, Witteman JC, Sanna S, Schipf S, Dunlop MG, Tönjes A, Ripatti S, Soranzo N, Toniolo D, Chasman DI, Raitakari O, Kao WH, Ciullo M, Fox CS, Caulfield M, Bochud M and Gieger C

    1] Renal Division, Freiburg University Hospital, Freiburg, Germany. [2] Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, USA. [3].

    Elevated serum urate concentrations can cause gout, a prevalent and painful inflammatory arthritis. By combining data from >140,000 individuals of European ancestry within the Global Urate Genetics Consortium (GUGC), we identified and replicated 28 genome-wide significant loci in association with serum urate concentrations (18 new regions in or near TRIM46, INHBB, SFMBT1, TMEM171, VEGFA, BAZ1B, PRKAG2, STC1, HNF4G, A1CF, ATXN2, UBE2Q2, IGF1R, NFAT5, MAF, HLF, ACVR1B-ACVRL1 and B3GNT4). Associations for many of the loci were of similar magnitude in individuals of non-European ancestry. We further characterized these loci for associations with gout, transcript expression and the fractional excretion of urate. Network analyses implicate the inhibins-activins signaling pathways and glucose metabolism in systemic urate control. New candidate genes for serum urate concentration highlight the importance of metabolic control of urate production and excretion, which may have implications for the treatment and prevention of gout.

    Nature genetics 2013;45;2;145-54

  • Unusual features in organisation of capsular polysaccharide-related genes of C. jejuni strain X.

    Karlyshev AV, Quail MA, Parkhill J and Wren BW

    School of Life Sciences, Kingston University, Faculty of Science, Engineering and Computing, Penrhyn Road, Kingston-upon Thames, KT1 2EE, UK. Electronic address: a.karlyshev@kingston.ac.uk.

    PCR probing of the genome of Campylobacter jejuni strain X using conserved capsular polysaccharide (CPS)-related genes allowed elucidation of a complete sequence of the respective gene cluster (cps). This is the largest known Campylobacter cps cluster (38kb excluding flanking kps regions), which includes a number of genes not detected in other Campylobacter strains. Sequence analysis suggests genetic rearrangements both within and outside the cps gene cluster, a mechanism which may be responsible for mosaic organisation of sugar transferase-related genes leading to structural variability of the capsular polysaccharide (CPS).

    Gene 2013

  • RetroSeq: transposable element discovery from next-generation sequencing data.

    Keane TM, Wong K and Adams DJ

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, UK. tk2@sanger.ac.uk

    A significant proportion of eukaryote genomes consist of transposable element (TE)-derived sequence. These elements are known to have the capacity to modulate gene function and genome evolution. We have developed RetroSeq for detecting non-reference TE insertions from Illumina paired-end whole-genome sequencing data. We evaluate RetroSeq on a human trio from the 1000 Genomes Project, showing that it produces highly accurate TE calls. AVAILABILTY: RetroSeq is open-source and available from https://github.com/tk2/RetroSeq.

    Funded by: Cancer Research UK; Medical Research Council; Wellcome Trust

    Bioinformatics (Oxford, England) 2013;29;3;389-90

  • Different Patterns of Epstein-Barr Virus Latency in Endemic Burkitt Lymphoma (BL) Lead to Distinct Variants within the BL-Associated Gene Expression Signature.

    Kelly GL, Stylianou J, Rasaiyaah J, Wei W, Thomas W, Croom-Carter D, Kohler C, Spang R, Woodman C, Kellam P, Rickinson AB and Bell AI

    School of Cancer Sciences, College of Medical and Dental Sciences, University of Birmingham, Edgbaston, Birmingham, United Kingdom.

    Epstein-Barr virus (EBV) is present in all cases of endemic Burkitt lymphoma (BL) but in few European/North American sporadic BLs. Gene expression arrays of sporadic tumors have defined a consensus BL profile within which tumors are classifiable as "molecular BL" (mBL). Where endemic BLs fall relative to this profile remains unclear, since they not only carry EBV but also display one of two different forms of virus latency. Here, we use early-passage BL cell lines from different tumors, and BL subclones from a single tumor, to compare EBV-negative cells with EBV-positive cells displaying either classical latency I EBV infection (where EBNA1 is the only EBV antigen expressed from the wild-type EBV genome) or Wp-restricted latency (where an EBNA2 gene-deleted virus genome broadens antigen expression to include the EBNA3A, -3B, and -3C proteins and BHRF1). Expression arrays show that both types of endemic BL fall within the mBL classification. However, while EBV-negative and latency I BLs show overlapping profiles, Wp-restricted BLs form a distinct subgroup, characterized by a detectable downregulation of the germinal center (GC)-associated marker Bcl6 and upregulation of genes marking early plasmacytoid differentiation, notably IRF4 and BLIMP1. Importantly, these same changes can be induced in EBV-negative or latency I BL cells by infection with an EBNA2-knockout virus. Thus, we infer that the distinct gene profile of Wp-restricted BLs does not reflect differences in the identity of the tumor progenitor cell per se but differences imposed on a common progenitor by broadened EBV gene expression.

    Journal of virology 2013;87;5;2882-94

  • A systematic genome-wide analysis of zebrafish protein-coding gene function.

    Kettleborough RN, Busch-Nentwich EM, Harvey SA, Dooley CM, de Bruijn E, van Eeden F, Sealy I, White RJ, Herd C, Nijman IJ, Fényes F, Mehroke S, Scahill C, Gibbons R, Wali N, Carruthers S, Hall A, Yen J, Cuppen E and Stemple DL

    1] Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK [2].

    Since the publication of the human reference genome, the identities of specific genes associated with human diseases are being discovered at a rapid rate. A central problem is that the biological activity of these genes is often unclear. Detailed investigations in model vertebrate organisms, typically mice, have been essential for understanding the activities of many orthologues of these disease-associated genes. Although gene-targeting approaches and phenotype analysis have led to a detailed understanding of nearly 6,000 protein-coding genes, this number falls considerably short of the more than 22,000 mouse protein-coding genes. Similarly, in zebrafish genetics, one-by-one gene studies using positional cloning, insertional mutagenesis, antisense morpholino oligonucleotides, targeted re-sequencing, and zinc finger and TAL endonucleases have made substantial contributions to our understanding of the biological activity of vertebrate genes, but again the number of genes studied falls well short of the more than 26,000 zebrafish protein-coding genes. Importantly, for both mice and zebrafish, none of these strategies are particularly suited to the rapid generation of knockouts in thousands of genes and the assessment of their biological activity. Here we describe an active project that aims to identify and phenotype the disruptive mutations in every zebrafish protein-coding gene, using a well-annotated zebrafish reference genome sequence, high-throughput sequencing and efficient chemical mutagenesis. So far we have identified potentially disruptive mutations in more than 38% of all known zebrafish protein-coding genes. We have developed a multi-allelic phenotyping scheme to efficiently assess the effects of each allele during embryogenesis and have analysed the phenotypic consequences of over 1,000 alleles. All mutant alleles and data are available to the community and our phenotyping scheme is adaptable to phenotypic analysis beyond embryogenesis.

    Nature 2013

  • A genome-wide analysis of populations from European Russia reveals a new pole of genetic diversity in northern europe.

    Khrunin AV, Khokhrin DV, Filippova IN, Esko T, Nelis M, Bebyakova NA, Bolotova NL, Klovins J, Nikitina-Zake L, Rehnström K, Ripatti S, Schreiber S, Franke A, Macek M, Krulišová V, Lubinski J, Metspalu A and Limborska SA

    Department of Molecular Bases of Human Genetics, Institute of Molecular Genetics, Russian Academy of Sciences, Moscow, Russia.

    Several studies examined the fine-scale structure of human genetic variation in Europe. However, the European sets analyzed represent mainly northern, western, central, and southern Europe. Here, we report an analysis of approximately 166,000 single nucleotide polymorphisms in populations from eastern (northeastern) Europe: four Russian populations from European Russia, and three populations from the northernmost Finno-Ugric ethnicities (Veps and two contrast groups of Komi people). These were compared with several reference European samples, including Finns, Estonians, Latvians, Poles, Czechs, Germans, and Italians. The results obtained demonstrated genetic heterogeneity of populations living in the region studied. Russians from the central part of European Russia (Tver, Murom, and Kursk) exhibited similarities with populations from central-eastern Europe, and were distant from Russian sample from the northern Russia (Mezen district, Archangelsk region). Komi samples, especially Izhemski Komi, were significantly different from all other populations studied. These can be considered as a second pole of genetic diversity in northern Europe (in addition to the pole, occupied by Finns), as they had a distinct ancestry component. Russians from Mezen and the Finnic-speaking Veps were positioned between the two poles, but differed from each other in the proportions of Komi and Finnic ancestries. In general, our data provides a more complete genetic map of Europe accounting for the diversity in its most eastern (northeastern) populations.

    PloS one 2013;8;3;e58552

  • Current application and future perspectives of molecular typing methods to study Clostridium difficile infections.

    Knetsch C, Lawley T, Hensgens M, Corver J, Wilcox M and Kuijper E

    Section Experimental Microbiology, Department of Medical Microbiology, Center of Infectious Diseases, Leiden University Medical Center, Leiden, Netherlands.

    Euro surveillance : bulletin europ&eacute;en sur les maladies transmissibles = European communicable disease bulletin 2013;18;4

  • Host responses to melioidosis and tuberculosis are both dominated by interferon-mediated signaling.

    Koh GC, Schreiber MF, Bautista R, Maude RR, Dunachie S, Limmathurotsakul D, Day NP, Dougan G and Peacock SJ

    Wellcome Trust Sanger Institute, Hinxton, Cambridge, United Kingdom ; Department of Medicine, University of Cambridge, Cambridge, United Kingdom ; Mahidol-Oxford Tropical Medicine Research Unit, Mahidol University, Bangkok, Thailand ; Department of Infection and Tropical Diseases, Birmingham Heartlands Hospital, Birmingham, United Kingdom.

    Melioidosis (Burkholderia pseudomallei infection) is a common cause of community-acquired sepsis in Northeast Thailand and northern Australia. B. pseudomallei is a soil saprophyte endemic to Southeast Asia and northern Australia. The clinical presentation of melioidosis may mimic tuberculosis (both cause chronic suppurative lesions unresponsive to conventional antibiotics and both commonly affect the lungs). The two diseases have overlapping risk profiles (e.g., diabetes, corticosteroid use), and both B. pseudomallei and Mycobacterium tuberculosis are intracellular pathogens. There are however important differences: the majority of melioidosis cases are acute, not chronic, and present with severe sepsis and a mortality rate that approaches 50% despite appropriate antimicrobial therapy. By contrast, tuberculosis is characteristically a chronic illness with mortality <2% with appropriate antimicrobial chemotherapy. We examined the gene expression profiles of total peripheral leukocytes in two cohorts of patients, one with acute melioidosis (30 patients and 30 controls) and another with tuberculosis (20 patients and 24 controls). Interferon-mediated responses dominate the host response to both infections, and both type 1 and type 2 interferon responses are important. An 86-gene signature previously thought to be specific for tuberculosis is also found in melioidosis. We conclude that the host responses to melioidosis and to tuberculosis are similar: both are dominated by interferon-signalling pathways and this similarity means gene expression signatures from whole blood do not distinguish between these two diseases.

    PloS one 2013;8;1;e54961

  • Criteria for inference of chromothripsis in cancer genomes.

    Korbel JO and Campbell PJ

    Genome Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany. Electronic address: jan.korbel@embl.de.

    Chromothripsis scars the genome when localized chromosome shattering and repair occurs in a one-off catastrophe. Outcomes of this process are detectable as massive DNA rearrangements affecting one or a few chromosomes. Although recent findings suggest a crucial role of chromothripsis in cancer development, the reproducible inference of this process remains challenging, requiring that cataclysmic one-off rearrangements be distinguished from localized lesions that occur progressively. We describe conceptual criteria for the inference of chromothripsis, based on ruling out the alternative hypothesis that stepwise rearrangements occurred. Robust means of inference may facilitate in-depth studies on the impact of, and the mechanisms underlying, chromothripsis.

    Cell 2013;152;6;1226-36

  • Expression of recombinant ITGA2 and CD109 for the detection of human platelet antigen (HPA)-5 and -15 alloantibodies.

    Lane-Serff H, Sun Y, Metcalfe P and Wright GJ

    Cell surface Signalling Laboratory, Wellcome Trust Sanger Institute, Hinxton Cambridge, UK.

    British journal of haematology 2013

  • Genomic, Transcriptomic, and Lipidomic Profiling Highlights the Role of Inflammation in Individuals With Low High-density Lipoprotein Cholesterol.

    Laurila PP, Surakka I, Sarin AP, Yetukuri L, Hyötyläinen T, Söderlund S, Naukkarinen J, Tang J, Kettunen J, Mirel DB, Soronen J, Lehtimäki T, Ruokonen A, Ehnholm C, Eriksson JG, Salomaa V, Jula A, Raitakari OT, Järvelin MR, Palotie A, Peltonen L, Oresic M, Jauhiainen M, Taskinen MR and Ripatti S

    Institute for Molecular Medicine, Finland, FIMM, Tukholmankatu 8, Helsinki 00290, Finland. samuli.ripatti@helsinki.fi.

    OBJECTIVE: Low high-density lipoprotein cholesterol (HDL-C) is associated with cardiometabolic pathologies. In this study, we investigate the biological pathways and individual genes behind low HDL-C by integrating results from 3 high-throughput data sources: adipose tissue transcriptomics, HDL lipidomics, and dense marker genotypes from Finnish individuals with low or high HDL-C (n=450). Approach and Results- In the pathway analysis of genetic data, we demonstrate that genetic variants within inflammatory pathways were enriched among low HDL-C associated single-nucleotide polymorphisms, and the expression of these pathways upregulated in the adipose tissue of low HDL-C subjects. The lipidomic analysis highlighted the change in HDL particle quality toward putatively more inflammatory and less vasoprotective state in subjects with low HDL-C, as evidenced by their decreased antioxidative plasmalogen contents. We show that the focal point of these inflammatory pathways seems to be the HLA region with its low HDL-associated alleles also associating with more abundant local transcript levels in adipose tissue, increased plasma vascular cell adhesion molecule 1 (VCAM1) levels, and decreased HDL particle plasmalogen contents, markers of adipose tissue inflammation, vascular inflammation, and HDL antioxidative potential, respectively. In a population-based look-up of the inflammatory pathway single-nucleotide polymorphisms in a large Finnish cohorts (n=11 211), no association of the HLA region was detected for HDL-C as quantitative trait, but with extreme HDL-C phenotypes, implying the presence of low or high HDL genes in addition to the population-genomewide association studies-identified HDL genes. CONCLUSIONS: Our study highlights the role of inflammation with a genetic component in subjects with low HDL-C and identifies novel cis-expression quantitative trait loci (cis-eQTL) variants in HLA region to be associated with low HDL-C.

    Arteriosclerosis, thrombosis, and vascular biology 2013;33;4;847-57

  • Intestinal colonization resistance.

    Lawley TD and Walker AW

    Bacterial Pathogenesis Laboratory, Wellcome Trust Sanger Institute, Hinxton, UK. tl2@sanger.ac.uk

    Dense, complex microbial communities, collectively termed the microbiota, occupy a diverse array of niches along the length of the mammalian intestinal tract. During health and in the absence of antibiotic exposure the microbiota can effectively inhibit colonization and overgrowth by invading microbes such as pathogens. This phenomenon is called 'colonization resistance' and is associated with a stable and diverse microbiota in tandem with a controlled lack of inflammation, and involves specific interactions between the mucosal immune system and the microbiota. Here we overview the microbial ecology of the healthy mammalian intestinal tract and highlight the microbe-microbe and microbe-host interactions that promote colonization resistance. Emerging themes highlight immunological (T helper type 17/regulatory T-cell balance), microbiota (diverse and abundant) and metabolic (short-chain fatty acid) signatures of intestinal health and colonization resistance. Intestinal pathogens use specific virulence factors or exploit antibiotic use to subvert colonization resistance for their own benefit by triggering inflammation to disrupt the harmony of the intestinal ecosystem. A holistic view that incorporates immunological and microbiological facets of the intestinal ecosystem should facilitate the development of immunomodulatory and microbe-modulatory therapies that promote intestinal homeostasis and colonization resistance.

    Funded by: Medical Research Council: 93614; Wellcome Trust: 076964, 098051

    Immunology 2013;138;1;1-11

  • Common variants in the HLA-DRB1-HLA-DQA1 HLA class II region are associated with susceptibility to visceral leishmaniasis.

    LeishGEN Consortium, Wellcome Trust Case Control Consortium 2, Fakiola M, Strange A, Cordell HJ, Miller EN, Pirinen M, Su Z, Mishra A, Mehrotra S, Monteiro GR, Band G, Bellenguez C, Dronov S, Edkins S, Freeman C, Giannoulatou E, Gray E, Hunt SE, Lacerda HG, Langford C, Pearson R, Pontes NN, Rai M, Singh SP, Smith L, Sousa O, Vukcevic D, Bramon E, Brown MA, Casas JP, Corvin A, Duncanson A, Jankowski J, Markus HS, Mathew CG, Palmer CN, Plomin R, Rautanen A, Sawcer SJ, Trembath RC, Viswanathan AC, Wood NW, Wilson ME, Deloukas P, Peltonen L, Christiansen F, Witt C, Jeronimo SM, Sundar S, Spencer CC, Blackwell JM and Donnelly P

    1] Cambridge Institute for Medical Research, University of Cambridge School of Clinical Medicine, Addenbrooke's Hospital, Cambridge, UK. [2].

    To identify susceptibility loci for visceral leishmaniasis, we undertook genome-wide association studies in two populations: 989 cases and 1,089 controls from India and 357 cases in 308 Brazilian families (1,970 individuals). The HLA-DRB1-HLA-DQA1 locus was the only region to show strong evidence of association in both populations. Replication at this region was undertaken in a second Indian population comprising 941 cases and 990 controls, and combined analysis across the three cohorts for rs9271858 at this locus showed P(combined) = 2.76 × 10(-17) and odds ratio (OR) = 1.41, 95% confidence interval (CI) = 1.30-1.52. A conditional analysis provided evidence for multiple associations within the HLA-DRB1-HLA-DQA1 region, and a model in which risk differed between three groups of haplotypes better explained the signal and was significant in the Indian discovery and replication cohorts. In conclusion, the HLA-DRB1-HLA-DQA1 HLA class II region contributes to visceral leishmaniasis susceptibility in India and Brazil, suggesting shared genetic risk factors for visceral leishmaniasis that cross the epidemiological divides of geography and parasite species.

    Nature genetics 2013;45;2;208-13

  • The piggybac transposon displays local and distant reintegration preferences and can cause mutations at non-canonical integration sites.

    Li MA, Pettitt SJ, Eckert S, Ning Z, Rice S, Cadiñanos J, Yusa K, Conte N and Bradley A

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK, CB10 1SA.

    The DNA transposon piggyBac is widely used as a tool in mammalian experimental systems for transgenesis, mutagenesis and genome engineering. We have characterised genome-wide insertion site preferences of piggyBac by sequencing a large set of integration sites arising from transposition from two separate genomic loci and a plasmid donor in mouse embryonic stem cells. We found that piggyBac preferentially integrates locally to the excision site when mobilised from a chromosomal location, and identified other non-local regions of the genome with elevated insertion frequencies. PiggyBac insertions were associated with expressed genes and markers of open chromatin structure, and were excluded from heterochromatin. At the nucleotide level, piggyBac prefers to insert into TA-rich regions within a broader GC-rich context. We also found that piggyBac can insert into sites other than its known TTAA insertion site at low frequency (2%). Such insertions introduce mismatches that are repaired with signatures of host cell mismatch repair pathways. Transposons could be mobilised from plasmids with the observed mismatches, indicating that piggyBac could generate point mutations in the genome.

    Molecular and cellular biology 2013

  • The future role of genetic screening to detect newborns at risk of childhood-onset hearing loss.

    Linden Phillips L, Bitner-Glindzicz M, Lench N, Steel KP, Langford C, Dawson SJ, Davis A, Simpson S and Packer C

    * National Institute for Health Research (NIHR) Horizon Scanning Centre, School of Health and Population Sciences, University of Birmingham , Birmingham , UK.

    Abstract Objective: To explore the future potential of genetic screening to detect newborns at risk of childhood-onset hearing loss. Design: An expert led discussion of current and future developments in genetic technology and the knowledge base of genetic hearing loss to determine the viability of genetic screening and the implications for screening policy. Results and Discussion: Despite increasing pressure to adopt genetic technologies, a major barrier for genetic screening in hearing loss is the uncertain clinical significance of the identified mutations and their interactions. Only when a reliable estimate of the future risk of hearing loss can be made at a reasonable cost, will genetic screening become viable. Given the speed of technological advancement this may be within the next 10 years. Decision-makers should start to consider how genetic screening could augment current screening programmes as well as the associated data processing and storage requirements. Conclusion: In the interim, we suggest that decision makers consider the benefits of (1) genetically testing all newborns and children with hearing loss, to determine aetiology and to increase knowledge of the genetic causes of hearing loss, and (2) consider screening pregnant women for the m.1555A> G mutation to reduce the risk of aminoglycoside antibiotic-associated hearing loss.

    International journal of audiology 2013;52;2;124-33

  • Dense genotyping of immune-related disease regions identifies nine new risk loci for primary sclerosing cholangitis.

    Liu JZ, Hov JR, Folseraas T, Ellinghaus E, Rushbrook SM, Doncheva NT, Andreassen OA, Weersma RK, Weismüller TJ, Eksteen B, Invernizzi P, Hirschfield GM, Gotthardt DN, Pares A, Ellinghaus D, Shah T, Juran BD, Milkiewicz P, Rust C, Schramm C, Müller T, Srivastava B, Dalekos G, Nöthen MM, Herms S, Winkelmann J, Mitrovic M, Braun F, Ponsioen CY, Croucher PJ, Sterneck M, Teufel A, Mason AL, Saarela J, Leppa V, Dorfman R, Alvaro D, Floreani A, Onengut-Gumuscu S, Rich SS, Thompson WK, Schork AJ, Næss S, Thomsen I, Mayr G, König IR, Hveem K, Cleynen I, Gutierrez-Achury J, Ricaño-Ponce I, van Heel D, Björnsson E, Sandford RN, Durie PR, Melum E, Vatn MH, Silverberg MS, Duerr RH, Padyukov L, Brand S, Sans M, Annese V, Achkar JP, Boberg KM, Marschall HU, Chazouillères O, Bowlus CL, Wijmenga C, Schrumpf E, Vermeire S, Albrecht M, The UK-PSCSC Consortium, The International IBD Genetics Consortium, Rioux JD, Alexander G, Bergquist A, Cho J, Schreiber S, Manns MP, Färkkilä M, Dale AM, Chapman RW, Lazaridis KN, The International PSC Study Group, Franke A, Anderson CA and Karlsen TH

    1] Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK. [2].

    Primary sclerosing cholangitis (PSC) is a severe liver disease of unknown etiology leading to fibrotic destruction of the bile ducts and ultimately to the need for liver transplantation. We compared 3,789 PSC cases of European ancestry to 25,079 population controls across 130,422 SNPs genotyped using the Immunochip. We identified 12 genome-wide significant associations outside the human leukocyte antigen (HLA) complex, 9 of which were new, increasing the number of known PSC risk loci to 16. Despite comorbidity with inflammatory bowel disease (IBD) in 72% of the cases, 6 of the 12 loci showed significantly stronger association with PSC than with IBD, suggesting overlapping yet distinct genetic architectures for these two diseases. We incorporated association statistics from 7 diseases clinically occurring with PSC in the analysis and found suggestive evidence for 33 additional pleiotropic PSC risk loci. Together with network analyses, these findings add to the genetic risk map of PSC and expand on the relationship between PSC and other immune-mediated diseases.

    Nature genetics 2013

  • Human Spermatogenic Failure Purges Deleterious Mutation Load from the Autosomes and Both Sex Chromosomes, including the Gene DMRT1.

    Lopes AM, Aston KI, Thompson E, Carvalho F, Gonçalves J, Huang N, Matthiesen R, Noordam MJ, Quintela I, Ramu A, Seabra C, Wilfert AB, Dai J, Downie JM, Fernandes S, Guo X, Sha J, Amorim A, Barros A, Carracedo A, Hu Z, Hurles ME, Moskovtsev S, Ober C, Paduch DA, Schiffman JD, Schlegel PN, Sousa M, Carrell DT and Conrad DF

    Institute of Molecular Pathology and Immunology of the University of Porto (IPATIMUP), Porto, Portugal.

    Gonadal failure, along with early pregnancy loss and perinatal death, may be an important filter that limits the propagation of harmful mutations in the human population. We hypothesized that men with spermatogenic impairment, a disease with unknown genetic architecture and a common cause of male infertility, are enriched for rare deleterious mutations compared to men with normal spermatogenesis. After assaying genomewide SNPs and CNVs in 323 Caucasian men with idiopathic spermatogenic impairment and more than 1,100 controls, we estimate that each rare autosomal deletion detected in our study multiplicatively changes a man's risk of disease by 10% (OR 1.10 [1.04-1.16], p<2×10(-3)), rare X-linked CNVs by 29%, (OR 1.29 [1.11-1.50], p<1×10(-3)), and rare Y-linked duplications by 88% (OR 1.88 [1.13-3.13], p<0.03). By contrasting the properties of our case-specific CNVs with those of CNV callsets from cases of autism, schizophrenia, bipolar disorder, and intellectual disability, we propose that the CNV burden in spermatogenic impairment is distinct from the burden of large, dominant mutations described for neurodevelopmental disorders. We identified two patients with deletions of DMRT1, a gene on chromosome 9p24.3 orthologous to the putative sex determination locus of the avian ZW chromosome system. In an independent sample of Han Chinese men, we identified 3 more DMRT1 deletions in 979 cases of idiopathic azoospermia and none in 1,734 controls, and found none in an additional 4,519 controls from public databases. The combined results indicate that DMRT1 loss-of-function mutations are a risk factor and potential genetic cause of human spermatogenic failure (frequency of 0.38% in 1306 cases and 0% in 7,754 controls, p = 6.2×10(-5)). Our study identifies other recurrent CNVs as potential causes of idiopathic azoospermia and generates hypotheses for directing future studies on the genetic basis of male infertility and IVF outcomes.

    PLoS genetics 2013;9;3;e1003349

  • Genome-wide association study in a Chinese population identifies a susceptibility locus for type 2 diabetes at 7q32 near PAX4.

    Ma RC, Hu C, Tam CH, Zhang R, Kwan P, Leung TF, Thomas GN, Go MJ, Hara K, Sim X, Ho JS, Wang C, Li H, Lu L, Wang Y, Li JW, Wang Y, Lam VK, Wang J, Yu W, Kim YJ, Ng DP, Fujita H, Panoutsopoulou K, Day-Williams AG, Lee HM, Ng AC, Fang YJ, Kong AP, Jiang F, Ma X, Hou X, Tang S, Lu J, Yamauchi T, Tsui SK, Woo J, Leung PC, Zhang X, Tang NL, Sy HY, Liu J, Wong TY, Lee JY, Maeda S, Xu G, Cherny SS, Chan TF, Ng MC, Xiang K, Morris AP, DIAGRAM Consortium, Keildson S, The MuTHER Consortium, Hu R, Ji L, Lin X, Cho YS, Kadowaki T, Tai ES, Zeggini E, McCarthy MI, Hon KL, Baum L, Tomlinson B, So WY, Bao Y, Chan JC and Jia W

    Department of Medicine and Therapeutics, Chinese University of Hong Kong, Prince of Wales Hospital, Shatin, New Territories, Hong Kong, SAR, People's Republic of China, rcwma@cuhk.edu.hk.

    AIMS/HYPOTHESIS: Most genetic variants identified for type 2 diabetes have been discovered in European populations. We performed genome-wide association studies (GWAS) in a Chinese population with the aim of identifying novel variants for type 2 diabetes in Asians. METHODS: We performed a meta-analysis of three GWAS comprising 684 patients with type 2 diabetes and 955 controls of Southern Han Chinese descent. We followed up the top signals in two independent Southern Han Chinese cohorts (totalling 10,383 cases and 6,974 controls), and performed in silico replication in multiple populations. RESULTS: We identified CDKN2A/B and four novel type 2 diabetes association signals with p < 1 × 10(-5) from the meta-analysis. Thirteen variants within these four loci were followed up in two independent Chinese cohorts, and rs10229583 at 7q32 was found to be associated with type 2 diabetes in a combined analysis of 11,067 cases and 7,929 controls (p meta = 2.6 × 10(-8); OR [95% CI] 1.18 [1.11, 1.25]). In silico replication revealed consistent associations across multiethnic groups, including five East Asian populations (p meta = 2.3 × 10(-10)) and a population of European descent (p = 8.6 × 10(-3)). The rs10229583 risk variant was associated with elevated fasting plasma glucose, impaired beta cell function in controls, and an earlier age at diagnosis for the cases. The novel variant lies within an islet-selective cluster of open regulatory elements. There was significant heterogeneity of effect between Han Chinese and individuals of European descent, Malaysians and Indians. CONCLUSIONS/INTERPRETATION: Our study identifies rs10229583 near PAX4 as a novel locus for type 2 diabetes in Chinese and other populations and provides new insights into the pathogenesis of type 2 diabetes.

    Diabetologia 2013

  • A follow-up linkage study of Finnish pre-eclampsia families identifies a new fetal susceptibility locus on chromosome 18.

    Majander KK, Villa PM, Kivinen K, Kere J and Laivuori H

    1] Department of Medical Genetics, Haartman Institute, University of Helsinki, Helsinki, Finland [2] Research Programs Unit, Women's Health, University of Helsinki, Helsinki, Finland.

    Pre-eclampsia is a common vascular disorder of pregnancy. It originates in the placenta and targets the maternal endothelium. According to epidemiological research, >50% of the liability to this disorder can be accounted for by genetic factors. Both maternal and fetal genes contribute to the risk, but especially the fetal genetic risk profile is still poorly understood. We have previously detected linkage signals in multiplex Finnish families on chromosomes 2p25, 4q32, and 9p13 using maternal phenotypes. We performed a linkage analysis using updated maternal phenotypes and an unprecedented linkage analysis using fetal phenotypes. Markers genotyped were available from 237 individuals in 15 Finnish families, including 72 affected mothers and 49 affected fetuses. The MERLIN software was used for sample and marker quality control and linkage analysis. The results were compared against the original ones obtained by using the GENEHUNTER 2.1 software. The previous identification of the maternal susceptibility locus to a genetic location at 21.70 cM near marker D2S168 on chromosome 2 was confirmed by using both maternal and fetal phenotypes (maternal non-parametric linkage (NPL) score 3.79, P=0.00008, LOD (logarithm (base 10) of odds)=2.20 and fetal NPL score 2.95, P=0.002, LOD=1.71). As a novel finding, we present a suggestive linkage to chromosome 18 at 86.80 cM near marker D18S64 (NPL score 2.51, P=0.006, LOD=1.20) using the fetal phenotype. We propose that chromosome 18 may harbor a new fetal susceptibility locus for pre-eclampsia.European Journal of Human Genetics advance online publication, 6 February 2013; doi:10.1038/ejhg.2013.6.

    European journal of human genetics : EJHG 2013

  • The challenge of increasing Pfam coverage of the human proteome.

    Mistry J, Coggill P, Eberhardt RY, Deiana A, Giansanti A, Finn RD, Bateman A and Punta M

    EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

    It is a worthy goal to completely characterize all human proteins in terms of their domains. Here, using the Pfam database, we asked how far we have progressed in this endeavour. Ninety per cent of proteins in the human proteome matched at least one of 5494 manually curated Pfam-A families. In contrast, human residue coverage by Pfam-A families was <45%, with 9418 automatically generated Pfam-B families adding a further 10%. Even after excluding predicted signal peptide regions and short regions (<50 consecutive residues) unlikely to harbour new families, for ∼38% of the human protein residues, there was no information in Pfam about conservation and evolutionary relationship with other protein regions. This uncovered portion of the human proteome was found to be distributed over almost 25 000 distinct protein regions. Comparison with proteins in the UniProtKB database suggested that the human regions that exhibited similarity to thousands of other sequences were often either divergent elements or N- or C-terminal extensions of existing families. Thirty-four per cent of regions, on the other hand, matched fewer than 100 sequences in UniProtKB. Most of these did not appear to share any relationship with existing Pfam-A families, suggesting that thousands of new families would need to be generated to cover them. Also, these latter regions were particularly rich in amino acid compositional bias such as the one associated with intrinsic disorder. This could represent a significant obstacle toward their inclusion into new Pfam families. Based on these observations, a major focus for increasing Pfam coverage of the human proteome will be to improve the definition of existing families. New families will also be built, prioritizing those that have been experimentally functionally characterized. Database URL: http://pfam.sanger.ac.uk/

    Database : the journal of biological databases and curation 2013;2013;bat023

  • Challenges in homology search: HMMER3 and convergent evolution of coiled-coil regions.

    Mistry J, Finn RD, Eddy SR, Bateman A and Punta M

    EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK and HHMI Janelia Farm Research Campus, 19700 Helix Drive, Ashburn, VA 20147, USA.

    Detection of protein homology via sequence similarity has important applications in biology, from protein structure and function prediction to reconstruction of phylogenies. Although current methods for aligning protein sequences are powerful, challenges remain, including problems with homologous overextension of alignments and with regions under convergent evolution. Here, we test the ability of the profile hidden Markov model method HMMER3 to correctly assign homologous sequences to >13 000 manually curated families from the Pfam database. We identify problem families using protein regions that match two or more Pfam families not currently annotated as related in Pfam. We find that HMMER3 E-value estimates seem to be less accurate for families that feature periodic patterns of compositional bias, such as the ones typically observed in coiled-coils. These results support the continued use of manually curated inclusion thresholds in the Pfam database, especially on the subset of families that have been identified as problematic in experiments such as these. They also highlight the need for developing new methods that can correct for this particular type of compositional bias.

    Nucleic acids research 2013

  • A powerful molecular synergy between mutant Nucleophosmin and Flt3-ITD drives acute myeloid leukemia in mice.

    Mupo A, Celani L, Dovey O, Cooper JL, Grove C, Rad R, Sportoletti P, Falini B, Bradley A and Vassiliou GS

    The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK.

    Leukemia 2013

  • Evolution of equine influenza virus in vaccinated horses.

    Murcia PR, Baillie GJ, Stack JC, Jervis C, Elton D, Mumford JA, Daly J, Kellam P, Grenfell BT, Holmes EC and Wood JL

    Medical Research Council-University of Glasgow Centre for Virus Research, Institute of Infection, Inflammation and Immunity, College of Medical, Veterinary and Life Sciences, University of Glasgow, Glasgow, United Kingdom.

    Influenza A viruses are characterized by their ability to evade host immunity, even in vaccinated individuals. To determine how prior immunity shapes viral diversity in vivo, we studied the intra- and interhost evolution of equine influenza virus in vaccinated horses. Although the level and structure of genetic diversity were similar to those in naïve horses, intrahost bottlenecks may be more stringent in vaccinated animals, and mutations shared among horses often fall close to putative antigenic sites.

    Journal of virology 2013;87;8;4768-71

  • Meta-analysis investigating associations between healthy diet and fasting glucose and insulin levels and modification by Loci associated with glucose homeostasis in data from 15 cohorts.

    Nettleton JA, Hivert MF, Lemaitre RN, McKeown NM, Mozaffarian D, Tanaka T, Wojczynski MK, Hruby A, Djoussé L, Ngwa JS, Follis JL, Dimitriou M, Ganna A, Houston DK, Kanoni S, Mikkilä V, Manichaikul A, Ntalla I, Renström F, Sonestedt E, van Rooij FJ, Bandinelli S, de Koning L, Ericson U, Hassanali N, Kiefte-de Jong JC, Lohman KK, Raitakari O, Papoutsakis C, Sjogren P, Stirrups K, Ax E, Deloukas P, Groves CJ, Jacques PF, Johansson I, Liu Y, McCarthy MI, North K, Viikari J, Zillikens MC, Dupuis J, Hofman A, Kolovou G, Mukamal K, Prokopenko I, Rolandsson O, Seppälä I, Cupples LA, Hu FB, Kähönen M, Uitterlinden AG, Borecki IB, Ferrucci L, Jacobs DR, Kritchevsky SB, Orho-Melander M, Pankow JS, Lehtimäki T, Witteman JC, Ingelsson E, Siscovick DS, Dedoussis G, Meigs JB and Franks PW

    Whether loci that influence fasting glucose (FG) and fasting insulin (FI) levels, as identified by genome-wide association studies, modify associations of diet with FG or FI is unknown. We utilized data from 15 US and European cohort studies comprising 51,289 persons without diabetes to test whether genotype and diet interact to influence FG or FI concentration. We constructed a diet score using study-specific quartile rankings for intakes of whole grains, fish, fruits, vegetables, and nuts/seeds (favorable) and red/processed meats, sweets, sugared beverages, and fried potatoes (unfavorable). We used linear regression within studies, followed by inverse-variance-weighted meta-analysis, to quantify 1) associations of diet score with FG and FI levels and 2) interactions of diet score with 16 FG-associated loci and 2 FI-associated loci. Diet score (per unit increase) was inversely associated with FG (β = -0.004 mmol/L, 95% confidence interval: -0.005, -0.003) and FI (β = -0.008 ln-pmol/L, 95% confidence interval: -0.009, -0.007) levels after adjustment for demographic factors, lifestyle, and body mass index. Genotype variation at the studied loci did not modify these associations. Healthier diets were associated with lower FG and FI concentrations regardless of genotype at previously replicated FG- and FI-associated loci. Studies focusing on genomic regions that do not yield highly statistically significant associations from main-effect genome-wide association studies may be more fruitful in identifying diet-gene interactions.

    American journal of epidemiology 2013;177;2;103-15

  • Binding of nucleoid-associated protein fis to DNA is regulated by DNA breathing dynamics.

    Nowak-Lovato K, Alexandrov LB, Banisadr A, Bauer AL, Bishop AR, Usheva A, Mu F, Hong-Geller E, Rasmussen KØ, Hlavacek WS and Alexandrov BS

    Bioscience Division, Los Alamos National Laboratory, Los Alamos, New Mexico, United States of America.

    Physicochemical properties of DNA, such as shape, affect protein-DNA recognition. However, the properties of DNA that are most relevant for predicting the binding sites of particular transcription factors (TFs) or classes of TFs have yet to be fully understood. Here, using a model that accurately captures the melting behavior and breathing dynamics (spontaneous local openings of the double helix) of double-stranded DNA, we simulated the dynamics of known binding sites of the TF and nucleoid-associated protein Fis in Escherichia coli. Our study involves simulations of breathing dynamics, analysis of large published in vitro and genomic datasets, and targeted experimental tests of our predictions. Our simulation results and available in vitro binding data indicate a strong correlation between DNA breathing dynamics and Fis binding. Indeed, we can define an average DNA breathing profile that is characteristic of Fis binding sites. This profile is significantly enriched among the identified in vivo E. coli Fis binding sites. To test our understanding of how Fis binding is influenced by DNA breathing dynamics, we designed base-pair substitutions, mismatch, and methylation modifications of DNA regions that are known to interact (or not interact) with Fis. The goal in each case was to make the local DNA breathing dynamics either closer to or farther from the breathing profile characteristic of a strong Fis binding site. For the modified DNA segments, we found that Fis-DNA binding, as assessed by gel-shift assay, changed in accordance with our expectations. We conclude that Fis binding is associated with DNA breathing dynamics, which in turn may be regulated by various nucleotide modifications.

    PLoS computational biology 2013;9;1;e1002881

  • Chlamydia trachomatis clinical isolates identified as tetracycline resistant do not exhibit resistance in vitro: whole-genome sequencing reveals a mutation in porB but no evidence for tetracycline resistance genes.

    O'Neill CE, Seth-Smith HM, Van Der Pol B, Harris SR, Thomson NR, Cutcliffe LT and Clarke IN

    Faculty of Medicine, CES Academic Unit, Level C, South Block, University of Southampton, Southampton General Hospital, Tremona Road, Southampton, UK.

    Chlamydia trachomatis is the most common bacterial sexually transmitted infection worldwide and the leading cause of preventable blindness in developing countries. Tetracycline is commonly the drug of choice for treating C. trachomatis infections, but cases of antibiotic resistance in clinical isolates have previously been reported. Here, we used antibiotic resistance assays and whole-genome sequencing to interrogate the hypothesis that two clinical isolates (IU824 and IU888) have acquired mechanisms of antibiotic resistance. Immunofluorescence staining was used to identify C. trachomatis inclusions in cell cultures grown in the presence of tetracycline; however, only antibiotic-free control cultures yielded the strong fluorescence associated with the presence of chlamydial inclusions. Infectivity was lost upon passage of harvested cultures grown in the presence of tetracycline into antibiotic-free medium, so we conclude that these isolates were phenotypically sensitive to tetracycline. Comparisons of the genome and plasmid sequences for the two isolates with tetracycline-sensitive strains did not identify regions of low sequence identity that could accommodate horizontally acquired resistance genes, and the tetracycline binding region of the 16S rRNA gene was identical to that of the sensitive control strains. The porB gene of strain IU824, however, was found to contain a premature stop codon not previously identified, which is noteworthy but unlikely to be related to tetracycline resistance. In conclusion, we found no evidence of tetracycline resistance in the two strains investigated, and it seems most likely that the small, aberrant inclusions previously identified resulted from the high chlamydial load used in the original antibiotic resistance assays.

    Microbiology (Reading, England) 2013;159;Pt 4;748-56

  • Getting ready for the human phenome project: the 2012 forum of the human variome project.

    Oetting WS, Robinson PN, Greenblatt MS, Cotton RG, Beck T, Carey JC, Doelken SC, Girdea M, Groza T, Hamilton CM, Hamosh A, Kerner B, Macarthur JA, Maglott DR, Mons B, Rehm HL, Schofield PN, Searle BA, Smedley D, Smith CL, Bernstein IT, Zankl A and Zhao EY

    Department of Experimental and Clinical Pharmacology, College of Pharmacy, University of Minnesota, Minneapolis, Minnesota.

    A forum of the Human Variome Project (HVP) was held as a satellite to the 2012 Annual Meeting of the American Society of Human Genetics in San Francisco, California. The theme of this meeting was "Getting Ready for the Human Phenome Project." Understanding the genetic contribution to both rare single-gene "Mendelian" disorders and more complex common diseases will require integration of research efforts among many fields and better defined phenotypes. The HVP is dedicated to bringing together researchers and research populations throughout the world to provide the resources to investigate the impact of genetic variation on disease. To this end, there needs to be a greater sharing of phenotype and genotype data. For this to occur, many databases that currently exist will need to become interoperable to allow for the combining of cohorts with similar phenotypes to increase statistical power for studies attempting to identify novel disease genes or causative genetic variants. Improved systems and tools that enhance the collection of phenotype data from clinicians are urgently needed. This meeting begins the HVP's effort toward this important goal.

    Human mutation 2013;34;4;661-6

  • Efficient depletion of host DNA contamination in malaria clinical sequencing.

    Oyola SO, Gu Y, Manske M, Otto TD, O'Brien J, Alcock D, Macinnis B, Berriman M, Newbold CI, Kwiatkowski DP, Swerdlow HP and Quail MA

    Wellcome Trust Sanger Institute, Hinxton, Cambridge, United Kingdom. Samuel.oyola@sanger.ac.uk

    The cost of whole-genome sequencing (WGS) is decreasing rapidly as next-generation sequencing technology continues to advance, and the prospect of making WGS available for public health applications is becoming a reality. So far, a number of studies have demonstrated the use of WGS as an epidemiological tool for typing and controlling outbreaks of microbial pathogens. Success of these applications is hugely dependent on efficient generation of clean genetic material that is free from host DNA contamination for rapid preparation of sequencing libraries. The presence of large amounts of host DNA severely affects the efficiency of characterizing pathogens using WGS and is therefore a serious impediment to clinical and epidemiological sequencing for health care and public health applications. We have developed a simple enzymatic treatment method that takes advantage of the methylation of human DNA to selectively deplete host contamination from clinical samples prior to sequencing. Using malaria clinical samples with over 80% human host DNA contamination, we show that the enzymatic treatment enriches Plasmodium falciparum DNA up to ∼9-fold and generates high-quality, nonbiased sequence reads covering >98% of 86,158 catalogued typeable single-nucleotide polymorphism loci.

    Funded by: Wellcome Trust: 079355/Z/06/Z

    Journal of clinical microbiology 2013;51;3;745-51

  • Tailoring the models of transcription.

    Pance A

    The Welcome Trust Sanger Institute, Genome Campus Hinxton, Cambridge CB10 1SA, UK. ap9@sanger.ac.uk.

    Molecular biology is a rapidly evolving field that has led to the development of increasingly sophisticated technologies to improve our capacity to study cellular processes in much finer detail. Transcription is the first step in protein expression and the major point of regulation of the components that determine the characteristics, fate and functions of cells. The study of transcriptional regulation has been greatly facilitated by the development of reporter genes and transcription factor expression vectors, which have become versatile tools for manipulating promoters, as well as transcription factors in order to examine their function. The understanding of promoter complexity and transcription factor structure offers an insight into the mechanisms of transcriptional control and their impact on cell behaviour. This review focuses on some of the many applications of molecular cut-and-paste tools for the manipulation of promoters and transcription factors leading to the understanding of crucial aspects of transcriptional regulation.

    International journal of molecular sciences 2013;14;4;7583-97

  • Proteomic and Genetic Analyses Demonstrate that Plasmodium berghei Blood Stages Export a Large and Diverse Repertoire of Proteins.

    Pasini EM, Braks JA, Fonager J, Klop O, Aime E, Spaccapelo R, Otto TD, Berriman M, Hiss JA, Thomas AW, Mann M, Janse CJ, Kocken CH and Franke-Fayard B

    ‡Biomedical Primate Research Centre, 2288 GJ Rijswijk, The Netherlands;

    Malaria parasites actively remodel the infected red blood cell (irbc) by exporting proteins into the host cell cytoplasm. The human parasite Plasmodium falciparum exports particularly large numbers of proteins, including proteins that establish a vesicular network allowing the trafficking of proteins onto the surface of irbcs that are responsible for tissue sequestration. Like P. falciparum, the rodent parasite P. berghei ANKA sequesters via irbc interactions with the host receptor CD36. We have applied proteomic, genomic, and reverse-genetic approaches to identify P. berghei proteins potentially involved in the transport of proteins to the irbc surface. A comparative proteomics analysis of P. berghei non-sequestering and sequestering parasites was used to determine changes in the irbc membrane associated with sequestration. Subsequent tagging experiments identified 13 proteins (Plasmodium export element (PEXEL)-positive as well as PEXEL-negative) that are exported into the irbc cytoplasm and have distinct localization patterns: a dispersed and/or patchy distribution, a punctate vesicle-like pattern in the cytoplasm, or a distinct location at the irbc membrane. Members of the PEXEL-negative BIR and PEXEL-positive Pb-fam-3 show a dispersed localization in the irbc cytoplasm, but not at the irbc surface. Two of the identified exported proteins are transported to the irbc membrane and were named erythrocyte membrane associated proteins. EMAP1 is a member of the PEXEL-negative Pb-fam-1 family, and EMAP2 is a PEXEL-positive protein encoded by a single copy gene; neither protein plays a direct role in sequestration. Our observations clearly indicate that P. berghei traffics a diverse range of proteins to different cellular locations via mechanisms that are analogous to those employed by P. falciparum. This information can be exploited to generate transgenic humanized rodent P. berghei parasites expressing chimeric P. berghei/P. falciparum proteins on the surface of rodent irbc, thereby opening new avenues for in vivo screening adjunct therapies that block sequestration.

    Molecular &amp; cellular proteomics : MCP 2013;12;2;426-48

  • Maps of open chromatin highlight cell type-restricted patterns of regulatory sequence variation at hematological trait loci.

    Paul DS, Albers CA, Rendon A, Voss K, Stephens J, van der Harst P, Chambers JC, Soranzo N, Ouwehand WH and Deloukas P

    Wellcome Trust Sanger Institute;

    Nearly three-quarters of the 143 genetic signals associated with platelet and erythrocyte phenotypes identified by meta-analyses of genome-wide association (GWA) studies are located at non-protein coding regions. Here, we assessed the role of candidate regulatory variants associated with cell type-restricted, closely related hematological quantitative traits in biologically relevant hematopoietic cell types. We used formaldehyde-assisted isolation of regulatory elements followed by next-generation sequencing (FAIRE-seq) to map regions of open chromatin in three primary human blood cells of the myeloid lineage. In the precursors of platelets and erythrocytes, as well as in monocytes, we found that open chromatin signatures reflect the corresponding hematopoietic lineages of the studied cell types, and associate with the cell type-specific gene expression patterns. Dependent on their signal strength, open chromatin regions showed correlation with promoter and enhancer histone marks, distance to the transcription start site and ontology classes of nearby genes. Cell type-restricted regions of open chromatin were enriched in sequence variants associated with hematological indices. The majority (63.6%) of such candidate functional variants at platelet quantitative trait loci (QTLs) coincided with binding sites of five transcription factors key in regulating megakaryopoiesis. We experimentally tested 13 candidate regulatory variants at 10 platelet QTLs, and found that 10 (76.9%) affected protein binding, suggesting that this is a frequent mechanism by which regulatory variants influence quantitative trait levels. Our findings demonstrate that combining large-scale GWA data with open chromatin profiles of relevant cell types can be a powerful means of dissecting the genetic architecture of closely related quantitative traits.

    Genome research 2013

  • Meander: visually exploring the structural variome using space-filling curves.

    Pavlopoulos GA, Kumar P, Sifrim A, Sakai R, Lin ML, Voet T, Moreau Y and Aerts J

    Department of Electrical Engineering (ESAT/SCD), University of Leuven, Kasteelpark Arenberg 10, Box 2446, 3001 Leuven, Belgium, iMinds Future Health Department, University of Leuven, Kasteelpark Arenberg 10, Box 2446, 3001 Leuven, Belgium, Division of Basic Sciences, University of Crete, Medical School, Heraklion, 71110 Crete, Greece, Laboratory of Reproductive Genomics, Department of Human Genetics, University of Leuven, Herestraat 49, 3000 Leuven, Belgium and Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton - Cambridge, CB10 1SA, UK.

    The introduction of next generation sequencing methods in genome studies has made it possible to shift research from a gene-centric approach to a genome wide view. Although methods and tools to detect single nucleotide polymorphisms are becoming more mature, methods to identify and visualize structural variation (SV) are still in their infancy. Most genome browsers can only compare a given sequence to a reference genome; therefore, direct comparison of multiple individuals still remains a challenge. Therefore, the implementation of efficient approaches to explore and visualize SVs and directly compare two or more individuals is desirable. In this article, we present a visualization approach that uses space-filling Hilbert curves to explore SVs based on both read-depth and pair-end information. An interactive open-source Java application, called Meander, implements the proposed methodology, and its functionality is demonstrated using two cases. With Meander, users can explore variations at different levels of resolution and simultaneously compare up to four different individuals against a common reference. The application was developed using Java version 1.6 and Processing.org and can be run on any platform. It can be found at http://homes.esat.kuleuven.be/∼bioiuser/meander.

    Nucleic acids research 2013

  • Computational proteomics pitfalls and challenges: HavanaBioinfo 2012 Workshop report.

    Perez-Riverol Y, Hermjakob H, Kohlbacher O, Martens L, Creasy D, Cox J, Leprevost F, Shan BP, Pérez-Nueno VI, Blazejczyk M, Punta M, Vierlinger K, Valiente PA, Leon K, Chinea G, Guirola O, Bringas R, Cabrera G, Guillen G, Padron G, Gonzalez LJ and Besada V

    Center for Genetic Engineering and Biotechnology, Ave 31 e/158 y 190, Cubanacán, Playa, Havana, Cuba; European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK. Electronic address: yasset.perez@biocomp.cigb.edu.cu.

    The workshop "Bioinformatics for Biotechnology Applications (HavanaBioinfo 2012)", held December 8-11, 2012 in Havana, aimed at exploring new bioinformatics tools and approaches for large-scale proteomics, genomics and chemoinformatics. Major conclusions of the workshop include the following: (i) development of new applications and bioinformatics tools for proteomic repository analysis is crucial; current proteomic repositories contain enough data (spectra/identifications) that can be used to increase the annotations in protein databases and to generate new tools for protein identification; (ii) spectral libraries, de novo sequencing and database search tools should be combined to increase the number of protein identifications; (iii) protein probabilities and FDR are not yet sufficiently mature; (iv) computational proteomics software needs to become more intuitive; and at the same time appropriate education and training should be provided to help in the efficient exchange of knowledge between mass spectrometrists and experimental biologists and bioinformaticians in order to increase their bioinformatics background, especially statistics knowledge.

    Journal of proteomics 2013

  • Recombination-mediated genetic engineering of Plasmodium berghei DNA.

    Pfander C, Anar B, Brochet M, Rayner JC and Billker O

    Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK.

    DNA of Plasmodium berghei is difficult to manipulate in Escherichia coli by conventional restriction and ligation methods due to its high content of adenine and thymine (AT) nucleotides. This limits our ability to clone large genes and to generate complex vectors for modifying the parasite genome. We here describe a protocol for using lambda Red recombinase to modify inserts of a P. berghei genomic DNA library constructed in a linear, low-copy, phage-derived vector. The method uses primer extensions of 50 bp, which provide sufficient homology for an antibiotic resistance marker to recombine efficiently with a P. berghei genomic DNA insert in E. coli. In a subsequent in vitro Gateway reaction the bacterial marker is replaced with a cassette for selection in P. berghei. The insert is then released and used for transfection. The basic techniques we describe here can be adapted to generate highly efficient vectors for gene deletion, tagging, targeted mutagenesis, or genetic complementation with larger genomic regions.

    Methods in molecular biology (Clifton, N.J.) 2013;923;127-38

  • A genome-wide mutagenesis screen identifies multiple genes contributing to Vi capsular expression in Salmonella Typhi.

    Pickard D, Kingsley RA, Hale C, Turner K, Sivaraman K, Wetter M, Langridge G and Dougan G

    The Wellcome Trust Sanger Institute, The Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom.

    A transposon-based, genome-wide mutagenesis screen exploiting the killing activity of a lytic ViII bacteriophage was used to identify Salmonella Typhi genes that contribute to Vi polysaccharide capsule expression. Genes enriched in the screen included those within the viaB locus (tviABCDE, vexABCDE) as well as oxyR, barA/sirA and yrfF, which have not previously been associated with Vi expression. The role of these genes in Vi expression was confirmed by constructing defined null mutant derivatives of S. Typhi and these were negative for Vi expression as determined by agglutination assays with Vi-specific sera or susceptibility to Vi-targeting bacteriophage. Transcriptome analysis confirmed a reduction in expression from the viaB locus in these S. Typhi mutant derivatives and defined regulatory networks associated with Vi expression.

    Journal of bacteriology 2013

  • A meta-analysis of thyroid-related traits reveals novel Loci and gender-specific differences in the regulation of thyroid function.

    Porcu E, Medici M, Pistis G, Volpato CB, Wilson SG, Cappola AR, Bos SD, Deelen J, den Heijer M, Freathy RM, Lahti J, Liu C, Lopez LM, Nolte IM, O'Connell JR, Tanaka T, Trompet S, Arnold A, Bandinelli S, Beekman M, Böhringer S, Brown SJ, Buckley BM, Camaschella C, de Craen AJ, Davies G, de Visser MC, Ford I, Forsen T, Frayling TM, Fugazzola L, Gögele M, Hattersley AT, Hermus AR, Hofman A, Houwing-Duistermaat JJ, Jensen RA, Kajantie E, Kloppenburg M, Lim EM, Masciullo C, Mariotti S, Minelli C, Mitchell BD, Nagaraja R, Netea-Maier RT, Palotie A, Persani L, Piras MG, Psaty BM, Räikkönen K, Richards JB, Rivadeneira F, Sala C, Sabra MM, Sattar N, Shields BM, Soranzo N, Starr JM, Stott DJ, Sweep FC, Usala G, van der Klauw MM, van Heemst D, van Mullem A, H Vermeulen S, Visser WE, Walsh JP, Westendorp RG, Widen E, Zhai G, Cucca F, Deary IJ, Eriksson JG, Ferrucci L, Fox CS, Jukema JW, Kiemeney LA, Pramstaller PP, Schlessinger D, Shuldiner AR, Slagboom EP, Uitterlinden AG, Vaidya B, Visser TJ, Wolffenbuttel BH, Meulenbelt I, Rotter JI, Spector TD, Hicks AA, Toniolo D, Sanna S, Peeters RP and Naitza S

    Istituto di Ricerca Genetica e Biomedica (IRGB), Consiglio Nazionale delle Ricerche, c/o Cittadella Universitaria di Monserrato, Monserrato, Cagliari, Italy ; Dipartimento di Scienze Biomediche, Università di Sassari, Sassari, Italy.

    Thyroid hormone is essential for normal metabolism and development, and overt abnormalities in thyroid function lead to common endocrine disorders affecting approximately 10% of individuals over their life span. In addition, even mild alterations in thyroid function are associated with weight changes, atrial fibrillation, osteoporosis, and psychiatric disorders. To identify novel variants underlying thyroid function, we performed a large meta-analysis of genome-wide association studies for serum levels of the highly heritable thyroid function markers TSH and FT4, in up to 26,420 and 17,520 euthyroid subjects, respectively. Here we report 26 independent associations, including several novel loci for TSH (PDE10A, VEGFA, IGFBP5, NFIA, SOX9, PRDM11, FGF7, INSR, ABO, MIR1179, NRG1, MBIP, ITPK1, SASH1, GLIS3) and FT4 (LHX3, FOXE1, AADAT, NETO1/FBXO15, LPCAT2/CAPNS2). Notably, only limited overlap was detected between TSH and FT4 associated signals, in spite of the feedback regulation of their circulating levels by the hypothalamic-pituitary-thyroid axis. Five of the reported loci (PDE8B, PDE10A, MAF/LOC440389, NETO1/FBXO15, and LPCAT2/CAPNS2) show strong gender-specific differences, which offer clues for the known sexual dimorphism in thyroid function and related pathologies. Importantly, the TSH-associated loci contribute not only to variation within the normal range, but also to TSH values outside the reference range, suggesting that they may be involved in thyroid dysfunction. Overall, our findings explain, respectively, 5.64% and 2.30% of total TSH and FT4 trait variance, and they improve the current knowledge of the regulation of hypothalamic-pituitary-thyroid axis function and the consequences of genetic variation for hypo- or hyperthyroidism.

    PLoS genetics 2013;9;2;e1003266

  • Comparative Study of Transcriptome Profiles of Mechanical- and Skin-Transformed Schistosoma mansoni Schistosomula.

    Protasio AV, Dunne DW and Berriman M

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, United Kingdom ; Department of Pathology, University of Cambridge, Cambridge, United Kingdom.

    Schistosome infection begins with the penetration of cercariae through healthy unbroken host skin. This process leads to the transformation of the free-living larvae into obligate parasites called schistosomula. This irreversible transformation, which occurs in as little as two hours, involves casting the cercaria tail and complete remodelling of the surface membrane. At this stage, parasites are vulnerable to host immune attack and oxidative stress. Consequently, the mechanisms by which the parasite recognises and swiftly adapts to the human host are still the subject of many studies, especially in the context of development of intervention strategies against schistosomiasis infection. Because obtaining enough material from in vivo infections is not always feasible for such studies, the transformation process is often mimicked in the laboratory by application of shear pressure to a cercarial sample resulting in mechanically transformed (MT) schistosomula. These parasites share remarkable morphological and biochemical similarity to the naturally transformed counterparts and have been considered a good proxy for parasites undergoing natural infection. Relying on this equivalency, MT schistosomula have been used almost exclusively in high-throughput studies of gene expression, identification of drug targets and identification of effective drugs against schistosomes. However, the transcriptional equivalency between skin-transformed (ST) and MT schistosomula has never been proven. In our approach to compare these two types of schistosomula preparations and to explore differences in gene expression triggered by the presence of a skin barrier, we performed RNA-seq transcriptome profiling of ST and MT schistosomula at 24 hours post transformation. We report that these two very distinct schistosomula preparations differ only in the expression of 38 genes (out of ∼11,000), providing convincing evidence to resolve the skin vs. mechanical long-lasting controversy.

    PLoS neglected tropical diseases 2013;7;3;e2091

  • Targeting MYCN in Neuroblastoma by BET Bromodomain Inhibition.

    Puissant A, Frumm SM, Alexe G, Bassil CF, Qi J, Chanthery YH, Nekritz EA, Zeid R, Gustafson WC, Greninger P, Garnett MJ, McDermott U, Benes CH, Kung AL, Weiss WA, Bradner JE and Stegmaier K

    Departments of 1Pediatric Oncology and 2Medical Oncology, Dana-Farber Cancer Institute; 3Boston Children's Hospital; 4Department of Medicine, Harvard Medical School; 5Bioinformatics Graduate Program, Boston University, Boston; 6The Broad Institute of Harvard University and Massachusetts Institute of Technology, Cambridge; 7Massachusetts General Hospital Cancer Center, Harvard Medical School, Charlestown, Massachusetts; 8Department of Pediatrics, Helen Diller Family Comprehensive Cancer Center; 9Departments of Neurology and Neurosurgery, Brain Tumor Research Center, University of California, San Francisco, San Francisco, California; and 10Cancer Genome Project, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, United Kingdom.

    Bromodomain inhibition comprises a promising therapeutic strategy in cancer, particularly for hematologic malignancies. To date, however, genomic biomarkers to direct clinical translation have been lacking. We conducted a cell-based screen of genetically defined cancer cell lines using a prototypical inhibitor of BET bromodomains. Integration of genetic features with chemosensitivity data revealed a robust correlation between MYCN amplification and sensitivity to bromodomain inhibition. We characterized the mechanistic and translational significance of this finding in neuroblastoma, a childhood cancer with frequent amplification of MYCN. Genome-wide expression analysis showed downregulation of the MYCN transcriptional program accompanied by suppression of MYCN transcription. Functionally, bromodomain-mediated inhibition of MYCN impaired growth and induced apoptosis in neuroblastoma. BRD4 knockdown phenocopied these effects, establishing BET bromodomains as transcriptional regulators of MYCN. BET inhibition conferred a significant survival advantage in 3 in vivo neuroblastoma models, providing a compelling rationale for developing BET bromodomain inhibitors in patients with neuroblastoma.

    Funded by: NCI NIH HHS: P01 CA081403, R01 CA102321

    Cancer discovery 2013

  • SpoIVA and SipL Are Clostridium difficile Spore Morphogenetic Proteins.

    Putnam EE, Nock AM, Lawley TD and Shen A

    Department of Microbiology and Molecular Genetics, University of Vermont, Burlington, Vermont, USA.

    Clostridium difficile is a major nosocomial pathogen whose infections are difficult to treat because of their frequent recurrence. The spores of C. difficile are responsible for these clinical features, as they resist common disinfectants and antibiotic treatment. Although spores are the major transmissive form of C. difficile, little is known about their composition or morphogenesis. Spore morphogenesis has been well characterized for Bacillus sp., but Bacillus sp. spore coat proteins are poorly conserved in Clostridium sp. Of the known spore morphogenetic proteins in Bacillus subtilis, SpoIVA is one of the mostly highly conserved in the Bacilli and the Clostridia. Using genetic analyses, we demonstrate that SpoIVA is required for proper spore morphogenesis in C. difficile. In particular, a spoIVA mutant exhibits defects in spore coat localization but not cortex formation. Our study also identifies SipL, a previously uncharacterized protein found in proteomic studies of C. difficile spores, as another critical spore morphogenetic protein, since a sipL mutant phenocopies a spoIVA mutant. Biochemical analyses and mutational analyses indicate that SpoIVA and SipL directly interact. This interaction depends on the Walker A ATP binding motif of SpoIVA and the LysM domain of SipL. Collectively, these results provide the first insights into spore morphogenesis in C. difficile.

    Funded by: NIGMS NIH HHS: R00 GM092934

    Journal of bacteriology 2013;195;6;1214-25

  • Identification and prioritization of novel uncharacterized peptidases for biochemical characterization.

    Rawlings ND

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.

    Genome sequencing projects are generating enormous amounts of biological data that require analysis, which in turn identifies genes and proteins that require characterization. Enzymes that act on proteins are especially difficult to characterize because of the time required to distinguish one from another. This is particularly true of peptidases, the enzymes that activate, inactivate and degrade proteins. This article aims to identify clusters of sequences each of which represents the species variants of a single putative peptidase that is widely distributed and is thus merits biochemical characterization. The MEROPS database maintains large collections of sequences, references, substrate cleavage positions and inhibitor interactions of peptidases and their homologues. MEROPS also maintains a hierarchical classification of peptidase homologues, in which sequences are clustered as species variants of a single peptidase; homologous sequences are assembled into a family; and families are clustered into a clan. For each family, an alignment and a phylogenetic tree are generated. By assigning an identifier to a peptidase that has been biochemically characterized from a particular species (called a holotype), the identifier can be automatically extended to sequences from other species that cluster with the holotype. This permits transference of annotation from the holotype to other members of the cluster. By extending this concept to all peptidase homologues (including those of unknown function that have not been characterized) from model organisms representing all the major divisions of cellular life, clusters of sequences representing putative peptidases can also be identified. The 42 most widely distributed of these putative peptidases have been identified and discussed here and are prioritized as ideal candidates for biochemical characterization. Database URL: http://merops.sanger.ac.uk.

    Database : the journal of biological databases and curation 2013;2013;bat022

  • Genes involved in host-parasite interactions can be revealed by their correlated expression.

    Reid AJ and Berriman M

    Parasite genomics group, Wellcome Trust Sanger Institute, Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK. ar11@sanger.ac.uk

    Molecular interactions between a parasite and its host are key to the ability of the parasite to enter the host and persist. Our understanding of the genes and proteins involved in these interactions is limited. To better understand these processes it would be advantageous to have a range of methods to predict pairs of genes involved in such interactions. Correlated gene expression profiles can be used to identify molecular interactions within a species. Here we have extended the concept to different species, showing that genes with correlated expression are more likely to encode proteins, which directly or indirectly participate in host-parasite interaction. We go on to examine our predictions of molecular interactions between the malaria parasite and both its mammalian host and insect vector. Our approach could be applied to study any interaction between species, for example, between a host and its parasites or pathogens, but also symbiotic and commensal pairings.

    Funded by: Wellcome Trust: 098051

    Nucleic acids research 2013;41;3;1508-18

  • Secretory meningiomas are defined by combined KLF4 K409Q and TRAF7 mutations.

    Reuss DE, Piro RM, Jones DT, Simon M, Ketter R, Kool M, Becker A, Sahm F, Pusch S, Meyer J, Hagenlocher C, Schweizer L, Capper D, Kickingereder P, Mucha J, Koelsche C, Jäger N, Santarius T, Tarpey PS, Stephens PJ, Andrew Futreal P, Wellenreuther R, Kraus J, Lenartz D, Herold-Mende C, Hartmann C, Mawrin C, Giese N, Eils R, Collins VP, König R, Wiestler OD, Pfister SM and von Deimling A

    Department of Neuropathology, Institute of Pathology, Ruprecht-Karls-University Heidelberg, Im Neuenheimer Feld 224, 69120, Heidelberg, Germany.

    Meningiomas are among the most frequent intracranial tumors. The secretory variant of meningioma is characterized by glandular differentiation, formation of intracellular lumina and pseudopsammoma bodies, expression of a distinct pattern of cytokeratins and clinically by pronounced perifocal brain edema. Here we describe whole-exome sequencing analysis of DNA from 16 secretory meningiomas and corresponding constitutional tissues. All secretory meningiomas invariably harbored a mutation in both KLF4 and TRAF7. Validation in an independent cohort of 14 secretory meningiomas by Sanger sequencing or derived cleaved amplified polymorphic sequence (dCAPS) assay detected the same pattern, with KLF4 mutations observed in a total of 30/30 and TRAF7 mutations in 29/30 of these tumors. All KLF4 mutations were identical, affected codon 409 and resulted in a lysine to glutamine exchange (K409Q). KLF4 mutations were not found in 89 non-secretory meningiomas, 267 other intracranial tumors including gliomas, glioneuronal tumors, pituitary adenomas and metastases, 59 peripheral nerve sheath tumors and 52 pancreatic tumors. TRAF7 mutations were restricted to the WD40 domains. While KLF4 mutations were exclusively seen in secretory meningiomas, TRAF7 mutations were also observed in 7/89 (8 %) of non-secretory meningiomas. KLF4 and TRAF7 mutations were mutually exclusive with NF2 mutations. In conclusion, our findings suggest an essential contribution of combined KLF4 K409Q and TRAF7 mutations in the genesis of secretory meningioma and demonstrate a role for TRAF7 alterations in other non-NF2 meningiomas.

    Acta neuropathologica 2013

  • A pilot study of rapid whole-genome sequencing for the investigation of a Legionella outbreak.

    Reuter S, Harrison TG, Köser CU, Ellington MJ, Smith GP, Parkhill J, Peacock SJ, Bentley SD and Török ME

    The Wellcome Trust Sanger Institute, Hinxton, UK.

    Objectives: Epidemiological investigations of Legionnaires' disease outbreaks rely on the rapid identification and typing of clinical and environmental Legionella isolates in order to identify and control the source of infection. Rapid bacterial whole-genome sequencing (WGS) is an emerging technology that has the potential to rapidly discriminate outbreak from non-outbreak isolates in a clinically relevant time frame. Methods: We performed a pilot study to determine the feasibility of using bacterial WGS to differentiate outbreak from non-outbreak isolates collected during an outbreak of Legionnaires' disease. Seven Legionella isolates (three clinical and four environmental) were obtained from the reference laboratory and sequenced using the Illumina MiSeq platform at Addenbrooke's Hospital, Cambridge. Bioinformatic analysis was performed blinded to the epidemiological data at the Wellcome Trust Sanger Institute. Results: We were able to distinguish outbreak from non-outbreak isolates using bacterial WGS, and to confirm the probable environmental source. Our analysis also highlighted constraints, which were the small number of Legionella pneumophila isolates available for sequencing, and the limited number of published genomes for comparison. Conclusions: We have demonstrated the feasibility of using rapid WGS to investigate an outbreak of Legionnaires' disease. Future work includes building larger genomic databases of L pneumophila from both clinical and environmental sources, developing automated data interpretation software, and conducting a cost-benefit analysis of WGS versus current typing methods.

    BMJ open 2013;3;1

  • Mosaic PPM1D mutations are associated with predisposition to breast and ovarian cancer.

    Ruark E, Snape K, Humburg P, Loveday C, Bajrami I, Brough R, Rodrigues DN, Renwick A, Seal S, Ramsay E, Duarte Sdel V, Rivas MA, Warren-Perry M, Zachariou A, Campion-Flora A, Hanks S, Murray A, Pour NA, Douglas J, Gregory L, Rimmer A, Walker NM, Yang TP, Adlard JW, Barwell J, Berg J, Brady AF, Brewer C, Brice G, Chapman C, Cook J, Davidson R, Donaldson A, Douglas F, Eccles D, Evans DG, Greenhalgh L, Henderson A, Izatt L, Kumar A, Lalloo F, Miedzybrodzka Z, Morrison PJ, Paterson J, Porteous M, Rogers MT, Shanley S, Walker L, Gore M, Houlston R, Brown MA, Caufield MJ, Deloukas P, McCarthy MI, Todd JA, Breast and Ovarian Cancer Susceptibility Collaboration, Wellcome Trust Case Control Consortium, Turnbull C, Reis-Filho JS, Ashworth A, Antoniou AC, Lord CJ, Donnelly P and Rahman N

    Division of Genetics & Epidemiology, The Institute of Cancer Research, Sutton SM2 5NG, UK.

    Improved sequencing technologies offer unprecedented opportunities for investigating the role of rare genetic variation in common disease. However, there are considerable challenges with respect to study design, data analysis and replication. Using pooled next-generation sequencing of 507 genes implicated in the repair of DNA in 1,150 samples, an analytical strategy focused on protein-truncating variants (PTVs) and a large-scale sequencing case-control replication experiment in 13,642 individuals, here we show that rare PTVs in the p53-inducible protein phosphatase PPM1D are associated with predisposition to breast cancer and ovarian cancer. PPM1D PTV mutations were present in 25 out of 7,781 cases versus 1 out of 5,861 controls (P = 1.12 × 10(-5)), including 18 mutations in 6,912 individuals with breast cancer (P = 2.42 × 10(-4)) and 12 mutations in 1,121 individuals with ovarian cancer (P = 3.10 × 10(-9)). Notably, all of the identified PPM1D PTVs were mosaic in lymphocyte DNA and clustered within a 370-base-pair region in the final exon of the gene, carboxy-terminal to the phosphatase catalytic domain. Functional studies demonstrate that the mutations result in enhanced suppression of p53 in response to ionizing radiation exposure, suggesting that the mutant alleles encode hyperactive PPM1D isoforms. Thus, although the mutations cause premature protein truncation, they do not result in the simple loss-of-function effect typically associated with this class of variant, but instead probably have a gain-of-function effect. Our results have implications for the detection and management of breast and ovarian cancer risk. More generally, these data provide new insights into the role of rare and of mosaic genetic variants in common conditions, and the use of sequencing in their identification.

    Funded by: Cancer Research UK: C12292/A11174; Medical Research Council: G0000934, G0900747 91070; Wellcome Trust: 068545/Z/02, 090532/Z/09/Z, 091157

    Nature 2013;493;7432;406-10

  • Characterization and comparative analysis of the complete Haemonchus contortus β-tubulin gene family and implications for benzimidazole resistance in strongylid nematodes.

    Saunders GI, Wasmuth JD, Beech R, Laing R, Hunt M, Naghra H, Cotton JA, Berriman M, Britton C and Gilleard JS

    Institute of Infection, Immunity and Inflammation, College of Medical, Veterinary and Life Sciences, University of Glasgow, 464 Bearsden Road, Glasgow, Scotland G61 1QH, UK; Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.

    Parasitic nematode β-tubulin genes are of particular interest because they are the targets of benzimidazole drugs. However, in spite of this, the full β-tubulin gene family has not been characterized for any parasitic nematode to date. Haemonchus contortus is the parasite species for which we understand benzimidazole resistance the best and its close phylogenetic relationship with Caenorhabditis elegans potentially allows inferences of gene function by comparative analysis. Consequently, we have characterized the full β-tubulin gene family in H. contortus. Further to the previously identified Hco-tbb-iso-1 and Hco-tbb-iso-2 genes, we have characterized two additional family members designated Hco-tbb-iso-3 and Hco-tbb-iso-4. We show that Hco-tbb-iso-1 is not a one-to-one orthologue with Cel-ben-1, the only β-tubulin gene in C. elegans that is a benzimidazole drug target. Instead, both Hco-tbb-iso-1 and Hco-tbb-iso-2 have a complex evolutionary relationship with three C. elegans β-tubulin genes: Cel-ben-1, Cel-tbb-1 and Cel-tbb-2. Furthermore, we show that both Hco-tbb-iso-1 and Hco-tbb-iso-2 are highly expressed in adult worms; in contrast, Hco-tbb-iso-3 and Hco-tbb-iso-4 are expressed only at very low levels and are orthologous to the Cel-mec-7 and Cel-tbb-4 genes, respectively, suggesting that they have specialized functional roles. Indeed, we have found that the expression pattern of Hco-tbb-iso-3 in H. contortus is identical to that of Cel-mec-7 in C. elegans, being expressed in just six "touch receptor" mechano-sensory neurons. These results suggest that further investigation is warranted into the potential involvement of strongylid isotype-2 β-tubulin genes in mechanisms of benzimidazole resistance.

    International journal for parasitology 2013;43;6;465-75

  • Exome sequencing identifies DYNC2H1 mutations as a common cause of asphyxiating thoracic dystrophy (Jeune syndrome) without major polydactyly, renal or retinal involvement.

    Schmidts M, Arts HH, Bongers EM, Yap Z, Oud MM, Antony D, Duijkers L, Emes RD, Stalker J, Yntema JB, Plagnol V, Hoischen A, Gilissen C, Forsythe E, Lausch E, Veltman JA, Roeleveld N, Superti-Furga A, Kutkowska-Kazmierczak A, Kamsteeg EJ, Elçioglu N, van Maarle MC, Graul-Neumann LM, Devriendt K, Smithson SF, Wellesley D, Verbeek NE, Hennekam RC, Kayserili H, Scambler PJ, Beales PL, UK10K, Knoers NV, Roepman R and Mitchison HM

    1Molecular Medicine Unit, Birth Defects Research Centre, University College London (UCL) Institute of Child Health, London, UK.

    BACKGROUND: Jeune asphyxiating thoracic dystrophy (JATD) is a rare, often lethal, recessively inherited chondrodysplasia characterised by shortened ribs and long bones, sometimes accompanied by polydactyly, and renal, liver and retinal disease. Mutations in intraflagellar transport (IFT) genes cause JATD, including the IFT dynein-2 motor subunit gene DYNC2H1. Genetic heterogeneity and the large DYNC2H1 gene size have hindered JATD genetic diagnosis. AIMS AND METHODS: To determine the contribution to JATD we screened DYNC2H1 in 71 JATD patients JATD patients combining SNP mapping, Sanger sequencing and exome sequencing. RESULTS AND CONCLUSIONS: We detected 34 DYNC2H1 mutations in 29/71 (41%) patients from 19/57 families (33%), showing it as a major cause of JATD especially in Northern European patients. This included 13 early protein termination mutations (nonsense/frameshift, deletion, splice site) but no patients carried these in combination, suggesting the human phenotype is at least partly hypomorphic. In addition, 21 missense mutations were distributed across DYNC2H1 and these showed some clustering to functional domains, especially the ATP motor domain. DYNC2H1 patients largely lacked significant extra-skeletal involvement, demonstrating an important genotype-phenotype correlation in JATD. Significant variability exists in the course and severity of the thoracic phenotype, both between affected siblings with identical DYNC2H1 alleles and among individuals with different alleles, which suggests the DYNC2H1 phenotype might be subject to modifier alleles, non-genetic or epigenetic factors. Assessment of fibroblasts from patients showed accumulation of anterograde IFT proteins in the ciliary tips, confirming defects similar to patients with other retrograde IFT machinery mutations, which may be of undervalued potential for diagnostic purposes.

    Journal of medical genetics 2013

  • Mechanisms controlling the temporal degradation of Nek2A and Kif18A by the APC/C-Cdc20 complex.

    Sedgwick GG, Hayward DG, Di Fiore B, Pardo M, Yu L, Pines J and Nilsson J

    The Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Copenhagen, Denmark.

    The Anaphase Promoting Complex/Cyclosome (APC/C) in complex with its co-activator Cdc20 is responsible for targeting proteins for ubiquitin-mediated degradation during mitosis. The activity of APC/C-Cdc20 is inhibited during prometaphase by the Spindle Assembly Checkpoint (SAC) yet certain substrates escape this inhibition. Nek2A degradation during prometaphase depends on direct binding of Nek2A to the APC/C via a C-terminal MR dipeptide but whether this motif alone is sufficient is not clear. Here, we identify Kif18A as a novel APC/C-Cdc20 substrate and show that Kif18A degradation depends on a C-terminal LR motif. However in contrast to Nek2A, Kif18A is not degraded until anaphase showing that additional mechanisms contribute to Nek2A degradation. We find that dimerization via the leucine zipper, in combination with the MR motif, is required for stable Nek2A binding to and ubiquitination by the APC/C. Nek2A and the mitotic checkpoint complex (MCC) have an overlap in APC/C subunit requirements for binding and we propose that Nek2A binds with high affinity to apo-APC/C and is degraded by the pool of Cdc20 that avoids inhibition by the SAC.

    Funded by: Wellcome Trust: 079643/Z/06/Z

    The EMBO journal 2013;32;2;303-14

  • Conceptual links between DNA methylation reprogramming in the early embryo and primordial germ cells.

    Seisenberger S, Peat JR and Reik W

    Epigenetics Programme, The Babraham Institute, Cambridge CB22 3AT, UK. Electronic address: stefanie.seisenberger@babraham.ac.uk.

    DNA methylation is a carrier of important regulatory information that undergoes global reprogramming in the mammalian germ line, including pre-implantation embryos and primordial germ cells (PGCs). A flurry of recent studies have employed technical advances to generate global profiles of methylation and hydroxymethylation in these cells, unravelling the dynamics of methylation erasure at single locus resolution. Active demethylation in the zygote, involving extensive oxidation, is followed by passive loss over early cell divisions. Certain gamete-contributed methylation marks appear to have evolved non-canonical mechanisms for targeted maintenance of methylation in the face of these processes. These protected sequences include the imprinting control regions (ICRs) required for parental imprinting but also a surprising number of other regions. Such targeted maintenance mechanisms may also operate at certain sequences during early PGC migration when global passive demethylation occurs. In later gonadal PGCs, imprints must be reset and this may be achieved through the targeting of active mechanisms including oxidation. Thus, emerging evidence paints a complex picture whereby active and passive demethylation pathways operate synergistically and in parallel to ensure robust erasure in the early embryo and PGCs.

    Current opinion in cell biology 2013

  • Reprogramming DNA methylation in the mammalian life cycle: building and breaking epigenetic barriers.

    Seisenberger S, Peat JR, Hore TA, Santos F, Dean W and Reik W

    Epigenetics Programme, The Babraham Institute, , Cambridge CB22 3AT, UK.

    In mammalian development, epigenetic modifications, including DNA methylation patterns, play a crucial role in defining cell fate but also represent epigenetic barriers that restrict developmental potential. At two points in the life cycle, DNA methylation marks are reprogrammed on a global scale, concomitant with restoration of developmental potency. DNA methylation patterns are subsequently re-established with the commitment towards a distinct cell fate. This reprogramming of DNA methylation takes place firstly on fertilization in the zygote, and secondly in primordial germ cells (PGCs), which are the direct progenitors of sperm or oocyte. In each reprogramming window, a unique set of mechanisms regulates DNA methylation erasure and re-establishment. Recent advances have uncovered roles for the TET3 hydroxylase and passive demethylation, together with base excision repair (BER) and the elongator complex, in methylation erasure from the zygote. Deamination by AID, BER and passive demethylation have been implicated in reprogramming in PGCs, but the process in its entirety is still poorly understood. In this review, we discuss the dynamics of DNA methylation reprogramming in PGCs and the zygote, the mechanisms involved and the biological significance of these events. Advances in our understanding of such natural epigenetic reprogramming are beginning to aid enhancement of experimental reprogramming in which the role of potential mechanisms can be investigated in vitro. Conversely, insights into in vitro reprogramming techniques may aid our understanding of epigenetic reprogramming in the germline and supply important clues in reprogramming for therapies in regenerative medicine.

    Philosophical transactions of the Royal Society of London. Series B, Biological sciences 2013;368;1609;20110330

  • Playing the 'next-generation game'.

    Seth-Smith H

    Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK. microbes@sanger.ac.uk.

    Advances in single-molecule DNA sequencing are enabling research into the fine resolution of DNA structure, and rapid, direct sequencing of pathogen genomes.

    Nature reviews. Microbiology 2013;11;2;74

  • Whole-genome sequences of Chlamydia trachomatis directly from clinical samples without culture.

    Seth-Smith HM, Harris SR, Skilton RJ, Radebe FM, Golparian D, Shipitsyna E, Duy PT, Scott P, Cutcliffe LT, O'Neill C, Parmar S, Pitt R, Baker S, Ison CA, Marsh P, Jalal H, Lewis DA, Unemo M, Clarke IN, Parkhill J and Thomson NR

    Pathogen Genomics, The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SA, United Kingdom;

    The use of whole-genome sequencing as a tool for the study of infectious bacteria is of growing clinical interest. Chlamydia trachomatis is responsible for sexually transmitted infections and the blinding disease trachoma, which affect hundreds of millions of people worldwide. Recombination is widespread within the genome of C. trachomatis, thus whole-genome sequencing is necessary to understand the evolution, diversity, and epidemiology of this pathogen. Culture of C. trachomatis has, until now, been a prerequisite to obtain DNA for whole-genome sequencing; however, as C. trachomatis is an obligate intracellular pathogen, this procedure is technically demanding and time consuming. Discarded clinical samples represent a large resource for sequencing the genomes of pathogens, yet clinical swabs frequently contain very low levels of C. trachomatis DNA and large amounts of contaminating microbial and human DNA. To determine whether it is possible to obtain whole-genome sequences from bacteria without the need for culture, we have devised an approach that combines immunomagnetic separation (IMS) for targeted bacterial enrichment with multiple displacement amplification (MDA) for whole-genome amplification. Using IMS-MDA in conjunction with high-throughput multiplexed Illumina sequencing, we have produced the first whole bacterial genome sequences direct from clinical samples. We also show that this method can be used to generate genome data from nonviable archived samples. This method will prove a useful tool in answering questions relating to the biology of many difficult-to-culture or fastidious bacteria of clinical concern.

    Genome research 2013

  • Genome Sequence of Chlamydia psittaci Strain 01DC12 Originating from Swine.

    Seth-Smith HM, Sait M, Sachse K, Gaede W, Longbottom D and Thomson NR

    Pathogen Genomics, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, United Kingdom.

    Chlamydia psittaci is the etiological agent of psittacosis and is a zoonotic pathogen infecting birds and a variety of mammalian hosts. Here we report the genome sequence of the porcine strain 01DC12 which is representative of a novel clade of C. psittaci belonging to ompA genotype E.

    Genome announcements 2013;1;1

  • Stress-induced lipocalin-2 controls dendritic spine formation and neuronal activity in the amygdala.

    Skrzypiec AE, Shah RS, Schiavon E, Baker E, Skene N, Pawlak R and Mucha M

    University of Exeter Medical School, Exeter, United Kingdom.

    Behavioural adaptation to psychological stress is dependent on neuronal plasticity and dysfunction at this cellular level may underlie the pathogenesis of affective disorders such as depression and post-traumatic stress disorder. Taking advantage of genome-wide microarray assay, we performed detailed studies of stress-affected transcripts in the amygdala - an area which forms part of the innate fear circuit in mammals. Having previously demonstrated the role of lipocalin-2 (Lcn-2) in promoting stress-induced changes in dendritic spine morphology/function and neuronal excitability in the mouse hippocampus, we show here that the Lcn-2 gene is one of the most highly upregulated transcripts detected by microarray analysis in the amygdala after acute restraint-induced psychological stress. This is associated with increased Lcn-2 protein synthesis, which is found on immunohistochemistry to be predominantly localised to neurons. Stress-naïve Lcn-2(-/-) mice show a higher spine density in the basolateral amygdala and a 2-fold higher rate of neuronal firing rate compared to wild-type mice. Unlike their wild-type counterparts, Lcn-2(-/-) mice did not show an increase in dendritic spine density in response to stress but did show a distinct pattern of spine morphology. Thus, amygdala-specific neuronal responses to Lcn-2 may represent a mechanism for behavioural adaptation to psychological stress.

    PloS one 2013;8;4;e61046

  • Sequencing of the sea lamprey (Petromyzon marinus) genome provides insights into vertebrate evolution.

    Smith JJ, Kuraku S, Holt C, Sauka-Spengler T, Jiang N, Campbell MS, Yandell MD, Manousaki T, Meyer A, Bloom OE, Morgan JR, Buxbaum JD, Sachidanandam R, Sims C, Garruss AS, Cook M, Krumlauf R, Wiedemann LM, Sower SA, Decatur WA, Hall JA, Amemiya CT, Saha NR, Buckley KM, Rast JP, Das S, Hirano M, McCurley N, Guo P, Rohner N, Tabin CJ, Piccinelli P, Elgar G, Ruffier M, Aken BL, Searle SM, Muffato M, Pignatelli M, Herrero J, Jones M, Brown CT, Chung-Davidson YW, Nanlohy KG, Libants SV, Yeh CY, McCauley DW, Langeland JA, Pancer Z, Fritzsch B, de Jong PJ, Zhu B, Fulton LL, Theising B, Flicek P, Bronner ME, Warren WC, Clifton SW, Wilson RK and Li W

    1] Department of Biology, University of Kentucky, Lexington, Kentucky, USA. [2] Benaroya Research Institute at Virginia Mason, Seattle, Washington, USA.

    Lampreys are representatives of an ancient vertebrate lineage that diverged from our own ∼500 million years ago. By virtue of this deeply shared ancestry, the sea lamprey (P. marinus) genome is uniquely poised to provide insight into the ancestry of vertebrate genomes and the underlying principles of vertebrate biology. Here, we present the first lamprey whole-genome sequence and assembly. We note challenges faced owing to its high content of repetitive elements and GC bases, as well as the absence of broad-scale sequence information from closely related species. Analyses of the assembly indicate that two whole-genome duplications likely occurred before the divergence of ancestral lamprey and gnathostome lineages. Moreover, the results help define key evolutionary events within vertebrate lineages, including the origin of myelin-associated proteins and the development of appendages. The lamprey genome provides an important resource for reconstructing vertebrate origins and the evolutionary events that have shaped the genomes of extant organisms.

    Nature genetics 2013

  • Sherlock Genomes - viral investigator.

    Smith SE and Wash RS

    Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK. microbes@sanger.ac.uk.

    This month's Genome Watch highlights how deep sequencing technologies have vastly reduced the time and prior knowledge needed to generate viral genomes.

    Nature reviews. Microbiology 2013;11;3;150

  • Genetic variants from lipid-related pathways and risk for incident myocardial infarction.

    Song C, Pedersen NL, Reynolds CA, Sabater-Lleal M, Kanoni S, Willenborg C, CARDIoGRAMplusC4D Consortium, Syvänen AC, Watkins H, Hamsten A, Prince JA and Ingelsson E

    Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden.

    Background: Circulating lipids levels, as well as several familial lipid metabolism disorders, are strongly associated with initiation and progression of atherosclerosis and incidence of myocardial infarction (MI). Objectives: We hypothesized that genetic variants associated with circulating lipid levels would also be associated with MI incidence, and have tested this in three independent samples. Using age- and sex-adjusted additive genetic models, we analyzed 554 single nucleotide polymorphisms (SNPs) in 41 candidate gene regions proposed to be involved in lipid-related pathways potentially predisposing to incidence of MI in 2,602 participants of the Swedish Twin Register (STR; 57% women). All associations with nominal P<0.01 were further investigated in the Uppsala Longitudinal Study of Adult Men (ULSAM; N = 1,142). Results: In the present study, we report associations of lipid-related SNPs with incident MI in two community-based longitudinal studies with in silico replication in a meta-analysis of genome-wide association studies. Overall, there were 9 SNPs in STR with nominal P-value <0.01 that were successfully genotyped in ULSAM. rs4149313 located in ABCA1 was associated with MI incidence in both longitudinal study samples with nominal significance (hazard ratio, 1.36 and 1.40; P-value, 0.004 and 0.015 in STR and ULSAM, respectively). In silico replication supported the association of rs4149313 with coronary artery disease in an independent meta-analysis including 173,975 individuals of European descent from the CARDIoGRAMplusC4D consortium (odds ratio, 1.03; P-value, 0.048). Conclusions: rs4149313 is one of the few amino acid changing variants in ABCA1 known to associate with reduced cholesterol efflux. Our results are suggestive of a weak association between this variant and the development of atherosclerosis and MI.

    PloS one 2013;8;3;e60454

  • The intermediate filament protein, vimentin, is a regulator of NOD2 activity.

    Stevens C, Henderson P, Nimmo ER, Soares DC, Dogan B, Simpson KW, Barrett JC, International Inflammatory Bowel Disease Genetics Consortium, Wilson DC and Satsangi J

    Centre for Molecular Medicine, University of Edinburgh, Western General Hospital, Crewe Road, Edinburgh EH4 2XU, UK; craig.stevens@ed.ac.uk.

    Objective: Mutations in the nucleotide-binding oligomerisation domain-containing protein 2 (NOD2) gene remain the strongest genetic determinants for Crohn's disease (CD). Having previously identified vimentin as a novel NOD2-interacting protein, the authors aimed to investigate the regulatory effects of vimentin on NOD2 function and the association of variants in Vim with CD susceptibility.

    Design: Coimmunoprecipitation, fluorescent microscopy and fractionation were used to confirm the interaction between NOD2 and vimentin. HEK293 cells stably expressing wild-type NOD2 or a NOD2 frameshift variant (L1007fs) and SW480 colonic epithelial cells were used alongside the vimentin inhibitor, withaferin A (WFA), to assess effects on NOD2 function using the nuclear factor-kappaB (NF-κB) reporter gene, green fluorescent protein-LC3-based autophagy, and bacterial gentamicin protection assays. International genome-wide association meta-analysis data were used to test for associations of single-nucleotide polymorphisms in Vim with CD susceptibility.

    Results: The leucine-rich repeat domain of NOD2 contained the elements required for vimentin binding; CD-associated polymorphisms disrupted this interaction. NOD2 and vimentin colocalised at the cell plasma membrane, and cytosolic mislocalisation of the L1007fs and R702W variants correlated with an inability to interact with vimentin. Use of WFA demonstrated that vimentin was required for NOD2-dependent NF-κB activation and muramyl dipeptide-induced autophagy induction, and that NOD2 and vimentin regulated the invasion and survival properties of a CD-associated adherent-invasive Escherichia coli strain. Genetic analysis revealed an association signal across the haplotype block containing Vim.

    Conclusion: Vimentin is an important regulator of NOD2 function and a potential novel therapeutic target in the treatment of CD. In addition, Vim is a candidate susceptibility gene for CD, supporting the functional data.

    Gut 2013;62;5;695-707

  • Mutations in B3GALNT2 Cause Congenital Muscular Dystrophy and Hypoglycosylation of α-Dystroglycan.

    Stevens E, Carss KJ, Cirak S, Foley AR, Torelli S, Willer T, Tambunan DE, Yau S, Brodd L, Sewry CA, Feng L, Haliloglu G, Orhan D, Dobyns WB, Enns GM, Manning M, Krause A, Salih MA, Walsh CA, Hurles M, Campbell KP, Manzini MC, UK10K Consortium, Stemple D, Lin YY and Muntoni F

    Dubowitz Neuromuscular Centre, UCL Institute of Child Health, London WC1N 1EH, UK.

    Mutations in several known or putative glycosyltransferases cause glycosylation defects in α-dystroglycan (α-DG), an integral component of the dystrophin glycoprotein complex. The hypoglycosylation reduces the ability of α-DG to bind laminin and other extracellular matrix ligands and is responsible for the pathogenesis of an inherited subset of muscular dystrophies known as the dystroglycanopathies. By exome and Sanger sequencing we identified two individuals affected by a dystroglycanopathy with mutations in β-1,3-N-acetylgalactosaminyltransferase 2 (B3GALNT2). B3GALNT2 transfers N-acetyl galactosamine (GalNAc) in a β-1,3 linkage to N-acetyl glucosamine (GlcNAc). A subsequent study of a separate cohort of individuals identified recessive mutations in four additional cases that were all affected by dystroglycanopathy with structural brain involvement. We show that functional dystroglycan glycosylation was reduced in the fibroblasts and muscle (when available) of these individuals via flow cytometry, immunoblotting, and immunocytochemistry. B3GALNT2 localized to the endoplasmic reticulum, and this localization was perturbed by some of the missense mutations identified. Moreover, knockdown of b3galnt2 in zebrafish recapitulated the human congenital muscular dystrophy phenotype with reduced motility, brain abnormalities, and disordered muscle fibers with evidence of damage to both the myosepta and the sarcolemma. Functional dystroglycan glycosylation was also reduced in the b3galnt2 knockdown zebrafish embryos. Together these results demonstrate a role for B3GALNT2 in the glycosylation of α-DG and show that B3GALNT2 mutations can cause dystroglycanopathy with muscle and brain involvement.

    American journal of human genetics 2013

  • Harnessing the genome: development of a hierarchical typing scheme for meticillin-resistant Staphylococcus aureus.

    Stone MJ, Wain J, Ivens A, Feltwell T, Kearns AM and Bamford KB

    1Department of Microbiology, Imperial College Healthcare NHS Trust, London, UK.

    A major barrier to using genome sequencing in medical microbiology is the ability to interpret the data. New schemes that provide information about the importance of sequence variation in both clinical and public health settings are required. Meticillin-resistant Staphylococcus aureus (MRSA) is an important nosocomial pathogen that is being observed with increasing frequency in community settings. Better tools are needed to improve our understanding of its transmissibility and micro-epidemiology in order to develop effective interventions. Using DNA microarray technology we identified a set of 20 binary targets whose presence or absence could be determined by PCR, producing a PCR binary typing scheme (PCR-BT). This was combined with multi-locus sequence type-based, sequence nucleotide polymorphism typing to form a hierarchical typing scheme. When applied to a set of epidemiologically unrelated isolates, a high degree of concordance was observed with PFGE (98.8 %). The scheme was able to detect the presence or absence of an outbreak strain in eight out of nine outbreak investigations, demonstrating epidemiological concordance. PCR-BT was better than PFGE at distinguishing between outbreak strains, particularly where epidemic MRSA-15 was involved. The method developed here is a rapid, digital typing scheme for S. aureus for use in both micro- and macro-epidemiological investigations that has the advantage of being suitable for use in routine diagnostic laboratories. The targets are defined and therefore the types can be defined by any platform capable of detecting the sequences used, including whole genome sequencing.

    Journal of medical microbiology 2013;62;Pt 1;36-45

  • Journeys into the genome of cancer cells.

    Stratton MR

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, UK. mrs@sanger.ac.uk.

    EMBO molecular medicine 2013

  • Detecting low-affinity extracellular protein interactions using protein microarrays.

    Sun Y and Wright GJ

    Cell Surface Signalling Laboratory, Wellcome Trust Sanger Institute, Hinxton, Cambridge, United Kingdom.

    Low-affinity extracellular protein interactions are critical for cellular recognition processes, but are not generally detected by methods that can be applied in a high-throughput manner. This unit describes a protein microarray platform that significantly improves the throughput of assays capable of detecting transient extracellular protein interactions. These methodological improvements now permit screening for novel extracellular receptor-ligand interactions on a genome-wide scale. Curr. Protoc. Protein Sci. 72:27.5.1-27.5.15. © 2013 by John Wiley & Sons, Inc.

    Current protocols in protein science / editorial board, John E. Coligan ... [et al.] 2013;Chapter 27;Unit27.5

  • Rapid whole-genome sequencing for investigation of a suspected tuberculosis outbreak.

    Török ME, Reuter S, Bryant J, Köser CU, Stinchcombe SV, Nazareth B, Ellington MJ, Bentley SD, Smith GP, Parkhill J and Peacock SJ

    Department of Medicine, University of Cambridge, Cambridge, United Kingdom.

    Two Southeast Asian students attending the same school in the United Kingdom presented with pulmonary tuberculosis. An epidemiological investigation failed to link the two cases, and drug resistance profiles of the Mycobacterium tuberculosis isolates were discrepant. Whole-genome sequencing of the isolates found them to be genetically identical, suggesting a missed transmission event.

    Journal of clinical microbiology 2013;51;2;611-4

  • The genomes of four tapeworm species reveal adaptations to parasitism.

    Tsai IJ, Zarowiecki M, Holroyd N, Garciarrubio A, Sanchez-Flores A, Brooks KL, Tracey A, Bobes RJ, Fragoso G, Sciutto E, Aslett M, Beasley H, Bennett HM, Cai J, Camicia F, Clark R, Cucher M, De Silva N, Day TA, Deplazes P, Estrada K, Fernández C, Holland PW, Hou J, Hu S, Huckvale T, Hung SS, Kamenetzky L, Keane JA, Kiss F, Koziol U, Lambert O, Liu K, Luo X, Luo Y, Macchiaroli N, Nichol S, Paps J, Parkinson J, Pouchkina-Stantcheva N, Riddiford N, Rosenzvit M, Salinas G, Wasmuth JD, Zamanian M, Zheng Y, Taenia solium Genome Consortium, Cai X, Soberón X, Olson PD, Laclette JP, Brehm K and Berriman M

    Parasite Genomics, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.

    Tapeworms (Cestoda) cause neglected diseases that can be fatal and are difficult to treat, owing to inefficient drugs. Here we present an analysis of tapeworm genome sequences using the human-infective species Echinococcus multilocularis, E. granulosus, Taenia solium and the laboratory model Hymenolepis microstoma as examples. The 115- to 141-megabase genomes offer insights into the evolution of parasitism. Synteny is maintained with distantly related blood flukes but we find extreme losses of genes and pathways that are ubiquitous in other animals, including 34 homeobox families and several determinants of stem cell fate. Tapeworms have specialized detoxification pathways, metabolism that is finely tuned to rely on nutrients scavenged from their hosts, and species-specific expansions of non-canonical heat shock proteins and families of known antigens. We identify new potential drug targets, including some on which existing pharmaceuticals may act. The genomes provide a rich resource to underpin the development of urgently needed treatments and control.

    Funded by: Biotechnology and Biological Sciences Research Council: BBG0038151; Canadian Institutes of Health Research: MOP#84556; FIC NIH HHS: TW008588; Wellcome Trust: 098051

    Nature 2013;496;7443;57-63

  • Public Health Value of Next-Generation DNA Sequencing of Enterohemorrhagic Escherichia coli Isolates from an Outbreak.

    Underwood AP, Dallman T, Thomson NR, Williams M, Harker K, Perry N, Adak B, Willshaw G, Cheasty T, Green J, Dougan G, Parkhill J and Wain J

    Health Protection Agency, London, United Kingdom.

    In 2009, an outbreak of enterohemorrhagic Escherichia coli (EHEC) on an open farm infected 93 persons, and approximately 22% of these individuals developed hemolytic-uremic syndrome (HUS). Genome sequencing was used to investigate outbreak-derived animal and human EHEC isolates. Phylogeny based on the whole-genome sequence was used to place outbreak isolates in the context of the overall E. coli species and the O157:H7 sequence type 11 (ST11) subgroup. Four informative single nucleotide polymorphisms (SNPs) were identified and used to design an assay to type 122 other outbreak isolates. The SNP phylogeny demonstrated that the outbreak strain was from a lineage distinct from previously reported O157:H7 ST11 EHEC and was not a member of the hypervirulent clade 8. The strain harbored determinants for two Stx2 verotoxins and other putative virulence factors. When linked to the epidemiological information, the sequence data indicate that gross contamination of a single outbreak strain occurred across the farm prior to the first clinical report of HUS. The most likely explanation for these results is that a single successful strain of EHEC spread from a single introduction through the farm by clonal expansion and that contamination of the environment (including the possible colonization of several animals) led ultimately to human cases.

    Journal of clinical microbiology 2013;51;1;232-7

  • Histone methyltransferase MLL3 contributes to genome-scale circadian transcription.

    Valekunja UK, Edgar RS, Oklejewicz M, van der Horst GT, O'Neill JS, Tamanini F, Turner DJ and Reddy AB

    Department of Clinical Neurosciences, University of Cambridge Metabolic Research Laboratories, Institute of Metabolic Science, National Institute for Health Research Cambridge Biomedical Research Centre, Addenbrooke's Hospital, University of Cambridge, Cambridge CB2 0QQ, United Kingdom.

    Daily cyclical expression of thousands of genes in tissues such as the liver is orchestrated by the molecular circadian clock, the disruption of which is implicated in metabolic disorders and cancer. Although we understand much about the circadian transcription factors that can switch gene expression on and off, it is still unclear how global changes in rhythmic transcription are controlled at the genomic level. Here, we demonstrate circadian modification of an activating histone mark at a significant proportion of gene loci that undergo daily transcription, implicating widespread epigenetic modification as a key node regulated by the clockwork. Furthermore, we identify the histone-remodelling enzyme mixed lineage leukemia (MLL)3 as a clock-controlled factor that is able to directly and indirectly modulate over a hundred epigenetically targeted circadian "output" genes in the liver. Importantly, catalytic inactivation of the histone methyltransferase activity of MLL3 also severely compromises the oscillation of "core" clock gene promoters, including Bmal1, mCry1, mPer2, and Rev-erbα, suggesting that rhythmic histone methylation is vital for robust transcriptional oscillator function. This highlights a pathway by which the clockwork exerts genome-wide control over transcription, which is critical for sustaining temporal programming of tissue physiology.

    Proceedings of the National Academy of Sciences of the United States of America 2013;110;4;1554-9

  • The miRNA Profile of Human Pancreatic Islets and Beta-Cells and Relationship to Type 2 Diabetes Pathogenesis.

    van de Bunt M, Gaulton KJ, Parts L, Moran I, Johnson PR, Lindgren CM, Ferrer J, Gloyn AL and McCarthy MI

    Oxford Centre for Diabetes, Endocrinology & Metabolism, University of Oxford, Oxford, United Kingdom ; Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, United Kingdom.

    Recent advances in the understanding of the genetics of type 2 diabetes (T2D) susceptibility have focused attention on the regulation of transcriptional activity within the pancreatic beta-cell. MicroRNAs (miRNAs) represent an important component of regulatory control, and have proven roles in the development of human disease and control of glucose homeostasis. We set out to establish the miRNA profile of human pancreatic islets and of enriched beta-cell populations, and to explore their potential involvement in T2D susceptibility. We used Illumina small RNA sequencing to profile the miRNA fraction in three preparations each of primary human islets and of enriched beta-cells generated by fluorescence-activated cell sorting. In total, 366 miRNAs were found to be expressed (i.e. >100 cumulative reads) in islets and 346 in beta-cells; of the total of 384 unique miRNAs, 328 were shared. A comparison of the islet-cell miRNA profile with those of 15 other human tissues identified 40 miRNAs predominantly expressed (i.e. >50% of all reads seen across the tissues) in islets. Several highly-expressed islet miRNAs, such as miR-375, have established roles in the regulation of islet function, but others (e.g. miR-27b-3p, miR-192-5p) have not previously been described in the context of islet biology. As a first step towards exploring the role of islet-expressed miRNAs and their predicted mRNA targets in T2D pathogenesis, we looked at published T2D association signals across these sites. We found evidence that predicted mRNA targets of islet-expressed miRNAs were globally enriched for signals of T2D association (p-values <0.01, q-values <0.1). At six loci with genome-wide evidence for T2D association (AP3S2, KCNK16, NOTCH2, SCL30A8, VPS26A, and WFS1) predicted mRNA target sites for islet-expressed miRNAs overlapped potentially causal variants. In conclusion, we have described the miRNA profile of human islets and beta-cells and provide evidence linking islet miRNAs to T2D pathogenesis.

    PloS one 2013;8;1;e55272

  • The Molecular Genetic Architecture of Self-Employment.

    van der Loos MJ, Rietveld CA, Eklund N, Koellinger PD, Rivadeneira F, Abecasis GR, Ankra-Badu GA, Baumeister SE, Benjamin DJ, Biffar R, Blankenberg S, Boomsma DI, Cesarini D, Cucca F, de Geus EJ, Dedoussis G, Deloukas P, Dimitriou M, Eiriksdottir G, Eriksson J, Gieger C, Gudnason V, Höhne B, Holle R, Hottenga JJ, Isaacs A, Järvelin MR, Johannesson M, Kaakinen M, Kähönen M, Kanoni S, Laaksonen MA, Lahti J, Launer LJ, Lehtimäki T, Loitfelder M, Magnusson PK, Naitza S, Oostra BA, Perola M, Petrovic K, Quaye L, Raitakari O, Ripatti S, Scheet P, Schlessinger D, Schmidt CO, Schmidt H, Schmidt R, Senft A, Smith AV, Spector TD, Surakka I, Svento R, Terracciano A, Tikkanen E, van Duijn CM, Viikari J, Völzke H, Wichmann HE, Wild PS, Willems SM, Willemsen G, van Rooij FJ, Groenen PJ, Uitterlinden AG, Hofman A and Thurik AR

    Department of Applied Economics, Erasmus School of Economics, Erasmus University Rotterdam, Rotterdam, The Netherlands ; Department of Epidemiology, Erasmus Medical Center, Rotterdam, The Netherlands.

    Economic variables such as income, education, and occupation are known to affect mortality and morbidity, such as cardiovascular disease, and have also been shown to be partly heritable. However, very little is known about which genes influence economic variables, although these genes may have both a direct and an indirect effect on health. We report results from the first large-scale collaboration that studies the molecular genetic architecture of an economic variable-entrepreneurship-that was operationalized using self-employment, a widely-available proxy. Our results suggest that common SNPs when considered jointly explain about half of the narrow-sense heritability of self-employment estimated in twin data (σg (2)/σP (2) = 25%, h (2) = 55%). However, a meta-analysis of genome-wide association studies across sixteen studies comprising 50,627 participants did not identify genome-wide significant SNPs. 58 SNPs with p<10(-5) were tested in a replication sample (n = 3,271), but none replicated. Furthermore, a gene-based test shows that none of the genes that were previously suggested in the literature to influence entrepreneurship reveal significant associations. Finally, SNP-based genetic scores that use results from the meta-analysis capture less than 0.2% of the variance in self-employment in an independent sample (p≥0.039). Our results are consistent with a highly polygenic molecular genetic architecture of self-employment, with many genetic variants of small effect. Although self-employment is a multi-faceted, heavily environmentally influenced, and biologically distal trait, our results are similar to those for other genetically complex and biologically more proximate outcomes, such as height, intelligence, personality, and several diseases.

    PloS one 2013;8;4;e60542

  • Cancer of mice and men: Old twists and new tails.

    van der Weyden L and Adams DJ

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK.

    In this review we set out to celebrate the contribution that mouse models of human cancer have made to our understanding of the fundamental mechanisms driving tumourigenesis. We take the opportunity to look forward to how the mouse will be used to model cancer and the tools and technologies that will be applied, and indulge in looking back at the key advances the mouse has made possible.

    The Journal of pathology 2013

  • Jdp2 downregulates Trp53 transcription to promote leukaemogenesis in the context of Trp53 heterozygosity.

    van der Weyden L, Rust AG, McIntyre RE, Robles-Espinoza CD, del Castillo Velasco-Herrera M, Strogantsev R, Ferguson-Smith AC, McCarthy S, Keane TM, Arends MJ and Adams DJ

    Wellcome Trust Sanger Institute, Cambridge, UK.

    We performed a genetic screen in mice to identify candidate genes that are associated with leukaemogenesis in the context of Trp53 heterozygosity. To do this we generated Trp53 heterozygous mice carrying the T2/Onc transposon and SB11 transposase alleles to allow transposon-mediated insertional mutagenesis to occur. From the resulting leukaemias/lymphomas that developed in these mice, we identified nine loci that are potentially associated with tumour formation in the context of Trp53 heterozygosity, including AB041803 and the Jun dimerization protein 2 (Jdp2). We show that Jdp2 transcriptionally regulates the Trp53 promoter, via an atypical AP-1 site, and that Jdp2 expression negatively regulates Trp53 expression levels. This study is the first to identify a genetic mechanism for tumour formation in the context of Trp53 heterozygosity.

    Funded by: Cancer Research UK; Wellcome Trust

    Oncogene 2013;32;3;397-402

  • Beyond the Sympathetic Tone: The New Brown Fat Activators.

    Villarroya F and Vidal-Puig A

    Departament de Bioquimica i Biologia Molecular, Institute of Biomedicine (IBUB), University of Barcelona, and CIBER Fisiopatologia de la Obesidad y Nutrición, Av Diagonal 643, 08028 Barcelona, Catalonia, Spain. Electronic address: fvillarroya@ub.edu.

    If we could avoid the side effects associated with global sympathetic activation, activating brown adipose tissue to increase thermogenesis would be a safe way to lose weight. The discovery of adrenergic-independent brown fat activators opens the prospect of developing this alternative way to efficiently and safely induce negative energy balance.

    Cell metabolism 2013

  • Whole-genome sequencing to delineate Mycobacterium tuberculosis outbreaks: a retrospective observational study.

    Walker TM, Ip CL, Harrell RH, Evans JT, Kapatai G, Dedicoat MJ, Eyre DW, Wilson DJ, Hawkey PM, Crook DW, Parkhill J, Harris D, Walker AS, Bowden R, Monk P, Smith EG and Peto TE

    Nuffield Department of Medicine, John Radcliffe Hospital, University of Oxford, Oxford, UK. timothy.walker@ndm.ox.ac.uk

    Background: Tuberculosis incidence in the UK has risen in the past decade. Disease control depends on epidemiological data, which can be difficult to obtain. Whole-genome sequencing can detect microevolution within Mycobacterium tuberculosis strains. We aimed to estimate the genetic diversity of related M tuberculosis strains in the UK Midlands and to investigate how this measurement might be used to investigate community outbreaks.

    Methods: In a retrospective observational study, we used Illumina technology to sequence M tuberculosis genomes from an archive of frozen cultures. We characterised isolates into four groups: cross-sectional, longitudinal, household, and community. We measured pairwise nucleotide differences within hosts and between hosts in household outbreaks and estimated the rate of change in DNA sequences. We used the findings to interpret network diagrams constructed from 11 community clusters derived from mycobacterial interspersed repetitive-unit-variable-number tandem-repeat data.

    Findings: We sequenced 390 separate isolates from 254 patients, including representatives from all five major lineages of M tuberculosis. The estimated rate of change in DNA sequences was 0.5 single nucleotide polymorphisms (SNPs) per genome per year (95% CI 0.3-0.7) in longitudinal isolates from 30 individuals and 25 families. Divergence is rarely higher than five SNPs in 3 years. 109 (96%) of 114 paired isolates from individuals and households differed by five or fewer SNPs. More than five SNPs separated isolates from none of 69 epidemiologically linked patients, two (15%) of 13 possibly linked patients, and 13 (17%) of 75 epidemiologically unlinked patients (three-way comparison exact p<0.0001). Genetic trees and clinical and epidemiological data suggest that super-spreaders were present in two community clusters.

    Interpretation: Whole-genome sequencing can delineate outbreaks of tuberculosis and allows inference about direction of transmission between cases. The technique could identify super-spreaders and predict the existence of undiagnosed cases, potentially leading to early treatment of infectious patients and their contacts.

    Funding: Medical Research Council, Wellcome Trust, National Institute for Health Research, and the Health Protection Agency.

    Funded by: Biotechnology and Biological Sciences Research Council; Department of Health: G0800778; Medical Research Council; Wellcome Trust: 087646/Z/08/Z, 098051

    The Lancet infectious diseases 2013;13;2;137-46

  • Genetic basis of Y-linked hearing impairment.

    Wang Q, Xue Y, Zhang Y, Long Q, an, Yang F, Turner DJ, Fitzgerald T, Ng BL, Zhao Y, Chen Y, Liu Q, Yang W, Han D, Quail MA, Swerdlow H, Burton J, Fahey C, Ning Z, Hurles ME, Carter NP, Yang H and Tyler-Smith C

    Department of Otolaryngology, Head and Neck Surgery, Chinese PLA Institute of Otolaryngology, Chinese PLA General Hospital, Beijing, China.

    A single Mendelian trait has been mapped to the human Y chromosome: Y-linked hearing impairment. The molecular basis of this disorder is unknown. Here, we report the detailed characterization of the DFNY1 Y chromosome and its comparison with a closely related Y chromosome from an unaffected branch of the family. The DFNY1 chromosome carries a complex rearrangement, including duplication of several noncontiguous segments of the Y chromosome and insertion of ∼160 kb of DNA from chromosome 1, in the pericentric region of Yp. This segment of chromosome 1 is derived entirely from within a known hearing impairment locus, DFNA49. We suggest that a third copy of one or more genes from the shared segment of chromosome 1 might be responsible for the hearing-loss phenotype.

    Funded by: Wellcome Trust: 098051

    American journal of human genetics 2013;92;2;301-6

  • Viral population analysis and minority-variant detection using short read next-generation sequencing.

    Watson SJ, Welkers MR, Depledge DP, Coulter E, Breuer JM, de Jong MD and Kellam P

    Wellcome Trust Sanger Institute, , Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.

    RNA viruses within infected individuals exist as a population of evolutionary-related variants. Owing to evolutionary change affecting the constitution of this population, the frequency and/or occurrence of individual viral variants can show marked or subtle fluctuations. Since the development of massively parallel sequencing platforms, such viral populations can now be investigated to unprecedented resolution. A critical problem with such analyses is the presence of sequencing-related errors that obscure the identification of true biological variants present at low frequency. Here, we report the development and assessment of the Quality Assessment of Short Read (QUASR) Pipeline (http://sourceforge.net/projects/quasr) specific for virus genome short read analysis that minimizes sequencing errors from multiple deep-sequencing platforms, and enables post-mapping analysis of the minority variants within the viral population. QUASR significantly reduces the error-related noise in deep-sequencing datasets, resulting in increased mapping accuracy and reduction of erroneous mutations. Using QUASR, we have determined influenza virus genome dynamics in sequential samples from an in vitro evolution of 2009 pandemic H1N1 (A/H1N1/09) influenza from samples sequenced on both the Roche 454 GSFLX and Illumina GAIIx platforms. Importantly, concordance between the 454 and Illumina sequencing allowed unambiguous minority-variant detection and accurate determination of virus population turnover in vitro.

    Philosophical transactions of the Royal Society of London. Series B, Biological sciences 2013;368;1614;20120205

  • A calibrated human Y-chromosomal phylogeny based on resequencing.

    Wei W, Ayub Q, Chen Y, McCarthy S, Hou Y, Carbone I, Xue Y and Tyler-Smith C

    The Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, United Kingdom.

    We have identified variants present in high-coverage complete sequences of 36 diverse human Y chromosomes from Africa, Europe, South Asia, East Asia, and the Americas, representing eight major haplogroups. After restricting our analysis to 8.97 Mb of the unique male-specific Y sequence, we identified 6662 high-confidence variants, including single-nucleotide polymorphisms (SNPs), multi-nucleotide polymorphisms (MNPs), and indels. We constructed phylogenetic trees using these variants, or subsets of them, and recapitulated the known structure of the tree. Assuming a male mutation rate of 1 × 10(-9) per base pair per year, the time depth of the tree (haplogroups A3-R) was ~101,000-115,000 yr, and the lineages found outside Africa dated to 57,000-74,000 yr, both as expected. In addition, we dated a striking Paleolithic male lineage expansion to 41,000-52,000 yr ago and the node representing the major European Y lineage, R1b, to 4000-13,000 yr ago, supporting a Neolithic origin for these modern European Y chromosomes. In all, we provide a nearly 10-fold increase in the number of Y markers with phylogenetic information, and novel historical insights derived from placing them on a calibrated phylogenetic tree.

    Funded by: Wellcome Trust: 098051

    Genome research 2013;23;2;388-95

  • Genome-wide SNP and CNV analysis identifies common and low-frequency variants associated with severe early-onset obesity.

    Wheeler E, Huang N, Bochukova EG, Keogh JM, Lindsay S, Garg S, Henning E, Blackburn H, Loos RJ, Wareham NJ, O'Rahilly S, Hurles ME, Barroso I and Farooqi IS

    Wellcome Trust Sanger Institute, Cambridge, UK.

    Common and rare variants associated with body mass index (BMI) and obesity account for <5% of the variance in BMI. We performed SNP and copy number variation (CNV) association analyses in 1,509 children with obesity at the extreme tail (>3 s.d. from the mean) of the BMI distribution and 5,380 controls. Evaluation of 29 SNPs (P < 1 × 10(-5)) in an additional 971 severely obese children and 1,990 controls identified 4 new loci associated with severe obesity (LEPR, PRKCH, PACS1 and RMST). A previously reported 43-kb deletion at the NEGR1 locus was significantly associated with severe obesity (P = 6.6 × 10(-7)). However, this signal was entirely driven by a flanking 8-kb deletion; absence of this deletion increased risk for obesity (P = 6.1 × 10(-11)). We found a significant burden of rare, single CNVs in severely obese cases (P < 0.0001). Integrative gene network pathway analysis of rare deletions indicated enrichment of genes affecting G protein-coupled receptors (GPCRs) involved in the neuronal regulation of energy homeostasis.

    Nature genetics 2013

  • Ischemic stroke is associated with the ABO locus: The EuroCLOT study.

    Williams FM, Carter AM, Hysi PG, Surdulescu G, Hodgkiss D, Soranzo N, Traylor M, Bevan S, Dichgans M, Rothwell PM, Sudlow C, Farrall M, Silander K, Kaunisto M, Wagner P, Saarela O, Kuulasmaa K, Virtamo J, Salomaa V, Amouyel P, Arveiler D, Ferrieres J, Wiklund PG, Ikram MA, Hofman A, Boncoraglio GB, Parati EA, Helgadottir A, Gretarsdottir S, Thorsteinsdottir U, Thorleifsson G, Stefansson K, Seshadri S, Destefano A, Gschwendtner A, Psaty B, Longstreth W, Mitchell BD, Cheng YC, Clarke R, Ferrario M, Bis JC, Levi C, Attia J, Holliday EG, Scott RJ, Fornage M, Sharma P, Furie KL, Rosand J, Nalls M, Meschia J, Mosely TH, Evans A, Palotie A, Markus HS, Grant PJ, Spector TD and EuroCLOT Investigators the Wellcome Trust Case Control Consortium 2 MOnica Risk, Genetics, Archiving and Monograph MetaStroke and the International Stroke Genetics Consortium

    Department of Twin Research and Genetic Epidemiology, King's College London, London, United Kingdom. frances.williams@kcl.ac.uk.

    Objective: End-stage coagulation and the structure/function of fibrin are implicated in the pathogenesis of ischemic stroke. We explored whether genetic variants associated with end-stage coagulation in healthy REFVIDunteers account for the genetic predisposition to ischemic stroke and examined their influence on stroke subtype. Methods: Common genetic variants identified through genome-wide association studies of coagulation factors and fibrin structure/function in healthy twins (n = 2,100, Stage 1) were examined in ischemic stroke (n = 4,200 cases) using 2 independent samples of European ancestry (Stage 2). A third clinical collection having stroke subtyping (total 8,900 cases, 55,000 controls) was used for replication (Stage 3). Results: Stage 1 identified 524 single nucleotide polymorphisms (SNPs) from 23 linkage disequilibrium blocks having significant association (p < 5 × 10(-8) ) with 1 or more coagulation/fibrin phenotypes. The most striking associations included SNP rs5985 with factor XIII activity (p = 2.6 × 10(-186) ), rs10665 with FVII (p = 2.4 × 10(-47) ), and rs505922 in the ABO gene with both von Willebrand factor (p = 4.7 × 10(-57) ) and factor VIII (p = 1.2 × 10(-36) ). In Stage 2, the 23 independent SNPs were examined in stroke cases/noncases using MOnica Risk, Genetics, Archiving and Monograph (MORGAM) and Wellcome Trust Case Control Consortium 2 collections. SNP rs505922 was nominally associated with ischemic stroke (odds ratio = 0.94, 95% confidence interval = 0.88-0.99, p = 0.023). Independent replication in Meta-Stroke confirmed the rs505922 association with stroke, beta (standard error, SE) = 0.066 (0.02), p = 0.001, a finding specific to large-vessel and cardioembolic stroke (p = 0.001 and p = < 0.001, respectively) but not seen with small-vessel stroke (p = 0.811). Interpretation: ABO gene variants are associated with large-vessel and cardioembolic stroke but not small-vessel disease. This work sheds light on the different pathogenic mechanisms underpinning stroke subtype. Ann Neurol 2013.

    Funded by: NCRR NIH HHS: UL1 RR033176; NHGRI NIH HHS: U01 HG004402, U01 HG004446; NHLBI NIH HHS: N01 HC035129, R01 HL075366, R01 HL080295, R01 HL085251, R01 HL087641, R01 HL087652, R01 HL093029, R01 HL105756; NIA NIH HHS: R01 AG008122, R01 AG015928, R01 AG016495, R01 AG020098, R01 AG023629, R01 AG027058, R01 AG031287, R01 AG033193; NIDDK NIH HHS: P30 DK063491; NINDS NIH HHS: R01 NS017950, R01 NS045012, U01 NS069208

    Annals of neurology 2013;73;1;16-31

  • Sequencing and comparative analysis of the gorilla MHC genomic sequence.

    Wilming LG, Hart EA, Coggill PC, Horton R, Gilbert JG, Clee C, Jones M, Lloyd C, Palmer S, Sims S, Whitehead S, Wiley D, Beck S and Harrow JL

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1HH, UK.

    Major histocompatibility complex (MHC) genes play a critical role in vertebrate immune response and because the MHC is linked to a significant number of auto-immune and other diseases it is of great medical interest. Here we describe the clone-based sequencing and subsequent annotation of the MHC region of the gorilla genome. Because the MHC is subject to extensive variation, both structural and sequence-wise, it is not readily amenable to study in whole genome shotgun sequence such as the recently published gorilla genome. The variation of the MHC also makes it of evolutionary interest and therefore we analyse the sequence in the context of human and chimpanzee. In our comparisons with human and re-annotated chimpanzee MHC sequence we find that gorilla has a trimodular RCCX cluster, versus the reference human bimodular cluster, and additional copies of Class I (pseudo)genes between Gogo-K and Gogo-A (the orthologues of HLA-K and -A). We also find that Gogo-H (and Patr-H) is coding versus the HLA-H pseudogene and, conversely, there is a Gogo-DQB2 pseudogene versus the HLA-DQB2 coding gene. Our analysis, which is freely available through the VEGA genome browser, provides the research community with a comprehensive dataset for comparative and evolutionary research of the MHC.

    Database : the journal of biological databases and curation 2013;2013;bat011

  • Go retro and get a GRIP.

    Wong K, Adams DJ and Keane TM

    Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1HH, UK. da1@sanger.ac.uk.

    Gene retrocopy insertions are a source of new genes and new gene functions, and can now be identified using paired-end whole genome sequencing data.

    Genome biology 2013;14;3;108

  • Pneumococcal capsular switching: a historical perspective.

    Wyres KL, Lambertsen LM, Croucher NJ, McGee L, von Gottberg A, Liñares J, Jacobs MR, Kristinsson KG, Beall BW, Klugman KP, Parkhill J, Hakenbeck R, Bentley SD and Brueggemann AB

    Department of Zoology, University of Oxford, and.

    Background. Changes in serotype prevalence among pneumococcal populations result from both serotype replacement and serotype (capsular) switching. Temporal changes in serotype distributions are well documented, but the contribution of capsular switching to such changes is unknown. Furthermore, it is unclear to what extent vaccine-induced selective pressures drive capsular switching. Methods. Serotype and multilocus sequence typing data for 426 pneumococci dated from 1937 through 2007 were analyzed. Whole-genome sequence data for a subset of isolates were used to investigate capsular switching events. Results. We identified 36 independent capsular switch events, 18 of which were explored in detail with whole-genome sequence data. Recombination fragment lengths were estimated for 11 events and ranged from approximately 19.0 kb to ≥58.2 kb. Two events took place no later than 1960, and the imported DNA included the capsular locus and the nearby penicillin-binding protein genes pbp2x and pbp1a. Conclusions. Capsular switching has been a regular occurrence among pneumococcal populations throughout the past 7 decades. Recombination of large DNA fragments (>30 kb), sometimes including the capsular locus and penicillin-binding protein genes, predated both vaccine introduction and widespread antibiotic use. This type of recombination has likely been an intrinsic feature throughout the history of pneumococcal evolution.

    The Journal of infectious diseases 2013;207;3;439-49

  • Interferon-induced transmembrane protein-3 genetic variant rs12252-C is associated with severe influenza in Chinese individuals.

    Zhang YH, Zhao Y, Li N, Peng YC, Giannoulatou E, Jin RH, Yan HP, Wu H, Liu JH, Liu N, Wang DY, Shu YL, Ho LP, Kellam P, McMichael A and Dong T

    Beijing You'an Hospital, Capital Medical University, Beijing PO 100069, China.

    The SNP rs12252-C allele alters the function of interferon-induced transmembrane protein-3 increasing the disease severity of influenza virus infection in Caucasians, but the allele is rare. However, rs12252-C is much more common in Han Chinese. Here we report that the CC genotype is found in 69% of Chinese patients with severe pandemic influenza A H1N1/09 virus infection compared with 25% in those with mild infection. Specifically, the CC genotype was estimated to confer a sixfold greater risk for severe infection than the CT and TT genotypes. More importantly, because the risk genotype occurs with such a high frequency, its effect translates to a large population-attributable risk of 54.3% for severe infection in the Chinese population studied compared with 5.4% in Northern Europeans. Interferon-induced transmembrane protein-3 genetic variants could, therefore, have a strong effect of the epidemiology of influenza in China and in people of Chinese descent.

    Funded by: Medical Research Council; Wellcome Trust

    Nature communications 2013;4;1418

* quick link - http://q.sanger.ac.uk/3hi3vrxe