Genomics of quantitative variation

Nicole Soranzo, leader of the Genomics of quantitative trait variation in humans team, is interested in the use of quantitative intermediate traits to unravel novel mechanisms underlying common, complex diseases such as cardiovascular and metabolic diseases.

[Genome Research Limited]

Background

Cardiovascular disease is the leading cause of death in developed countries, principally resulting from coronary artery disease (CAD) and myocardial infarction (MI). Heritability estimates of CAD (38-54%) reflect the complex pathophysiology of this disease, where different genetic and environmental triggers are likely to contribute to distinct clinical sub-phenotypes, the understanding and definition of which can lead to better treatments. Our research combines large-scale genetic analyses, emerging initiatives in stem cell biology and advances in metabolic phenotyping to characterize key processes underlying predisposition to CAD and MI and to provide new opportunities for experimental treatment.

Research

Our approach

Our group uses large-scale genetic association analyses to identify genetic determinant of quantitative cardiometabolic traits in deeply phenotyped population-based cohorts. More recently, we have begun using massively parallel whole-exome and whole-genome sequencing with the aim to investigate the contribution of low-frequency and rare genetic variants.

Ongoing projects

Common and rare genetic determinants of cardiometabolic traits

Within the cohorts component of the Wellcome Trust-funded UK10K project, we are generating low-coverage whole-genome sequence information for 4,000 individuals from the UK. We will carry out associations of these sequence variants with a suite of key cardiometabolic traits, with the aim to explore the role of rare and low-frequency variants. We will also extend these analyses to genetically-isolated populations from Italy (INGI Val Borbera and Friuli Venezia Giulia. In these two populations, we will use sequencing of individuals at low coverage and imputation of sequenced variants into GWAS datasets to investigate the role of rare and common variants with the same traits investigated in the UK10K sample.

Metabolomic genetics

Large numbers of drugs fail in phases 2 and 3 clinical trials due to lack of efficacy, even though the drugs may be effective in a subset of the tested clinical population. It is hoped that decisions about patient stratification will be enhanced by the application of high throughput technologies based on liquid-chromatography mass-spectrometry (LC-MS)-metabolomics. This project is funded by the drug company Pfizer, and aims to explore the use of metabolomics technology to stratify metabolically a healthy UK population. We have extended the metabolomic phenotyping (Suhre et al Nature 2011), and we will analyze genetic associations in this extended dataset, overlying the information to information of complex trait loci, gene expression patterns and variation in underlying clinical phenotypes to address the usefulness of metabolomics for the stratification of the patient population.

Genetic and epigenetic determinants of hematopoiesis

The hematopoietic system provides a good model system to inform interpretation of association studies owing to

  1. simple phenotypes at the cellular level;
  2. nearly unlimited access to suitable tissue with good ability for in vitro manipulation;
  3. suitable model organisms;
  4. widespread clinical relevance.

Our group has led - together with many collaborators - the discovery of nearly ~100 loci affecting variation in blood cell elements through genome-wide association studies. We have further seeked to combine genetic discoveries to a host of integrative analyses and functional approaches, including protein-protein interaction networks, in vitro differentiation of HSCs towards red cell and platelet precursors, and silencing experiments in model organisms (fly, zebrafish and mouse). Our results to date support the notion that the regulation of the formation and survival of blood cells in healthy individuals is mediated through a host of previously unknown regulators, prevalently active in the late stages of lineage commitment, and affecting blood cell formation in a prevalently lineage-specific manner.

As an extension to this work, we now aim to identify and characterize in greater depth genes implicated in hematopoietic development in the EU FP7-funded BLUEPRINT project, which will generate reference genomes and epigenomes of at least 100 specific blood cell types. Our group will be responsible for the genomic (through whole-genome sequencing) and epigenetic characterization of two cell types in 200 individuals, with the aim characterize the role of human variation on the epigenomic landscape.

Selected Publications

  • New gene functions in megakaryopoiesis and platelet formation.

    Gieger C, Radhakrishnan A, Cvejic A, Tang W, Porcu E, Pistis G, Serbanovic-Canic J, Elling U, Goodall AH, Labrune Y, Lopez LM, Mägi R, Meacham S, Okada Y, Pirastu N, Sorice R, Teumer A, Voss K, Zhang W, Ramirez-Solis R, Bis JC, Ellinghaus D, Gögele M, Hottenga JJ, Langenberg C, Kovacs P, O'Reilly PF, Shin SY, Esko T, Hartiala J, Kanoni S, Murgia F, Parsa A, Stephens J, van der Harst P, Ellen van der Schoot C, Allayee H, Attwood A, Balkau B, Bastardot F, Basu S, Baumeister SE, Biino G, Bomba L, Bonnefond A, Cambien F, Chambers JC, Cucca F, D'Adamo P, Davies G, de Boer RA, de Geus EJ, Döring A, Elliott P, Erdmann J, Evans DM, Falchi M, Feng W, Folsom AR, Frazer IH, Gibson QD, Glazer NL, Hammond C, Hartikainen AL, Heckbert SR, Hengstenberg C, Hersch M, Illig T, Loos RJ, Jolley J, Khaw KT, Kühnel B, Kyrtsonis MC, Lagou V, Lloyd-Jones H, Lumley T, Mangino M, Maschio A, Mateo Leach I, McKnight B, Memari Y, Mitchell BD, Montgomery GW, Nakamura Y, Nauck M, Navis G, Nöthlings U, Nolte IM, Porteous DJ, Pouta A, Pramstaller PP, Pullat J, Ring SM, Rotter JI, Ruggiero D, Ruokonen A, Sala C, Samani NJ, Sambrook J, Schlessinger D, Schreiber S, Schunkert H, Scott J, Smith NL, Snieder H, Starr JM, Stumvoll M, Takahashi A, Tang WH, Taylor K, Tenesa A, Lay Thein S, Tönjes A, Uda M, Ulivi S, van Veldhuisen DJ, Visscher PM, Völker U, Wichmann HE, Wiggins KL, Willemsen G, Yang TP, Hua Zhao J, Zitting P, Bradley JR, Dedoussis GV, Gasparini P, Hazen SL, Metspalu A, Pirastu M, Shuldiner AR, Joost van Pelt L, Zwaginga JJ, Boomsma DI, Deary IJ, Franke A, Froguel P, Ganesh SK, Jarvelin MR, Martin NG, Meisinger C, Psaty BM, Spector TD, Wareham NJ, Akkerman JW, Ciullo M, Deloukas P, Greinacher A, Jupe S, Kamatani N, Khadake J, Kooner JS, Penninger J, Prokopenko I, Stemple D, Toniolo D, Wernisch L, Sanna S, Hicks AA, Rendon A, Ferreira MA, Ouwehand WH and Soranzo N

    Nature 2011;480;7376;201-8

  • Human metabolic individuality in biomedical and pharmaceutical research.

    Suhre K, Shin SY, Petersen AK, Mohney RP, Meredith D, Wägele B, Altmaier E, CARDIoGRAM, Deloukas P, Erdmann J, Grundberg E, Hammond CJ, de Angelis MH, Kastenmüller G, Köttgen A, Kronenberg F, Mangino M, Meisinger C, Meitinger T, Mewes HW, Milburn MV, Prehn C, Raffler J, Ried JS, Römisch-Margl W, Samani NJ, Small KS, Wichmann HE, Zhai G, Illig T, Spector TD, Adamski J, Soranzo N and Gieger C

    Nature 2011;477;7362;54-60

  • Common variants at 10 genomic loci influence hemoglobin A₁(C) levels via glycemic and nonglycemic pathways.

    Soranzo N, Sanna S, Wheeler E, Gieger C, Radke D, Dupuis J, Bouatia-Naji N, Langenberg C, Prokopenko I, Stolerman E, Sandhu MS, Heeney MM, Devaney JM, Reilly MP, Ricketts SL, Stewart AF, Voight BF, Willenborg C, Wright B, Altshuler D, Arking D, Balkau B, Barnes D, Boerwinkle E, Böhm B, Bonnefond A, Bonnycastle LL, Boomsma DI, Bornstein SR, Böttcher Y, Bumpstead S, Burnett-Miller MS, Campbell H, Cao A, Chambers J, Clark R, Collins FS, Coresh J, de Geus EJ, Dei M, Deloukas P, Döring A, Egan JM, Elosua R, Ferrucci L, Forouhi N, Fox CS, Franklin C, Franzosi MG, Gallina S, Goel A, Graessler J, Grallert H, Greinacher A, Hadley D, Hall A, Hamsten A, Hayward C, Heath S, Herder C, Homuth G, Hottenga JJ, Hunter-Merrill R, Illig T, Jackson AU, Jula A, Kleber M, Knouff CW, Kong A, Kooner J, Köttgen A, Kovacs P, Krohn K, Kühnel B, Kuusisto J, Laakso M, Lathrop M, Lecoeur C, Li M, Li M, Loos RJ, Luan J, Lyssenko V, Mägi R, Magnusson PK, Mälarstig A, Mangino M, Martínez-Larrad MT, März W, McArdle WL, McPherson R, Meisinger C, Meitinger T, Melander O, Mohlke KL, Mooser VE, Morken MA, Narisu N, Nathan DM, Nauck M, O'Donnell C, Oexle K, Olla N, Pankow JS, Payne F, Peden JF, Pedersen NL, Peltonen L, Perola M, Polasek O, Porcu E, Rader DJ, Rathmann W, Ripatti S, Rocheleau G, Roden M, Rudan I, Salomaa V, Saxena R, Schlessinger D, Schunkert H, Schwarz P, Seedorf U, Selvin E, Serrano-Ríos M, Shrader P, Silveira A, Siscovick D, Song K, Spector TD, Stefansson K, Steinthorsdottir V, Strachan DP, Strawbridge R, Stumvoll M, Surakka I, Swift AJ, Tanaka T, Teumer A, Thorleifsson G, Thorsteinsdottir U, Tönjes A, Usala G, Vitart V, Völzke H, Wallaschofski H, Waterworth DM, Watkins H, Wichmann HE, Wild SH, Willemsen G, Williams GH, Wilson JF, Winkelmann J, Wright AF, WTCCC, Zabena C, Zhao JH, Epstein SE, Erdmann J, Hakonarson HH, Kathiresan S, Khaw KT, Roberts R, Samani NJ, Fleming MD, Sladek R, Abecasis G, Boehnke M, Froguel P, Groop L, McCarthy MI, Kao WH, Florez JC, Uda M, Wareham NJ, Barroso I and Meigs JB

    Diabetes 2010;59;12;3229-39

  • A genome-wide perspective of genetic variation in human metabolism.

    Illig T, Gieger C, Zhai G, Römisch-Margl W, Wang-Sattler R, Prehn C, Altmaier E, Kastenmüller G, Kato BS, Mewes HW, Meitinger T, de Angelis MH, Kronenberg F, Soranzo N, Wichmann HE, Spector TD, Adamski J and Suhre K

    Nature genetics 2010;42;2;137-41

  • New genetic loci implicated in fasting glucose homeostasis and their impact on type 2 diabetes risk.

    Dupuis J, Langenberg C, Prokopenko I, Saxena R, Soranzo N, Jackson AU, Wheeler E, Glazer NL, Bouatia-Naji N, Gloyn AL, Lindgren CM, Mägi R, Morris AP, Randall J, Johnson T, Elliott P, Rybin D, Thorleifsson G, Steinthorsdottir V, Henneman P, Grallert H, Dehghan A, Hottenga JJ, Franklin CS, Navarro P, Song K, Goel A, Perry JR, Egan JM, Lajunen T, Grarup N, Sparsø T, Doney A, Voight BF, Stringham HM, Li M, Kanoni S, Shrader P, Cavalcanti-Proença C, Kumari M, Qi L, Timpson NJ, Gieger C, Zabena C, Rocheleau G, Ingelsson E, An P, O'Connell J, Luan J, Elliott A, McCarroll SA, Payne F, Roccasecca RM, Pattou F, Sethupathy P, Ardlie K, Ariyurek Y, Balkau B, Barter P, Beilby JP, Ben-Shlomo Y, Benediktsson R, Bennett AJ, Bergmann S, Bochud M, Boerwinkle E, Bonnefond A, Bonnycastle LL, Borch-Johnsen K, Böttcher Y, Brunner E, Bumpstead SJ, Charpentier G, Chen YD, Chines P, Clarke R, Coin LJ, Cooper MN, Cornelis M, Crawford G, Crisponi L, Day IN, de Geus EJ, Delplanque J, Dina C, Erdos MR, Fedson AC, Fischer-Rosinsky A, Forouhi NG, Fox CS, Frants R, Franzosi MG, Galan P, Goodarzi MO, Graessler J, Groves CJ, Grundy S, Gwilliam R, Gyllensten U, Hadjadj S, Hallmans G, Hammond N, Han X, Hartikainen AL, Hassanali N, Hayward C, Heath SC, Hercberg S, Herder C, Hicks AA, Hillman DR, Hingorani AD, Hofman A, Hui J, Hung J, Isomaa B, Johnson PR, Jørgensen T, Jula A, Kaakinen M, Kaprio J, Kesaniemi YA, Kivimaki M, Knight B, Koskinen S, Kovacs P, Kyvik KO, Lathrop GM, Lawlor DA, Le Bacquer O, Lecoeur C, Li Y, Lyssenko V, Mahley R, Mangino M, Manning AK, Martínez-Larrad MT, McAteer JB, McCulloch LJ, McPherson R, Meisinger C, Melzer D, Meyre D, Mitchell BD, Morken MA, Mukherjee S, Naitza S, Narisu N, Neville MJ, Oostra BA, Orrù M, Pakyz R, Palmer CN, Paolisso G, Pattaro C, Pearson D, Peden JF, Pedersen NL, Perola M, Pfeiffer AF, Pichler I, Polasek O, Posthuma D, Potter SC, Pouta A, Province MA, Psaty BM, Rathmann W, Rayner NW, Rice K, Ripatti S, Rivadeneira F, Roden M, Rolandsson O, Sandbaek A, Sandhu M, Sanna S, Sayer AA, Scheet P, Scott LJ, Seedorf U, Sharp SJ, Shields B, Sigurethsson G, Sijbrands EJ, Silveira A, Simpson L, Singleton A, Smith NL, Sovio U, Swift A, Syddall H, Syvänen AC, Tanaka T, Thorand B, Tichet J, Tönjes A, Tuomi T, Uitterlinden AG, van Dijk KW, van Hoek M, Varma D, Visvikis-Siest S, Vitart V, Vogelzangs N, Waeber G, Wagner PJ, Walley A, Walters GB, Ward KL, Watkins H, Weedon MN, Wild SH, Willemsen G, Witteman JC, Yarnell JW, Zeggini E, Zelenika D, Zethelius B, Zhai G, Zhao JH, Zillikens MC, DIAGRAM Consortium, GIANT Consortium, Global BPgen Consortium, Borecki IB, Loos RJ, Meneton P, Magnusson PK, Nathan DM, Williams GH, Hattersley AT, Silander K, Salomaa V, Smith GD, Bornstein SR, Schwarz P, Spranger J, Karpe F, Shuldiner AR, Cooper C, Dedoussis GV, Serrano-Ríos M, Morris AD, Lind L, Palmer LJ, Hu FB, Franks PW, Ebrahim S, Marmot M, Kao WH, Pankow JS, Sampson MJ, Kuusisto J, Laakso M, Hansen T, Pedersen O, Pramstaller PP, Wichmann HE, Illig T, Rudan I, Wright AF, Stumvoll M, Campbell H, Wilson JF, Anders Hamsten on behalf of Procardis Consortium, MAGIC investigators, Bergman RN, Buchanan TA, Collins FS, Mohlke KL, Tuomilehto J, Valle TT, Altshuler D, Rotter JI, Siscovick DS, Penninx BW, Boomsma DI, Deloukas P, Spector TD, Frayling TM, Ferrucci L, Kong A, Thorsteinsdottir U, Stefansson K, van Duijn CM, Aulchenko YS, Cao A, Scuteri A, Schlessinger D, Uda M, Ruokonen A, Jarvelin MR, Waterworth DM, Vollenweider P, Peltonen L, Mooser V, Abecasis GR, Wareham NJ, Sladek R, Froguel P, Watanabe RM, Meigs JB, Groop L, Boehnke M, McCarthy MI, Florez JC and Barroso I

    Nature genetics 2010;42;2;105-16

  • A genome-wide meta-analysis identifies 22 loci associated with eight hematological parameters in the HaemGen consortium.

    Soranzo N, Spector TD, Mangino M, Kühnel B, Rendon A, Teumer A, Willenborg C, Wright B, Chen L, Li M, Salo P, Voight BF, Burns P, Laskowski RA, Xue Y, Menzel S, Altshuler D, Bradley JR, Bumpstead S, Burnett MS, Devaney J, Döring A, Elosua R, Epstein SE, Erber W, Falchi M, Garner SF, Ghori MJ, Goodall AH, Gwilliam R, Hakonarson HH, Hall AS, Hammond N, Hengstenberg C, Illig T, König IR, Knouff CW, McPherson R, Melander O, Mooser V, Nauck M, Nieminen MS, O'Donnell CJ, Peltonen L, Potter SC, Prokisch H, Rader DJ, Rice CM, Roberts R, Salomaa V, Sambrook J, Schreiber S, Schunkert H, Schwartz SM, Serbanovic-Canic J, Sinisalo J, Siscovick DS, Stark K, Surakka I, Stephens J, Thompson JR, Völker U, Völzke H, Watkins NA, Wells GA, Wichmann HE, Van Heel DA, Tyler-Smith C, Thein SL, Kathiresan S, Perola M, Reilly MP, Stewart AF, Erdmann J, Samani NJ, Meisinger C, Greinacher A, Deloukas P, Ouwehand WH and Gieger C

    Nature genetics 2009;41;11;1182-90

  • Multiple loci influence erythrocyte phenotypes in the CHARGE Consortium.

    Ganesh SK, Zakai NA, van Rooij FJ, Soranzo N, Smith AV, Nalls MA, Chen MH, Kottgen A, Glazer NL, Dehghan A, Kuhnel B, Aspelund T, Yang Q, Tanaka T, Jaffe A, Bis JC, Verwoert GC, Teumer A, Fox CS, Guralnik JM, Ehret GB, Rice K, Felix JF, Rendon A, Eiriksdottir G, Levy D, Patel KV, Boerwinkle E, Rotter JI, Hofman A, Sambrook JG, Hernandez DG, Zheng G, Bandinelli S, Singleton AB, Coresh J, Lumley T, Uitterlinden AG, Vangils JM, Launer LJ, Cupples LA, Oostra BA, Zwaginga JJ, Ouwehand WH, Thein SL, Meisinger C, Deloukas P, Nauck M, Spector TD, Gieger C, Gudnason V, van Duijn CM, Psaty BM, Ferrucci L, Chakravarti A, Greinacher A, O'Donnell CJ, Witteman JC, Furth S, Cushman M, Harris TB and Lin JP

    Nature genetics 2009;41;11;1191-8

  • Large scale association analysis of novel genetic loci for coronary artery disease.

    Coronary Artery Disease Consortium, Samani NJ, Deloukas P, Erdmann J, Hengstenberg C, Kuulasmaa K, McGinnis R, Schunkert H, Soranzo N, Thompson J, Tiret L and Ziegler A

    Arteriosclerosis, thrombosis, and vascular biology 2009;29;5;774-80

  • A novel variant on chromosome 7q22.3 associated with mean platelet volume, counts, and function.

    Soranzo N, Rendon A, Gieger C, Jones CI, Watkins NA, Menzel S, Döring A, Stephens J, Prokisch H, Erber W, Potter SC, Bray SL, Burns P, Jolley J, Falchi M, Kühnel B, Erdmann J, Schunkert H, Samani NJ, Illig T, Garner SF, Rankin A, Meisinger C, Bradley JR, Thein SL, Goodall AH, Spector TD, Deloukas P and Ouwehand WH

    Blood 2009;113;16;3831-7

  • Meta-analysis of genome-wide scans for human adult stature identifies novel Loci and associations with measures of skeletal frame size.

    Soranzo N, Rivadeneira F, Chinappen-Horsley U, Malkina I, Richards JB, Hammond N, Stolk L, Nica A, Inouye M, Hofman A, Stephens J, Wheeler E, Arp P, Gwilliam R, Jhamai PM, Potter S, Chaney A, Ghori MJ, Ravindrarajah R, Ermakov S, Estrada K, Pols HA, Williams FM, McArdle WL, van Meurs JB, Loos RJ, Dermitzakis ET, Ahmadi KR, Hart DJ, Ouwehand WH, Wareham NJ, Barroso I, Sandhu MS, Strachan DP, Livshits G, Spector TD, Uitterlinden AG and Deloukas P

    PLoS genetics 2009;5;4;e1000445

  • A genome-wide association study identifies three loci associated with mean platelet volume.

    Meisinger C, Prokisch H, Gieger C, Soranzo N, Mehta D, Rosskopf D, Lichtner P, Klopp N, Stephens J, Watkins NA, Deloukas P, Greinacher A, Koenig W, Nauck M, Rimmbach C, Völzke H, Peters A, Illig T, Ouwehand WH, Meitinger T, Wichmann HE and Döring A

    American journal of human genetics 2009;84;1;66-71

  • Variants in MTNR1B influence fasting glucose levels.

    Prokopenko I, Langenberg C, Florez JC, Saxena R, Soranzo N, Thorleifsson G, Loos RJ, Manning AK, Jackson AU, Aulchenko Y, Potter SC, Erdos MR, Sanna S, Hottenga JJ, Wheeler E, Kaakinen M, Lyssenko V, Chen WM, Ahmadi K, Beckmann JS, Bergman RN, Bochud M, Bonnycastle LL, Buchanan TA, Cao A, Cervino A, Coin L, Collins FS, Crisponi L, de Geus EJ, Dehghan A, Deloukas P, Doney AS, Elliott P, Freimer N, Gateva V, Herder C, Hofman A, Hughes TE, Hunt S, Illig T, Inouye M, Isomaa B, Johnson T, Kong A, Krestyaninova M, Kuusisto J, Laakso M, Lim N, Lindblad U, Lindgren CM, McCann OT, Mohlke KL, Morris AD, Naitza S, Orrù M, Palmer CN, Pouta A, Randall J, Rathmann W, Saramies J, Scheet P, Scott LJ, Scuteri A, Sharp S, Sijbrands E, Smit JH, Song K, Steinthorsdottir V, Stringham HM, Tuomi T, Tuomilehto J, Uitterlinden AG, Voight BF, Waterworth D, Wichmann HE, Willemsen G, Witteman JC, Yuan X, Zhao JH, Zeggini E, Schlessinger D, Sandhu M, Boomsma DI, Uda M, Spector TD, Penninx BW, Altshuler D, Vollenweider P, Jarvelin MR, Lakatta E, Waeber G, Fox CS, Peltonen L, Groop LC, Mooser V, Cupples LA, Thorsteinsdottir U, Boehnke M, Barroso I, Van Duijn C, Dupuis J, Watanabe RM, Stefansson K, McCarthy MI, Wareham NJ, Meigs JB and Abecasis GR

    Nature genetics 2009;41;1;77-81

Team

Team members

Helena Bouman
Postdoctoral Fellow
Lu Chen
Postdoctoral Fellow
Heather Elding
Postdoctoral Fellow
Valentina Iotchkova
EBI-Sanger Postdoctoral Fellow
Daniel Mead
Project Manager
Nicole Soranzo
Group Leader
Klaudia Walter
kw8@sanger.ac.ukStaff Scientist

Helena Bouman

- Postdoctoral Fellow

Heleen obtained an M.Sc degree in Pharmacy from Utrecht University, the Netherlands in 2010, after which she commenced the Selective Utrecht Medical Master (a 4-year clinician-scientist training program). In 2012 she obtained her PhD on antiplatelet therapy and platelet function, in a shared position between the Cardiovascular Research Institute Maastricht (CARIM) and the St. Antonius Hospital, Nieuwegein, both in the Netherlands. Her interests are in identifying genetic and non-genetic determinants of atherothrombotic disease that can be used to predict future events and guide therapy.

Research

Heleen joined the group of Nicole Soranzo at the Wellcome Trust Sanger Institute as a postdoctoral fellow in June 2014. She works on identifying genetic variants associated with blood phenotypes in UK Biobank and the Uganda General Population Cohort.

References

  • The relevance of P2Y(12)-receptor gene variation for the outcome of clopidogrel-treated patients undergoing elective coronary stent implantation: a clinical follow-up.

    Bouman HJ, van Werkum JW, Rudež G, Hackeng CM, Leebeek FW, ten Cate H, ten Berg JM and de Maat MP

    Thrombosis and haemostasis 2012;107;1;189-91

  • Variability in on-treatment platelet reactivity explained by CYP2C19*2 genotype is modest in clopidogrel pretreated patients undergoing coronary stenting.

    Bouman HJ, Harmsze AM, van Werkum JW, Breet NJ, Bergmeijer TO, Ten Cate H, Hackeng CM, Deneer VH and Ten Berg JM

    Department of Cardiology, St Antonius Hospital, CM Nieuwegein, The Netherlands. jurtenberg@gmail.com

    Background: An inadequate response to clopidogrel is mainly attributable to the variable formation of its active metabolite. The CYP2C19*2 loss-of-function polymorphism leads to reduced generation of the active metabolite and is, similarly to high on-treatment platelet reactivity (HPR), associated with recurrent atherothrombotic events following coronary stent implantation.

    Aim: To determine the relative contribution of CYP2C19*2 genotype to HPR.

    CYP2C19*2 genotyping and platelet function testing using 5 and 20 μmol/l ADP-induced light transmittance aggregometry (LTA), the PlateletWorks assay and the VerifyNow P2Y12 assay, were performed in 1069 clopidogrel pretreated patients undergoing elective coronary stenting (POPular study, http://clinicalTrials.gov/ NCT00352014). The relative contributions of CYP2C19*2 genotype and clinical variables to the interindividual variability of on-treatment platelet reactivity and the occurrence of HPR were established using multivariate regression models. CYP2C19*2 carrier status was associated with a more frequent occurrence of HPR. CYP2C19*2 genotype alone could explain 5.0%, 6.2%, 4.4% and 3.7% of the variability in 5 and 20 μmol/l ADP-induced LTA, the PlateletWorks assay and the VerifyNow P2Y12 assay, respectively, which increased to 13.0%, 15.2%, 5.6% and 20.6% when clinical variables were considered as well. Besides the CYP2C19*2 genotype, multiple clinical variables could be identified as independent predictors of HPR, including age, gender, body mass index, diabetes mellitus, clopidogrel loading dose regimen, use of amlodipine and platelet count.

    Conclusion: The CYP2C19*2 loss-of-function polymorphism is associated with a more frequent occurrence of HPR. However, the part of the interindividual variability in on-treatment platelet reactivity explained by CYP2C19*2 genotype is modest.

    Heart (British Cardiac Society) 2011;97;15;1239-44

  • A case-control study on platelet reactivity in patients with coronary stent thrombosis.

    Bouman HJ, van Werkum JW, Breet NJ, ten Cate H, Hackeng CM and ten Berg JM

    St Antonius Center for Platelet Function Research, St Antonius Hospital, Nieuwegein, The Netherlands.

    Background: The pathophysiology of stent thrombosis (ST) has evolved from the identification of single causative factors to a complex multifactorial model.

    Objectives: The aim of the present study was to investigate whether patients with a history of ST exhibit heightened platelet reactivity to clopidogrel and aspirin.

    Pretreatment and on-treatment platelet reactivity to clopidogrel and aspirin, as well as dual antiplatelet therapy resistance, was determined in 84 patients with a history of definite ST (cases: 41 early ST; 43 late ST) and in 103 control patients with a previously implanted coronary stent but no ST after the index procedure. Platelet function was evaluated with optical aggregometry, the VerifyNow P2Y12 and aspirin assays, the PFA-100 Innovance P2Y* cartridge, the flow cytometric vasodilator-stimulated phosphoprotein assay and urine 11-dehydrothromboxane B(2) measurement before and after the administration of a 600-mg loading dose of clopidogrel and 100 mg of aspirin. The study was registered at ClinicalTrials.gov, number NCT01012544.

    Results: Patients with a history of early ST clearly demonstrated higher on-clopidogrel platelet reactivity than controls. Patients with both early and late ST exhibited heightened on-aspirin platelet reactivity status, and dual antiplatelet therapy resistance was more frequent.

    Conclusions: Patients with a history of early ST exhibit a poor response to clopidogrel. Furthermore, both early and late ST are strongly and independently associated with heightened on-aspirin platelet reactivity, and dual antiplatelet therapy resistance is more frequent.

    Journal of thrombosis and haemostasis : JTH 2011;9;5;909-16

  • Paraoxonase-1 is a major determinant of clopidogrel efficacy.

    Bouman HJ, Schömig E, van Werkum JW, Velder J, Hackeng CM, Hirschhäuser C, Waldmann C, Schmalz HG, ten Berg JM and Taubert D

    Department of Cardiology, St. Antonius Hospital Nieuwegein, Nieuwegein, The Netherlands.

    Clinical efficacy of the antiplatelet drug clopidogrel is hampered by its variable biotransformation into the active metabolite. The variability in the clinical response to clopidogrel treatment has been attributed to genetic factors, but the specific genes and mechanisms underlying clopidogrel bioactivation remain unclear. Using in vitro metabolomic profiling techniques, we identified paraoxonase-1 (PON1) as the crucial enzyme for clopidogrel bioactivation, with its common Q192R polymorphism determining the rate of active metabolite formation. We tested the clinical relevance of the PON1 Q192R genotype in a population of individuals with coronary artery disease who underwent stent implantation and received clopidogrel therapy. PON1 QQ192 homozygous individuals showed a considerably higher risk than RR192 homozygous individuals of stent thrombosis, lower PON1 plasma activity, lower plasma concentrations of active metabolite and lower platelet inhibition. Thus, we identified PON1 as a key factor for the bioactivation and clinical activity of clopidogrel. These findings have therapeutic implications and may be exploited to prospectively assess the clinical efficacy of clopidogrel.

    Nature medicine 2011;17;1;110-6

  • Impact of CYP2C19 variant genotypes on clinical efficacy of antiplatelet treatment with clopidogrel: systematic review and meta-analysis.

    Bauer T, Bouman HJ, van Werkum JW, Ford NF, ten Berg JM and Taubert D

    Department of Pharmacology, University Hospital of Cologne, D-50931 Cologne, Germany.

    Objective: To evaluate the accumulated information from genetic association studies investigating the impact of variants of the cytochrome P450 (CYP) 2C19 genotype on the clinical efficacy of clopidogrel.

    Design: Systematic review and meta-analysis with a structured search algorithm and prespecified eligibility criteria for retrieval of relevant studies; dominant genetic model assumptions and quantitative methods for calculating summary effect estimates from study level odds ratios; systematic assessment of bias within and between studies; and grading of the cumulative evidence by consensus criteria.

    Medline, Embase, the Cochrane Library, online databases, contents pages and bibliographies of general medical, cardiovascular, pharmacological, and genetic journals. Eligibility criteria for selecting studies Original full length reports assessing the cumulative incidence of major adverse cardiovascular events or stent thrombosis over a follow-up period of at least a month in association with carrier status for the loss of function or gain of function CYP2C19 allele in adult patients with coronary artery disease and a clinical presentation of acute coronary syndrome or stable angina pectoris who were taking clopidogrel.

    Results: 15 studies met the inclusion criteria. The random effects summary odds ratio for stent thrombosis in carriers of at least one CYP2C19 loss of function allele versus non-carriers combining nine studies was 1.77 (95% confidence interval 1.31 to 2.40; P < 0.001). This nominally significant odds ratio was subject to considerable bias across the studies (small study effect bias and replication diversity). The adjustment for these quality modifiers tended to abolish the association. The corresponding random effects summary odds ratio of major adverse cardiovascular events for 12 studies combined was 1.11 (0.89 to 1.39; P = 0.36). The random effects summary odds ratio of stent thrombosis in carriers versus non-carriers of at least one CYP2C19*17 gain of function allele for three studies combined was 0.99 (0.60 to 1.62; P = 0.96), and the corresponding odds ratio of major adverse cardiovascular events in five studies was 0.93 (0.75 to 1.14; P = 0.48). The overall quality of epidemiological evidence was graded as low, which excludes reliable clinical assessments.

    Conclusions: Accumulated information from genetic association studies does not indicate a substantial or consistent influence of CYP2C19 gene polymorphisms on the clinical efficacy of clopidogrel. The current evidence does not support the use of individualised antiplatelet regimens guided by CYP2C19 genotype.

    BMJ (Clinical research ed.) 2011;343;d4588

  • Which platelet function test is suitable to monitor clopidogrel responsiveness? A pharmacokinetic analysis on the active metabolite of clopidogrel.

    Bouman HJ, Parlak E, van Werkum JW, Breet NJ, ten Cate H, Hackeng CM, ten Berg JM and Taubert D

    Department of Cardiology, St Antonius Hospital, Nieuwegein, the Netherlands.

    Background: Multiple platelet function tests claim to be P2Y12-pathway specific and capable of capturing the biological activity of clopidogrel.

    Objectives: The aim of the present study was to determine which platelet function test provides the best reflection of the in vivo plasma levels of the active metabolite of clopidogrel (AMC).

    Clopidogrel-naive patients scheduled for elective percutaneous coronary intervention (PCI) received a 600 mg loading dose of clopidogrel and 100 mg of aspirin. For pharmacokinetic analysis, blood was drawn at 0, 20, 40, 60, 90, 120, 180, 240 and 360 min after clopidogrel loading and peak plasma concentrations (C(max)) of the AMC were quantified with liquid chromatography-tandem mass spectrometry (LC-MS/MS). Platelet function testing was performed at baseline and 360 min after the clopidogrel loading.

    Results: The VASP-assay, the VerifyNow P2Y12-assay and 20 micromol L(-1) adenosine diphosphate (ADP)-induced light transmittance aggregometry (LTA) showed strong correlations with C(max) of the AMC (VASP: R(2) = 0.56, P < 0.001; VerifyNow platelet reactivity units (PRU): R(2) = 0.48, P < 0.001; VerifyNow %inhibition: R(2) = 0.59, P < 0.001; 20 micromol L(-1) ADP-induced LTA: R(2) = 0.47, P < 0.001). Agreement with C(max) of the AMC was less evident for 5 micromol L(-1) ADP-induced LTA or whole blood aggregometry (WBA), whereas the IMPACT-R ADP test did not show any correlation with plasma levels of the AMC.

    Conclusion: The flow cytometric VASP-assay, the VerifyNow P2Y12 assay and, although to a lesser extent, 20 micromol L(-1) ADP-induced LTA correlate best with the maximal plasma level of the AMC, suggesting these may be the preferred platelet function tests for monitoring the responsiveness to clopidogrel.

    Journal of thrombosis and haemostasis : JTH 2010;8;3;482-8

  • The influence of variation in the P2Y12 receptor gene on in vitro platelet inhibition with the direct P2Y12 antagonist cangrelor.

    Bouman HJ, van Werkum JW, Rudez G, Leebeek FW, Kruit A, Hackeng CM, Ten Berg JM, de Maat MP and Ruven HJ

    Department of Cardiology, St Antonius Hospital, P.O. Box 2500, 3435 CM Nieuwegein, The Netherlands. heleenbouman@gmail.com

    Novel P2Y12 inhibitors are in development to overcome the occurrence of atherothrombotic events associated with poor responsiveness to the widely used P2Y12 inhibitor clopidogrel. Cangrelor is an intravenously administered P2Y12 inhibitor that does not need metabolic conversion to an active metabolite for its antiplatelet action, and as a consequence exhibits a more potent and consistent antiplatelet profile as compared to clopidogrel. It was the objective of this study to determine the contribution of variation in the P2Y12 receptor gene to platelet aggregation after in vitro partial P2Y12 receptor blockade with the direct antagonist cangrelor. Optical aggregometry was performed at baseline and after in vitro addition of 0.05 and 0.25 microM cangrelor to the platelet-rich plasma of 254 healthy subjects. Five haplotype-tagging (ht)-SNPs covering the entire P2Y12 receptor gene were genotyped (rs6798347C>t, rs6787801T>c, rs9859552C>a, rs6801273A>g and rs2046934T>c [T744C]) and haplotypes were inferred. The minor c allele of SNP rs6787801 was associated with a 5% lower 20 microM ADP-induced peak platelet aggregation (0.05 microM cangrelor, p<0.05). Aa homozygotes for SNP rs9859552 showed 20% and 17% less inhibition of platelet aggregation with cangrelor when compared to CC homozygotes (0.05 and 0.25 microM cangrelor respectively; p<0.05). Results of the haplotype analyses were consistent with those of the single SNPs. Polymorphisms of the P2Y12 receptor gene contribute significantly to the interindividual variability in platelet inhibition after partial in vitro blockade with the P2Y12 antagonist cangrelor.

    Thrombosis and haemostasis 2010;103;2;379-86

  • Cytochrome P-450 polymorphisms and response to clopidogrel.

    Taubert D, Bouman HJ and van Werkum JW

    The New England journal of medicine 2009;360;21;2249-50; author reply 2251

Lu Chen

- Postdoctoral Fellow

Lu graduated with a Bachelor of Bioengineering from Huaqiao University and a Master of Science in Zoology from Xiamen University, China. He obtained his PhD in Biochemistry at University of Bath in 2011. He then joined as a postdoctoral fellow in Nicole Soranzo’s group at the Wellcome Trust Sanger Institute and Department of Haematology at University of Cambridge.

Research

Lu has been mainly working on three projects as a member of BLUEPRINT and UK10K consortiums: (1) The effect of common sequence variation on the epigenome landscape using deep genetic, transcriptomic and epigenetic data of monocytes, neutrophils and CD4+ T cells from 200 healthy individuals in the BLUEPRINT project. (2) Transcriptional diversity during lineage commitment of human blood progenitors in the BLUEPRINT project. (3) Genome wide association study of nine quantitative traits (Hgb, MCV, PCV, PLT, RBC, WBC, MCH, MCHC and IL6) using whole genome sequencing from the UK10K project.

References

  • Transcriptional diversity during lineage commitment of human blood progenitors.

    Chen L, Kostadima M, Martens JH, Canu G, Garcia SP, Turro E, Downes K, Macaulay IC, Bielczyk-Maczynska E, Coe S, Farrow S, Poudel P, Burden F, Jansen SB, Astle WJ, Attwood A, Bariana T, de Bono B, Breschi A, Chambers JC, BRIDGE Consortium, Choudry FA, Clarke L, Coupland P, van der Ent M, Erber WN, Jansen JH, Favier R, Fenech ME, Foad N, Freson K, van Geet C, Gomez K, Guigo R, Hampshire D, Kelly AM, Kerstens HH, Kooner JS, Laffan M, Lentaigne C, Labalette C, Martin T, Meacham S, Mumford A, Nürnberg S, Palumbo E, van der Reijden BA, Richardson D, Sammut SJ, Slodkowicz G, Tamuri AU, Vasquez L, Voss K, Watt S, Westbury S, Flicek P, Loos R, Goldman N, Bertone P, Read RJ, Richardson S, Cvejic A, Soranzo N, Ouwehand WH, Stunnenberg HG, Frontini M and Rendon A

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK. Department of Haematology, University of Cambridge, Cambridge Biomedical Campus, Cambridge CB2 0PT, UK. National Health Service (NHS) Blood and Transplant, Cambridge Biomedical Campus, Cambridge CB2 0PT, UK.

    Blood cells derive from hematopoietic stem cells through stepwise fating events. To characterize gene expression programs driving lineage choice, we sequenced RNA from eight primary human hematopoietic progenitor populations representing the major myeloid commitment stages and the main lymphoid stage. We identified extensive cell type-specific expression changes: 6711 genes and 10,724 transcripts, enriched in non-protein-coding elements at early stages of differentiation. In addition, we found 7881 novel splice junctions and 2301 differentially used alternative splicing events, enriched in genes involved in regulatory processes. We demonstrated experimentally cell-specific isoform usage, identifying nuclear factor I/B (NFIB) as a regulator of megakaryocyte maturation-the platelet precursor. Our data highlight the complexity of fating events in closely related progenitor populations, the understanding of which is essential for the advancement of transplantation and regenerative medicine.

    Funded by: British Heart Foundation: FS/12/27/29405, RG/09/12/28096, RP-PG-0310-1002; Cancer Research UK: 14953, C45041/A14953; Medical Research Council: MR/K023489/1; Wellcome Trust: 082961/Z/07/Z, 084183/Z/07/Z, 095908, WT091310, WT098051

    Science (New York, N.Y.) 2014;345;6204;1251033

  • An atlas of genetic influences on human blood metabolites.

    Shin SY, Fauman EB, Petersen AK, Krumsiek J, Santos R, Huang J, Arnold M, Erte I, Forgetta V, Yang TP, Walter K, Menni C, Chen L, Vasquez L, Valdes AM, Hyde CL, Wang V, Ziemek D, Roberts P, Xi L, Grundberg E, Multiple Tissue Human Expression Resource (MuTHER) Consortium, Waldenberger M, Richards JB, Mohney RP, Milburn MV, John SL, Trimmer J, Theis FJ, Overington JP, Suhre K, Brosnan MJ, Gieger C, Kastenmüller G, Spector TD and Soranzo N

    1] Department of Human Genetics, Wellcome Trust Sanger Institute, Hinxton, UK. [2] [3].

    Genome-wide association scans with high-throughput metabolic profiling provide unprecedented insights into how genetic variation influences metabolism and complex disease. Here we report the most comprehensive exploration of genetic loci influencing human metabolism thus far, comprising 7,824 adult individuals from 2 European population studies. We report genome-wide significant associations at 145 metabolic loci and their biochemical connectivity with more than 400 metabolites in human blood. We extensively characterize the resulting in vivo blueprint of metabolism in human blood by integrating it with information on gene expression, heritability and overlap with known loci for complex disorders, inborn errors of metabolism and pharmacological targets. We further developed a database and web-based resources for data mining and results visualization. Our findings provide new insights into the role of inherited variation in blood metabolic diversity and identify potential new opportunities for drug development and for understanding disease.

    Funded by: Wellcome Trust: WT091310, WT098051

    Nature genetics 2014;46;6;543-50

  • Correcting for differential transcript coverage reveals a strong relationship between alternative splicing and organism complexity.

    Chen L, Bush SJ, Tovar-Corona JM, Castillo-Morales A and Urrutia AO

    Department of Biology and Biochemistry, University of Bath, Bath, United Kingdom.

    What at the genomic level underlies organism complexity? Although several genomic features have been associated with organism complexity, in the case of alternative splicing, which has long been proposed to explain the variation in complexity, no such link has been established. Here, we analyzed over 39 million expressed sequence tags available for 47 eukaryotic species with fully sequenced genomes to obtain a comparable index of alternative splicing estimates, which corrects for the distorting effect of a variable number of transcripts per species--an important obstacle for comparative studies of alternative splicing. We find that alternative splicing has steadily increased over the last 1,400 My of eukaryotic evolution and is strongly associated with organism complexity, assayed as the number of cell types. Importantly, this association is not explained as a by-product of covariance between alternative splicing with other variables previously linked to complexity including gene content, protein length, proteome disorder, and protein interactivity. In addition, we found no evidence to suggest that the relationship of alternative splicing to cell type number is explained by drift due to reduced N(e) in more complex species. Taken together, our results firmly establish alternative splicing as a significant predictor of organism complexity and are, in principle, consistent with an important role of transcript diversification through alternative splicing as a means of determining a genome's functional information capacity.

    Molecular biology and evolution 2014;31;6;1402-13

  • Presence-absence variation in A. thaliana is primarily associated with genomic signatures consistent with relaxed selective constraints.

    Bush SJ, Castillo-Morales A, Tovar-Corona JM, Chen L, Kover PX and Urrutia AO

    Department of Biology and Biochemistry, University of Bath, Bath, United Kingdom.

    The sequencing of multiple genomes of the same plant species has revealed polymorphic gene and exon loss. Genes associated with disease resistance are overrepresented among those showing structural variations, suggesting an adaptive role for gene and exon presence-absence variation (PAV). To shed light on the possible functional relevance of polymorphic coding region loss and the mechanisms driving this process, we characterized genes that have lost entire exons or their whole coding regions in 17 fully sequenced Arabidopsis thaliana accessions. We found that although a significant enrichment in genes associated with certain functional categories is observed, PAV events are largely restricted to genes with signatures of reduced essentiality: PAV genes tend to be newer additions to the genome, tissue specific, and lowly expressed. In addition, PAV genes are located in regions of lower gene density and higher transposable element density. Partial coding region PAV events were associated with only a marginal reduction in gene expression level in the affected accession and occurred in genes with higher levels of alternative splicing in the Col-0 accession. Together, these results suggest that although adaptive scenarios cannot be ruled out, PAV events can be explained without invoking them.

    Funded by: Biotechnology and Biological Sciences Research Council: BB/F022697/1

    Molecular biology and evolution 2014;31;1;59-69

  • A rare variant in APOC3 is associated with plasma triglyceride and VLDL levels in Europeans.

    Timpson NJ, Walter K, Min JL, Tachmazidou I, Malerba G, Shin SY, Chen L, Futema M, Southam L, Iotchkova V, Cocca M, Huang J, Memari Y, McCarthy S, Danecek P, Muddyman D, Mangino M, Menni C, Perry JR, Ring SM, Gaye A, Dedoussis G, Farmaki AE, Burton P, Talmud PJ, Gambaro G, Spector TD, Smith GD, Durbin R, Richards JB, Humphries SE, Zeggini E, Soranzo N, UK1OK Consortium Members and UK1OK Consortium Members

    MRC Integrative Epidemiology Unit at the University of Bristol, University of Bristol, Oakfield House, Oakfield Grove, Bristol BS8 2BN, UK.

    The analysis of rich catalogues of genetic variation from population-based sequencing provides an opportunity to screen for functional effects. Here we report a rare variant in APOC3 (rs138326449-A, minor allele frequency ~0.25% (UK)) associated with plasma triglyceride (TG) levels (-1.43 s.d. (s.e.=0.27 per minor allele (P-value=8.0 × 10(-8))) discovered in 3,202 individuals with low read-depth, whole-genome sequence. We replicate this in 12,831 participants from five additional samples of Northern and Southern European origin (-1.0 s.d. (s.e.=0.173), P-value=7.32 × 10(-9)). This is consistent with an effect between 0.5 and 1.5 mmol l(-1) dependent on population. We show that a single predicted splice donor variant is responsible for association signals and is independent of known common variants. Analyses suggest an independent relationship between rs138326449 and high-density lipoprotein (HDL) levels. This represents one of the first examples of a rare, large effect variant identified from whole-genome sequencing at a population scale.

    Funded by: British Heart Foundation: PG008/08; Medical Research Council: G1001799, MC_UU_12013/1-9, MC_UU_12013/3; Wellcome Trust: 076113, 091310, 092731, 095219, 098051, WT091310, WT095219MA, WT098051

    Nature communications 2014;5;4871

  • Evidence for deep phylogenetic conservation of exonic splice-related constraints: splice-related skews at exonic ends in the brown alga Ectocarpus are common and resemble those seen in humans.

    Wu X, Tronholm A, Cáceres EF, Tovar-Corona JM, Chen L, Urrutia AO and Hurst LD

    Department of Biology and Biochemistry, University of Bath, Somerset, United Kingdom.

    The control of RNA splicing is often modulated by exonic motifs near splice sites. Chief among these are exonic splice enhancers (ESEs). Well-described ESEs in mammals are purine rich and cause predictable skews in codon and amino acid usage toward exonic ends. Looking across species, those with relatively abundant intronic sequence are those with the more profound end of exon skews, indicative of exonization of splice site recognition. To date, the only intron-rich species that have been analyzed are mammals, precluding any conclusions about the likely ancestral condition. Here, we examine the patterns of codon and amino acid usage in the vicinity of exon-intron junctions in the brown alga Ectocarpus siliculosus, a species with abundant large introns, known SR proteins, and classical splice sites. We find that amino acids and codons preferred/avoided at both 3' and 5' ends in Ectocarpus, of which there are many, tend, on average, to also be preferred/avoided at the same exon ends in humans. Moreover, the preferences observed at the 5' ends of exons are largely the same as those at the 3' ends, a symmetry trend only previously observed in animals. We predict putative hexameric ESEs in Ectocarpus and show that these are purine rich and that there are many more of these identified as functional ESEs in humans than expected by chance. These results are consistent with deep phylogenetic conservation of SR protein binding motifs. Assuming codons preferred near boundaries are "splice optimal" codons, in Ectocarpus, unlike Drosophila, splice optimal and translationally optimal codons are not mutually exclusive. The exclusivity of translationally optimal and splice optimal codon sets is thus not universal.

    Genome biology and evolution 2013;5;9;1731-45

  • Alternative splicing: a potential source of functional innovation in the eukaryotic genome.

    Chen L, Tovar-Corona JM and Urrutia AO

    Department of Biology and Biochemistry, University of Bath, Bath BA2 7AY, UK.

    Alternative splicing (AS) is a common posttranscriptional process in eukaryotic organisms, by which multiple distinct functional transcripts are produced from a single gene. The release of the human genome draft revealed a much smaller number of genes than anticipated. Because of its potential role in expanding protein diversity, interest in alternative splicing has been increasing over the last decade. Although recent studies have shown that 94% human multiexon genes undergo AS, evolution of AS and thus its potential role in functional innovation in eukaryotic genomes remain largely unexplored. Here we review available evidence regarding the evolution of AS prevalence and functional role. In addition we stress the need to correct for the strong effect of transcript coverage in AS detection and set out a strategy to ultimately elucidate the extent of the role of AS in functional innovation on a genomic scale.

    International journal of evolutionary biology 2012;2012;596274

  • Increased levels of noisy splicing in cancers, but not for oncogene-derived transcripts.

    Chen L, Tovar-Corona JM and Urrutia AO

    Department of Biology and Biochemistry, University of Bath, Bath BA2 7AY, UK.

    Recent genome-wide analyses have detected numerous cancer-specific alternative splicing (AS) events. Whether transcripts containing cancer-specific AS events are likely to be translated into functional proteins or simply reflect noisy splicing, thereby determining their clinical relevance, is not known. Here we show that consistent with a noisy-splicing model, cancer-specific AS events generally tend to be rare, containing more premature stop codons and have less identifiable functional domains in both the human and mouse. Interestingly, common cancer-derived AS transcripts from tumour suppressor and oncogenes show marked changes in premature stop-codon frequency; with tumour suppressor genes exhibiting increased levels of premature stop codons whereas oncogenes have the opposite pattern. We conclude that tumours tend to have faithful oncogene splicing and a higher incidence of premature stop codons among tumour suppressor and cancer-specific splice variants showing the importance of considering splicing noise when analysing cancer-specific splicing changes.

    Human molecular genetics 2011;20;22;4422-9

  • Spatiotemporal expression of Pax genes in amphioxus: insights into Pax-related organogenesis and evolution.

    Chen L, Zhang Q, Wang W and Wang Y

    Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, School of Life Sciences, Xiamen University, Xiamen, China.

    The expression of four AmphiPax genes in 16 developmental stages and different organs in amphioxus (Branchiostoma belcheri) was investigated, finding those genes expressed throughout amphioxus life with temporal-specific (especially during embryogenesis and metamorphosis) and spatial-specific patterns. This study suggests that duplicated Pax genes in vertebrates might maintain most of their ancestral functions and also expand their expression patterns after the divergence of protochordates and vertebrates.

    Science China. Life sciences 2010;53;8;1031-40

Heather Elding

- Postdoctoral Fellow

Heather graduated with a B.Sc Honours in Human Genetics from University College London, UK in 2010, where she continued on to read a PhD in Genetic Epidemiology at the same university. Her PhD work focused on dissecting the genetics of complex traits, with particular interest in investigating the genetics of Inflammatory Bowel Disease.

Research

Heather joined the Soranzo Team in October 2014 as a Post Doctoral Fellow to work on the INTERVAL blood donations project and also on UK Biobank.

References

  • Refinement in localization and identification of gene regions associated with Crohn disease.

    Elding H, Lau W, Swallow DM and Maniatis N

    Research Department of Genetics Evolution & Environment, University College London, Gower Street, London WC1E 6BT, UK.

    The risk of Crohn disease (CD) has a large genetic component. A recent meta-analysis of 6 genome-wide association studies reported 71 chromosomal intervals but does not account for all of the known genetic contribution. Here, we refine localization of the previously reported intervals and also identify additional CD susceptibility genes using a mapping approach that localizes causal variants based on genetic maps in linkage disequilibrium units (LDU maps). Using 2 of the 6 cohorts, 66 of the 71 previously reported loci are confirmed and more precise location estimates for these intervals are given. We identify 78 additional gene regions that pass genome-wide significance, providing strong evidence for 144 genes. Additionally, 56 nominally significant signals, but with more stringent and precise colocalization, are identified. In total, we provide evidence for 200 gene regions confirming that CD is truly multifactorial and complex in nature. Many identified genes have functions that are compatible with involvement in immune/inflammatory processes and seem to have a large effect in individuals with extra ileal as well as ileal inflammation. The precise locations and the evidence that some genes reflect phenotypic subgroups will help identify functional variants and will lead to greater insight of CD etiology.

    Funded by: Wellcome Trust

    American journal of human genetics 2013;92;1;107-13

  • Dissecting the genetics of complex inheritance: linkage disequilibrium mapping provides insight into Crohn disease.

    Elding H, Lau W, Swallow DM and Maniatis N

    Research Department of Genetics, Evolution, and Environment, University College London, London, UK.

    Family studies for Crohn disease (CD) report extensive linkage on chromosome 16q and pinpoint NOD2 as a possible causative locus. However, linkage is also observed in families that do not bear the most frequent NOD2 causative mutations, but no other signals on 16q have been found so far in published genome-wide association studies. Our aim is to identify this missing genetic contribution. We apply a powerful genetic mapping approach to the Wellcome Trust Case-Control Consortium and the National Institute of Diabetes and Digestive and Kidney Diseases genome-wide association data on CD. This method takes into account the underlying structure of linkage disequilibrium (LD) by using genetic distances from LD maps and provides a location for the causal agent. We find genetic heterogeneity within the NOD2 locus and also show an independent and unsuspected involvement of the neighboring gene, CYLD. We find associations with the IRF8 region and the region containing CDH1 and CDH3, as well as substantial phenotypic and genetic heterogeneity for CD itself. The genes are known to be involved in inflammation and immune dysregulation. These findings provide insight into the genetics of CD and suggest promising directions for understanding disease heterogeneity. The application of this method thus paves the way for understanding complex inheritance in general, leading to the dissection of different pathways and ultimately, personalized treatment.

    Funded by: Wellcome Trust

    American journal of human genetics 2011;89;6;798-805

Valentina Iotchkova

- EBI-Sanger Postdoctoral Fellow

Valentina Iotchkova studied Mathematics and Statistics (2004-2009), followed by a DPhil in Statistical Genetics (2009-2013) in Dr Jonathan Marchini's group at the University of Oxford on Bayesian method development for multivariate phenotype analysis for genome-wide association studies.

Research

She is currently an EBI-Sanger postdoctoral fellow (ESPOD) shared between Dr Nicole Soranzo's group at Sanger and Dr Ewan Birney's group at EBI, working on modelling the genetic-epigenetic regulatory pathways in hematopoietic cells. Her interests lie in statistical method development for analysis of high-dimentional datasets in quest for unravelling the genetic basis of phenotypic variation.

References

  • A rare variant in APOC3 is associated with plasma triglyceride and VLDL levels in Europeans.

    Timpson NJ, Walter K, Min JL, Tachmazidou I, Malerba G, Shin SY, Chen L, Futema M, Southam L, Iotchkova V, Cocca M, Huang J, Memari Y, McCarthy S, Danecek P, Muddyman D, Mangino M, Menni C, Perry JR, Ring SM, Gaye A, Dedoussis G, Farmaki AE, Burton P, Talmud PJ, Gambaro G, Spector TD, Smith GD, Durbin R, Richards JB, Humphries SE, Zeggini E, Soranzo N, UK1OK Consortium Members and UK1OK Consortium Members

    MRC Integrative Epidemiology Unit at the University of Bristol, University of Bristol, Oakfield House, Oakfield Grove, Bristol BS8 2BN, UK.

    The analysis of rich catalogues of genetic variation from population-based sequencing provides an opportunity to screen for functional effects. Here we report a rare variant in APOC3 (rs138326449-A, minor allele frequency ~0.25% (UK)) associated with plasma triglyceride (TG) levels (-1.43 s.d. (s.e.=0.27 per minor allele (P-value=8.0 × 10(-8))) discovered in 3,202 individuals with low read-depth, whole-genome sequence. We replicate this in 12,831 participants from five additional samples of Northern and Southern European origin (-1.0 s.d. (s.e.=0.173), P-value=7.32 × 10(-9)). This is consistent with an effect between 0.5 and 1.5 mmol l(-1) dependent on population. We show that a single predicted splice donor variant is responsible for association signals and is independent of known common variants. Analyses suggest an independent relationship between rs138326449 and high-density lipoprotein (HDL) levels. This represents one of the first examples of a rare, large effect variant identified from whole-genome sequencing at a population scale.

    Funded by: British Heart Foundation: PG008/08; Medical Research Council: G1001799, MC_UU_12013/1-9, MC_UU_12013/3; Wellcome Trust: 076113, 091310, 092731, 095219, 098051, WT091310, WT095219MA, WT098051

    Nature communications 2014;5;4871

Daniel Mead

- Project Manager

I graduated from the University of Aberdeen with a BSc in Genetics and Immunology in 2004, before taking a year out to help the fami1y pub business in Yorkshire. I began my scientific career working as a Forensic DNA Analyst and Robot Support Technician at the Forensic Science Service in Huntington for 3 years, before joining the Kwiatkowski Group in 2008, which I left in late 2014. I now works for Nicole Soranzo in the Genomics of Quantitative Variation group as a project manager.

Research

My work as a Project Manager is to act as the primary contact for managing Team 151 project delivery with respect to its activities. My role is to monitor, manage and report on the progress of samples and data sets through the production pipelines and to help deliver, in a timely fashion, a demanding and high visibility science project.

I am required to maintain a high quality routine of project management and reporting in a complex production and analysis system that contains a large number of processes, steps, participating teams, requirements, and end-users.

References

  • Whole-Genome Scans Provide Evidence of Adaptive Evolution in Malawian Plasmodium falciparum Isolates.

    Ocholla H, Preston MD, Mipando M, Jensen AT, Campino S, MacInnis B, Alcock D, Terlouw A, Zongo I, Oudraogo JB, Djimde AA, Assefa S, Doumbo OK, Borrmann S, Nzila A, Marsh K, Fairhurst RM, Nosten F, Anderson TJ, Kwiatkowski DP, Craig A, Clark TG and Montgomery J

    Malawi-Liverpool-Wellcome Trust Clinical Research Programme Liverpool School of Tropical Medicine, Pembroke Place, Liverpool.

    Background: Selection by host immunity and antimalarial drugs has driven extensive adaptive evolution in Plasmodium falciparum and continues to produce ever-changing landscapes of genetic variation.

    Methods: We performed whole-genome sequencing of 69 P. falciparum isolates from Malawi and used population genetics approaches to investigate genetic diversity and population structure and identify loci under selection.

    Results: High genetic diversity (π = 2.4 × 10(-4)), moderately high multiplicity of infection (2.7), and low linkage disequilibrium (500-bp) were observed in Chikhwawa District, Malawi, an area of high malaria transmission. Allele frequency-based tests provided evidence of recent population growth in Malawi and detected potential targets of host immunity and candidate vaccine antigens. Comparison of the sequence variation between isolates from Malawi and those from 5 geographically dispersed countries (Kenya, Burkina Faso, Mali, Cambodia, and Thailand) detected population genetic differences between Africa and Asia, within Southeast Asia, and within Africa. Haplotype-based tests of selection to sequence data from all 6 populations identified signals of directional selection at known drug-resistance loci, including pfcrt, pfdhps, pfmdr1, and pfgch1.

    Conclusions: The sequence variations observed at drug-resistance loci reflect differences in each country's historical use of antimalarial drugs and may be useful in formulating local malaria treatment guidelines.

    The Journal of infectious diseases 2014;210;12;1991-2000

  • Optimized Whole-Genome Amplification Strategy for Extremely AT-Biased Template.

    Oyola SO, Manske M, Campino S, Claessens A, Hamilton WL, Kekre M, Drury E, Mead D, Gu Y, Miles A, MacInnis B, Newbold C, Berriman M and Kwiatkowski DP

    Wellcome Trust Sanger Institute, Hinxton, UK so1@sanger.ac.uk.

    Pathogen genome sequencing directly from clinical samples is quickly gaining importance in genetic and medical research studies. However, low DNA yield from blood-borne pathogens is often a limiting factor. The problem worsens in extremely base-biased genomes such as the AT-rich Plasmodium falciparum. We present a strategy for whole-genome amplification (WGA) of low-yield samples from P. falciparum prior to short-read sequencing. We have developed WGA conditions that incorporate tetramethylammonium chloride for improved amplification and coverage of AT-rich regions of the genome. We show that this method reduces amplification bias and chimera formation. Our data show that this method is suitable for as low as 10 pg input DNA, and offers the possibility of sequencing the parasite genome from small blood samples.

    DNA research : an international journal for rapid publication of reports on genes and genomes 2014

  • A genome wide association study of Plasmodium falciparum susceptibility to 22 antimalarial drugs in Kenya.

    Wendler JP, Okombo J, Amato R, Miotto O, Kiara SM, Mwai L, Pole L, O'Brien J, Manske M, Alcock D, Drury E, Sanders M, Oyola SO, Malangone C, Jyothi D, Miles A, Rockett KA, MacInnis BL, Marsh K, Bejon P, Nzila A and Kwiatkowski DP

    Medical Research Council (MRC) Centre for Genomics and Global Health, University of Oxford, Oxford, United Kingdom; Wellcome Trust Sanger Institute, Hinxton, Cambridge, United Kingdom; Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, United Kingdom.

    Background: Drug resistance remains a chief concern for malaria control. In order to determine the genetic markers of drug resistant parasites, we tested the genome-wide associations (GWA) of sequence-based genotypes from 35 Kenyan P. falciparum parasites with the activities of 22 antimalarial drugs.

    Parasites isolated from children with acute febrile malaria were adapted to culture, and sensitivity was determined by in vitro growth in the presence of anti-malarial drugs. Parasites were genotyped using whole genome sequencing techniques. Associations between 6250 single nucleotide polymorphisms (SNPs) and resistance to individual anti-malarial agents were determined, with false discovery rate adjustment for multiple hypothesis testing. We identified expected associations in the pfcrt region with chloroquine (CQ) activity, and other novel loci associated with amodiaquine, quinazoline, and quinine activities. Signals for CQ and primaquine (PQ) overlap in and around pfcrt, and interestingly the phenotypes are inversely related for these two drugs. We catalog the variation in dhfr, dhps, mdr1, nhe, and crt, including novel SNPs, and confirm the presence of a dhfr-164L quadruple mutant in coastal Kenya. Mutations implicated in sulfadoxine-pyrimethamine resistance are at or near fixation in this sample set.

    Sequence-based GWA studies are powerful tools for phenotypic association tests. Using this approach on falciparum parasites from coastal Kenya we identified known and previously unreported genes associated with phenotypic resistance to anti-malarial drugs, and observe in high-resolution haplotype visualizations a possible signature of an inverse selective relationship between CQ and PQ.

    Funded by: Medical Research Council: G0600718, G1002624; Wellcome Trust: 090532/Z/09/Z, 090770/Z/09/Z, 098051

    PloS one 2014;9;5;e96486

  • Multiple populations of artemisinin-resistant Plasmodium falciparum in Cambodia.

    Miotto O, Almagro-Garcia J, Manske M, Macinnis B, Campino S, Rockett KA, Amaratunga C, Lim P, Suon S, Sreng S, Anderson JM, Duong S, Nguon C, Chuor CM, Saunders D, Se Y, Lon C, Fukuda MM, Amenga-Etego L, Hodgson AV, Asoala V, Imwong M, Takala-Harrison S, Nosten F, Su XZ, Ringwald P, Ariey F, Dolecek C, Hien TT, Boni MF, Thai CQ, Amambua-Ngwa A, Conway DJ, Djimdé AA, Doumbo OK, Zongo I, Ouedraogo JB, Alcock D, Drury E, Auburn S, Koch O, Sanders M, Hubbart C, Maslen G, Ruano-Rubio V, Jyothi D, Miles A, O'Brien J, Gamble C, Oyola SO, Rayner JC, Newbold CI, Berriman M, Spencer CC, McVean G, Day NP, White NJ, Bethell D, Dondorp AM, Plowe CV, Fairhurst RM and Kwiatkowski DP

    Medical Research Council MRC Centre for Genomics and Global Health, University of Oxford, Oxford, UK.

    We describe an analysis of genome variation in 825 P. falciparum samples from Asia and Africa that identifies an unusual pattern of parasite population structure at the epicenter of artemisinin resistance in western Cambodia. Within this relatively small geographic area, we have discovered several distinct but apparently sympatric parasite subpopulations with extremely high levels of genetic differentiation. Of particular interest are three subpopulations, all associated with clinical resistance to artemisinin, which have skewed allele frequency spectra and high levels of haplotype homozygosity, indicative of founder effects and recent population expansion. We provide a catalog of SNPs that show high levels of differentiation in the artemisinin-resistant subpopulations, including codon variants in transporter proteins and DNA mismatch repair proteins. These data provide a population-level genetic framework for investigating the biological origins of artemisinin resistance and for defining molecular markers to assist in its elimination.

    Funded by: Howard Hughes Medical Institute: 55005502; Medical Research Council: G0600718, G19/9, MC_U190081987; Wellcome Trust: 082370, 089275, 089276, 090532, 090532/Z/09/Z, 090770, 090770/Z/09/Z, 093956, 098051, G0600718

    Nature genetics 2013;45;6;648-55

  • Efficient depletion of host DNA contamination in malaria clinical sequencing.

    Oyola SO, Gu Y, Manske M, Otto TD, O'Brien J, Alcock D, Macinnis B, Berriman M, Newbold CI, Kwiatkowski DP, Swerdlow HP and Quail MA

    Wellcome Trust Sanger Institute, Hinxton, Cambridge, United Kingdom. Samuel.oyola@sanger.ac.uk

    The cost of whole-genome sequencing (WGS) is decreasing rapidly as next-generation sequencing technology continues to advance, and the prospect of making WGS available for public health applications is becoming a reality. So far, a number of studies have demonstrated the use of WGS as an epidemiological tool for typing and controlling outbreaks of microbial pathogens. Success of these applications is hugely dependent on efficient generation of clean genetic material that is free from host DNA contamination for rapid preparation of sequencing libraries. The presence of large amounts of host DNA severely affects the efficiency of characterizing pathogens using WGS and is therefore a serious impediment to clinical and epidemiological sequencing for health care and public health applications. We have developed a simple enzymatic treatment method that takes advantage of the methylation of human DNA to selectively deplete host contamination from clinical samples prior to sequencing. Using malaria clinical samples with over 80% human host DNA contamination, we show that the enzymatic treatment enriches Plasmodium falciparum DNA up to ∼9-fold and generates high-quality, nonbiased sequence reads covering >98% of 86,158 catalogued typeable single-nucleotide polymorphism loci.

    Funded by: Medical Research Council: G19/9; Wellcome Trust: 079355/Z/06/Z, 090532

    Journal of clinical microbiology 2013;51;3;745-51

  • Effective preparation of Plasmodium vivax field isolates for high-throughput whole genome sequencing.

    Auburn S, Marfurt J, Maslen G, Campino S, Ruano Rubio V, Manske M, Machunter B, Kenangalem E, Noviyanti R, Trianty L, Sebayang B, Wirjanata G, Sriprawat K, Alcock D, Macinnis B, Miotto O, Clark TG, Russell B, Anstey NM, Nosten F, Kwiatkowski DP and Price RN

    Global and Tropical Health Division, Menzies School of Health Research, Charles Darwin University, Darwin, Australia. sarah.auburn@menzies.edu.au

    Whole genome sequencing (WGS) of Plasmodium vivax is problematic due to the reliance on clinical isolates which are generally low in parasitaemia and sample volume. Furthermore, clinical isolates contain a significant contaminating background of host DNA which confounds efforts to map short read sequence of the target P. vivax DNA. Here, we discuss a methodology to significantly improve the success of P. vivax WGS on natural (non-adapted) patient isolates. Using 37 patient isolates from Indonesia, Thailand, and travellers, we assessed the application of CF11-based white blood cell filtration alone and in combination with short term ex vivo schizont maturation. Although CF11 filtration reduced human DNA contamination in 8 Indonesian isolates tested, additional short-term culture increased the P. vivax DNA yield from a median of 0.15 to 6.2 ng µl(-1) packed red blood cells (pRBCs) (p = 0.001) and reduced the human DNA percentage from a median of 33.9% to 6.22% (p = 0.008). Furthermore, post-CF11 and culture samples from Thailand gave a median P. vivax DNA yield of 2.34 ng µl(-1) pRBCs, and 2.65% human DNA. In 22 P. vivax patient isolates prepared with the 2-step method, we demonstrate high depth (median 654X coverage) and breadth (≥89%) of coverage on the Illumina GAII and HiSeq platforms. In contrast to the A+T-rich P. falciparum genome, negligible bias was observed in coverage depth between coding and non-coding regions of the P. vivax genome. This uniform coverage will greatly facilitate the detection of SNPs and copy number variants across the genome, enabling unbiased exploration of the natural diversity in P. vivax populations.

    Funded by: Medical Research Council: G19/9; Wellcome Trust: 089275, 090532, 091625

    PloS one 2013;8;1;e53160

  • Analysis of Plasmodium falciparum diversity in natural infections by deep sequencing.

    Manske M, Miotto O, Campino S, Auburn S, Almagro-Garcia J, Maslen G, O'Brien J, Djimde A, Doumbo O, Zongo I, Ouedraogo JB, Michon P, Mueller I, Siba P, Nzila A, Borrmann S, Kiara SM, Marsh K, Jiang H, Su XZ, Amaratunga C, Fairhurst R, Socheat D, Nosten F, Imwong M, White NJ, Sanders M, Anastasi E, Alcock D, Drury E, Oyola S, Quail MA, Turner DJ, Ruano-Rubio V, Jyothi D, Amenga-Etego L, Hubbart C, Jeffreys A, Rowlands K, Sutherland C, Roper C, Mangano V, Modiano D, Tan JC, Ferdig MT, Amambua-Ngwa A, Conway DJ, Takala-Harrison S, Plowe CV, Rayner JC, Rockett KA, Clark TG, Newbold CI, Berriman M, MacInnis B and Kwiatkowski DP

    Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, UK.

    Malaria elimination strategies require surveillance of the parasite population for genetic changes that demand a public health response, such as new forms of drug resistance. Here we describe methods for the large-scale analysis of genetic variation in Plasmodium falciparum by deep sequencing of parasite DNA obtained from the blood of patients with malaria, either directly or after short-term culture. Analysis of 86,158 exonic single nucleotide polymorphisms that passed genotyping quality control in 227 samples from Africa, Asia and Oceania provides genome-wide estimates of allele frequency distribution, population structure and linkage disequilibrium. By comparing the genetic diversity of individual infections with that of the local parasite population, we derive a metric of within-host diversity that is related to the level of inbreeding in the population. An open-access web application has been established for the exploration of regional differences in allele frequency and of highly differentiated loci in the P. falciparum genome.

    Funded by: Howard Hughes Medical Institute: 55005502; Medical Research Council: G0600718, G19/9; Wellcome Trust: 075491/Z/04, 077012/Z/05/Z, 082370, 089275, 090532, 090532/Z/09/Z, 090770, 090770/Z/09/Z, 092654, 093956, 098051

    Nature 2012;487;7407;375-9

  • An effective method to purify Plasmodium falciparum DNA directly from clinical blood samples for whole genome high-throughput sequencing.

    Auburn S, Campino S, Clark TG, Djimde AA, Zongo I, Pinches R, Manske M, Mangano V, Alcock D, Anastasi E, Maslen G, Macinnis B, Rockett K, Modiano D, Newbold CI, Doumbo OK, Ouédraogo JB and Kwiatkowski DP

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, United Kingdom. sa3@sanger.ac.uk

    Highly parallel sequencing technologies permit cost-effective whole genome sequencing of hundreds of Plasmodium parasites. The ability to sequence clinical Plasmodium samples, extracted directly from patient blood without a culture step, presents a unique opportunity to sample the diversity of "natural" parasite populations in high resolution clinical and epidemiological studies. A major challenge to sequencing clinical Plasmodium samples is the abundance of human DNA, which may substantially reduce the yield of Plasmodium sequence. We tested a range of human white blood cell (WBC) depletion methods on P. falciparum-infected patient samples in search of a method displaying an optimal balance of WBC-removal efficacy, cost, simplicity, and applicability to low resource settings. In the first of a two-part study, combinations of three different WBC depletion methods were tested on 43 patient blood samples in Mali. A two-step combination of Lymphoprep plus Plasmodipur best fitted our requirements, although moderate variability was observed in human DNA quantity. This approach was further assessed in a larger sample of 76 patients from Burkina Faso. WBC-removal efficacy remained high (<30% human DNA in >70% samples) and lower variation was observed in human DNA quantities. In order to assess the Plasmodium sequence yield at different human DNA proportions, 59 samples with up to 60% human DNA contamination were sequenced on the Illumina Genome Analyzer platform. An average ~40-fold coverage of the genome was observed per lane for samples with ≤ 30% human DNA. Even in low resource settings, using a simple two-step combination of Lymphoprep plus Plasmodipur, over 70% of clinical sample preparations should exhibit sufficiently low human DNA quantities to enable ~40-fold sequence coverage of the P. falciparum genome using a single lane on the Illumina Genome Analyzer platform. This approach should greatly facilitate large-scale clinical and epidemiologic studies of P. falciparum.

    Funded by: Howard Hughes Medical Institute: 55005502; Medical Research Council: G0600718, G19/9; Wellcome Trust: 090532, 090770

    PloS one 2011;6;7;e22213

  • Genome-wide and fine-resolution association analysis of malaria in West Africa.

    Jallow M, Teo YY, Small KS, Rockett KA, Deloukas P, Clark TG, Kivinen K, Bojang KA, Conway DJ, Pinder M, Sirugo G, Sisay-Joof F, Usen S, Auburn S, Bumpstead SJ, Campino S, Coffey A, Dunham A, Fry AE, Green A, Gwilliam R, Hunt SE, Inouye M, Jeffreys AE, Mendy A, Palotie A, Potter S, Ragoussis J, Rogers J, Rowlands K, Somaskantharajah E, Whittaker P, Widden C, Donnelly P, Howie B, Marchini J, Morris A, SanJoaquin M, Achidi EA, Agbenyega T, Allen A, Amodu O, Corran P, Djimde A, Dolo A, Doumbo OK, Drakeley C, Dunstan S, Evans J, Farrar J, Fernando D, Hien TT, Horstmann RD, Ibrahim M, Karunaweera N, Kokwaro G, Koram KA, Lemnge M, Makani J, Marsh K, Michon P, Modiano D, Molyneux ME, Mueller I, Parker M, Peshu N, Plowe CV, Puijalon O, Reeder J, Reyburn H, Riley EM, Sakuntabhai A, Singhasivanon P, Sirima S, Tall A, Taylor TE, Thera M, Troye-Blomberg M, Williams TN, Wilson M, Kwiatkowski DP, Wellcome Trust Case Control Consortium and Malaria Genomic Epidemiology Network

    MRC Laboratories, Fajara, Banjul, Gambia.

    We report a genome-wide association (GWA) study of severe malaria in The Gambia. The initial GWA scan included 2,500 children genotyped on the Affymetrix 500K GeneChip, and a replication study included 3,400 children. We used this to examine the performance of GWA methods in Africa. We found considerable population stratification, and also that signals of association at known malaria resistance loci were greatly attenuated owing to weak linkage disequilibrium (LD). To investigate possible solutions to the problem of low LD, we focused on the HbS locus, sequencing this region of the genome in 62 Gambian individuals and then using these data to conduct multipoint imputation in the GWA samples. This increased the signal of association, from P = 4 × 10(-7) to P = 4 × 10(-14), with the peak of the signal located precisely at the HbS causal variant. Our findings provide proof of principle that fine-resolution multipoint imputation, based on population-specific sequencing data, can substantially boost authentic GWA signals and enable fine mapping of causal variants in African populations.

    Funded by: Chief Scientist Office: CZB/4/540, ETM/137, ETM/75; Howard Hughes Medical Institute; Medical Research Council: G0600230, G0600230(77610), G0600329, G0600718, G0800759, G19/9, G9828345, MC_U190081977, MC_U190081993; NIAID NIH HHS: U19 AI065683, U19 AI065683-04; Wellcome Trust: 061858, 064890, 076113, 076934, 077011, 077383, 077383/Z/05/Z, 081682, 089062

    Nature genetics 2009;41;6;657-65

  • A global network for investigating the genomic epidemiology of malaria.

    Malaria Genomic Epidemiology Network

    The University of Buea, PO Box 63, Buea, South West Province, Cameroon.

    Large-scale studies of genomic variation could assist efforts to eliminate malaria. But there are scientific, ethical and practical challenges to carrying out such studies in developing countries, where the burden of disease is greatest. The Malaria Genomic Epidemiology Network (MalariaGEN) is now working to overcome these obstacles, using a consortial approach that brings together researchers from 21 countries.

    Funded by: Medical Research Council: G0200454, G0200454(62635), G0600230, G0600230(77610), G0600718, G19/9; Wellcome Trust: 076934, 077383, 077383/Z/05/Z

    Nature 2008;456;7223;732-7

Nicole Soranzo

- Group Leader

Nicole graduated in biological sciences at the University of Milano, Italy, with a dissertation on plant population and evolutionary genetics. She later obtained a PhD in genetics from the University of Dundee, and undertook post-doctoral training in human population and statistical genetics at University College London, conducting applied and methodological work in evolutionary genetics and association studies. In 2005 Nicole joined the pharmacogenomics department at Johnson and Johnson Pharmaceutical Research and Development (Raritan, USA). Since 2007 she has been employed at the Wellcome Trust Sanger Institute, and since 2009 she has led her own team.

Research

Our group uses large-scale genetic association analyses to identify genetic determinant of quantitative cardiometabolic traits in deeply phenotyped population-based cohorts. More recently, we have begun using massively parallel whole-exome and whole-genome sequencing with the aim to investigate the contribution of low-frequency and rare genetic variants.

Ongoing projects focus on 1) Common and rare genetic determinants of cardiometabolic traits 2) Metabolomic genetics 3) Genetic and epigenetic determinants of hematopoiesis

References

  • Seventy-five genetic loci influencing the human red blood cell.

    van der Harst P, Zhang W, Mateo Leach I, Rendon A, Verweij N, Sehmi J, Paul DS, Elling U, Allayee H, Li X, Radhakrishnan A, Tan ST, Voss K, Weichenberger CX, Albers CA, Al-Hussani A, Asselbergs FW, Ciullo M, Danjou F, Dina C, Esko T, Evans DM, Franke L, Gögele M, Hartiala J, Hersch M, Holm H, Hottenga JJ, Kanoni S, Kleber ME, Lagou V, Langenberg C, Lopez LM, Lyytikäinen LP, Melander O, Murgia F, Nolte IM, O'Reilly PF, Padmanabhan S, Parsa A, Pirastu N, Porcu E, Portas L, Prokopenko I, Ried JS, Shin SY, Tang CS, Teumer A, Traglia M, Ulivi S, Westra HJ, Yang J, Zhao JH, Anni F, Abdellaoui A, Attwood A, Balkau B, Bandinelli S, Bastardot F, Benyamin B, Boehm BO, Cookson WO, Das D, de Bakker PI, de Boer RA, de Geus EJ, de Moor MH, Dimitriou M, Domingues FS, Döring A, Engström G, Eyjolfsson GI, Ferrucci L, Fischer K, Galanello R, Garner SF, Genser B, Gibson QD, Girotto G, Gudbjartsson DF, Harris SE, Hartikainen AL, Hastie CE, Hedblad B, Illig T, Jolley J, Kähönen M, Kema IP, Kemp JP, Liang L, Lloyd-Jones H, Loos RJ, Meacham S, Medland SE, Meisinger C, Memari Y, Mihailov E, Miller K, Moffatt MF, Nauck M, Novatchkova M, Nutile T, Olafsson I, Onundarson PT, Parracciani D, Penninx BW, Perseu L, Piga A, Pistis G, Pouta A, Puc U, Raitakari O, Ring SM, Robino A, Ruggiero D, Ruokonen A, Saint-Pierre A, Sala C, Salumets A, Sambrook J, Schepers H, Schmidt CO, Silljé HH, Sladek R, Smit JH, Starr JM, Stephens J, Sulem P, Tanaka T, Thorsteinsdottir U, Tragante V, van Gilst WH, van Pelt LJ, van Veldhuisen DJ, Völker U, Whitfield JB, Willemsen G, Winkelmann BR, Wirnsberger G, Algra A, Cucca F, d'Adamo AP, Danesh J, Deary IJ, Dominiczak AF, Elliott P, Fortina P, Froguel P, Gasparini P, Greinacher A, Hazen SL, Jarvelin MR, Khaw KT, Lehtimäki T, Maerz W, Martin NG, Metspalu A, Mitchell BD, Montgomery GW, Moore C, Navis G, Pirastu M, Pramstaller PP, Ramirez-Solis R, Schadt E, Scott J, Shuldiner AR, Smith GD, Smith JG, Snieder H, Sorice R, Spector TD, Stefansson K, Stumvoll M, Tang WH, Toniolo D, Tönjes A, Visscher PM, Vollenweider P, Wareham NJ, Wolffenbuttel BH, Boomsma DI, Beckmann JS, Dedoussis GV, Deloukas P, Ferreira MA, Sanna S, Uda M, Hicks AA, Penninger JM, Gieger C, Kooner JS, Ouwehand WH, Soranzo N and Chambers JC

    Department of Cardiology, University of Groningen, University Medical Center Groningen, 9700 RB Groningen, The Netherlands. p.van.der.harst@umcg.nl

    Anaemia is a chief determinant of global ill health, contributing to cognitive impairment, growth retardation and impaired physical capacity. To understand further the genetic factors influencing red blood cells, we carried out a genome-wide association study of haemoglobin concentration and related parameters in up to 135,367 individuals. Here we identify 75 independent genetic loci associated with one or more red blood cell phenotypes at P < 10(-8), which together explain 4-9% of the phenotypic variance per trait. Using expression quantitative trait loci and bioinformatic strategies, we identify 121 candidate genes enriched in functions relevant to red blood cell biology. The candidate genes are expressed preferentially in red blood cell precursors, and 43 have haematopoietic phenotypes in Mus musculus or Drosophila melanogaster. Through open-chromatin and coding-variant analyses we identify potential causal genetic variants at 41 loci. Our findings provide extensive new insights into genetic mechanisms and biological pathways controlling red blood cell formation and function.

    Funded by: British Heart Foundation: RG/08/014/24067; Cancer Research UK: 14136; Chief Scientist Office: CZB/4/505, ETM/55; Medical Research Council: G0600705, G0700704, G0801056, G1000143, G9815508, MC_U106179471, MC_U106188470; NCATS NIH HHS: UL1 TR000439; NCI NIH HHS: R01 CA165001; NCRR NIH HHS: K12 RR023250, U54 RR020278, UL1 RR025005; NHGRI NIH HHS: T32 HG002536, U01 HG004402; NHLBI NIH HHS: HHSN268201100005C, HHSN268201100006C, HHSN268201100007C, HHSN268201100008C, HHSN268201100009C, HHSN268201100010C, HHSN268201100011C, HHSN268201100012C, P01 HL076491, P01 HL098055, P20 HL113452, R01 HL059367, R01 HL086694, R01 HL087641, R01 HL087679, R01 HL088119, R01 HL103866, R01 HL103931, U01 HL072515, U01 HL084756; NIA NIH HHS: N01AG12109, R01 AG018728; NICHD NIH HHS: R01 HD042157; NIDA NIH HHS: HHSN271201100005C; NIDDK NIH HHS: P30 DK072488; NIGMS NIH HHS: R01 GM053275, U01 GM074518; NIMH NIH HHS: R01 MH081802, RL1 MH083268, U24 MH068457; NLM NIH HHS: R01 LM010098; Wellcome Trust: 092731, 097117

    Nature 2012;492;7429;369-75

  • New gene functions in megakaryopoiesis and platelet formation.

    Gieger C, Radhakrishnan A, Cvejic A, Tang W, Porcu E, Pistis G, Serbanovic-Canic J, Elling U, Goodall AH, Labrune Y, Lopez LM, Mägi R, Meacham S, Okada Y, Pirastu N, Sorice R, Teumer A, Voss K, Zhang W, Ramirez-Solis R, Bis JC, Ellinghaus D, Gögele M, Hottenga JJ, Langenberg C, Kovacs P, O'Reilly PF, Shin SY, Esko T, Hartiala J, Kanoni S, Murgia F, Parsa A, Stephens J, van der Harst P, Ellen van der Schoot C, Allayee H, Attwood A, Balkau B, Bastardot F, Basu S, Baumeister SE, Biino G, Bomba L, Bonnefond A, Cambien F, Chambers JC, Cucca F, D'Adamo P, Davies G, de Boer RA, de Geus EJ, Döring A, Elliott P, Erdmann J, Evans DM, Falchi M, Feng W, Folsom AR, Frazer IH, Gibson QD, Glazer NL, Hammond C, Hartikainen AL, Heckbert SR, Hengstenberg C, Hersch M, Illig T, Loos RJ, Jolley J, Khaw KT, Kühnel B, Kyrtsonis MC, Lagou V, Lloyd-Jones H, Lumley T, Mangino M, Maschio A, Mateo Leach I, McKnight B, Memari Y, Mitchell BD, Montgomery GW, Nakamura Y, Nauck M, Navis G, Nöthlings U, Nolte IM, Porteous DJ, Pouta A, Pramstaller PP, Pullat J, Ring SM, Rotter JI, Ruggiero D, Ruokonen A, Sala C, Samani NJ, Sambrook J, Schlessinger D, Schreiber S, Schunkert H, Scott J, Smith NL, Snieder H, Starr JM, Stumvoll M, Takahashi A, Tang WH, Taylor K, Tenesa A, Lay Thein S, Tönjes A, Uda M, Ulivi S, van Veldhuisen DJ, Visscher PM, Völker U, Wichmann HE, Wiggins KL, Willemsen G, Yang TP, Hua Zhao J, Zitting P, Bradley JR, Dedoussis GV, Gasparini P, Hazen SL, Metspalu A, Pirastu M, Shuldiner AR, Joost van Pelt L, Zwaginga JJ, Boomsma DI, Deary IJ, Franke A, Froguel P, Ganesh SK, Jarvelin MR, Martin NG, Meisinger C, Psaty BM, Spector TD, Wareham NJ, Akkerman JW, Ciullo M, Deloukas P, Greinacher A, Jupe S, Kamatani N, Khadake J, Kooner JS, Penninger J, Prokopenko I, Stemple D, Toniolo D, Wernisch L, Sanna S, Hicks AA, Rendon A, Ferreira MA, Ouwehand WH and Soranzo N

    Institute of Genetic Epidemiology, Helmholtz Zentrum München, German Research Center for Environmental Health, Ingolstädter Landstr 1, 85764 Neuherberg, Germany. christian.gieger@helmholtz-muenchen.de

    Platelets are the second most abundant cell type in blood and are essential for maintaining haemostasis. Their count and volume are tightly controlled within narrow physiological ranges, but there is only limited understanding of the molecular processes controlling both traits. Here we carried out a high-powered meta-analysis of genome-wide association studies (GWAS) in up to 66,867 individuals of European ancestry, followed by extensive biological and functional assessment. We identified 68 genomic loci reliably associated with platelet count and volume mapping to established and putative novel regulators of megakaryopoiesis and platelet formation. These genes show megakaryocyte-specific gene expression patterns and extensive network connectivity. Using gene silencing in Danio rerio and Drosophila melanogaster, we identified 11 of the genes as novel regulators of blood cell formation. Taken together, our findings advance understanding of novel gene functions controlling fate-determining events during megakaryopoiesis and platelet formation, providing a new example of successful translation of GWAS to function.

    Funded by: Biotechnology and Biological Sciences Research Council: BB/F019394/1; British Heart Foundation: RG/09/012/28096; Chief Scientist Office: CZB/4/505, ETM/55; Medical Research Council: G0601966, G0700704, G0700931, G0701120, G0701863, G0801056, G1000143, MC_U105260799, MC_U106179471, MC_U106188470; NCRR NIH HHS: K12 RR023250, K12 RR023250-05, M01 RR016500-08, U54 RR020278-06, UL1 RR025005, UL1 RR025005-05; NHGRI NIH HHS: P41 HG003751, T32 HG002536; NHLBI NIH HHS: N01 HC055015, N01 HC055016, N01 HC055018, N01 HC055019, N01 HC055020, N01 HC055021, N01 HC055022, N01 HC085079, P01 HL076491, P01 HL076491-09, P01 HL098055, P01 HL098055-03, R01 HL059367, R01 HL059367-11, R01 HL068986, R01 HL068986-06, R01 HL073410-08, R01 HL085251, R01 HL085251-04, R01 HL086694, R01 HL086694-05, R01 HL087641, R01 HL087641-03, R01 HL087679-03, R01 HL088119, R01 HL088119-04, R01 HL103866, R01 HL103866-03, R01 HL105756, U01 HL072515, U01 HL072515-06, U01 HL084756, U01 HL084756-03; NIA NIH HHS: R01 AG018728, R01 AG018728-05S1; NICHD NIH HHS: R01 HD042157-01A1; NIDDK NIH HHS: P30 DK072488, P30 DK072488-08; NIGMS NIH HHS: R01 GM053275, R01 GM053275-14, U01 GM074518, U01 GM074518-04; NIMH NIH HHS: RL1 MH083268, RL1 MH083268-05; Wellcome Trust: 092731, 098051, WT077037/Z/05/Z, WT077047/Z/05/Z, WT082597/Z/07/Z

    Nature 2011;480;7376;201-8

  • Silencing of RhoA nucleotide exchange factor, ARHGEF3, reveals its unexpected role in iron uptake.

    Serbanovic-Canic J, Cvejic A, Soranzo N, Stemple DL, Ouwehand WH and Freson K

    Department of Haematology, University of Cambridge and NHS Blood and Transplant, Cambridge, UK.

    Genomewide association meta-analysis studies have identified > 100 independent genetic loci associated with blood cell indices, including volume and count of platelets and erythrocytes. Although several of these loci encode known regulators of hematopoiesis, the mechanism by which most sequence variants exert their effect on blood cell formation remains elusive. An example is the Rho guanine nucleotide exchange factor, ARHGEF3, which was previously implicated by genomewide association meta-analysis studies in bone cell biology. Here, we report on the unexpected role of ARHGEF3 in regulation of iron uptake and erythroid cell maturation. Although early erythroid differentiation progressed normally, silencing of arhgef3 in Danio rerio resulted in microcytic and hypochromic anemia. This was rescued by intracellular supplementation of iron, showing that arhgef3-depleted erythroid cells are fully capable of hemoglobinization. Disruption of the arhgef3 target, RhoA, also produced severe anemia, which was, again, corrected by iron injection. Moreover, silencing of ARHGEF3 in erythromyeloblastoid cells K562 showed that the uptake of transferrin was severely impaired. Taken together, this is the first study to provide evidence for ARHGEF3 being a regulator of transferrin uptake in erythroid cells, through activation of RHOA.

    Funded by: Wellcome Trust: WT 077037/Z/05/Z, WT077047/Z/05/Z, WT082597/Z/07/Z

    Blood 2011;118;18;4967-76

  • Human metabolic individuality in biomedical and pharmaceutical research.

    Suhre K, Shin SY, Petersen AK, Mohney RP, Meredith D, Wägele B, Altmaier E, CARDIoGRAM, Deloukas P, Erdmann J, Grundberg E, Hammond CJ, de Angelis MH, Kastenmüller G, Köttgen A, Kronenberg F, Mangino M, Meisinger C, Meitinger T, Mewes HW, Milburn MV, Prehn C, Raffler J, Ried JS, Römisch-Margl W, Samani NJ, Small KS, Wichmann HE, Zhai G, Illig T, Spector TD, Adamski J, Soranzo N and Gieger C

    Institute of Bioinformatics and Systems Biology, Helmholtz Zentrum München, German Research Center for Environmental Health, Ingolstädter Landstraße 1, 85764 Neuherberg, Germany. karsten@suhre.fr

    Genome-wide association studies (GWAS) have identified many risk loci for complex diseases, but effect sizes are typically small and information on the underlying biological processes is often lacking. Associations with metabolic traits as functional intermediates can overcome these problems and potentially inform individualized therapy. Here we report a comprehensive analysis of genotype-dependent metabolic phenotypes using a GWAS with non-targeted metabolomics. We identified 37 genetic loci associated with blood metabolite concentrations, of which 25 show effect sizes that are unusually high for GWAS and account for 10-60% differences in metabolite levels per allele copy. Our associations provide new functional insights for many disease-related associations that have been reported in previous studies, including those for cardiovascular and kidney disorders, type 2 diabetes, cancer, gout, venous thromboembolism and Crohn's disease. The study advances our knowledge of the genetic basis of metabolic individuality in humans and generates many new hypotheses for biomedical and pharmaceutical research.

    Funded by: Biotechnology and Biological Sciences Research Council; British Heart Foundation; Canadian Institutes of Health Research: MOP172605, MOP77682, MOP‐82810; Cancer Research UK; Medical Research Council; NHLBI NIH HHS: 1R01HL103931‐01, HL087647, N01‐HC‐55015, N01‐HC‐55016, N01‐HC‐55018, N01‐HC‐55019, N01‐HC‐55020, N01‐HC‐55021, N01‐HC‐55022, P01 HL098055, P01HL076491‐06, P01HL087018, R01 HL087647, R01 HL087676, R01HL089650‐02; NIA NIH HHS: N01‐AG‐12100; NIDDK NIH HHS: R01DK080732; Wellcome Trust: 091746, 091746/Z/10/Z

    Nature 2011;477;7362;54-60

  • Maps of open chromatin guide the functional follow-up of genome-wide association signals: application to hematological traits.

    Paul DS, Nisbet JP, Yang TP, Meacham S, Rendon A, Hautaviita K, Tallila J, White J, Tijssen MR, Sivapalaratnam S, Basart H, Trip MD, Cardiogenics Consortium, MuTHER Consortium, Göttgens B, Soranzo N, Ouwehand WH and Deloukas P

    Wellcome Trust Sanger Institute, Hinxton, United Kingdom. dp5@sanger.ac.uk

    Turning genetic discoveries identified in genome-wide association (GWA) studies into biological mechanisms is an important challenge in human genetics. Many GWA signals map outside exons, suggesting that the associated variants may lie within regulatory regions. We applied the formaldehyde-assisted isolation of regulatory elements (FAIRE) method in a megakaryocytic and an erythroblastoid cell line to map active regulatory elements at known loci associated with hematological quantitative traits, coronary artery disease, and myocardial infarction. We showed that the two cell types exhibit distinct patterns of open chromatin and that cell-specific open chromatin can guide the finding of functional variants. We identified an open chromatin region at chromosome 7q22.3 in megakaryocytes but not erythroblasts, which harbors the common non-coding sequence variant rs342293 known to be associated with platelet volume and function. Resequencing of this open chromatin region in 643 individuals provided strong evidence that rs342293 is the only putative causative variant in this region. We demonstrated that the C- and G-alleles differentially bind the transcription factor EVI1 affecting PIK3CG gene expression in platelets and macrophages. A protein-protein interaction network including up- and down-regulated genes in Pik3cg knockout mice indicated that PIK3CG is associated with gene pathways with an established role in platelet membrane biogenesis and thrombus formation. Thus, rs342293 is the functional common variant at this locus; to the best of our knowledge this is the first such variant to be elucidated among the known platelet quantitative trait loci (QTLs). Our data suggested a molecular mechanism by which a non-coding GWA index SNP modulates platelet phenotype.

    Funded by: British Heart Foundation: RG/09/12/28096; Medical Research Council: G0800784, G0900339, MC_U105260799; Wellcome Trust: 081917/Z/07/Z, 091746/Z/10/Z

    PLoS genetics 2011;7;6;e1002139

  • Common variants at 10 genomic loci influence hemoglobin A₁(C) levels via glycemic and nonglycemic pathways.

    Soranzo N, Sanna S, Wheeler E, Gieger C, Radke D, Dupuis J, Bouatia-Naji N, Langenberg C, Prokopenko I, Stolerman E, Sandhu MS, Heeney MM, Devaney JM, Reilly MP, Ricketts SL, Stewart AF, Voight BF, Willenborg C, Wright B, Altshuler D, Arking D, Balkau B, Barnes D, Boerwinkle E, Böhm B, Bonnefond A, Bonnycastle LL, Boomsma DI, Bornstein SR, Böttcher Y, Bumpstead S, Burnett-Miller MS, Campbell H, Cao A, Chambers J, Clark R, Collins FS, Coresh J, de Geus EJ, Dei M, Deloukas P, Döring A, Egan JM, Elosua R, Ferrucci L, Forouhi N, Fox CS, Franklin C, Franzosi MG, Gallina S, Goel A, Graessler J, Grallert H, Greinacher A, Hadley D, Hall A, Hamsten A, Hayward C, Heath S, Herder C, Homuth G, Hottenga JJ, Hunter-Merrill R, Illig T, Jackson AU, Jula A, Kleber M, Knouff CW, Kong A, Kooner J, Köttgen A, Kovacs P, Krohn K, Kühnel B, Kuusisto J, Laakso M, Lathrop M, Lecoeur C, Li M, Li M, Loos RJ, Luan J, Lyssenko V, Mägi R, Magnusson PK, Mälarstig A, Mangino M, Martínez-Larrad MT, März W, McArdle WL, McPherson R, Meisinger C, Meitinger T, Melander O, Mohlke KL, Mooser VE, Morken MA, Narisu N, Nathan DM, Nauck M, O'Donnell C, Oexle K, Olla N, Pankow JS, Payne F, Peden JF, Pedersen NL, Peltonen L, Perola M, Polasek O, Porcu E, Rader DJ, Rathmann W, Ripatti S, Rocheleau G, Roden M, Rudan I, Salomaa V, Saxena R, Schlessinger D, Schunkert H, Schwarz P, Seedorf U, Selvin E, Serrano-Ríos M, Shrader P, Silveira A, Siscovick D, Song K, Spector TD, Stefansson K, Steinthorsdottir V, Strachan DP, Strawbridge R, Stumvoll M, Surakka I, Swift AJ, Tanaka T, Teumer A, Thorleifsson G, Thorsteinsdottir U, Tönjes A, Usala G, Vitart V, Völzke H, Wallaschofski H, Waterworth DM, Watkins H, Wichmann HE, Wild SH, Willemsen G, Williams GH, Wilson JF, Winkelmann J, Wright AF, WTCCC, Zabena C, Zhao JH, Epstein SE, Erdmann J, Hakonarson HH, Kathiresan S, Khaw KT, Roberts R, Samani NJ, Fleming MD, Sladek R, Abecasis G, Boehnke M, Froguel P, Groop L, McCarthy MI, Kao WH, Florez JC, Uda M, Wareham NJ, Barroso I and Meigs JB

    Human Genetics, Wellcome Trust Sanger Institute, Hinxton, U.K.

    Objective: Glycated hemoglobin (HbA₁(c)), used to monitor and diagnose diabetes, is influenced by average glycemia over a 2- to 3-month period. Genetic factors affecting expression, turnover, and abnormal glycation of hemoglobin could also be associated with increased levels of HbA₁(c). We aimed to identify such genetic factors and investigate the extent to which they influence diabetes classification based on HbA₁(c) levels.

    We studied associations with HbA₁(c) in up to 46,368 nondiabetic adults of European descent from 23 genome-wide association studies (GWAS) and 8 cohorts with de novo genotyped single nucleotide polymorphisms (SNPs). We combined studies using inverse-variance meta-analysis and tested mediation by glycemia using conditional analyses. We estimated the global effect of HbA₁(c) loci using a multilocus risk score, and used net reclassification to estimate genetic effects on diabetes screening.

    Results: Ten loci reached genome-wide significant association with HbA(1c), including six new loci near FN3K (lead SNP/P value, rs1046896/P = 1.6 × 10⁻²⁶), HFE (rs1800562/P = 2.6 × 10⁻²⁰), TMPRSS6 (rs855791/P = 2.7 × 10⁻¹⁴), ANK1 (rs4737009/P = 6.1 × 10⁻¹²), SPTA1 (rs2779116/P = 2.8 × 10⁻⁹) and ATP11A/TUBGCP3 (rs7998202/P = 5.2 × 10⁻⁹), and four known HbA₁(c) loci: HK1 (rs16926246/P = 3.1 × 10⁻⁵⁴), MTNR1B (rs1387153/P = 4.0 × 10⁻¹¹), GCK (rs1799884/P = 1.5 × 10⁻²⁰) and G6PC2/ABCB11 (rs552976/P = 8.2 × 10⁻¹⁸). We show that associations with HbA₁(c) are partly a function of hyperglycemia associated with 3 of the 10 loci (GCK, G6PC2 and MTNR1B). The seven nonglycemic loci accounted for a 0.19 (% HbA₁(c)) difference between the extreme 10% tails of the risk score, and would reclassify ∼2% of a general white population screened for diabetes with HbA₁(c).

    Conclusions: GWAS identified 10 genetic loci reproducibly associated with HbA₁(c). Six are novel and seven map to loci where rarer variants cause hereditary anemias and iron storage disorders. Common variants at these loci likely influence HbA₁(c) levels via erythrocyte biology, and confer a small but detectable reclassification of diabetes diagnosis by HbA₁(c).

    Funded by: Chief Scientist Office: CZB/4/710; Medical Research Council: G0401527, G0701863, MC_QA137934, MC_U106179471, MC_U106188470, MC_U127561128, MC_UP_A100_1003; NIDDK NIH HHS: R01 DK072193

    Diabetes 2010;59;12;3229-39

  • New genetic loci implicated in fasting glucose homeostasis and their impact on type 2 diabetes risk.

    Dupuis J, Langenberg C, Prokopenko I, Saxena R, Soranzo N, Jackson AU, Wheeler E, Glazer NL, Bouatia-Naji N, Gloyn AL, Lindgren CM, Mägi R, Morris AP, Randall J, Johnson T, Elliott P, Rybin D, Thorleifsson G, Steinthorsdottir V, Henneman P, Grallert H, Dehghan A, Hottenga JJ, Franklin CS, Navarro P, Song K, Goel A, Perry JR, Egan JM, Lajunen T, Grarup N, Sparsø T, Doney A, Voight BF, Stringham HM, Li M, Kanoni S, Shrader P, Cavalcanti-Proença C, Kumari M, Qi L, Timpson NJ, Gieger C, Zabena C, Rocheleau G, Ingelsson E, An P, O'Connell J, Luan J, Elliott A, McCarroll SA, Payne F, Roccasecca RM, Pattou F, Sethupathy P, Ardlie K, Ariyurek Y, Balkau B, Barter P, Beilby JP, Ben-Shlomo Y, Benediktsson R, Bennett AJ, Bergmann S, Bochud M, Boerwinkle E, Bonnefond A, Bonnycastle LL, Borch-Johnsen K, Böttcher Y, Brunner E, Bumpstead SJ, Charpentier G, Chen YD, Chines P, Clarke R, Coin LJ, Cooper MN, Cornelis M, Crawford G, Crisponi L, Day IN, de Geus EJ, Delplanque J, Dina C, Erdos MR, Fedson AC, Fischer-Rosinsky A, Forouhi NG, Fox CS, Frants R, Franzosi MG, Galan P, Goodarzi MO, Graessler J, Groves CJ, Grundy S, Gwilliam R, Gyllensten U, Hadjadj S, Hallmans G, Hammond N, Han X, Hartikainen AL, Hassanali N, Hayward C, Heath SC, Hercberg S, Herder C, Hicks AA, Hillman DR, Hingorani AD, Hofman A, Hui J, Hung J, Isomaa B, Johnson PR, Jørgensen T, Jula A, Kaakinen M, Kaprio J, Kesaniemi YA, Kivimaki M, Knight B, Koskinen S, Kovacs P, Kyvik KO, Lathrop GM, Lawlor DA, Le Bacquer O, Lecoeur C, Li Y, Lyssenko V, Mahley R, Mangino M, Manning AK, Martínez-Larrad MT, McAteer JB, McCulloch LJ, McPherson R, Meisinger C, Melzer D, Meyre D, Mitchell BD, Morken MA, Mukherjee S, Naitza S, Narisu N, Neville MJ, Oostra BA, Orrù M, Pakyz R, Palmer CN, Paolisso G, Pattaro C, Pearson D, Peden JF, Pedersen NL, Perola M, Pfeiffer AF, Pichler I, Polasek O, Posthuma D, Potter SC, Pouta A, Province MA, Psaty BM, Rathmann W, Rayner NW, Rice K, Ripatti S, Rivadeneira F, Roden M, Rolandsson O, Sandbaek A, Sandhu M, Sanna S, Sayer AA, Scheet P, Scott LJ, Seedorf U, Sharp SJ, Shields B, Sigurethsson G, Sijbrands EJ, Silveira A, Simpson L, Singleton A, Smith NL, Sovio U, Swift A, Syddall H, Syvänen AC, Tanaka T, Thorand B, Tichet J, Tönjes A, Tuomi T, Uitterlinden AG, van Dijk KW, van Hoek M, Varma D, Visvikis-Siest S, Vitart V, Vogelzangs N, Waeber G, Wagner PJ, Walley A, Walters GB, Ward KL, Watkins H, Weedon MN, Wild SH, Willemsen G, Witteman JC, Yarnell JW, Zeggini E, Zelenika D, Zethelius B, Zhai G, Zhao JH, Zillikens MC, DIAGRAM Consortium, GIANT Consortium, Global BPgen Consortium, Borecki IB, Loos RJ, Meneton P, Magnusson PK, Nathan DM, Williams GH, Hattersley AT, Silander K, Salomaa V, Smith GD, Bornstein SR, Schwarz P, Spranger J, Karpe F, Shuldiner AR, Cooper C, Dedoussis GV, Serrano-Ríos M, Morris AD, Lind L, Palmer LJ, Hu FB, Franks PW, Ebrahim S, Marmot M, Kao WH, Pankow JS, Sampson MJ, Kuusisto J, Laakso M, Hansen T, Pedersen O, Pramstaller PP, Wichmann HE, Illig T, Rudan I, Wright AF, Stumvoll M, Campbell H, Wilson JF, Anders Hamsten on behalf of Procardis Consortium, MAGIC investigators, Bergman RN, Buchanan TA, Collins FS, Mohlke KL, Tuomilehto J, Valle TT, Altshuler D, Rotter JI, Siscovick DS, Penninx BW, Boomsma DI, Deloukas P, Spector TD, Frayling TM, Ferrucci L, Kong A, Thorsteinsdottir U, Stefansson K, van Duijn CM, Aulchenko YS, Cao A, Scuteri A, Schlessinger D, Uda M, Ruokonen A, Jarvelin MR, Waterworth DM, Vollenweider P, Peltonen L, Mooser V, Abecasis GR, Wareham NJ, Sladek R, Froguel P, Watanabe RM, Meigs JB, Groop L, Boehnke M, McCarthy MI, Florez JC and Barroso I

    Department of Biostatistics, Boston University School of Public Health, Massachusetts, USA.

    Levels of circulating glucose are tightly regulated. To identify new loci influencing glycemic traits, we performed meta-analyses of 21 genome-wide association studies informative for fasting glucose, fasting insulin and indices of beta-cell function (HOMA-B) and insulin resistance (HOMA-IR) in up to 46,186 nondiabetic participants. Follow-up of 25 loci in up to 76,558 additional subjects identified 16 loci associated with fasting glucose and HOMA-B and two loci associated with fasting insulin and HOMA-IR. These include nine loci newly associated with fasting glucose (in or near ADCY5, MADD, ADRA2A, CRY2, FADS1, GLIS3, SLC2A2, PROX1 and C2CD4B) and one influencing fasting insulin and HOMA-IR (near IGF1). We also demonstrated association of ADCY5, PROX1, GCK, GCKR and DGKB-TMEM195 with type 2 diabetes. Within these loci, likely biological candidate genes influence signal transduction, cell proliferation, development, glucose-sensing and circadian regulation. Our results demonstrate that genetic studies of glycemic traits can identify type 2 diabetes risk loci, as well as loci containing gene variants that are associated with a modest elevation in glucose levels but are not associated with overt diabetes.

    Funded by: Chief Scientist Office: CZB/4/710; Medical Research Council: G0600705, G0601261, G0700222, G0700222(81696), G0701863, G0801056, G19/35, MC_U106179471, MC_U106188470, MC_U127561128, MC_U127592696, MC_U137686854, MC_U137686857, MC_UP_A620_1014, MC_UP_A620_1015; NIDDK NIH HHS: K24 DK080140, P30 DK040561, P30 DK040561-14, P30 DK072488, R01 DK029867, R01 DK072193, R01 DK078616, R01 DK078616-01A1; The Dunhill Medical Trust: R69/0208; Wellcome Trust: 064890, 077011, 077016, 081682, 088885, 089061, 091746

    Nature genetics 2010;42;2;105-16

  • A novel variant on chromosome 7q22.3 associated with mean platelet volume, counts, and function.

    Soranzo N, Rendon A, Gieger C, Jones CI, Watkins NA, Menzel S, Döring A, Stephens J, Prokisch H, Erber W, Potter SC, Bray SL, Burns P, Jolley J, Falchi M, Kühnel B, Erdmann J, Schunkert H, Samani NJ, Illig T, Garner SF, Rankin A, Meisinger C, Bradley JR, Thein SL, Goodall AH, Spector TD, Deloukas P and Ouwehand WH

    Human Genetics Department, Wellcome Trust Sanger Institute, Hinxton, Cambridge, United Kingdom. ns6@sanger.ac.uk

    Mean platelet volume (MPV) and platelet count (PLT) are highly heritable and tightly regulated traits. We performed a genome-wide association study for MPV and identified one SNP, rs342293, as having highly significant and reproducible association with MPV (per-G allele effect 0.016 +/- 0.001 log fL; P < 1.08 x 10(-24)) and PLT (per-G effect -4.55 +/- 0.80 10(9)/L; P < 7.19 x 10(-8)) in 8586 healthy subjects. Whole-genome expression analysis in the 1-MB region showed a significant association with platelet transcript levels for PIK3CG (n = 35; P = .047). The G allele at rs342293 was also associated with decreased binding of annexin V to platelets activated with collagen-related peptide (n = 84; P = .003). The region 7q22.3 identifies the first QTL influencing platelet volume, counts, and function in healthy subjects. Notably, the association signal maps to a chromosome region implicated in myeloid malignancies, indicating this site as an important regulatory site for hematopoiesis. The identification of loci regulating MPV by this and other studies will increase our insight in the processes of megakaryopoiesis and proplatelet formation, and it may aid the identification of genes that are somatically mutated in essential thrombocytosis.

    Funded by: Medical Research Council: G0000111; Wellcome Trust: 072856, 077011, 079771, 082597, 084183

    Blood 2009;113;16;3831-7

  • Meta-analysis of genome-wide scans for human adult stature identifies novel Loci and associations with measures of skeletal frame size.

    Soranzo N, Rivadeneira F, Chinappen-Horsley U, Malkina I, Richards JB, Hammond N, Stolk L, Nica A, Inouye M, Hofman A, Stephens J, Wheeler E, Arp P, Gwilliam R, Jhamai PM, Potter S, Chaney A, Ghori MJ, Ravindrarajah R, Ermakov S, Estrada K, Pols HA, Williams FM, McArdle WL, van Meurs JB, Loos RJ, Dermitzakis ET, Ahmadi KR, Hart DJ, Ouwehand WH, Wareham NJ, Barroso I, Sandhu MS, Strachan DP, Livshits G, Spector TD, Uitterlinden AG and Deloukas P

    Human Genetics Department, Wellcome Trust Sanger Institute, Hinxton, Cambridge, United Kingdom.

    Recent genome-wide (GW) scans have identified several independent loci affecting human stature, but their contribution through the different skeletal components of height is still poorly understood. We carried out a genome-wide scan in 12,611 participants, followed by replication in an additional 7,187 individuals, and identified 17 genomic regions with GW-significant association with height. Of these, two are entirely novel (rs11809207 in CATSPER4, combined P-value = 6.1x10(-8) and rs910316 in TMED10, P-value = 1.4x10(-7)) and two had previously been described with weak statistical support (rs10472828 in NPR3, P-value = 3x10(-7) and rs849141 in JAZF1, P-value = 3.2x10(-11)). One locus (rs1182188 at GNA12) identifies the first height eQTL. We also assessed the contribution of height loci to the upper- (trunk) and lower-body (hip axis and femur) skeletal components of height. We find evidence for several loci associated with trunk length (including rs6570507 in GPR126, P-value = 4x10(-5) and rs6817306 in LCORL, P-value = 4x10(-4)), hip axis length (including rs6830062 at LCORL, P-value = 4.8x10(-4) and rs4911494 at UQCC, P-value = 1.9x10(-4)), and femur length (including rs710841 at PRKG2, P-value = 2.4x10(-5) and rs10946808 at HIST1H1D, P-value = 6.4x10(-6)). Finally, we used conditional analyses to explore a possible differential contribution of the height loci to these different skeletal size measurements. In addition to validating four novel loci controlling adult stature, our study represents the first effort to assess the contribution of genetic loci to three skeletal components of height. Further statistical tests in larger numbers of individuals will be required to verify if the height loci affect height preferentially through these subcomponents of height.

    Funded by: Medical Research Council: G0000934, G0701863, MC_QA137934, MC_U106188470; Wellcome Trust: 068545/Z/02

    PLoS genetics 2009;5;4;e1000445

  • Variants in MTNR1B influence fasting glucose levels.

    Prokopenko I, Langenberg C, Florez JC, Saxena R, Soranzo N, Thorleifsson G, Loos RJ, Manning AK, Jackson AU, Aulchenko Y, Potter SC, Erdos MR, Sanna S, Hottenga JJ, Wheeler E, Kaakinen M, Lyssenko V, Chen WM, Ahmadi K, Beckmann JS, Bergman RN, Bochud M, Bonnycastle LL, Buchanan TA, Cao A, Cervino A, Coin L, Collins FS, Crisponi L, de Geus EJ, Dehghan A, Deloukas P, Doney AS, Elliott P, Freimer N, Gateva V, Herder C, Hofman A, Hughes TE, Hunt S, Illig T, Inouye M, Isomaa B, Johnson T, Kong A, Krestyaninova M, Kuusisto J, Laakso M, Lim N, Lindblad U, Lindgren CM, McCann OT, Mohlke KL, Morris AD, Naitza S, Orrù M, Palmer CN, Pouta A, Randall J, Rathmann W, Saramies J, Scheet P, Scott LJ, Scuteri A, Sharp S, Sijbrands E, Smit JH, Song K, Steinthorsdottir V, Stringham HM, Tuomi T, Tuomilehto J, Uitterlinden AG, Voight BF, Waterworth D, Wichmann HE, Willemsen G, Witteman JC, Yuan X, Zhao JH, Zeggini E, Schlessinger D, Sandhu M, Boomsma DI, Uda M, Spector TD, Penninx BW, Altshuler D, Vollenweider P, Jarvelin MR, Lakatta E, Waeber G, Fox CS, Peltonen L, Groop LC, Mooser V, Cupples LA, Thorsteinsdottir U, Boehnke M, Barroso I, Van Duijn C, Dupuis J, Watanabe RM, Stefansson K, McCarthy MI, Wareham NJ, Meigs JB and Abecasis GR

    [1] Oxford Centre for Diabetes, Endocrinology and Metabolism, University of Oxford, Oxford OX3 7LJ, UK. [2] Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, UK. [3] These authors contributed equally to this work.

    To identify previously unknown genetic loci associated with fasting glucose concentrations, we examined the leading association signals in ten genome-wide association scans involving a total of 36,610 individuals of European descent. Variants in the gene encoding melatonin receptor 1B (MTNR1B) were consistently associated with fasting glucose across all ten studies. The strongest signal was observed at rs10830963, where each G allele (frequency 0.30 in HapMap CEU) was associated with an increase of 0.07 (95% CI = 0.06-0.08) mmol/l in fasting glucose levels (P = 3.2 x 10(-50)) and reduced beta-cell function as measured by homeostasis model assessment (HOMA-B, P = 1.1 x 10(-15)). The same allele was associated with an increased risk of type 2 diabetes (odds ratio = 1.09 (1.05-1.12), per G allele P = 3.3 x 10(-7)) in a meta-analysis of 13 case-control studies totaling 18,236 cases and 64,453 controls. Our analyses also confirm previous associations of fasting glucose with variants at the G6PC2 (rs560887, P = 1.1 x 10(-57)) and GCK (rs4607517, P = 1.0 x 10(-25)) loci.

    Funded by: Medical Research Council: G0000649, G016121, G0500539, G0601261, G0701863, MC_U106179471, MC_U106188470; NCRR NIH HHS: RR-163736; NHGRI NIH HHS: HG-02651, R01 HG002651, R01 HG002651-05; NHLBI NIH HHS: HC-25195, HL-084729, HL-087679, N01 HC025195, N02-HL-6-4278, R01 HL087679-02, U01 HL084729, U01 HL084729-03; NIDA NIH HHS: DA-021519, U54 DA021519, U54 DA021519-04; NIDDK NIH HHS: DK-062370, DK-065978, DK-072193, DK-078616, DK-080140, DK069922, K23 DK065978, K23 DK065978-05, K24 DK080140, K24 DK080140-01, K24 DK080140-02, R01 DK029867, R01 DK062370, R01 DK062370-05, R01 DK069922-02, R01 DK072193, R01 DK072193-04, R01 DK078616, R01 DK078616-01A1; NIMH NIH HHS: MH059160, R01 MH059160, R01 MH059160-04; Wellcome Trust: 076113, 077011, 077016, 079557, 083948, 089061, GR069224, GR072960

    Nature genetics 2009;41;1;77-81

Klaudia Walter

kw8@sanger.ac.uk Staff Scientist

Klaudia graduated with a Mag.rer.nat. degree in mathematics and physical education from the University of Vienna, Austria. Later she completed an MSc in statistics at the University of Sheffield in 2003 and a PhD in statistical methods in comparative genomics at the University of Cambridge in 2007.

She joined Matt Hurles' group at the WTSI as a postdoctoral fellow to work mainly on structural variations in the 1000 Genomes Project from 2007 to 2011.

Research

Since June 2011 she is working as a statistical geneticist in the UK10K project in Nicole Soranzo's group at the WTSI.

References

  • Estimating genome-wide significance for whole-genome sequencing studies.

    Xu C, Tachmazidou I, Walter K, Ciampi A, Zeggini E, Greenwood CM and UK10K Consortium

    Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Montreal, Canada; Lady Davis Institute for Medical Research, Jewish General Hospital, Montreal, Canada.

    Although a standard genome-wide significance level has been accepted for the testing of association between common genetic variants and disease, the era of whole-genome sequencing (WGS) requires a new threshold. The allele frequency spectrum of sequence-identified variants is very different from common variants, and the identified rare genetic variation is usually jointly analyzed in a series of genomic windows or regions. In nearby or overlapping windows, these test statistics will be correlated, and the degree of correlation is likely to depend on the choice of window size, overlap, and the test statistic. Furthermore, multiple analyses may be performed using different windows or test statistics. Here we propose an empirical approach for estimating genome-wide significance thresholds for data arising from WGS studies, and we demonstrate that the empirical threshold can be efficiently estimated by extrapolating from calculations performed on a small genomic region. Because analysis of WGS may need to be repeated with different choices of test statistics or windows, this prediction approach makes it computationally feasible to estimate genome-wide significance thresholds for different analysis choices. Based on UK10K whole-genome sequence data, we derive genome-wide significance thresholds ranging between 2.5 × 10(-8) and 8 × 10(-8) for our analytic choices in window-based testing, and thresholds of 0.6 × 10(-8) -1.5 × 10(-8) for a combined analytic strategy of testing common variants using single-SNP tests together with rare variants analyzed with our sliding-window test strategy.

    Funded by: Canadian Institutes of Health Research: MOP-115110; Wellcome Trust: WT091310, WT098051

    Genetic epidemiology 2014;38;4;281-90

  • A rare variant in APOC3 is associated with plasma triglyceride and VLDL levels in Europeans.

    Timpson NJ, Walter K, Min JL, Tachmazidou I, Malerba G, Shin SY, Chen L, Futema M, Southam L, Iotchkova V, Cocca M, Huang J, Memari Y, McCarthy S, Danecek P, Muddyman D, Mangino M, Menni C, Perry JR, Ring SM, Gaye A, Dedoussis G, Farmaki AE, Burton P, Talmud PJ, Gambaro G, Spector TD, Smith GD, Durbin R, Richards JB, Humphries SE, Zeggini E, Soranzo N, UK1OK Consortium Members and UK1OK Consortium Members

    MRC Integrative Epidemiology Unit at the University of Bristol, University of Bristol, Oakfield House, Oakfield Grove, Bristol BS8 2BN, UK.

    The analysis of rich catalogues of genetic variation from population-based sequencing provides an opportunity to screen for functional effects. Here we report a rare variant in APOC3 (rs138326449-A, minor allele frequency ~0.25% (UK)) associated with plasma triglyceride (TG) levels (-1.43 s.d. (s.e.=0.27 per minor allele (P-value=8.0 × 10(-8))) discovered in 3,202 individuals with low read-depth, whole-genome sequence. We replicate this in 12,831 participants from five additional samples of Northern and Southern European origin (-1.0 s.d. (s.e.=0.173), P-value=7.32 × 10(-9)). This is consistent with an effect between 0.5 and 1.5 mmol l(-1) dependent on population. We show that a single predicted splice donor variant is responsible for association signals and is independent of known common variants. Analyses suggest an independent relationship between rs138326449 and high-density lipoprotein (HDL) levels. This represents one of the first examples of a rare, large effect variant identified from whole-genome sequencing at a population scale.

    Funded by: British Heart Foundation: PG008/08; Medical Research Council: G1001799, MC_UU_12013/1-9, MC_UU_12013/3; Wellcome Trust: 076113, 091310, 092731, 095219, 098051, WT091310, WT095219MA, WT098051

    Nature communications 2014;5;4871

  • An integrated map of genetic variation from 1,092 human genomes.

    1000 Genomes Project Consortium, Abecasis GR, Auton A, Brooks LD, DePristo MA, Durbin RM, Handsaker RE, Kang HM, Marth GT and McVean GA

    By characterizing the geographic and functional spectrum of human genetic variation, the 1000 Genomes Project aims to build a resource to help to understand the genetic contribution to disease. Here we describe the genomes of 1,092 individuals from 14 populations, constructed using a combination of low-coverage whole-genome and exome sequencing. By developing methods to integrate information across several algorithms and diverse data sources, we provide a validated haplotype map of 38 million single nucleotide polymorphisms, 1.4 million short insertions and deletions, and more than 14,000 larger deletions. We show that individuals from different populations carry different profiles of rare and common variants, and that low-frequency variants show substantial geographic differentiation, which is further increased by the action of purifying selection. We show that evolutionary conservation and coding consequence are key determinants of the strength of purifying selection, that rare-variant load varies substantially across biological pathways, and that each individual contains hundreds of rare non-coding variants at conserved sites, such as motif-disrupting changes in transcription-factor-binding sites. This resource, which captures up to 98% of accessible single nucleotide polymorphisms at a frequency of 1% in related populations, enables analysis of common and low-frequency variants in individuals from diverse, including admixed, populations.

    Funded by: Biotechnology and Biological Sciences Research Council: BB/I021213/1; British Heart Foundation: RG/09/012/28096, RG/09/12/28096; Howard Hughes Medical Institute; Medical Research Council: G0701805, G0801823, G0900747, G0900747(91070); NCI NIH HHS: R01 CA166661, R01CA166661; NCRR NIH HHS: G12 RR003050, UL1RR024131; NHGRI NIH HHS: P01 HG004120, P01HG4120, P41HG2371, P41HG4221, R01 HG002898, R01 HG004960, R01 HG007022, R01HG2898, R01HG3698, R01HG4719, R01HG4960, R01HG5701, RC2HG5552, RC2HG5581, U01 HG005728, U01 HG006513, U01HG5208, U01HG5209, U01HG5211, U01HG5214, U01HG5715, U01HG5725, U01HG5728, U01HG6513, U01HG6569, U41HG4568, U54 HG003273, U54HG3067, U54HG3079, U54HG3273; NHLBI NIH HHS: HL078885, R01HL95045, RC2HL102925, T32HL94284; NIA NIH HHS: P30 AG038072; NIAID NIH HHS: AI077439, AI2009061; NIEHS NIH HHS: ES015794; NIGMS NIH HHS: R01GM59290, T32GM7748, T32GM8283; NIH HHS: DP2OD6514; NIMH NIH HHS: R01MH84698; NIMHD NIH HHS: G12 MD007579; NLM NIH HHS: T15 LM007056, T15LM7033; PHS HHS: HHSN268201100040C; Wellcome Trust: 085532, 086084, 090532, 095908, WT085475/Z/08/Z, WT085532AIA, WT086084/Z/08/Z, WT089250/Z/09/Z, WT090532/Z/09/Z, WT095552/Z/11/Z, WT098051

    Nature 2012;491;7422;56-65

  • A systematic survey of loss-of-function variants in human protein-coding genes.

    MacArthur DG, Balasubramanian S, Frankish A, Huang N, Morris J, Walter K, Jostins L, Habegger L, Pickrell JK, Montgomery SB, Albers CA, Zhang ZD, Conrad DF, Lunter G, Zheng H, Ayub Q, DePristo MA, Banks E, Hu M, Handsaker RE, Rosenfeld JA, Fromer M, Jin M, Mu XJ, Khurana E, Ye K, Kay M, Saunders GI, Suner MM, Hunt T, Barnes IH, Amid C, Carvalho-Silva DR, Bignell AH, Snow C, Yngvadottir B, Bumpstead S, Cooper DN, Xue Y, Romero IG, 1000 Genomes Project Consortium, Wang J, Li Y, Gibbs RA, McCarroll SA, Dermitzakis ET, Pritchard JK, Barrett JC, Harrow J, Hurles ME, Gerstein MB and Tyler-Smith C

    Wellcome Trust Sanger Institute, Hinxton, UK. macarthur@atgu.mgh.harvard.edu

    Genome-sequencing studies indicate that all humans carry many genetic variants predicted to cause loss of function (LoF) of protein-coding genes, suggesting unexpected redundancy in the human genome. Here we apply stringent filters to 2951 putative LoF variants obtained from 185 human genomes to determine their true prevalence and properties. We estimate that human genomes typically contain ~100 genuine LoF variants with ~20 genes completely inactivated. We identify rare and likely deleterious LoF alleles, including 26 known and 21 predicted severe disease-causing variants, as well as common LoF variants in nonessential genes. We describe functional and evolutionary differences between LoF-tolerant and recessive disease genes and a method for using these differences to prioritize candidate genes found in clinical sequencing studies.

    Funded by: British Heart Foundation: RG/09/012/28096; NHGRI NIH HHS: U54 HG003273; Wellcome Trust: 085532, 090532, 090532/Z/09/Z, 098051

    Science (New York, N.Y.) 2012;335;6070;823-8

  • Mapping copy number variation by population-scale genome sequencing.

    Mills RE, Walter K, Stewart C, Handsaker RE, Chen K, Alkan C, Abyzov A, Yoon SC, Ye K, Cheetham RK, Chinwalla A, Conrad DF, Fu Y, Grubert F, Hajirasouliha I, Hormozdiari F, Iakoucheva LM, Iqbal Z, Kang S, Kidd JM, Konkel MK, Korn J, Khurana E, Kural D, Lam HY, Leng J, Li R, Li Y, Lin CY, Luo R, Mu XJ, Nemesh J, Peckham HE, Rausch T, Scally A, Shi X, Stromberg MP, Stütz AM, Urban AE, Walker JA, Wu J, Zhang Y, Zhang ZD, Batzer MA, Ding L, Marth GT, McVean G, Sebat J, Snyder M, Wang J, Ye K, Eichler EE, Gerstein MB, Hurles ME, Lee C, McCarroll SA, Korbel JO and 1000 Genomes Project

    Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, USA.

    Genomic structural variants (SVs) are abundant in humans, differing from other forms of variation in extent, origin and functional impact. Despite progress in SV characterization, the nucleotide resolution architecture of most SVs remains unknown. We constructed a map of unbalanced SVs (that is, copy number variants) based on whole genome DNA sequencing data from 185 human genomes, integrating evidence from complementary SV discovery approaches with extensive experimental validations. Our map encompassed 22,025 deletions and 6,000 additional SVs, including insertions and tandem duplications. Most SVs (53%) were mapped to nucleotide resolution, which facilitated analysing their origin and functional impact. We examined numerous whole and partial gene deletions with a genotyping approach and observed a depletion of gene disruptions amongst high frequency deletions. Furthermore, we observed differences in the size spectra of SVs originating from distinct formation mechanisms, and constructed a map of SV hotspots formed by common mechanisms. Our analytical framework and SV map serves as a resource for sequencing-based association studies.

    Funded by: Howard Hughes Medical Institute; Medical Research Council: G0701805; NHGRI NIH HHS: P01 HG004120, P41 HG004221, P41 HG004221-01, P41 HG004221-02, P41 HG004221-03, P41 HG004221-03S1, P41 HG004221-03S2, P41 HG004221-03S3, R01 HG004719, R01 HG004719-01, R01 HG004719-02, R01 HG004719-02S1, R01 HG004719-03, R01 HG004719-04, RC2 HG005552, RC2 HG005552-01, RC2 HG005552-02, U01 HG005209, U01 HG005209-01, U01 HG005209-02, U54 HG003273; NIGMS NIH HHS: R01 GM059290, R01 GM081533, R01 GM081533-01A1, R01 GM081533-02, R01 GM081533-03, R01 GM081533-04, R01 GM59290; NIMH NIH HHS: R01 MH091350; Wellcome Trust: 062023, 077009, 077014, 077192, 085532

    Nature 2011;470;7332;59-65

  • A map of human genome variation from population-scale sequencing.

    1000 Genomes Project Consortium, Abecasis GR, Altshuler D, Auton A, Brooks LD, Durbin RM, Gibbs RA, Hurles ME and McVean GA

    The 1000 Genomes Project aims to provide a deep characterization of human genome sequence variation as a foundation for investigating the relationship between genotype and phenotype. Here we present results of the pilot phase of the project, designed to develop and compare different strategies for genome-wide sequencing with high-throughput platforms. We undertook three projects: low-coverage whole-genome sequencing of 179 individuals from four populations; high-coverage sequencing of two mother-father-child trios; and exon-targeted sequencing of 697 individuals from seven populations. We describe the location, allele frequency and local haplotype structure of approximately 15 million single nucleotide polymorphisms, 1 million short insertions and deletions, and 20,000 structural variants, most of which were previously undescribed. We show that, because we have catalogued the vast majority of common variation, over 95% of the currently accessible variants found in any individual are present in this data set. On average, each person is found to carry approximately 250 to 300 loss-of-function variants in annotated genes and 50 to 100 variants previously implicated in inherited disorders. We demonstrate how these results can be used to inform association and functional studies. From the two trios, we directly estimate the rate of de novo germline base substitution mutations to be approximately 10(-8) per base pair per generation. We explore the data with regard to signatures of natural selection, and identify a marked reduction of genetic variation in the neighbourhood of genes, due to selection at linked sites. These methods and public data will support the next phase of human genetic research.

    Funded by: British Heart Foundation: RG/09/012/28096; Howard Hughes Medical Institute; Medical Research Council: G0801823, G0801823(89305); NCRR NIH HHS: S10RR025056; NHGRI NIH HHS: 01HG3229, N01HG62088, P01 HG004120, P01HG4120, P41HG2371, P41HG4221, P41HG4222, P50HG2357, R01 HG003229, R01 HG003229-05, R01 HG004719, R01 HG004719-01, R01 HG004719-02, R01 HG004719-02S1, R01 HG004719-03, R01 HG004719-04, R01HG2651, R01HG3698, R01HG4333, R01HG4719, R01HG4960, RC2 HG005552, RC2 HG005552-01, RC2 HG005552-02, RC2HG5552, U01HG5208, U01HG5209, U01HG5210, U01HG5211, U01HG5214, U41HG4568, U54 HG003273, U54HG2750, U54HG2757, U54HG3067, U54HG3079, U54HG3273; NIGMS NIH HHS: R01GM59290, R01GM72861, T32 GM007753; NIMH NIH HHS: 01MH84698; Wellcome Trust: 075491, 077009, 077014, 077192, 081407, 085532, 086084, 089061, 089062, 089088, WT075491/Z/04, WT077009, WT081407/Z/06/Z, WT085532AIA, WT086084/Z/08/Z, WT089088/Z/09/Z

    Nature 2010;467;7319;1061-73

  • Accurate whole human genome sequencing using reversible terminator chemistry.

    Bentley DR, Balasubramanian S, Swerdlow HP, Smith GP, Milton J, Brown CG, Hall KP, Evers DJ, Barnes CL, Bignell HR, Boutell JM, Bryant J, Carter RJ, Keira Cheetham R, Cox AJ, Ellis DJ, Flatbush MR, Gormley NA, Humphray SJ, Irving LJ, Karbelashvili MS, Kirk SM, Li H, Liu X, Maisinger KS, Murray LJ, Obradovic B, Ost T, Parkinson ML, Pratt MR, Rasolonjatovo IM, Reed MT, Rigatti R, Rodighiero C, Ross MT, Sabot A, Sankar SV, Scally A, Schroth GP, Smith ME, Smith VP, Spiridou A, Torrance PE, Tzonev SS, Vermaas EH, Walter K, Wu X, Zhang L, Alam MD, Anastasi C, Aniebo IC, Bailey DM, Bancarz IR, Banerjee S, Barbour SG, Baybayan PA, Benoit VA, Benson KF, Bevis C, Black PJ, Boodhun A, Brennan JS, Bridgham JA, Brown RC, Brown AA, Buermann DH, Bundu AA, Burrows JC, Carter NP, Castillo N, Chiara E Catenazzi M, Chang S, Neil Cooley R, Crake NR, Dada OO, Diakoumakos KD, Dominguez-Fernandez B, Earnshaw DJ, Egbujor UC, Elmore DW, Etchin SS, Ewan MR, Fedurco M, Fraser LJ, Fuentes Fajardo KV, Scott Furey W, George D, Gietzen KJ, Goddard CP, Golda GS, Granieri PA, Green DE, Gustafson DL, Hansen NF, Harnish K, Haudenschild CD, Heyer NI, Hims MM, Ho JT, Horgan AM, Hoschler K, Hurwitz S, Ivanov DV, Johnson MQ, James T, Huw Jones TA, Kang GD, Kerelska TH, Kersey AD, Khrebtukova I, Kindwall AP, Kingsbury Z, Kokko-Gonzales PI, Kumar A, Laurent MA, Lawley CT, Lee SE, Lee X, Liao AK, Loch JA, Lok M, Luo S, Mammen RM, Martin JW, McCauley PG, McNitt P, Mehta P, Moon KW, Mullens JW, Newington T, Ning Z, Ling Ng B, Novo SM, O'Neill MJ, Osborne MA, Osnowski A, Ostadan O, Paraschos LL, Pickering L, Pike AC, Pike AC, Chris Pinkard D, Pliskin DP, Podhasky J, Quijano VJ, Raczy C, Rae VH, Rawlings SR, Chiva Rodriguez A, Roe PM, Rogers J, Rogert Bacigalupo MC, Romanov N, Romieu A, Roth RK, Rourke NJ, Ruediger ST, Rusman E, Sanches-Kuiper RM, Schenker MR, Seoane JM, Shaw RJ, Shiver MK, Short SW, Sizto NL, Sluis JP, Smith MA, Ernest Sohna Sohna J, Spence EJ, Stevens K, Sutton N, Szajkowski L, Tregidgo CL, Turcatti G, Vandevondele S, Verhovsky Y, Virk SM, Wakelin S, Walcott GC, Wang J, Worsley GJ, Yan J, Yau L, Zuerlein M, Rogers J, Mullikin JC, Hurles ME, McCooke NJ, West JS, Oaks FL, Lundberg PL, Klenerman D, Durbin R and Smith AJ

    Illumina Cambridge Ltd. (Formerly Solexa Ltd), Chesterford Research Park, Little Chesterford, Nr Saffron Walden, Essex CB10 1XL, UK. dbentley@illumina.com

    DNA sequence information underpins genetic research, enabling discoveries of important biological or medical benefit. Sequencing projects have traditionally used long (400-800 base pair) reads, but the existence of reference sequences for the human and many other genomes makes it possible to develop new, fast approaches to re-sequencing, whereby shorter reads are compared to a reference to identify intraspecies genetic variation. Here we report an approach that generates several billion bases of accurate nucleotide sequence per experiment at low cost. Single molecules of DNA are attached to a flat surface, amplified in situ and used as templates for synthetic sequencing with fluorescent reversible terminator deoxyribonucleotides. Images of the surface are analysed to generate high-quality sequence. We demonstrate application of this approach to human genome sequencing on flow-sorted X chromosomes and then scale the approach to determine the genome sequence of a male Yoruba from Ibadan, Nigeria. We build an accurate consensus sequence from >30x average depth of paired 35-base reads. We characterize four million single-nucleotide polymorphisms and four hundred thousand structural variants, many of which were previously unknown. Our approach is effective for accurate, rapid and economical whole-genome re-sequencing and many other biomedical applications.

    Funded by: Biotechnology and Biological Sciences Research Council: B05823, MOL04534; Medical Research Council: G0701805; NHGRI NIH HHS: Z01 HG200330-03; Wellcome Trust

    Nature 2008;456;7218;53-9

  • Increased human IgE induced by killing Schistosoma mansoni in vivo is associated with pretreatment Th2 cytokine responsiveness to worm antigens.

    Walter K, Fulford AJ, McBeath R, Joseph S, Jones FM, Kariuki HC, Mwatha JK, Kimani G, Kabatereine NB, Vennervald BJ, Ouma JH and Dunne DW

    Division of Microbiology and Parasitology, Department of Pathology, University of Cambridge, Cambridge, United Kingdom. klaudia.walter@mrc-bsu.cam.ac.uk

    In schistosomiasis endemic areas, children are very susceptible to postchemotherapy reinfection, whereas adults are relatively resistant. Different studies have reported that schistosome-specific IL-4 and IL-5 responses, or posttreatment worm-IgE levels, correlate with subsequent low reinfection. Chemotherapy kills i.v. worms providing an in vivo Ag challenge. We measured anti-worm (soluble worm Ag (SWA) and recombinant tegumental Ag (rSm22.6)) and anti-egg (soluble egg Ag) Ab levels in 177 Ugandans (aged 7-50) in a high Schistosoma mansoni transmission area, both before and 7 wk posttreatment, and analyzed these data in relation to whole blood in vitro cytokine responses at the same time points. Soluble egg Ag-Ig levels were unaffected by treatment but worm-IgG1 and -IgG4 increased, whereas worm-IgE increased in many but not all individuals. An increase in worm-IgE was mainly seen in >15-year-olds and, unlike in children, was inversely correlated to pretreatment infection intensities, suggesting this response was associated both with resistance to pretreatment infection, as well as posttreatment reinfection. The increases in SWA-IgE and rSm22.6-IgE positively correlated with pretreatment Th2 cytokines, but not IFN-gamma, induced by SWA. These relationships remained significant after allowing for the confounding effects of pretreatment infection intensity, age, and pretreatment IgE levels, indicating a link between SWA-specific Th2 cytokine responsiveness and subsequent increases in worm-IgE. An exceptionally strong relationship between IL-5 and posttreatment worm-IgE levels in < 15-year-olds suggested that the failure of younger children to respond to in vivo Ag stimulation with increased levels of IgE, is related to their lack of pretreatment SWA Th2 cytokine responsiveness.

    Funded by: Medical Research Council: G7708609; Wellcome Trust

    Journal of immunology (Baltimore, Md. : 1950) 2006;177;8;5490-8

  • Striking nucleotide frequency pattern at the borders of highly conserved vertebrate non-coding sequences.

    Walter K, Abnizova I, Elgar G and Gilks WR

    MRC Rosalind Franklin Centre for Genomics Research, Hinxton, Cambridge CB10 1SB, UK.

    In a recent study, 1373 highly conserved non-coding elements (CNEs) were detected by aligning the human and Takifugu rubripes (Fugu) genomes. The remarkable degree of sequence conservation in CNEs compared with their surroundings suggested comparing the base composition within CNEs with their 5' and 3' flanking regions. The analysis reveals a novel, sharp and distinct signal of nucleotide frequency bias precisely at the border between CNEs and flanking regions.

    Trends in genetics : TIG 2005;21;8;436-40

  • Highly conserved non-coding sequences are associated with vertebrate development.

    Woolfe A, Goodson M, Goode DK, Snell P, McEwen GK, Vavouri T, Smith SF, North P, Callaway H, Kelly K, Walter K, Abnizova I, Gilks W, Edwards YJ, Cooke JE and Elgar G

    Medical Research Council Rosalind Franklin Centre for Genomics Research Hinxton, Cambridge, United Kingdom.

    In addition to protein coding sequence, the human genome contains a significant amount of regulatory DNA, the identification of which is proving somewhat recalcitrant to both in silico and functional methods. An approach that has been used with some success is comparative sequence analysis, whereby equivalent genomic regions from different organisms are compared in order to identify both similarities and differences. In general, similarities in sequence between highly divergent organisms imply functional constraint. We have used a whole-genome comparison between humans and the pufferfish, Fugu rubripes, to identify nearly 1,400 highly conserved non-coding sequences. Given the evolutionary divergence between these species, it is likely that these sequences are found in, and furthermore are essential to, all vertebrates. Most, and possibly all, of these sequences are located in and around genes that act as developmental regulators. Some of these sequences are over 90% identical across more than 500 bases, being more highly conserved than coding sequence between these two species. Despite this, we cannot find any similar sequences in invertebrate genomes. In order to begin to functionally test this set of sequences, we have used a rapid in vivo assay system using zebrafish embryos that allows tissue-specific enhancer activity to be identified. Functional data is presented for highly conserved non-coding sequences associated with four unrelated developmental regulators (SOX21, PAX6, HLXB9, and SHH), in order to demonstrate the suitability of this screen to a wide range of genes and expression patterns. Of 25 sequence elements tested around these four genes, 23 show significant enhancer activity in one or more tissues. We have identified a set of non-coding sequences that are highly conserved throughout vertebrates. They are found in clusters across the human genome, principally around genes that are implicated in the regulation of development, including many transcription factors. These highly conserved non-coding sequences are likely to form part of the genomic circuitry that uniquely defines vertebrate development.

    PLoS biology 2005;3;1;e7

* quick link - http://q.sanger.ac.uk/qtraits