Cancer genome project

The Wellcome Trust Sanger Institute's Cancer Genome Project is led jointly by Professor Mike Stratton and Dr Peter Campbell. All cancers occur due to abnormalities in DNA sequence. Cancer affects people at all ages with the risk for most types increasing with age.

One in three people in the Western world develop cancer and one in five die of the disease. Cancer is therefore the most common genetic disease.

[Anne Weston, Wellcome Images]

Background

Throughout life, the genome within cells of the human body is exposed to mutagens and suffers mistakes in replication. These corrosive influences result in progressive, subtle divergence of the DNA sequence in each cell from that originally constituted in the fertilised egg.

Lung cancer

Lung cancer

zoom

Occasionally, one of these somatic mutations alters the function of a critical gene, providing a growth advantage to the cell in which it has occurred and resulting in the emergence of an expanded clone derived from this cell. Acquisition of additional mutations, and consequent waves of clonal expansion result in the evolution of the mutinous cells that invade surrounding tissues and metastasise.

The identification of genes that are mutated and hence drive oncogenesis has been a central aim of cancer research since the advent of recombinant DNA technology.

The Cancer Genome Project is using the human genome sequence and high-throughput mutation detection techniques to identify somatically acquired sequence variants/mutations and hence identify genes critical to the development of human cancers. This initiative will ultimately provide the paradigm for the detection of germline mutations in non-neoplastic human genetic diseases through genome-wide mutation detection approaches.

This is an ongoing project and we will be adding further data in the future. If you would like to be informed when new data is released please sign up here

Resources

Data resources

Gene ConsensusCancer Gene Census:
Mutated genes causally implicated in human cancer.

COSMICCOSMIC:
Catalogue of somatic mutations in cancer

Systematic ScreensWhole Genomes:
Somatic mutations from systematic large scale screening of genes in human cancers.

Cell linesCancer Cell Line Project:
Resequencing of known cancer genes and other analyses of human cancer cell lines.

Copy NumberCGP copy number analysis in cancer:
Analysis of copy number variation in cancer.

ArchiveCGP trace and genotype archive:
Archive of sequence traces and genotype data generated by the group.

Genomics of drug sensitivity in cancerGenomics of drug sensitivity in cancer:
Analysis of drug sensitivity data in human cancer cell lines.

CGP SoftwareCGP Software:
Algorithms and software developed by the CGP.

Selected Publications

  • The landscape of cancer genes and mutational processes in breast cancer.

    Stephens PJ, Tarpey PS, Davies H, Van Loo P, Greenman C, Wedge DC, Nik-Zainal S, Martin S, Varela I, Bignell GR, Yates LR, Papaemmanuil E, Beare D, Butler A, Cheverton A, Gamble J, Hinton J, Jia M, Jayakumar A, Jones D, Latimer C, Lau KW, McLaren S, McBride DJ, Menzies A, Mudie L, Raine K, Rad R, Chapman MS, Teague J, Easton D, Langerød A, Oslo Breast Cancer Consortium (OSBREAC), Lee MT, Shen CY, Tee BT, Huimin BW, Broeks A, Vargas AC, Turashvili G, Martens J, Fatima A, Miron P, Chin SF, Thomas G, Boyault S, Mariani O, Lakhani SR, van de Vijver M, van 't Veer L, Foekens J, Desmedt C, Sotiriou C, Tutt A, Caldas C, Reis-Filho JS, Aparicio SA, Salomon AV, Børresen-Dale AL, Richardson AL, Campbell PJ, Futreal PA and Stratton MR

    Nature 2012;486;7403;400-4

  • The life history of 21 breast cancers.

    Nik-Zainal S, Van Loo P, Wedge DC, Alexandrov LB, Greenman CD, Lau KW, Raine K, Jones D, Marshall J, Ramakrishna M, Shlien A, Cooke SL, Hinton J, Menzies A, Stebbings LA, Leroy C, Jia M, Rance R, Mudie LJ, Gamble SJ, Stephens PJ, McLaren S, Tarpey PS, Papaemmanuil E, Davies HR, Varela I, McBride DJ, Bignell GR, Leung K, Butler AP, Teague JW, Martin S, Jönsson G, Mariani O, Boyault S, Miron P, Fatima A, Langerød A, Aparicio SA, Tutt A, Sieuwerts AM, Borg Å, Thomas G, Salomon AV, Richardson AL, Børresen-Dale AL, Futreal PA, Stratton MR, Campbell PJ and Breast Cancer Working Group of the International Cancer Genome Consortium

    Cell 2012;149;5;994-1007

  • Mutational processes molding the genomes of 21 breast cancers.

    Nik-Zainal S, Alexandrov LB, Wedge DC, Van Loo P, Greenman CD, Raine K, Jones D, Hinton J, Marshall J, Stebbings LA, Menzies A, Martin S, Leung K, Chen L, Leroy C, Ramakrishna M, Rance R, Lau KW, Mudie LJ, Varela I, McBride DJ, Bignell GR, Cooke SL, Shlien A, Gamble J, Whitmore I, Maddison M, Tarpey PS, Davies HR, Papaemmanuil E, Stephens PJ, McLaren S, Butler AP, Teague JW, Jönsson G, Garber JE, Silver D, Miron P, Fatima A, Boyault S, Langerød A, Tutt A, Martens JW, Aparicio SA, Borg Å, Salomon AV, Thomas G, Børresen-Dale AL, Richardson AL, Neuberger MS, Futreal PA, Campbell PJ, Stratton MR and Breast Cancer Working Group of the International Cancer Genome Consortium

    Cell 2012;149;5;979-93

  • Systematic identification of genomic markers of drug sensitivity in cancer cells.

    Garnett MJ, Edelman EJ, Heidorn SJ, Greenman CD, Dastur A, Lau KW, Greninger P, Thompson IR, Luo X, Soares J, Liu Q, Iorio F, Surdez D, Chen L, Milano RJ, Bignell GR, Tam AT, Davies H, Stevenson JA, Barthorpe S, Lutz SR, Kogera F, Lawrence K, McLaren-Douglas A, Mitropoulos X, Mironenko T, Thi H, Richardson L, Zhou W, Jewitt F, Zhang T, O'Brien P, Boisvert JL, Price S, Hur W, Yang W, Deng X, Butler A, Choi HG, Chang JW, Baselga J, Stamenkovic I, Engelman JA, Sharma SV, Delattre O, Saez-Rodriguez J, Gray NS, Settleman J, Futreal PA, Haber DA, Stratton MR, Ramaswamy S, McDermott U and Benes CH

    Nature 2012;483;7391;570-5

  • Genome sequencing and analysis of the Tasmanian devil and its transmissible cancer.

    Murchison EP, Schulz-Trieglaff OB, Ning Z, Alexandrov LB, Bauer MJ, Fu B, Hims M, Ding Z, Ivakhno S, Stewart C, Ng BL, Wong W, Aken B, White S, Alsop A, Becq J, Bignell GR, Cheetham RK, Cheng W, Connor TR, Cox AJ, Feng ZP, Gu Y, Grocock RJ, Harris SR, Khrebtukova I, Kingsbury Z, Kowarsky M, Kreiss A, Luo S, Marshall J, McBride DJ, Murray L, Pearse AM, Raine K, Rasolonjatovo I, Shaw R, Tedder P, Tregidgo C, Vilella AJ, Wedge DC, Woods GM, Gormley N, Humphray S, Schroth G, Smith G, Hall K, Searle SM, Carter NP, Papenfuss AT, Futreal PA, Campbell PJ, Yang F, Bentley DR, Evers DJ and Stratton MR

    Cell 2012;148;4;780-91

  • Somatic SF3B1 mutation in myelodysplasia with ring sideroblasts.

    Papaemmanuil E, Cazzola M, Boultwood J, Malcovati L, Vyas P, Bowen D, Pellagatti A, Wainscoat JS, Hellstrom-Lindberg E, Gambacorti-Passerini C, Godfrey AL, Rapado I, Cvejic A, Rance R, McGee C, Ellis P, Mudie LJ, Stephens PJ, McLaren S, Massie CE, Tarpey PS, Varela I, Nik-Zainal S, Davies HR, Shlien A, Jones D, Raine K, Hinton J, Butler AP, Teague JW, Baxter EJ, Score J, Galli A, Della Porta MG, Travaglino E, Groves M, Tauro S, Munshi NC, Anderson KC, El-Naggar A, Fischer A, Mustonen V, Warren AJ, Cross NC, Green AR, Futreal PA, Stratton MR, Campbell PJ and Chronic Myeloid Disorders Working Group of the International Cancer Genome Consortium

    The New England journal of medicine 2011;365;15;1384-95

  • Exome sequencing identifies frequent mutation of the SWI/SNF complex gene PBRM1 in renal carcinoma.

    Varela I, Tarpey P, Raine K, Huang D, Ong CK, Stephens P, Davies H, Jones D, Lin ML, Teague J, Bignell G, Butler A, Cho J, Dalgliesh GL, Galappaththige D, Greenman C, Hardy C, Jia M, Latimer C, Lau KW, Marshall J, McLaren S, Menzies A, Mudie L, Stebbings L, Largaespada DA, Wessels LF, Richard S, Kahnoski RJ, Anema J, Tuveson DA, Perez-Mancera PA, Mustonen V, Fischer A, Adams DJ, Rust A, Chan-on W, Subimerb C, Dykema K, Furge K, Campbell PJ, Teh BT, Stratton MR and Futreal PA

    Nature 2011;469;7331;539-42

  • Massive genomic rearrangement acquired in a single catastrophic event during cancer development.

    Stephens PJ, Greenman CD, Fu B, Yang F, Bignell GR, Mudie LJ, Pleasance ED, Lau KW, Beare D, Stebbings LA, McLaren S, Lin ML, McBride DJ, Varela I, Nik-Zainal S, Leroy C, Jia M, Menzies A, Butler AP, Teague JW, Quail MA, Burton J, Swerdlow H, Carter NP, Morsberger LA, Iacobuzio-Donahue C, Follows GA, Green AR, Flanagan AM, Stratton MR, Futreal PA and Campbell PJ

    Cell 2011;144;1;27-40

  • The patterns and dynamics of genomic instability in metastatic pancreatic cancer.

    Campbell PJ, Yachida S, Mudie LJ, Stephens PJ, Pleasance ED, Stebbings LA, Morsberger LA, Latimer C, McLaren S, Lin ML, McBride DJ, Varela I, Nik-Zainal SA, Leroy C, Jia M, Menzies A, Butler AP, Teague JW, Griffin CA, Burton J, Swerdlow H, Quail MA, Stratton MR, Iacobuzio-Donahue C and Futreal PA

    Nature 2010;467;7319;1109-13

  • A comprehensive catalogue of somatic mutations from a human cancer genome.

    Pleasance ED, Cheetham RK, Stephens PJ, McBride DJ, Humphray SJ, Greenman CD, Varela I, Lin ML, Ordóñez GR, Bignell GR, Ye K, Alipaz J, Bauer MJ, Beare D, Butler A, Carter RJ, Chen L, Cox AJ, Edkins S, Kokko-Gonzales PI, Gormley NA, Grocock RJ, Haudenschild CD, Hims MM, James T, Jia M, Kingsbury Z, Leroy C, Marshall J, Menzies A, Mudie LJ, Ning Z, Royce T, Schulz-Trieglaff OB, Spiridou A, Stebbings LA, Szajkowski L, Teague J, Williamson D, Chin L, Ross MT, Campbell PJ, Bentley DR, Futreal PA and Stratton MR

    Nature 2010;463;7278;191-6

  • A small-cell lung cancer genome with complex signatures of tobacco exposure.

    Pleasance ED, Stephens PJ, O'Meara S, McBride DJ, Meynert A, Jones D, Lin ML, Beare D, Lau KW, Greenman C, Varela I, Nik-Zainal S, Davies HR, Ordoñez GR, Mudie LJ, Latimer C, Edkins S, Stebbings L, Chen L, Jia M, Leroy C, Marshall J, Menzies A, Butler A, Teague JW, Mangion J, Sun YA, McLaughlin SF, Peckham HE, Tsung EF, Costa GL, Lee CC, Minna JD, Gazdar A, Birney E, Rhodes MD, McKernan KJ, Stratton MR, Futreal PA and Campbell PJ

    Nature 2010;463;7278;184-90

  • Somatic mutations of the histone H3K27 demethylase gene UTX in human cancer.

    van Haaften G, Dalgliesh GL, Davies H, Chen L, Bignell G, Greenman C, Edkins S, Hardy C, O'Meara S, Teague J, Butler A, Hinton J, Latimer C, Andrews J, Barthorpe S, Beare D, Buck G, Campbell PJ, Cole J, Forbes S, Jia M, Jones D, Kok CY, Leroy C, Lin ML, McBride DJ, Maddison M, Maquire S, McLay K, Menzies A, Mironenko T, Mulderrig L, Mudie L, Pleasance E, Shepherd R, Smith R, Stebbings L, Stephens P, Tang G, Tarpey PS, Turner R, Turrell K, Varian J, West S, Widaa S, Wray P, Collins VP, Ichimura K, Law S, Wong J, Yuen ST, Leung SY, Tonon G, DePinho RA, Tai YT, Anderson KC, Kahnoski RJ, Massie A, Khoo SK, Teh BT, Stratton MR and Futreal PA

    Nature genetics 2009;41;5;521-3

  • Lung cancer: intragenic ERBB2 kinase mutations in tumours.

    Stephens P, Hunter C, Bignell G, Edkins S, Davies H, Teague J, Stevens C, O'Meara S, Smith R, Parker A, Barthorpe A, Blow M, Brackenbury L, Butler A, Clarke O, Cole J, Dicks E, Dike A, Drozd A, Edwards K, Forbes S, Foster R, Gray K, Greenman C, Halliday K, Hills K, Kosmidou V, Lugg R, Menzies A, Perry J, Petty R, Raine K, Ratford L, Shepherd R, Small A, Stephens Y, Tofts C, Varian J, West S, Widaa S, Yates A, Brasseur F, Cooper CS, Flanagan AM, Knowles M, Leung SY, Louis DN, Looijenga LH, Malkowicz B, Pierotti MA, Teh B, Chenevix-Trench G, Weber BL, Yuen ST, Harris G, Goldstraw P, Nicholson AG, Futreal PA, Wooster R and Stratton MR

    Nature 2004;431;7008;525-6

  • Mutations of the BRAF gene in human cancer.

    Davies H, Bignell GR, Cox C, Stephens P, Edkins S, Clegg S, Teague J, Woffendin H, Garnett MJ, Bottomley W, Davis N, Dicks E, Ewing R, Floyd Y, Gray K, Hall S, Hawes R, Hughes J, Kosmidou V, Menzies A, Mould C, Parker A, Stevens C, Watt S, Hooper S, Wilson R, Jayatilake H, Gusterson BA, Cooper C, Shipley J, Hargrave D, Pritchard-Jones K, Maitland N, Chenevix-Trench G, Riggins GJ, Bigner DD, Palmieri G, Cossu A, Flanagan A, Nicholson A, Ho JW, Leung SY, Yuen ST, Weber BL, Seigler HF, Darrow TL, Paterson H, Marais R, Marshall CJ, Wooster R, Stratton MR and Futreal PA

    Nature 2002;417;6892;949-54

Team

Team members

David Jones
Principal Bioinformatician
Angela Matchan
am26@sanger.ac.ukSenior Bioinformatician
Keiran Raine
Principal Bioinformatician

David Jones

- Principal Bioinformatician

BSc. (Hons) Molecular Cell Biology, University of Southampton 2003

MSc. Bioinformatics, University of Westminster 2008

I first joined the Cancer Genome Project (CGP) in 2004 as a research assistant where I worked in the high throughput laboratory and also analysed capillary sequencing data.

In 2006 I began a part time MSc. Bioinformatics at The University of Westminster sponsored by the Sanger Institute, graduating in 2008.

On completion of my MSc. I moved into the CGP informatics team where I was the main developer on our SNV calling algorithm CaVEMan (https://github.com/cancerit/CaVEMan)

Research

My current areas of focus are the development of a new analysis pipeline based around bpipe, improvement and development of analysis algorithms (https://github.com/cancerit)

References

  • Analysis of the genetic phylogeny of multifocal prostate cancer identifies multiple independent clonal expansions in neoplastic and morphologically normal prostate tissue.

    Cooper CS, Eeles R, Wedge DC, Van Loo P, Gundem G, Alexandrov LB, Kremeyer B, Butler A, Lynch AG, Camacho N, Massie CE, Kay J, Luxton HJ, Edwards S, Kote-Jarai Z, Dennis N, Merson S, Leongamornlert D, Zamora J, Corbishley C, Thomas S, Nik-Zainal S, Ramakrishna M, O'Meara S, Matthews L, Clark J, Hurst R, Mithen R, Bristow RG, Boutros PC, Fraser M, Cooke S, Raine K, Jones D, Menzies A, Stebbings L, Hinton J, Teague J, McLaren S, Mudie L, Hardy C, Anderson E, Joseph O, Goody V, Robinson B, Maddison M, Gamble S, Greenman C, Berney D, Hazell S, Livni N, ICGC Prostate Group, Fisher C, Ogden C, Kumar P, Thompson A, Woodhouse C, Nicol D, Mayer E, Dudderidge T, Shah NC, Gnanapragasam V, Voet T, Campbell P, Futreal A, Easton D, Warren AY, Foster CS, Stratton MR, Whitaker HC, McDermott U, Brewer DS and Neal DE

    1] Division of Genetics and Epidemiology, The Institute of Cancer Research, London, UK. [2] Department of Biological Sciences, University of East Anglia, Norwich, UK. [3] Norwich Medical School, University of East Anglia, Norwich, UK.

    Genome-wide DNA sequencing was used to decrypt the phylogeny of multiple samples from distinct areas of cancer and morphologically normal tissue taken from the prostates of three men. Mutations were present at high levels in morphologically normal tissue distant from the cancer, reflecting clonal expansions, and the underlying mutational processes at work in morphologically normal tissue were also at work in cancer. Our observations demonstrate the existence of ongoing abnormal mutational processes, consistent with field effects, underlying carcinogenesis. This mechanism gives rise to extensive branching evolution and cancer clone mixing, as exemplified by the coexistence of multiple cancer lineages harboring distinct ERG fusions within a single cancer nodule. Subsets of mutations were shared either by morphologically normal and malignant tissues or between different ERG lineages, indicating earlier or separate clonal cell expansions. Our observations inform on the origin of multifocal disease and have implications for prostate cancer therapy in individual cases.

    Funded by: Cancer Research UK: 14835, C5047/A14835; Wellcome Trust

    Nature genetics 2015;47;4;367-72

  • Polygenic in vivo validation of cancer mutations using transposons.

    Chew SK, Lu D, Campos LS, Scott KL, Saci A, Wang J, Collinson A, Raine K, Hinton J, Teague JW, Jones D, Menzies A, Butler AP, Gamble J, O'Meara S, McLaren S, Chin L, Liu P and Futreal PA

    The in vivo validation of cancer mutations and genes identified in cancer genomics is resource-intensive because of the low throughput of animal experiments. We describe a mouse model that allows multiple cancer mutations to be validated in each animal line. Animal lines are generated with multiple candidate cancer mutations using transposons. The candidate cancer genes are tagged and randomly expressed in somatic cells, allowing easy identification of the cancer genes involved in the generated tumours. This system presents a useful, generalised and efficient means for animal validation of cancer genes.

    Funded by: Wellcome Trust

    Genome biology 2014;15;9;455

  • Aging as accelerated accumulation of somatic variants: whole-genome sequencing of centenarian and middle-aged monozygotic twin pairs.

    Ye K, Beekman M, Lameijer EW, Zhang Y, Moed MH, van den Akker EB, Deelen J, Houwing-Duistermaat JJ, Kremer D, Anvar SY, Laros JF, Jones D, Raine K, Blackburne B, Potluri S, Long Q, Guryev V, van der Breggen R, Westendorp RG, 't Hoen PA, den Dunnen J, van Ommen GJ, Willemsen G, Pitts SJ, Cox DR, Ning Z, Boomsma DI and Slagboom PE

    Molecular Epidemiology, Medical Statistics and Bioinformatics, Leiden University Medical Center, Leiden, the Netherlands.

    It has been postulated that aging is the consequence of an accelerated accumulation of somatic DNA mutations and that subsequent errors in the primary structure of proteins ultimately reach levels sufficient to affect organismal functions. The technical limitations of detecting somatic changes and the lack of insight about the minimum level of erroneous proteins to cause an error catastrophe hampered any firm conclusions on these theories. In this study, we sequenced the whole genome of DNA in whole blood of two pairs of monozygotic (MZ) twins, 40 and 100 years old, by two independent next-generation sequencing (NGS) platforms (Illumina and Complete Genomics). Potentially discordant single-base substitutions supported by both platforms were validated extensively by Sanger, Roche 454, and Ion Torrent sequencing. We demonstrate that the genomes of the two twin pairs are germ-line identical between co-twins, and that the genomes of the 100-year-old MZ twins are discerned by eight confirmed somatic single-base substitutions, five of which are within introns. Putative somatic variation between the 40-year-old twins was not confirmed in the validation phase. We conclude from this systematic effort that by using two independent NGS platforms, somatic single nucleotide substitutions can be detected, and that a century of life did not result in a large number of detectable somatic mutations in blood. The low number of somatic variants observed by using two NGS platforms might provide a framework for detecting disease-related somatic variants in phenotypically discordant MZ twins.

    Twin research and human genetics : the official journal of the International Society for Twin Studies 2013;16;6;1026-32

  • Signatures of mutational processes in human cancer.

    Alexandrov LB, Nik-Zainal S, Wedge DC, Aparicio SA, Behjati S, Biankin AV, Bignell GR, Bolli N, Borg A, Børresen-Dale AL, Boyault S, Burkhardt B, Butler AP, Caldas C, Davies HR, Desmedt C, Eils R, Eyfjörd JE, Foekens JA, Greaves M, Hosoda F, Hutter B, Ilicic T, Imbeaud S, Imielinski M, Imielinsk M, Jäger N, Jones DT, Jones D, Knappskog S, Kool M, Lakhani SR, López-Otín C, Martin S, Munshi NC, Nakamura H, Northcott PA, Pajic M, Papaemmanuil E, Paradiso A, Pearson JV, Puente XS, Raine K, Ramakrishna M, Richardson AL, Richter J, Rosenstiel P, Schlesner M, Schumacher TN, Span PN, Teague JW, Totoki Y, Tutt AN, Valdés-Mas R, van Buuren MM, van 't Veer L, Vincent-Salomon A, Waddell N, Yates LR, Australian Pancreatic Cancer Genome Initiative, ICGC Breast Cancer Consortium, ICGC MMML-Seq Consortium, ICGC PedBrain, Zucman-Rossi J, Futreal PA, McDermott U, Lichter P, Meyerson M, Grimmond SM, Siebert R, Campo E, Shibata T, Pfister SM, Campbell PJ and Stratton MR

    Cancer Genome Project, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK.

    All cancers are caused by somatic mutations; however, understanding of the biological processes generating these mutations is limited. The catalogue of somatic mutations from a cancer genome bears the signatures of the mutational processes that have been operative. Here we analysed 4,938,362 mutations from 7,042 cancers and extracted more than 20 distinct mutational signatures. Some are present in many cancer types, notably a signature attributed to the APOBEC family of cytidine deaminases, whereas others are confined to a single cancer class. Certain signatures are associated with age of the patient at cancer diagnosis, known mutagenic exposures or defects in DNA maintenance, but many are of cryptic origin. In addition to these genome-wide mutational signatures, hypermutation localized to small genomic regions, 'kataegis', is found in many cancer types. The results reveal the diversity of mutational processes underlying the development of cancer, with potential implications for understanding of cancer aetiology, prevention and therapy.

    Funded by: NCI NIH HHS: T32 CA009216; Wellcome Trust: 088340, 093867, 098051

    Nature 2013;500;7463;415-21

  • Frequent mutation of the major cartilage collagen gene COL2A1 in chondrosarcoma.

    Tarpey PS, Behjati S, Cooke SL, Van Loo P, Wedge DC, Pillay N, Marshall J, O'Meara S, Davies H, Nik-Zainal S, Beare D, Butler A, Gamble J, Hardy C, Hinton J, Jia MM, Jayakumar A, Jones D, Latimer C, Maddison M, Martin S, McLaren S, Menzies A, Mudie L, Raine K, Teague JW, Tubio JM, Halai D, Tirabosco R, Amary F, Campbell PJ, Stratton MR, Flanagan AM and Futreal PA

    Cancer Genome Project, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK.

    Chondrosarcoma is a heterogeneous collection of malignant bone tumors and is the second most common primary malignancy of bone after osteosarcoma. Recent work has identified frequent, recurrent mutations in IDH1 or IDH2 in nearly half of central chondrosarcomas. However, there has been little systematic genomic analysis of this tumor type, and, thus, the contribution of other genes is unclear. Here we report comprehensive genomic analyses of 49 individuals with chondrosarcoma (cases). We identified hypermutability of the major cartilage collagen gene COL2A1, with insertions, deletions and rearrangements identified in 37% of cases. The patterns of mutation were consistent with selection for variants likely to impair normal collagen biosynthesis. In addition, we identified mutations in IDH1 or IDH2 (59%), TP53 (20%), the RB1 pathway (33%) and Hedgehog signaling (18%).

    Funded by: Wellcome Trust: 077012/Z/05/Z, 088340, 093867, WT088340MA

    Nature genetics 2013;45;8;923-6

  • Whole exome sequencing of adenoid cystic carcinoma.

    Stephens PJ, Davies HR, Mitani Y, Van Loo P, Shlien A, Tarpey PS, Papaemmanuil E, Cheverton A, Bignell GR, Butler AP, Gamble J, Gamble S, Hardy C, Hinton J, Jia M, Jayakumar A, Jones D, Latimer C, McLaren S, McBride DJ, Menzies A, Mudie L, Maddison M, Raine K, Nik-Zainal S, O'Meara S, Teague JW, Varela I, Wedge DC, Whitmore I, Lippman SM, McDermott U, Stratton MR, Campbell PJ, El-Naggar AK and Futreal PA

    Cancer Genome Project, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, United Kingdom.

    Adenoid cystic carcinoma (ACC) is a rare malignancy that can occur in multiple organ sites and is primarily found in the salivary gland. While the identification of recurrent fusions of the MYB-NFIB genes have begun to shed light on the molecular underpinnings, little else is known about the molecular genetics of this frequently fatal cancer. We have undertaken exome sequencing in a series of 24 ACC to further delineate the genetics of the disease. We identified multiple mutated genes that, combined, implicate chromatin deregulation in half of cases. Further, mutations were identified in known cancer genes, including PIK3CA, ATM, CDKN2A, SF3B1, SUFU, TSC1, and CYLD. Mutations in NOTCH1/2 were identified in 3 cases, and we identify the negative NOTCH signaling regulator, SPEN, as a new cancer gene in ACC with mutations in 5 cases. Finally, the identification of 3 likely activating mutations in the tyrosine kinase receptor FGFR2, analogous to those reported in ovarian and endometrial carcinoma, point to potential therapeutic avenues for a subset of cases.

    Funded by: NCI NIH HHS: P50 CA097007; NIDCR NIH HHS: U01DE019765; WETP NIH HHS: WT088340MA; Wellcome Trust: 077012/Z/05/Z, 088340, 093867, 098051

    The Journal of clinical investigation 2013;123;7;2965-8

  • The genetic heterogeneity and mutational burden of engineered melanomas in zebrafish models.

    Yen J, White RM, Wedge DC, Van Loo P, de Ridder J, Capper A, Richardson J, Jones D, Raine K, Watson IR, Wu CJ, Cheng J, Martincorena I, Nik-Zainal S, Mudie L, Moreau Y, Marshall J, Ramakrishna M, Tarpey P, Shlien A, Whitmore I, Gamble S, Latimer C, Langdon E, Kaufman C, Dovey M, Taylor A, Menzies A, McLaren S, O'Meara S, Butler A, Teague J, Lister J, Chin L, Campbell P, Adams DJ, Zon LI, Patton EE, Stemple DL and Futreal PA

    Background: Melanoma is the most deadly form of skin cancer. Expression of oncogenic BRAF or NRAS, which are frequently mutated in human melanomas, promote the formation of nevi but are not sufficient for tumorigenesis. Even with germline mutated p53, these engineered melanomas present with variable onset and pathology, implicating additional somatic mutations in a multi-hit tumorigenic process.

    Results: To decipher the genetics of these melanomas, we sequence the protein coding exons of 53 primary melanomas generated from several BRAF(V600E) or NRAS(Q61K) driven transgenic zebrafish lines. We find that engineered zebrafish melanomas show an overall low mutation burden, which has a strong, inverse association with the number of initiating germline drivers. Although tumors reveal distinct mutation spectrums, they show mostly C > T transitions without UV light exposure, and enrichment of mutations in melanogenesis, p53 and MAPK signaling. Importantly, a recurrent amplification occurring with pre-configured drivers BRAF(V600E) and p53-/- suggests a novel path of BRAF cooperativity through the protein kinase A pathway.

    Conclusion: This is the first analysis of a melanoma mutational landscape in the absence of UV light, where tumors manifest with remarkably low mutation burden and high heterogeneity. Genotype specific amplification of protein kinase A in cooperation with BRAF and p53 mutation suggests the involvement of melanogenesis in these tumors. This work is important for defining the spectrum of events in BRAF or NRAS driven melanoma in the absence of UV light, and for informed exploitation of models such as transgenic zebrafish to better understand mechanisms leading to human melanoma formation.

    Funded by: Cancer Research UK: 13031; Medical Research Council: G120/875, MC_PC_U127585840, MC_U127585840; NCI NIH HHS: R01 CA103846; NIAMS NIH HHS: K08 AR061071, K08AR61071; NIDDK NIH HHS: P30 DK049216; Wellcome Trust

    Genome biology 2013;14;10;R113

  • The landscape of cancer genes and mutational processes in breast cancer.

    Stephens PJ, Tarpey PS, Davies H, Van Loo P, Greenman C, Wedge DC, Nik-Zainal S, Martin S, Varela I, Bignell GR, Yates LR, Papaemmanuil E, Beare D, Butler A, Cheverton A, Gamble J, Hinton J, Jia M, Jayakumar A, Jones D, Latimer C, Lau KW, McLaren S, McBride DJ, Menzies A, Mudie L, Raine K, Rad R, Chapman MS, Teague J, Easton D, Langerød A, Oslo Breast Cancer Consortium (OSBREAC), Lee MT, Shen CY, Tee BT, Huimin BW, Broeks A, Vargas AC, Turashvili G, Martens J, Fatima A, Miron P, Chin SF, Thomas G, Boyault S, Mariani O, Lakhani SR, van de Vijver M, van 't Veer L, Foekens J, Desmedt C, Sotiriou C, Tutt A, Caldas C, Reis-Filho JS, Aparicio SA, Salomon AV, Børresen-Dale AL, Richardson AL, Campbell PJ, Futreal PA and Stratton MR

    Cancer Genome Project, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK.

    All cancers carry somatic mutations in their genomes. A subset, known as driver mutations, confer clonal selective advantage on cancer cells and are causally implicated in oncogenesis, and the remainder are passenger mutations. The driver mutations and mutational processes operative in breast cancer have not yet been comprehensively explored. Here we examine the genomes of 100 tumours for somatic copy number changes and mutations in the coding exons of protein-coding genes. The number of somatic mutations varied markedly between individual tumours. We found strong correlations between mutation number, age at which cancer was diagnosed and cancer histological grade, and observed multiple mutational signatures, including one present in about ten per cent of tumours characterized by numerous mutations of cytosine at TpC dinucleotides. Driver mutations were identified in several new cancer genes including AKT2, ARID1B, CASP8, CDKN1B, MAP3K1, MAP3K13, NCOR1, SMARCD1 and TBX3. Among the 100 tumours, we found driver mutations in at least 40 cancer genes and 73 different combinations of mutated cancer genes. The results highlight the substantial genetic diversity underlying this common disease.

    Funded by: Cancer Research UK: 10118; Chief Scientist Office; Department of Health; NCI NIH HHS: CA089393, P30 CA016672; Wellcome Trust: 077012/Z/05/Z, 088340, 093867, WT088340MA

    Nature 2012;486;7403;400-4

  • The life history of 21 breast cancers.

    Nik-Zainal S, Van Loo P, Wedge DC, Alexandrov LB, Greenman CD, Lau KW, Raine K, Jones D, Marshall J, Ramakrishna M, Shlien A, Cooke SL, Hinton J, Menzies A, Stebbings LA, Leroy C, Jia M, Rance R, Mudie LJ, Gamble SJ, Stephens PJ, McLaren S, Tarpey PS, Papaemmanuil E, Davies HR, Varela I, McBride DJ, Bignell GR, Leung K, Butler AP, Teague JW, Martin S, Jönsson G, Mariani O, Boyault S, Miron P, Fatima A, Langerød A, Aparicio SA, Tutt A, Sieuwerts AM, Borg Å, Thomas G, Salomon AV, Richardson AL, Børresen-Dale AL, Futreal PA, Stratton MR, Campbell PJ and Breast Cancer Working Group of the International Cancer Genome Consortium

    Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton CB10 1SA, UK.

    Cancer evolves dynamically as clonal expansions supersede one another driven by shifting selective pressures, mutational processes, and disrupted cancer genes. These processes mark the genome, such that a cancer's life history is encrypted in the somatic mutations present. We developed algorithms to decipher this narrative and applied them to 21 breast cancers. Mutational processes evolve across a cancer's lifespan, with many emerging late but contributing extensive genetic variation. Subclonal diversification is prominent, and most mutations are found in just a fraction of tumor cells. Every tumor has a dominant subclonal lineage, representing more than 50% of tumor cells. Minimal expansion of these subclones occurs until many hundreds to thousands of mutations have accumulated, implying the existence of long-lived, quiescent cell lineages capable of substantial proliferation upon acquisition of enabling genomic changes. Expansion of the dominant subclone to an appreciable mass may therefore represent the final rate-limiting step in a breast cancer's development, triggering diagnosis.

    Funded by: Department of Health; NCI NIH HHS: CA089393; Wellcome Trust: 088340, 093867, 098051

    Cell 2012;149;5;994-1007

  • Mutational processes molding the genomes of 21 breast cancers.

    Nik-Zainal S, Alexandrov LB, Wedge DC, Van Loo P, Greenman CD, Raine K, Jones D, Hinton J, Marshall J, Stebbings LA, Menzies A, Martin S, Leung K, Chen L, Leroy C, Ramakrishna M, Rance R, Lau KW, Mudie LJ, Varela I, McBride DJ, Bignell GR, Cooke SL, Shlien A, Gamble J, Whitmore I, Maddison M, Tarpey PS, Davies HR, Papaemmanuil E, Stephens PJ, McLaren S, Butler AP, Teague JW, Jönsson G, Garber JE, Silver D, Miron P, Fatima A, Boyault S, Langerød A, Tutt A, Martens JW, Aparicio SA, Borg Å, Salomon AV, Thomas G, Børresen-Dale AL, Richardson AL, Neuberger MS, Futreal PA, Campbell PJ, Stratton MR and Breast Cancer Working Group of the International Cancer Genome Consortium

    Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton CB10 1SA, UK.

    All cancers carry somatic mutations. The patterns of mutation in cancer genomes reflect the DNA damage and repair processes to which cancer cells and their precursors have been exposed. To explore these mechanisms further, we generated catalogs of somatic mutation from 21 breast cancers and applied mathematical methods to extract mutational signatures of the underlying processes. Multiple distinct single- and double-nucleotide substitution signatures were discernible. Cancers with BRCA1 or BRCA2 mutations exhibited a characteristic combination of substitution mutation signatures and a distinctive profile of deletions. Complex relationships between somatic mutation prevalence and transcription were detected. A remarkable phenomenon of localized hypermutation, termed "kataegis," was observed. Regions of kataegis differed between cancers but usually colocalized with somatic rearrangements. Base substitutions in these regions were almost exclusively of cytosine at TpC dinucleotides. The mechanisms underlying most of these mutational signatures are unknown. However, a role for the APOBEC family of cytidine deaminases is proposed.

    Funded by: Department of Health; Medical Research Council: MC_U105178806; NCI NIH HHS: CA089393; Wellcome Trust: 088340, 098051, WT088340MA

    Cell 2012;149;5;979-93

Angela Matchan

am26@sanger.ac.uk Senior Bioinformatician

BSc Biology (specialising in genetics), The University of Sheffield 1999

MSc Applied Bioinformatics, The University of Cranfield 2008

Prior to joining Sanger I worked as an IT Consultant and software programmer mainly in the financial services sector. I moved in to Bioinformatics in 2010 as a microarray (gene expression and miRNA) and sequencing data (NGS) analyst for a private company in Oxfordshire. I joined the Zeggini group at Sanger in November 2012 providing informatics and data management support on a number of GWAS and sequencing studies in the area of complex disease research.

Research

I am currently a Senior Bioinformatician based in the Cancer Genome Project IT Group developing an RNA-Seq pipeline for tumour-normal analysis. The pipeline is also being developed as part of the Centre for Therapeutic Target Validation (CTTV) project and will be used to analyse cell line data from The Cancer Genome Atlas. I am collaborating with colleagues at the EBI and GSK as part of this project.

References

  • Genetic characterization of Greek population isolates reveals strong genetic drift at missense and trait-associated variants.

    Panoutsopoulou K, Hatzikotoulas K, Xifara DK, Colonna V, Farmaki AE, Ritchie GR, Southam L, Gilly A, Tachmazidou I, Fatumo S, Matchan A, Rayner NW, Ntalla I, Mezzavilla M, Chen Y, Kiagiadaki C, Zengini E, Mamakou V, Athanasiadis A, Giannakopoulou M, Kariakli VE, Nsubuga RN, Karabarinde A, Sandhu M, McVean G, Tyler-Smith C, Tsafantakis E, Karaleftheri M, Xue Y, Dedoussis G and Zeggini E

    Department of Human Genetics, Wellcome Trust Sanger Institute, Hinxton CB10 1HH, UK.

    Isolated populations are emerging as a powerful study design in the search for low-frequency and rare variant associations with complex phenotypes. Here we genotype 2,296 samples from two isolated Greek populations, the Pomak villages (HELIC-Pomak) in the North of Greece and the Mylopotamos villages (HELIC-MANOLIS) in Crete. We compare their genomic characteristics to the general Greek population and establish them as genetic isolates. In the MANOLIS cohort, we observe an enrichment of missense variants among the variants that have drifted up in frequency by more than fivefold. In the Pomak cohort, we find novel associations at variants on chr11p15.4 showing large allele frequency increases (from 0.2% in the general Greek population to 4.6% in the isolate) with haematological traits, for example, with mean corpuscular volume (rs7116019, P=2.3 × 10(-26)). We replicate this association in a second set of Pomak samples (combined P=2.0 × 10(-36)). We demonstrate significant power gains in detecting medical trait associations.

    Funded by: European Research Council: 280559; NHGRI NIH HHS: U41 HG006941, U41HG006941; Wellcome Trust: 098051

    Nature communications 2014;5;5345

  • A rare functional cardioprotective APOC3 variant has risen in frequency in distinct population isolates.

    Tachmazidou I, Dedoussis G, Southam L, Farmaki AE, Ritchie GR, Xifara DK, Matchan A, Hatzikotoulas K, Rayner NW, Chen Y, Pollin TI, O'Connell JR, Yerges-Armstrong LM, Kiagiadaki C, Panoutsopoulou K, Schwartzentruber J, Moutsianas L, UK10K consortium, Tsafantakis E, Tyler-Smith C, McVean G, Xue Y and Zeggini E

    Wellcome Trust Sanger Institute, Hinxton CB10 1SA, UK.

    Isolated populations can empower the identification of rare variation associated with complex traits through next generation association studies, but the generalizability of such findings remains unknown. Here we genotype 1,267 individuals from a Greek population isolate on the Illumina HumanExome Beadchip, in search of functional coding variants associated with lipids traits. We find genome-wide significant evidence for association between R19X, a functional variant in APOC3, with increased high-density lipoprotein and decreased triglycerides levels. Approximately 3.8% of individuals are heterozygous for this cardioprotective variant, which was previously thought to be private to the Amish founder population. R19X is rare (<0.05% frequency) in outbred European populations. The increased frequency of R19X enables discovery of this lipid traits signal at genome-wide significance in a small sample size. This work exemplifies the value of isolated populations in successfully detecting transferable rare variant associations of high medical relevance.

    Funded by: NHLBI NIH HHS: K01 HL116770, R01 HL104193, U01 HL072515, U01 HL105198; NIDDK NIH HHS: P30 DK072488; Wellcome Trust: 090532, 098051, WT091310

    Nature communications 2013;4;2872

Keiran Raine

- Principal Bioinformatician

BSc. (Hons) Biomedical Sciences, The University of Durham 2001

MSc. Bioinformatics, The University of Manchester 2002

Initially my career within the Cancer Genome Project focused on the development of Laboratory Management Systems for high throughput capillary sequencing. Subsequently I worked to develop tools to automate the analysis of these.

Between 2007 and 2009 I worked in the pharmaceutical industry as a clinical programmer.

I returned to CGP in 2009 to develop a mapping and analysis system for NGS analysis.

Research

I am primarily focused on development and coordination of infrastructure to support the scientific staff within the group.

Recently this has involved improving our external interactions through code release and a large involvement in the ICGC/TCGA Pancancer projects technical working group.

References

  • Analysis of the genetic phylogeny of multifocal prostate cancer identifies multiple independent clonal expansions in neoplastic and morphologically normal prostate tissue.

    Cooper CS, Eeles R, Wedge DC, Van Loo P, Gundem G, Alexandrov LB, Kremeyer B, Butler A, Lynch AG, Camacho N, Massie CE, Kay J, Luxton HJ, Edwards S, Kote-Jarai Z, Dennis N, Merson S, Leongamornlert D, Zamora J, Corbishley C, Thomas S, Nik-Zainal S, Ramakrishna M, O'Meara S, Matthews L, Clark J, Hurst R, Mithen R, Bristow RG, Boutros PC, Fraser M, Cooke S, Raine K, Jones D, Menzies A, Stebbings L, Hinton J, Teague J, McLaren S, Mudie L, Hardy C, Anderson E, Joseph O, Goody V, Robinson B, Maddison M, Gamble S, Greenman C, Berney D, Hazell S, Livni N, ICGC Prostate Group, Fisher C, Ogden C, Kumar P, Thompson A, Woodhouse C, Nicol D, Mayer E, Dudderidge T, Shah NC, Gnanapragasam V, Voet T, Campbell P, Futreal A, Easton D, Warren AY, Foster CS, Stratton MR, Whitaker HC, McDermott U, Brewer DS and Neal DE

    1] Division of Genetics and Epidemiology, The Institute of Cancer Research, London, UK. [2] Department of Biological Sciences, University of East Anglia, Norwich, UK. [3] Norwich Medical School, University of East Anglia, Norwich, UK.

    Genome-wide DNA sequencing was used to decrypt the phylogeny of multiple samples from distinct areas of cancer and morphologically normal tissue taken from the prostates of three men. Mutations were present at high levels in morphologically normal tissue distant from the cancer, reflecting clonal expansions, and the underlying mutational processes at work in morphologically normal tissue were also at work in cancer. Our observations demonstrate the existence of ongoing abnormal mutational processes, consistent with field effects, underlying carcinogenesis. This mechanism gives rise to extensive branching evolution and cancer clone mixing, as exemplified by the coexistence of multiple cancer lineages harboring distinct ERG fusions within a single cancer nodule. Subsets of mutations were shared either by morphologically normal and malignant tissues or between different ERG lineages, indicating earlier or separate clonal cell expansions. Our observations inform on the origin of multifocal disease and have implications for prostate cancer therapy in individual cases.

    Funded by: Cancer Research UK: 14835, C5047/A14835; Wellcome Trust

    Nature genetics 2015;47;4;367-72

  • Mobile DNA in cancer. Extensive transduction of nonrepetitive DNA mediated by L1 retrotransposition in cancer genomes.

    Tubio JM, Li Y, Ju YS, Martincorena I, Cooke SL, Tojo M, Gundem G, Pipinikas CP, Zamora J, Raine K, Menzies A, Roman-Garcia P, Fullam A, Gerstung M, Shlien A, Tarpey PS, Papaemmanuil E, Knappskog S, Van Loo P, Ramakrishna M, Davies HR, Marshall J, Wedge DC, Teague JW, Butler AP, Nik-Zainal S, Alexandrov L, Behjati S, Yates LR, Bolli N, Mudie L, Hardy C, Martin S, McLaren S, O'Meara S, Anderson E, Maddison M, Gamble S, ICGC Breast Cancer Group, ICGC Bone Cancer Group, ICGC Prostate Cancer Group, Foster C, Warren AY, Whitaker H, Brewer D, Eeles R, Cooper C, Neal D, Lynch AG, Visakorpi T, Isaacs WB, van't Veer L, Caldas C, Desmedt C, Sotiriou C, Aparicio S, Foekens JA, Eyfjörd JE, Lakhani SR, Thomas G, Myklebost O, Span PN, Børresen-Dale AL, Richardson AL, Van de Vijver M, Vincent-Salomon A, Van den Eynden GG, Flanagan AM, Futreal PA, Janes SM, Bova GS, Stratton MR, McDermott U and Campbell PJ

    Wellcome Trust Sanger Institute, Hinxton, Cambridgeshire, UK.

    Long interspersed nuclear element-1 (L1) retrotransposons are mobile repetitive elements that are abundant in the human genome. L1 elements propagate through RNA intermediates. In the germ line, neighboring, nonrepetitive sequences are occasionally mobilized by the L1 machinery, a process called 3' transduction. Because 3' transductions are potentially mutagenic, we explored the extent to which they occur somatically during tumorigenesis. Studying cancer genomes from 244 patients, we found that tumors from 53% of the patients had somatic retrotranspositions, of which 24% were 3' transductions. Fingerprinting of donor L1s revealed that a handful of source L1 elements in a tumor can spawn from tens to hundreds of 3' transductions, which can themselves seed further retrotranspositions. The activity of individual L1 elements fluctuated during tumor evolution and correlated with L1 promoter hypomethylation. The 3' transductions disseminated genes, exons, and regulatory elements to new locations, most often to heterochromatic regions of the genome.

    Funded by: Cancer Research UK: 14835, C5047/A14835; Department of Health; Medical Research Council: G0900871; NCI NIH HHS: P30 CA006973; Wellcome Trust: 088340, 091730, WT100183MA

    Science (New York, N.Y.) 2014;345;6196;1251343

  • Polygenic in vivo validation of cancer mutations using transposons.

    Chew SK, Lu D, Campos LS, Scott KL, Saci A, Wang J, Collinson A, Raine K, Hinton J, Teague JW, Jones D, Menzies A, Butler AP, Gamble J, O'Meara S, McLaren S, Chin L, Liu P and Futreal PA

    The in vivo validation of cancer mutations and genes identified in cancer genomics is resource-intensive because of the low throughput of animal experiments. We describe a mouse model that allows multiple cancer mutations to be validated in each animal line. Animal lines are generated with multiple candidate cancer mutations using transposons. The candidate cancer genes are tagged and randomly expressed in somatic cells, allowing easy identification of the cancer genes involved in the generated tumours. This system presents a useful, generalised and efficient means for animal validation of cancer genes.

    Funded by: Wellcome Trust

    Genome biology 2014;15;9;455

  • Aging as accelerated accumulation of somatic variants: whole-genome sequencing of centenarian and middle-aged monozygotic twin pairs.

    Ye K, Beekman M, Lameijer EW, Zhang Y, Moed MH, van den Akker EB, Deelen J, Houwing-Duistermaat JJ, Kremer D, Anvar SY, Laros JF, Jones D, Raine K, Blackburne B, Potluri S, Long Q, Guryev V, van der Breggen R, Westendorp RG, 't Hoen PA, den Dunnen J, van Ommen GJ, Willemsen G, Pitts SJ, Cox DR, Ning Z, Boomsma DI and Slagboom PE

    Molecular Epidemiology, Medical Statistics and Bioinformatics, Leiden University Medical Center, Leiden, the Netherlands.

    It has been postulated that aging is the consequence of an accelerated accumulation of somatic DNA mutations and that subsequent errors in the primary structure of proteins ultimately reach levels sufficient to affect organismal functions. The technical limitations of detecting somatic changes and the lack of insight about the minimum level of erroneous proteins to cause an error catastrophe hampered any firm conclusions on these theories. In this study, we sequenced the whole genome of DNA in whole blood of two pairs of monozygotic (MZ) twins, 40 and 100 years old, by two independent next-generation sequencing (NGS) platforms (Illumina and Complete Genomics). Potentially discordant single-base substitutions supported by both platforms were validated extensively by Sanger, Roche 454, and Ion Torrent sequencing. We demonstrate that the genomes of the two twin pairs are germ-line identical between co-twins, and that the genomes of the 100-year-old MZ twins are discerned by eight confirmed somatic single-base substitutions, five of which are within introns. Putative somatic variation between the 40-year-old twins was not confirmed in the validation phase. We conclude from this systematic effort that by using two independent NGS platforms, somatic single nucleotide substitutions can be detected, and that a century of life did not result in a large number of detectable somatic mutations in blood. The low number of somatic variants observed by using two NGS platforms might provide a framework for detecting disease-related somatic variants in phenotypically discordant MZ twins.

    Twin research and human genetics : the official journal of the International Society for Twin Studies 2013;16;6;1026-32

  • The life history of 21 breast cancers.

    Nik-Zainal S, Van Loo P, Wedge DC, Alexandrov LB, Greenman CD, Lau KW, Raine K, Jones D, Marshall J, Ramakrishna M, Shlien A, Cooke SL, Hinton J, Menzies A, Stebbings LA, Leroy C, Jia M, Rance R, Mudie LJ, Gamble SJ, Stephens PJ, McLaren S, Tarpey PS, Papaemmanuil E, Davies HR, Varela I, McBride DJ, Bignell GR, Leung K, Butler AP, Teague JW, Martin S, Jönsson G, Mariani O, Boyault S, Miron P, Fatima A, Langerød A, Aparicio SA, Tutt A, Sieuwerts AM, Borg Å, Thomas G, Salomon AV, Richardson AL, Børresen-Dale AL, Futreal PA, Stratton MR, Campbell PJ and Breast Cancer Working Group of the International Cancer Genome Consortium

    Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton CB10 1SA, UK.

    Cancer evolves dynamically as clonal expansions supersede one another driven by shifting selective pressures, mutational processes, and disrupted cancer genes. These processes mark the genome, such that a cancer's life history is encrypted in the somatic mutations present. We developed algorithms to decipher this narrative and applied them to 21 breast cancers. Mutational processes evolve across a cancer's lifespan, with many emerging late but contributing extensive genetic variation. Subclonal diversification is prominent, and most mutations are found in just a fraction of tumor cells. Every tumor has a dominant subclonal lineage, representing more than 50% of tumor cells. Minimal expansion of these subclones occurs until many hundreds to thousands of mutations have accumulated, implying the existence of long-lived, quiescent cell lineages capable of substantial proliferation upon acquisition of enabling genomic changes. Expansion of the dominant subclone to an appreciable mass may therefore represent the final rate-limiting step in a breast cancer's development, triggering diagnosis.

    Funded by: Department of Health; NCI NIH HHS: CA089393; Wellcome Trust: 088340, 093867, 098051

    Cell 2012;149;5;994-1007

  • Mutational processes molding the genomes of 21 breast cancers.

    Nik-Zainal S, Alexandrov LB, Wedge DC, Van Loo P, Greenman CD, Raine K, Jones D, Hinton J, Marshall J, Stebbings LA, Menzies A, Martin S, Leung K, Chen L, Leroy C, Ramakrishna M, Rance R, Lau KW, Mudie LJ, Varela I, McBride DJ, Bignell GR, Cooke SL, Shlien A, Gamble J, Whitmore I, Maddison M, Tarpey PS, Davies HR, Papaemmanuil E, Stephens PJ, McLaren S, Butler AP, Teague JW, Jönsson G, Garber JE, Silver D, Miron P, Fatima A, Boyault S, Langerød A, Tutt A, Martens JW, Aparicio SA, Borg Å, Salomon AV, Thomas G, Børresen-Dale AL, Richardson AL, Neuberger MS, Futreal PA, Campbell PJ, Stratton MR and Breast Cancer Working Group of the International Cancer Genome Consortium

    Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton CB10 1SA, UK.

    All cancers carry somatic mutations. The patterns of mutation in cancer genomes reflect the DNA damage and repair processes to which cancer cells and their precursors have been exposed. To explore these mechanisms further, we generated catalogs of somatic mutation from 21 breast cancers and applied mathematical methods to extract mutational signatures of the underlying processes. Multiple distinct single- and double-nucleotide substitution signatures were discernible. Cancers with BRCA1 or BRCA2 mutations exhibited a characteristic combination of substitution mutation signatures and a distinctive profile of deletions. Complex relationships between somatic mutation prevalence and transcription were detected. A remarkable phenomenon of localized hypermutation, termed "kataegis," was observed. Regions of kataegis differed between cancers but usually colocalized with somatic rearrangements. Base substitutions in these regions were almost exclusively of cytosine at TpC dinucleotides. The mechanisms underlying most of these mutational signatures are unknown. However, a role for the APOBEC family of cytidine deaminases is proposed.

    Funded by: Department of Health; Medical Research Council: MC_U105178806; NCI NIH HHS: CA089393; Wellcome Trust: 088340, 098051, WT088340MA

    Cell 2012;149;5;979-93

  • Genome sequencing and analysis of the Tasmanian devil and its transmissible cancer.

    Murchison EP, Schulz-Trieglaff OB, Ning Z, Alexandrov LB, Bauer MJ, Fu B, Hims M, Ding Z, Ivakhno S, Stewart C, Ng BL, Wong W, Aken B, White S, Alsop A, Becq J, Bignell GR, Cheetham RK, Cheng W, Connor TR, Cox AJ, Feng ZP, Gu Y, Grocock RJ, Harris SR, Khrebtukova I, Kingsbury Z, Kowarsky M, Kreiss A, Luo S, Marshall J, McBride DJ, Murray L, Pearse AM, Raine K, Rasolonjatovo I, Shaw R, Tedder P, Tregidgo C, Vilella AJ, Wedge DC, Woods GM, Gormley N, Humphray S, Schroth G, Smith G, Hall K, Searle SM, Carter NP, Papenfuss AT, Futreal PA, Campbell PJ, Yang F, Bentley DR, Evers DJ and Stratton MR

    Wellcome Trust Sanger Institute, Hinxton, CB10 1SA, UK. elizabeth.murchison@sanger.ac.uk

    The Tasmanian devil (Sarcophilus harrisii), the largest marsupial carnivore, is endangered due to a transmissible facial cancer spread by direct transfer of living cancer cells through biting. Here we describe the sequencing, assembly, and annotation of the Tasmanian devil genome and whole-genome sequences for two geographically distant subclones of the cancer. Genomic analysis suggests that the cancer first arose from a female Tasmanian devil and that the clone has subsequently genetically diverged during its spread across Tasmania. The devil cancer genome contains more than 17,000 somatic base substitution mutations and bears the imprint of a distinct mutational process. Genotyping of somatic mutations in 104 geographically and temporally distributed Tasmanian devil tumors reveals the pattern of evolution and spread of this parasitic clonal lineage, with evidence of a selective sweep in one geographical area and persistence of parallel lineages in other populations.

    Funded by: Wellcome Trust: 077012/Z/05/Z, 088340, 095908

    Cell 2012;148;4;780-91

  • AutoCSA, an algorithm for high throughput DNA sequence variant detection in cancer genomes.

    Dicks E, Teague JW, Stephens P, Raine K, Yates A, Mattocks C, Tarpey P, Butler A, Menzies A, Richardson D, Jenkinson A, Davies H, Edkins S, Forbes S, Gray K, Greenman C, Shepherd R, Stratton MR, Futreal PA and Wooster R

    Cancer Genome Project, Wellcome Trust Sanger Institute, Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.

    Unlabelled: The undertaking of large-scale DNA sequencing screens for somatic variants in human cancers requires accurate and rapid processing of traces for variants. Due to their often aneuploid nature and admixed normal tissue, heterozygous variants found in primary cancers are often subtle and difficult to detect. To address these issues, we have developed a mutation detection algorithm, AutoCSA, specifically optimized for the high throughput screening of cancer samples.

    Availability: http://www.sanger.ac.uk/genetics/CGP/Software/AutoCSA.

    Funded by: Wellcome Trust

    Bioinformatics (Oxford, England) 2007;23;13;1689-91

* quick link - http://q.sanger.ac.uk/cgp