Contact WTSI Webmaster Printer friendly format Login to WTSI resources WTSI RSS feed
Genomics & Genetics
  • Overview
  • CGP
  • Faculty
  • Stratton
  • Futreal
  • Projects
  • Cancer Gene Census
  • COSMIC
  • CGP Resequencing Studies
  • Copy Number Mapping
  • NCI-60
  • Planned studies
  • Software
  • Information
  • Links
  • Team
  • News
  • Publications
  • Conditions of use
  • Website Search
  • People Search
  • Library Services
  • Site Map
  • Feedback / Help
Catalogue Of Somatic Mutations In Cancer - (COSMIC)

8th Mar 2010 COSMIC v46 Release

COSMIC v46 Release

The second full-genome resequencing study from the CGP at the Sanger Institute, UK is now available, together with the curation of Parsons et al (2008), a systematic candidate gene screen of Glioblastomas. In addition, the published literature has been fully curated for fusion mutations between seven new gene pairs.

Full Genome resequencing of NCI-H209

The recent Pleasance et al (2010) publication "A small-cell lung cancer genome with complex signatures of tobacco exposure" (Nature 463, 184-190) is now available within COSMIC; please click here.

Systematic Screen Curation

The largest published candidate gene screen of Glioblastomas Parsons et al (2008), is now curated in COSMIC; please click here:

An integrated genomic analysis of human glioblastoma multiforme. Parsons DW, Jones S, Zhang X, Lin JC, Leary RJ, Angenendt P, Mankoo P, Carter H, Siu IM, Gallia GL, Olivi A, McLendon R, Rasheed BA, Keir S, Nikolskaya T, Nikolsky Y, Busam DA, Tekleab H, Diaz LA, Hartigan J, Smith DR, Strausberg RL, Marie SK, Shinjo SM, Yan H, Riggins GJ, Bigner DD, Karchin R, Papadopoulos N, Parmigiani G, Vogelstein B, Velculescu VE, Kinzler KW Science. 2008;321;1807-12. PMID: 18772396 DOI: 10.1126/science.1164382

Statistics

Samples 105
Mutations 2449

Fusion mutations between 7 new gene pairs have been curated from the literature for this release.

FUS-ERG , FUS-FEV , FUS-ATF1 Both FUS-ERG and FUS-FEV fusions have been identified as alternatives to EWSR1-ETS transcription factor fusions in Ewing's sarcoma, and FUS-ERG also occurs in t (16,21) myeloid leukaemia as well as in these solid tumours. FUS-ATF1 is found in angiomatoid fibrous histiocytoma, where the fusion of the N-terminus of FUS and the DNA binding domain of ATF1 is similar to the EWSR1-ATF1 fusion found in clear cell sarcoma.

SS18-SSX1 This fusion is characteristic for synovial sarcoma along with SS18-SSX2 and more rarely, SS18-SSX4 fusions. Through its N-terminal SNH domain SS18 protein is involved in the remodelling of chromatin structures and functions as a transcriptional activator whereas SSX proteins have 2 putative transcription-repressor domains, one of which, an SSXRD domain in the C-terminal region, is preserved in the fusion protein.

SRGAP3-RAF1 This oncogenic fusion has been identified in paediatric pilocytic astrocytoma as an alternative to the previously described KIAA1549-BRAF fusion. It also activates the ERK/MAPK pathway; the auto-inhibitory domain of RAF1 being replaced by SRGAP3.

COL1A1-PDGFB This recurrent fusion characterizes dermatofibroma protuberans and its juvenile form, giant cell fibroblastoma. The fusion consistently deletes exon 1 of PDGFB releasing this growth factor from its normal regulation. The breakpoints in COL1A1, which encodes an extracellular matrix protein, occur in various exons in the alpha-helical domain.

JAZF1-SUZ12 A fusion involving these two genes is common but not universal in endometrial stromal sarcomas, occurring less frequently in high-grade tumours. The genes encode novel proteins with zinc finger motifs and these are retained in the fusion.

The following curated genes have been updated in this release

ABL1, ACVR1B, AKT1, ALK, APC, ASXL1, ATM, BRAF, BRCA1, BRCA2, CBL, CDC73, CDH1, CDKN2A, CEBPA, CSF1R, CTNNA1, CTNNB1, CYLD, EGFR, EML4, ERBB2, ERG, FAM123B, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, FOXL2, GATA1, GNAQ, GNAS, HNF1A, HRAS, IDH1, IDH2, JAK2, JAK3, KIT, KRAS, MAP2K4, MEN1, MET, MLH1, MPL, MSH2, MSH6, NF1, NF2, NOTCH1, NOTCH2, NPM1, NRAS, PDGFRA, PHOX2B, PIK3CA, PRKAR1A, PTCH1, PTEN, PTPN11, RB1, RET, RUNX1, SMAD4, SMARCA4, SMARCB1, SMO, SOCS1, SRC, STK11, SUFU, TNFAIP3, TSHR, VHL, WT1

COSMIC v46 Total Statistics


Experiments 2077858
Tumours 449676
Samples 451972
Mutant Samples 108773
Mutations 112256
Unique Mutations 19239
Papers curated 8911
Genes 18478
Fusions 4657
Structural Variants 2307


More


21st Jan 2010 COSMIC v45 Release

COSMIC v45 Release

The first full-genome resequencing study is now available, together with the genome-wide rearrangement screens of 24 breast tumours. In addition, five new cancer genes have been curated from the literature.

To make the data easier to investigate in depth, the website has been upgraded with new specialisation features, together with new views on mutation spectrum and distribution. Finally, we are introducing a new COSMIC Biomart, where all COSMIC's information will be available in this industry-standard data mining tool.

Full Genome resequencing of COLO-829

The recent Pleasance et al (2010) publication "A comprehensive catalogue of somatic mutations from a human cancer genome" (Nature 463, 191-196) is now available within COSMIC; please click here.

Whole-genome rearrangement screen of 24 Breast tumour samples:

Also, the CGP Stephens et al (2009) paper "Complex landscapes of somatic rearrangement in human breast cancer genomes" (Nature 462, 1005-1010) is now available in COSMIC; please click here . A paired-end genome-wide Illumina sequencing strategy revealed numerous rearrangements in very diverse patterns between the samples examined.

New genes curated from the scientific literature

GNAQ is the alpha subunit of one of the heterotrimeric GTP-binding proteins that mediate stimulation of protein kinase C signalling. Mutations in GNAQ, occurring at codon 209 in the catalytic domain, have been found as common and early mutational events in uveal melanomas.

TNFAIP3 is a negative regulator of the NF-kappa B pathway functioning through the removal of activating Lys63-linked ubiquitins and the Lys48-linked ubiquitination of receptor-interacting proteins. TNFAIP3 has been shown to be a genetic target in B-lineage lymphomas such as mucosa-associated lymphoma and Hodgkin's lymphoma of nodular sclerosing histology.

CBL encodes a protein with multiadaptor function and E3 ubiquitin ligase activity that targets a variety of tyrosine kinases for degradation. Mutations in CBL have been identified in myeloid malignancies, occurring in the critical linker and ring finger domains of the protein.

JAK3 is a member of the non-receptor tyrosine kinase family which includes JAK2. Rare but significant JAK3 activating mutations located in the JH2 (pseudokinase) and JH6 (receptor binding) domains have been found in Down syndrome and Non-DS acute megakaryoblastic leukaemia (AML-M7). Mutations have also been found in various myeloproliferative neoplasms, lymphomas and carcinomas.

NOTCH2 is a Type 1 transmembrane protein with an extracellular domain consisting of multiple epidermal growth factor-like (EGF) repeats, and an intracellular domain consisting of multiple different domain types. The Notch2 receptor and its 5 ligands, which include Jagged1, Jagged2, and Delta-like 1, 3 and 4, send signals that are important for development before birth. After birth,Notch2 signaling is involved in tissue repair. Mutations in the NOTCH2 gene have been identified in a small percentage of people with Alagille syndrome and malformations in the kidneys, especially in filtering structures. NOTCH2 is also preferentially expressed in mature B cells,is essential for marginal zone B-cell generation, and mutations are evident in a subset of individuals with diffuse large B-cell lymphomas.

Web site enhancements

The main histogram page of the COSMIC website had been improved to provide better ways of selecting and viewing subsets of data. In the navigation bar on the left side, new options are now available to redraw the histogram and associated tables based on four parameters: mutation type (eg deletion, nonsense substitutions, etc), sample source (cultured or tissue sample), somatic status (confirmed somatic or unknown) and systematic screen (genome-wide screen). In addition to redrawing the histogram and tables, a new "Distribution" button displays pie charts of relevant information about the data selected.

The sample summary page has also been upgraded, with every CGP sample (examined through numerous genes) receiving a mutation spectrum diagram. This comprises a histogram showing the relative frequencies of each substitution type, together with a count of insertion/deletion mutations. This is highly useful when looking for mutation signatures which may show characteristsics of, for instance, tobacco or UV light exposure.

Biomart

The new COSMIC biomart is now available, please click here. This system allows much more specialised selection of data in COSMIC and is very useful for data mining. In addition, it can be directly linked to Ensembl for federilsed querying across both databases.

The following curated genes have been updated in this release

JAK2, JAK3, MAP2K4, GNAS, MPL, SOCS1, WT1, CYLD, FBXW7, MEN1, NF1, RUNX1, ASXL1, NOTCH2, IDH1, IDH2, APC, CDH1, VHL, GNAQ, BRAF, HRAS, CEBPA, CTNNB1, FLT3, KIT, PDGFRA, PTEN, RB1, RET, SMARCB1, AKT1, EGFR, ERBB2, CDKN2A, CBL, GATA1, NPM1, PTPN11, NRAS, FGFR3, BRCA1, MSH6, PRKAR1A, KRAS, PIK3CA, MET, TNFAIP3

COSMIC v45 Total Statistics

Experiments 1654274
Tumours 434364
Samples 436577
Mutant Samples 101860
Mutations 105171
Unique Mutations 16788
Papers curated 8624
Genes 13634
Fusions 3635
Structural Variants 2249


More


4th Nov 2009 COSMIC v44 Release

COSMIC v44 Release

This release of COSMIC includes 4 new curated genes, 8 new curated fusion pairs and the TCGA systematic screen publication of 91 Glioblastoma tumour samples. In addition, a new CGP study is available (Adenoid cystic carcinoma) together with substantial updates to existing data.

New curated genes

IDH2 encodes a mitochondrial NADP(+)-dependent isocitrate dehydrogenase which catalyzes oxidative decarboxylation of isocitrate to alpha-ketoglutarate. It is now implicated in the pathogenesis of malignant gliomas and some secondary glioblastomas lacking IDH1 mutations have IDH2 mutations at the analogous amino acid (R172).

AKT1 encodes a serine-threonine protein kinase which is activated by phosphorylated phosphoinositides and is a central mediator of the PI3kinase signalling pathway. A common mutation (E17K) has been identified in the pleckstrin homology domain in cancers of the colon, breast, lung and ovary.

ASXL1 belongs to a family of proteins regulating chromatin remodelling. Originally implicated via aCGH on MDS/AML samples, mutations are mainly frameshift mutations, the predicted truncated proteins lack the PHD finger domain potentially compromising the function of the associated chromatin modifiers.

FOXL2, forkhead box L2 is a winged helix/forkhead transcription factor gene, encoding a nuclear protein that is specifically expressed in eyelids and in fetal and adult ovarian follicular cells. Germline mutations in FOXL2 are responsible for BPES - blepharophimosis ptosis epicanthus inversus syndrome - an autosomal dominant disorder consisting of eyelid abnormalities (only, in Type II) and ovarian failure (Type I). Somatic mutations have recently been described in ovarian granulosa cell tumours.

New curated gene fusion pairs:

The following gene fusions have been curated from the scientific literature:
EML4 / ALK
MSN / ALK
NPM1 / ALK
CLTC / ALK
SEC31A / ALK
RANBP2 / ALK
SS18 / SSX2
SS18 / SSX4

Systematic screen curation:

Comprehensive genomic characterization defines human glioblastoma genes and core pathways.The first systematic screen of the Cancer Genome Atlas Research Network (PMID 18772890) is now curated in COSMIC .

Comprehensive genomic characterization defines human glioblastoma genes and core pathways.
Cancer Genome Atlas Research Network
Nature. 2008;455;1061-8. PMID: 18772890 DOI: 10.1038/nature07385

Statistics


Glioblastoma Samples 91
Genes 599
Sequencing Experiments 54509
Mutations 662

New CGP resequencing study: Adenoid Cystic Carcinoma Candidate Gene Screen

Adenoid cystic carcinoma is a slow growing tumour of the secretory glands, arising most commonly in the salivary glands but also occurring in other parts of the body. As part of an ongoing research effort funded by the Adenoid Cystic Carcinoma Research Fund (www.accrf.org), 400 candidate gene (including genes implicated in cancer, cell signaling and growth control) were sequenced for small point mutations. This work was carried out on 25 samples (provided by ACCRF collaborative research group member Dr. Adel El-Naggar) utilising an approach of PCR product generation for the entire set of PCR amplimers followed by individual concatentation of all amplimers for each tumour and matching normal DNA sample, then sequencing this material utilising next generation sequencing. In total 8 somatic point mutations were identified in 8 genes. No highly prevalent point mutation was identified in this set of genes.

These curated genes have been updated this release

KRAS, PIK3CA, FGFR2, MET, ABL1, FGFR1, JAK2, MAP2K4, GNAS, EML4, FOXL2, PTCH1, MPL, SOCS1, HNF1A, WT1, NF2, CYLD, FBXW7, MEN1, NF1, RUNX1, IDH1, IDH2, ASXL1, FAM123B, APC, CDH1, SMAD4, VHL, BRAF, HRAS, CEBPA, CSF1R, CTNNB1, FLT3, KIT, PDGFRA, PTEN, RB1, RET, SMARCB1, SUFU, ACVR1B, AKT1, ALK, ATM, EGFR, ERBB2, SRC, STK11, CDKN2A, GATA1, SMO, NOTCH1, NPM1, PTPN11, NRAS, FGFR3, BRCA1, BRCA2, MLH1, MSH2, MSH6, PRKAR1A

COSMIC v44 Total Statistics

Experiments 1631186
Tumours 419018
Samples 421193
Mutant Samples 97932
Mutations 101138
Unique Mutations 16072
Papers curated 8336
Genes 13501
Fusions 3521
Structural Variants 40


More


26th Aug 2009 COSMIC v43 Release

COSMIC v43 Release

The COSMIC curation systems have been extended to encompass the entry of large-scale systematic screen papers. For this release, we have entered the first such paper, the Sjoblom et al (2006) screen of human breast and colorectal cancers. This release also contains two new genes successfully curated from the scientific literature (IDH1, SMARCA4) and the finalisation of two of the Cancer Genome Project's current resequencing studies.

Systematic Screen Papers Curated in COSMIC

For this release of COSMIC we have entered the Sjoblom et al (2006) systematic screen paper of human breast and colorectal cancers. An additional 8,648 genes have been added to COSMIC along with the 1,672 mutations from the paper. The COSMIC reference overview page for this publication is available here.

The consensus coding sequences of human breast and colorectal cancers. Sjoblom T, Jones S, Wood LD, Parsons DW, Lin J, Barber TD, Mandelker D, Leary RJ, Ptak J, Silliman N, Szabo S, Buckhaults P, Farrell C, Meeh P, Markowitz SD, Willis J, Dawson D, Willson JK, Gazdar AF, Hartigan J, Wu L, Liu C, Parmigiani G, Park BH, Bachman KE, Papadopoulos N, Vogelstein B, Kinzler KW, Velculescu VE. Science. 2006 Oct 13;314(5797):268-74. Epub 2006 Sep 7. PMID: 16959974

CGP resequencing studies completed

The resequencing of candidate genes in Pilot and Renal tumour sets has now been completed. The finalised studies examined 2978 samples through 4766 genes, discovering a total of 5437 mutations. All of these can be found in COSMIC's CGP Resequencing Studies Site.

New curated genes

IDH1 is a catalytic enzyme causing NADP+ dependent oxidative decarboxylation of isocitric acid. It plays an important role in the control of glucose-stimulated insulin secretion and the cholesterol and fatty acid biosynthetic pathways. Originally implicated in human cancer in genome-wide sequencing scans, when mutated it is an indicator for the longer survival of these patients.

SMARCA4, is a scaffold protein, forming a functional part of the SWI/SNF complex involved in the control of transcription.

These curated genes have been updated this release

FBXW7, MEN1, NF1, BRAF, HRAS, CSF1R, CTNNB1, FLT3, KIT, PDGFRA, PTEN, RET, SMARCB1, SUFU, ACVR1B, ATM, EGFR, ERBB2, SRC, CDKN2A, FAM123B, GATA1, SMO, NOTCH1, NPM1, PTPN11, NRAS, FGFR3, BRCA1, BRCA2, APC, CDH1, SMAD4, VHL, TSHR, MLH1, MSH2, MSH6, SMARCA4, RUNX1, PHOX2B, GNAS, KRAS, PIK3CA, FGFR2, FGFR1, IDH1, JAK2, JAK3, MAP2K4, TET2, PRKAR1A, CDC73, PTCH1, MPL, CTNNA1, SOCS1, HNF1A, WT1, ERG, NF2

COSMIC v43 Total Statistics


Experiments 1506545
Tumours 366477
Samples 368592
Mutant Samples 85749
Mutations 88727
Unique Mutations 14971
Papers curated 7797
Genes 13423
Fusions 2770
Structural Variants 40


More


28th May 2009 COSMIC v42 Release

COSMIC v42 Release

For this release of COSMIC two known cancer genes (GNAS and ALK) and 3 gene fusions (FCHSD1 / BRAF, KIAA1549 / BRAF, EWSR1 / NR4A3) have been successfully curated from the scientific literature. The Cancer Cell Line Project has also been updated with the addition of 80 mutations.


Cancer Cell Line Project Update

The Cancer Cell Line Data has been updated with the addition of 80 mutations. The project has also published a further set of variants identified by the screen which have been classified as Tentatively Oncogenic Variant (TOV) or Unknown Variant (UV). These variants are currently available from our website as an excel file.

Curation of known cancer genes ALK and GNAS

Two further cancer genes have been curated with the addition of 95 mutations for ALK and 235 mutations for GNAS.

Curation of gene fusions

The following gene fusions have been curated from the scientific literature:
FCHSD1 / BRAF
KIAA1549 / BRAF
EWSR1 / NR4A3

Genes updated: KRAS, PIK3CA, ABL1, FGFR1, JAK2, BRAF, HRAS, CEBPA, CSF1R, CTNNB1, KIT, PDGFRA, PTEN, RB1, SUFU, ERBB2, SRC, STK11, CDKN2A, GATA1, SMO, NPM1, PTPN11, NRAS, BRCA2, MLH1, MSH2, MSH6, APC, CDH1, SMAD4, MET, EGFR, FLT3, PTCH1, MPL, WT1, CYLD, FBXW7, NF1, ALK, FGFR3, RET, NOTCH1, NF2, GNAS

COSMIC v42 Total Statistics


Experiments 1111579
Tumours 339481
Samples 341522
Mutant Samples 76132
Mutations 78933
Unique Mutations 12905
Papers curated 7386
Genes 4775
Fusions 2424
Structural Variants 40


More


4th Mar 2009 COSMIC v41 release

COSMIC v41 release

This release of COSMIC comprises an update of published data in which 44 genes have been updated with the addition of 22516 samples and a further 7387 mutations.


Gene Update

STK11, CDKN2A, GATA1, SMO, NPM1, PTPN11, NRAS, BRCA2, MSH2, KRAS, PIK3CA, JAK2, MAP2K4, BRAF, HRAS, CEBPA, CTNNB1, KIT, PDGFRA, PTEN, RB1, ATM, ERBB2, FBXW7, NF1, FAM123B, APC, CDH1, VHL, MET, EGFR, FLT3, PTCH1, MPL, SOCS1, HNF1A, WT1, CYLD, FGFR3, RET, RUNX1, TSHR, PHOX2B, NOTCH1.


COSMIC v41 Total Statistics

Experiments 1078748
Tumours 313780
Samples 315778
Mutant Samples 70086
Mutations 72718
Unique Mutations 12349
Papers curated 6876
Genes 4773
Fusions 2266
Structural Variants 40


More


26th Nov 2008 COSMIC release 40

COSMIC release 40

This release of COSMIC comprises an update of the existing genes totalling almost 3000 new mutations.

Gene Update
2947 new mutations have been added in release 40; the following curated genes have been updated: BRAF, HRAS, CEBPA, CTNNB1, KIT, PDGFRA, PTEN, ERBB2, MLH1, MSH2, KRAS, PIK3CA, JAK2, CDKN2A, GATA1, NPM1, PTPN11, NRAS, FAM123B, APC, VHL, MET, EGFR, FLT3, PTCH1, MPL, HNF1A, FBXW7, PRKAR1A, RUNX1, FGFR3, RET, TSHR, NOTCH1

Cancer Gene Census

On the 5th November, the Cancer Gene Census was brought up to date, with the addition of three genes newly identified in the causation of cancer, IDH1, MDM2, KIAA1549.

COSMIC v40 Total Statistics

Experiments 1050480
Tumours 291551
Samples 293262
Mutant Samples 62930
Mutations 65331
Unique Mutations 11789
Papers curated 6486
Genes 4773
Fusions 2266
Structural Variants 40


More


15th Oct 2008 COSMIC release 39, Annotating Cancer Genomes

COSMIC release 39, Annotating Cancer Genomes

For this release of COSMIC the database and web interfaces have been upgraded to handle Next Generation Sequencing Data. This is part of ongoing work to allow COSMIC to handle the increased volumes and complexity of somatic data that is anticipated from Next Generation Sequencers. In particular, for this release we have concentrated on adapting COSMIC to handle large-scale structural variants (including translocations, large insertions/deletions, inversions, and duplications).

The structural variants from the Campbell et al. 2008 paper, which comprehensively characterizes 2 lung cancer cell lines, have been entered into COSMIC (click here for study overview). Sample Summary pages are available for both cancer cell lines (NCI-H2171 and NCI-H1770).

Circular plots (Circos plots developed by Martin Krzywinski) have been added to the sample overview page which gives a clear overview of all the structural variants along with copy number changes and COSMIC point mutations for a particular sample (Figure 1). More detailed views of complex rearrangements are available on the mutation details page.


Circos Image Unavailable
Figure 1. Circos Plot showing structural variants in relation to copy number and COSMIC Point Mutations.

Tabular views and exports are also available for these data (Figure 2). Due to the complexity of these rearrangements, where possible, a short description term of the variant is given (e.g. deletion, tandem duplication translocation). The variant is also fully described using HGVS mutation nomenclature. For example chr11:g.36585230_76606619del, where chr11: denotes the chromosome involved, g. for genomic coordinates, 36585230 for the deletion start point, 76606619 for deletion end point and del indicates a deletion event.


Figure 2 Unavailable
Figure 2. Summary Structural Variants Table


Bioinformatics Primer on COSMIC published

NCI/Nature Pathway Interaction Database Primer on COSMIC published and is available from here.


Update of the Cancer Gene Census

The Cancer Gene Census was updated on 11th August 2008. The Census now contains information of 379 genes of which 343 harbour somatic alterations and 70 germline.


COSMIC v39 General Statistics

Experiments 1035943
Tumours 281307
Samples 282777
Mutant Samples 60007
Mutations 62352
Unique Mutations 11642
Papers curated 6168
Genes 4773
Fusions 2266
Structural Variants 40


More


3rd Jul 2008 COSMIC release 38

COSMIC release 38

For this release of COSMIC we have concentrated our efforts on significantly updating the following genes: BRAF, HRAS, CTNNB1, KIT, PDGFRA, PTEN, RB1, ERBB2, MAP2K4, CDKN2A, GATA1, SMO, NPM1, NRAS, MLH1, MSH2, KRAS, PIK3CA, JAK2, APC, SMAD4, EGFR, FLT3, PTCH1, MPL, HNF1A, FBXW7, NF1, FGFR3, RET, NF2, NOTCH1.

External links

In collaboration with the Human Gene Nomenclature committee (HGNC) and the Atlas of Genetics and Cytogenetics in Oncology and Haematology (Atlas Genetics Oncology), links are now available from COSMIC's gene summary page to further information at these resources.



A Current Protocol for COSMIC

An article describing COSMIC, its contents and usage, has been published in Current Protocols in Human Genetics, unit 10.11. Describing in detail how the website and exported datasheets may be used and interpreted, this is available at the Wiley Interscience website.



COSMIC v38 total statistics


Experiments 1019304
Tumours 268938
Samples 270095
Mutant Samples 56918
Mutations 59187
Unique Mutations 11400
Papers curated 5902
Genes 4773
Fusions 2266


More


7th May 2008 COSMIC release 37

COSMIC release 37

This months release extends our complete curation of oncogenic EWSR1 fusion partners, together with two new curated genes, PHOX2B & PRKAR1A. CGP's resequencing studies and cell line projects are also significantly updated, each receiving over 100 new mutations. In total, over 1200 new mutations have been added to COSMIC this release.


Curated genes

PHOX2B This gene encodes a highly conserved homeobox transcription factor known to cause congenital central hypoventilation syndrome with associated neuroblastoma.

PRKAR1A This is a regulatory subunit of the cAMP dependent protein kinase holoenzyme. An apparent tumour suppressor gene, it has also been observed to be oncogenic in fusions with RET and RARA.


Gene Unique Samples Samples Experiments Mutants Papers Mutations Unique Mutations
PHOX2B 410 410 411 6 4 6 5
PRKAR1A 232 232 233 7 5 7 7


Curated EWSR1 fusions

EWSR1/ETV4; EWSR1/FEV; EWSR1/PATZ1; EWSR1/PBX1; EWSR1/POU5F1; EWSR1/ZNF384

EWSR1 has been observed in oncogenic gene fusions with over 15 partners. This month we release our curation of the literature describing its fusion with a further six partners, bringing the total to 14.

Genes Unique Breakpoints Mutations Unique Fusions Papers Mutant Samples
EWSR1 / ETV4 3 6 4 2 3
EWSR1 / FEV 5 10 4 4 5
EWSR1 / PATZ1 1 2 2 1 1
EWSR1 / PBX1 1 3 3 1 1
EWSR1 / POU5F1 5 10 4 2 5
EWSR1 / ZNF384 2 4 4 1 2

The following curated genes have received significant updates: BRAF, HRAS, KIT, PTEN, RB1, SMARCB1, ERBB2, STK11, CDKN2A, PTPN11, NRAS, BRCA2, MLH1, MSH2, KRAS, PIK3CA, JAK2, APC, VHL, MSH6, MET, EGFR, MPL, FBXW7, PRKAR1A, RET, RUNX1, NOTCH1, NF2, PHOX2B.



COSMIC v37 total statistics

Experiments 1006553
Tumours 258584
Samples 259684
Mutant Samples 53569
Mutations 55779
Unique Mutations 11207
Papers curated 5706
Genes 4773
Fusions 2249


More


5th Mar 2008 COSMIC release 36

COSMIC release 36

The March 2008 release of COSMIC contains full curation of the TSHR gene together with a further 6 EWSR1 gene fusion pairs.

Curated genes

TSHR - Thyroid stimulating hormone receptor is a 7-TM cell surface receptor expressed in follicular thyroid cells. Upon binding of its ligand, thyrotropin, a signalling cascade is commenced resulting in a range of transcriptional alterations. Somatic mutations in this gene have been described in thyroid adenomas and carcinomas.

Gene Samples Experiments Mutants Papers Mutations Unique Mutations
TSHR 665 669 210 36 210 61

Curated Fusions

EWSR1/ATF1 ; EWSR1/CREB1 ; EWSR1/DDIT3 ; EWSR1/ETV1 ; EWSR1/SP3 ; EWSR1/WT1
EWSR1 is fused to multiple partner genes via recurrent chromosomal translocation in, primarily, Ewing sarcoma. We are currently curating the complete mutation data for this gene, which has so far been fused with over 10 partners; we have released our curation of EWSR1 with ERG & FLI1, we now release the data for six more gene partners.


Genes Mutant Samples Mutations Unique fusions Papers
EWSR1 / ATF1 72 175 16 17
EWSR1 / CREB1 24 36 5 3
EWSR1 / DDIT3 11 22 7 6
EWSR1 / ETV1 4 7 3 3
EWSR1 / SP3 1 3 3 1
EWSR1 / WT1 102 198 22 28

The following curated genes have received significant updates:
BRAF, BRCA1, BRCA2, CDH1, CDKN2A, CEBPA, EGFR, ERBB2, FLT3, HRAS, KRAS, MLH1, MSH2, MSH6, NF2, NRAS, PDGFRA, PTEN, SMARCB1, STK11, TSHR, VHL


COSMIC v36 total statistics

Experiments 1000842
Tumours 254673
Samples 255767
Mutant Samples 52343
Mutations 54519
Unique Mutations 10995
Papers curated 5614
Genes 4772
Fusions 2174


More


16th Jan 2008 COSMIC release 35

COSMIC release 35

This release of COSMIC contains the new curation of four new tumour suppressor genes, and further curation of EWSR1/FLI1 gene fusions in Ewing's sarcoma. We also announce a significant upgrade to the CGP Trace Archive, which is now updated daily with our latest sequencing results.



Literature Curation



MLH1

MLH1 is a tumour suppressor gene, involved in mismatch repair. The encoded protein is a subunit of the large 'BRCA1-associated genome surveillance complex' (BASC) involved in DNA damage detection and repair. This particular subunit dimerises with PMS2 to provide endonuclease capacity within the complex. MLH1 germline mutations give rise to HNPCC (hereditary non-polyposis colorectal cancer). Somatic mutations in this gene are important in sporadic colorectal cancers. Mutations of MLH1 lead to a mutator phenotype often manifested by microsatellite instability.



MSH2

MSH2 is a tumour suppressor gene, also involved in mismatch repair. It resides within the 'BRCA1-associated genome surveillance complex' (BASC) which detects and repairs DNA damage. MSH2, in complex with MSH6, forms a sliding clamp which traverses the DNA backbone detecting mismatched bases. MSH2 germline mutations also give rise to HNPCC. Similar to MLH1, somatic mutations in MSH2 are found predominantly in colorectal cancers. Mutations of MSH2 lead to a mutator phenotype often manifested by microsatellite instability.



CDC73

CDC73 (HRPT2) is a tumour suppressor forming part of the PAF protein complex, which is associated with RNA polymerase II and may therefore be involved in both initiation of RNA synthesis and RNA elongation. Mutations in this gene have been identified in tumours of the parathyroid, most often causing the endocrine disorder hyperparathyroidism (with or without jaw tumour).



MAP2K4

MAP2K4 is one part of the mitogen-activated protein kinase (MAPK) pathway, a signal transduction cascade which mediates certain extracellular signals via RAS/RAF resulting in transcriptional control of a wide range of genes. The MAP2K family of peptides regulate MAPK activity by phosphorylation. MAP2K4 mutations appear involved in many tumour types.



Gene Samples Experiments Mutations Unique Mutations Papers
MLH1 1328 1325 44 38 25
MSH2 1306 1304 36 33 23
CDC73 278 272 39 32 11
MAP2K4 1557 1559 22 19 9


EWSR1/FLI1 Gene fusions

Ewing's sarcoma is a rare bone tumour, infrequently of extraskeletal origin, most frequently occurring in teenage children. The majority of these tumours contain a t(11;22)(q24;q12) translocation which fuses the EWSR1 gene on chromosome 22 with the FLI1 gene on chromosome 11. We have now curated the existing literature describing fusions between this gene pair.



Genes Mutant Samples Papers Unique Mutations
EWSR1/FLI1 1133 115 28


The following curated genes have been updated for this release: CDKN2A, PTPN11, NRAS, MLH1, MSH2, KRAS, JAK2, MAP2K4, BRAF, HRAS, CTNNB1, MEN1, NF1, APC, VHL, EGFR, FLT3, PTCH, MPL, WT1, RET, CDC73, RUNX1, EWSR1, FLI1.



Web site upgrade

Genomic co-ordinates for individual mutations are now available in the data export section, together with the datasheets in the FTP site.



CGP Trace Archive

The CGP trace archive has been updated to contain all the sequencing traces used in our analysis of the samples and genes presented in the CGP Resequencing project (COSMIC red pages). The number of traces available for download is now approaching 9.5 million. The Archive itself has also been upgraded, so that it receives daily updates of CGP sequencing traces as they pass through our sequencing pipeline. Daily updates are available as separate files; these will be integrated into the main download files once per week.

Samples with trace data Total number of traces available
276 9465645


COSMIC Statistics


Experiments 991743
Tumours 250869
Samples 251847
Mutant Samples 50949
Mutations 53098
Unique Mutations 10779
Papers curated 5449
Genes 4763
Fusions 1957



More


8th Nov 2007 COSMIC 34

COSMIC 34

This release of COSMIC includes the addition of BRCA1, BRCA2, and EWSR1/ERG gene fusion from the scientific literature. The website has been enhanced with an update of old gene names and the addition of further links (NCBI Entrez Gene, CCDS, Swiss-Prot and TrEMBL). The CGP Trace and Genotype Archive holding the groups sequence traces and genotype data is also now available.

Literature Curation



BRCA1 and BRCA2

BRCA1 and BRCA2 are tumour suppressor genes initially identified as inherited cancer susceptibility genes for breast and ovarian cancer. Both proteins been shown to have roles in genome surveillance, detection of DNA damage and its subsequent repair. However, they associate with different DNA repair complexes and generate different tumour histologies and spectra. Somatic mutations of either gene are rare, with BRCA2 being more frequently found to have somatic mutations, particularly in ovarian and pancreatic carcinomas.

We report that mutations in these two genes have been discovered at fairly low frequencies (2-3%), with BRCA2 mutated in a wider tissue range than BRCA1.

Gene Unique Samples Samples Experiments Mutant Samples Papers Mutations Unique Mutations
BRCA1 1106 1106 1114 25 22 25 23
BRCA2 1142 1146 1145 29 16 33 29


EWSR1/ERG fusion

Fusions of EWSR1 and ERG are common events in skeletal (and the rarer extraskeletal) Ewing's Sarcoma. These fusions, found at a frequency of approximately 10% in bone tumours result from complex rearrangements, since the two partner genes are not transcribed in the same chromosomal direction.

Genes Mutant Samples Papers Unique Mutations
EWSR1/ERG 77 49 11


COSMIC Data Updates

The CGP Resequencing screens and the following curated genes have received updates: BRAF, HRAS, CSF1R, CTNNB1, KIT, PDGFRA, PTEN, ACVR1B, ATM, ERBB2, BRCA1, BRCA2, KRAS, PIK3CA, FGFR2, ABL1, FGFR1, JAK2, SRC, STK11, CDKN2A, PTPN11, NRAS, FAM123B, APC, SMAD4, VHL, MSH6, MET, EGFR, FLT3, FBXW7, MEN1, NF1, RUNX1, FGFR3, RET.



CGP Trace and Genotype Archive

The groups sequence traces and genotype data are now available from the CGP Trace and Genotype Archive site. In order to access the data a Data Transfer Agreement must be completed and approved. A unique username and password will then be provided to access this resource.

Samples with trace data 276
Samples with genotyping data 1,135
Total number of traces 7,254,445


Gene Name Update

244 genes had their names updated (5.2%). It is still possible to search by the old gene name.



Website Upgrades

There has been an addition of several external gene links on the gene summary page. This includes links to NCBI Entrez gene, CCDS, Swiss-Prot and TrEMBL.

The sample summary page now also contains sample source information.



General Statistics


Experiments 984673
Tumours 246369
Mutant Samples 50032
Mutations 52146
Unique Mutations 10533
Papers curated 5271
Genes 4762
Fusions 685



More


5th Sep 2007 COSMIC 33: Improved CGP data release

COSMIC 33: Improved CGP data release

The WTSI Cancer Genome Project (CGP) announces an updated data release policy. We will now be releasing confirmed somatic mutations on a bi-monthly basis. Confirmed and annotated somatic mutations identified in the previous two months will be released in COSMIC, continuing on at two-monthly intervals. Data will still appear within current COSMIC architecture of gene family/gene set and under appropriate studies. This new policy will result in expedited pre-publication release of curated somatic mutations as they are identified.

This new data will be available in the COSMIC blue pages, but will be most noticeable in COSMIC's CGP resequencing studies site (red pages), as this distinguishes CGP data from the literature curation.

CGP resequencing data is broadly divided (in the red pages) into 3 categories, 'Kinase', 'Pilot' and a new project, 'Renal'. Whilst the Kinase data is completed and published, the other two studies are much larger and still in progress. A collection of approximately 4000 genes has been selected for resequencing in a set of 40 matched pair cell lines ('Pilot' project) and 96 primary clear cell renal cancers. Each tumour sample in these projects has a matched normal sample, which allows the distinction of somatic mutations from germline variants. The pilot project currently comprises 1865 somatic sequence changes, whilst the Renal project, although less advanced than the Pilot, has identified 84 mutations to date. These will be automatically updated with all our confirmed data every bimonthly release.



Literature curation


RUNX1 (AML1) has been fully curated

RUNX1 is one subunit of the PEBP2 transcription factor, binding to DNA at enhancer sequences. This gene is one of the most frequent targets of chromosome translocations associated with leukemia. Small somatic mutations have also been observed, most frequently in myeloblastic leukaemia types (Acute myeloblastic Leukaemia, MyeloDysplastic Syndrome) and it is these that we have curated in COSMIC. Our data suggests a somatic mutation rate of approximately 10% in this phenotype.


Curated Gene Update

The following curated genes have received updates from the literature: APC, ATM, BRAF, CDH1, CDKN2A, CTNNA1, CTNNB1, CYLD, EGFR, ERBB2, ERG, ETV1, FBXW7, FGFR3, FLT3, GATA1, HRAS, JAK2, KIT, KRAS, MADH4, MPL, MSH6, NF1, NF2, NOTCH1, NPM1, NRAS, PIK3CA, PTCH, PTEN, PTPN11, RB1, RET, SMARCB1, SMO, SOCS1, STK11, SUFU, TMPRSS2, VHL, WT1, WTX.


General Statistics

This release includes 1563 new mutations identified in the set of 4799 genes; 1495 genes are new this month.

Experiments 968416
Tumours 239766
Mutant Samples 48959
Mutations 51054
Unique Mutations 10390
Papers curated 5103
Genes 4799
Fusions 445



More


8th Aug 2007 COSMIC v32

COSMIC v32

This release includes four new tumour suppressor genes and improved availability in Ensembl.

New external integration: Ensembl

We are continually striving to improve the utility of the data in COSMIC by integrating it closely with external resources. In this release, we provide a much closer integration with the Ensembl genome browser than previously. All our gene & mutation data now have location coordinates on the NCBI36 genome sequence, allowing us to use Ensembl "DAS" technology to display this information within their genome browser, aligned with their standard genome annotations. We have made this easily available, via a single link from our pages.



BRAF_front_page



Literature curation

Four new tumour suppressor genes have been introduced to COSMIC this month, all receiving full literature curation of their somatic mutation data.

NF1

Neurofibromatosis is a familial disease with a complex phenotype including tumours of the central nervous system, caused by mutations in the NF1 tumour suppressor gene. Somatic mutations in tumours have also been identified in this gene, and it is these that we have fully curated.

NF2

The central form of neurofibromatosis is a similar familial central nervous system tumour syndrome, caused by mutations in the NF2 tumour suppressor gene. Somatic mutations in tumours have also been identified in this gene, and it is these that we have fully curated.

SOCS1

SOCS1 downregulates cellular cytokine signalling by its direct interaction with JAK1. It was first implicated in cancer after aberrant methylation was observed to inactivate its activity causing Hepatocellular Carcinoma. Somatic mutations have also been observed which inactivate this tumour suppressor and these have been curated.

TCF1

TCF1 binds to the promoters of several (largely liver-specific) genes, to enhance their expression. Somatic and germline mutations in this gene have been found which cause liver adenomas, and we have curated the somatic component.



The following curated genes have received updates from the scientific literature: KRAS, PIK3CA, JAK2, BRAF, HRAS, KIT, PDGFRA, PTEN, CDKN2A, VHL, EGFR, FBXW7, MEN1, RET



General Statistics for this release


Experiments 521624
Tumours 235207
Mutant Samples 47470
Mutations 49491
Unique Mutations 9699
Papers curated 5053
Genes 3304
Fusions 445



More


27th Jul 2007 COSMIC (v31) now includes Gene Fusion Data

COSMIC (v31) now includes Gene Fusion Data

The CGP COSMIC team is pleased to announce the addition of gene fusion/translocation somatic mutation data from the literature to the database. Currently, the census of known cancer genes is dominated by somatically generated fusion genes that have been identified primarily in leukaemias, lymphomas and soft tissue tumours. Until now, we have concentrated on curating somatically point mutated cancer genes for COSMIC. Almost all known cancer genes that have somatic point mutations are, however, now curated in COSMIC. In the coming months we will therefore be searching the scientific literature and annotating genes involved in gene fusions and their partners for addition into the COSMIC database.

We have launched this new facility, complete with new views for this data type, with the curation of TMPRSS2, a gene frequently found to be fused to ETS family transcription factors in adenocarcinoma of the prostate. These mutations have served to spur increased investigation into the potential role of fusion genes in adult solid tumours. The move to curate fusion genes is an important addition and will further enhance COSMIC as the most comprehensive source for somatic mutation data from human cancers.



Fusion Gene Pairs



TMPRSS2/ETV1

TMPRSS2/ERG



Website Upgrades

The fusion data has been integrated into existing pages and overviewed in new pages: Translocations Overview and Translocations Summary.

This new data can be viewed graphically and textually.

FusionImage1

The image above shows the table of inferred breakpoints (determined from a sample's observed fusion mRNA spectrum) for a fusion gene pair.

FusionImage2

The image above shows a graphical representation of the observed mRNA transcripts from which the inferred breakpoints are calculated.

Further information of the new gene fusion website features is available in the help pages.



Genes from Literature Curation

A new homepage has been created for genes which have received full curation of the scientific literature. This is a new page which allows the distinction of these genes from CGP's data release, for which no literature has been curated.

Curated Gene Update

The following curated genes have also received updates from the scientific literature: CDKN2A, GATA1, NOTCH1, NPM1, NRAS, JAK2, KRAS, PIK3CA, BRAF, HRAS, CEBPA, CSF1R, CTNNB1, KIT, PDGFRA, PTEN, RB1, MET, EGFR, FLT3, WT1, APC, MADH4, FBXW7, FGFR3.



General Statistics for this release


Experiments 515535
Tumours 230057
Mutant Samples 46978
Mutations 48911
Unique Mutations 9014
Papers curated 4938
Genes 3302
Fusions 438


More


6th Jun 2007 COSMIC v30

COSMIC v30

Today we release full literature curations of five tumour suppressor genes MEN1, ATM, CYLD, FBXW7, WTX; 4712 samples were examined in 112 papers, recording 468 mutations. Additionally, we release two new CGP resequencing studies which add a further 91 new genes to COSMIC.



Literature Curation

Curation of the scientific literature has been completed for five new genes from the cancer census. All five genes are tumour suppressors, causing phenotypes via their inactivation:



MEN1 (Multiple Endocrine Neoplasia Type 1)

Somatic mutations in this gene have been found in tumours from several endocrine sites, recapitulating those seen in patients carrying germline mutations including tumours in the pituitary, pancreas and parathyroid. MEN1 encodes a nuclear protein thought to be a transcriptional regulator.



CYLD (Cylindromatosis)

This gene has been found to have mutations in sporadic cylindromas, tumours arising from skin adnexal structures (such as hair follicles and glands), principally on the face and scalp. CYLD encodes a deubiquitinating enzyme regulating cell signalling including the NF-kappaB pathway.



FBXW7 (CDC4)

Mutations inactivating FBXW7 have been found in a range of cancer types including colorectal, ovarian and T-ALL. The protein is involved targeting a number of key proteins, including NOTCH1 and MYC, for ubiquitin-mediated degradation.



ATM

This gene encodes a protein kinase involved in cell cycle checkpoint control. Amongst other key cell cycle components, it has been shown to phosphorylate TP53 and CHEK2 in response to DNA damage. Germline mutations causes Ataxia-telangiectasia (AT) a recessive disorder characterized by cerebellar ataxia, telangiectases, immune defects, and a predisposition to malignancy, primarily lymphoid in origin.



WTX (FAM123B)

Recently discovered, WTX is inactivated in approximately 30% of Wilms Tumours. Located on the X chromosome, this tumour suppressor only requires a 'single-hit' for tumourigenic inactivation.



New tumour suppressor gene statistics:


Gene Samples Experiments Mutations Papers
MEN1 1680 1683 196 66
ATM 1714 1692 198 33
FBXW7 1207 1204 60 10
WTX 82 82 7 1
CYLD 29 29 7 2


The following curated genes have also received updates: BRAF,HRAS,CEBPA,CTNNB1,KIT,PDGFRA,PTEN,SMARCB1,ERBB2,JAK2,CDKN2A,PTPN11,NRAS,KRAS,PIK3CA,APC,CDH1,MADH4,EGFR,FLT3,MPL,WT1,FGFR3



CGP resequencing studies

91 new genes have been examined in our pilot set of matched pair cell lines, resulting in the discovery of 22 new mutations:

Study Genes Experiments Samples Mutations
Integrin alpha family 16 640 40 11
Miscellaneous genes of interest from literature sources 75 3000 40 11


General Statistics for this release


Experiments 499958
Tumours 217944
Mutant Samples 44491
Mutations 46364
Unique Mutations 8855
Papers curated 4794
Genes 3302


More


9th May 2007 COSMIC v29 released

COSMIC v29 released

COSMIC release 29 includes 22 new CGP resequencing studies, comprising 567 new genes within which 192 new mutations have been identified. Additional updates to our curation of the scientific literature have also been included, adding a total of 1041 mutations to this release.



CGP Resequencing Studies



567 genes have been examined in our pilot set of matched pair cell lines:



2
Study Genes Mutations
PAX transcription factor family 11 5
Tripartite motif-containing protein family 56 28
Genes on APC/CTNNB1 pathway 59 25
FK506/rapamycin binding protein family 26 5
Diacylglycerol kinases and other lipid kinases 18 17
SMAD protein family 24 7
Histone acetyltransferase 7 2
Dual specificity phosphatases 23 2
Genes associated with ERB family of RTKs 8 3
Genes associated with MYC proteins 21 12
Ubiquitin specific peptidase family 50 16
C-X-C/C-C motif chemokine receptor genes 19 8
Essential For Cell Division - derived from a siRNA screen in human cells 21 5
Genes from RNAi TSG gene screen 5
Glycolysis associated genes 23 4
Integrin beta family 8 8
Small ubiquitin-like modifier (SUMO) protein family 14 2
14_3_3 family of scaffold protein 8 1
STAT and SOCS gene families 43 7
Serpin/TIMP peptidase inhibitor families 46 17
Sorting NeXin family 27 3
Genes associated with RAS proteins 53 13




Literature curation



89 new publications have been curated, updating the information for the following genes: JAK2,HRAS,CEBPA,PTEN,RB1,RET,ERBB2,CDKN2A,GATA1,NRAS,KRAS,PIK3CA,EGFR,CTNNA1,APC,CDH1,MADH4



General Statistics for this release


Experiments 482902
Tumours 206972
Mutant Samples 42266
Mutations 44062
Unique Mutations 8420
Papers curated 4515
Genes 3220



More


4th Apr 2007 COSMIC v28 Released

COSMIC v28 Released

This months COSMIC release comprises a substantial increase in the CGP resequencing data, adding 1033 new genes to the system, together with updates to the scientific literature curation.

CGP Resequencing Studies


26 new studies have been included in this release, containing 1033 new genes which have been examined through the pilot matched pair cell line set.

Study Genes Mutations
ADAM metallopeptidase family 40 27
Cyclins and Genes associated with RB 68 20
Nfkappa signalling family 58 14
Phospholipase C Family 13 16
Protein Kinase anchor proteins 32 15
Ral Guanine nucleotide dissociation factors 6 3
Hypoxia inducible factor pathway 23 11
SerThr Phosphotases (PPP) 69 17
Integrin Binding proteins 27 1
K homology RNA-binding domain, type I 25 9
Cytochrome C oxidase family 24 3
DNA methylation and histone deacetylation 38 10
Heat shock proteins 81 20
Ets transcription factor family 28 5
High Mobility Group proteins 24 2
Immediate early/regulator of G-protein signalling family 25 5
Kallikrein protease family 16 5
Matrix metallopeptidase family 21 7
Genes implicated in stem cell regulation 63 12
TCA cycle genes 56 16
Forkhead transcription factor family 43 11
TP53 responsive genes 76 22
Ubiquitination pathway genes 63 21
Ubiquitin Ligases 72 36
DEAD Box proteins 60 25
Genes associated with TP53 and targets 47 16


Curated Gene Update


The following fully curated genes also received minor updates : APC, BRAF, CDKN2A, CTNNB1, EGFR, ERBB2, HRAS, KIT, KRAS, NOTCH1, NPM1, NRAS, PDGFRA, PTEN, PTPN11, RB1, WT1.

General Statistics for this Release

Experiments 455765
Tumours 204457
Mutant Samples 41259
Mutations 43021
Unique Mutations 8122
Papers curated 4426
Genes 2671


More


14th Mar 2007 COSMIC v27 released

COSMIC v27 released

This months release of COSMIC comprises upgrades to both the web site (which now allows searching by gene/sample name or keyword) and data, with new CGP resequencing studies and curated genes. COSMIC now contains data on over 200,000 tumour samples and 400,000 individual experiments. Of these 202109 tumours, 40331 were found to contain one or more mutations (19.9%).



CGP Resequencing Studies

Two new studies examine our pilot data set comprising 40 cancer cell lines that have a matched normal cell line, allowing all of the mutations to be confirmed as somatic.



Notch signalling proteins
This group of proteins comprise the Notch receptors and other proteins which are involved in Notch signalling. The Notch signalling pathway allows cells to communicate with each other and plays a crucial role in developmental regulation. NOTCH1 mutations have been associated with T-cell acute lymphoblastic leukaemia.



Phosphatidylinositol metabolism
This gene set includes proteins which control the synthesis and turnover of phosphatidylinositol which is synthesised in the endoplasmic reticulum before translocating to cytosolic membrane surfaces where it plays an important role in many cellular processes including cell signalling. Mutations in the phosphatidylinositol-3-kinase PIK3CA and the lipid phosphatase PTEN are associated with many types of cancer.



Literature Curation



STK11 (LKB1)
STK11 is a tumour suppressor, physically associating with p53 to effect growth suppression via p53-dependent apoptosis pathways; restoring gene activity into cancer cell lines defective for its expression results in a G1 cell cycle arrest. It has been identified as the cause of Peutz-Jeghers syndrome, an autosomal dominant disorder inducing an increased risk of melanocytic macules, gastrointestinal polyps and various neoplasms.



STK11 Statistics
Samples 2344
Mutations 92
Unique Sequence Changes 63




WT1
Wilms tumour is a solid cancer usually occurring in childhood, caused by malignant transformation of renal stem cells retaining embryonic differentiation potential. Several tumour suppressor genes have been associated with the development of WT, most classically the WT1 zinc finger DNA binding protein located at chromosome 11p13. A number of isoforms of the transcription factor WT1 exist, unusually exerting control over expression of target genes during both their transcription and splicing.



WT1 Statistics
Samples 1710
Mutations 106
Unique Sequence Changes 68




Website Upgrades



Search Facility
A major update to the COSMIC website this month is the Exalead search facility, allowing for easier navigation of the site. In the 'Text Search' field on the home page, you can search for a gene name or accession number, a sample name or id, or a tumour primary site or sub-site. There is a help page for more advanced searches, which can be accessed by clicking on the question mark in the search box, or the help button in the sidebar.



General Statistics for this release


Experiments 408164
Tumours 202109
Mutant Samples 40331
Mutations 42057
Unique Mutations 7736
Papers curated 4348
Genes 1638



More


14th Feb 2007 COSMIC third anniversary release (v26)

COSMIC third anniversary release (v26)

This release comprises a significant increase in the number of CGP resequencing studies. The five new studies all examine our pilot sample set comprising 40 cancer cell lines that all have a matched normal cell line, allowing all of the mutations to be confirmed as somatic.

Nuclear receptors and cofactors

A related but diverse array of transcription factors interacting with a wide range of coregulatory proteins to form a complex network of multicomponent assemblies serving as coactivators or corepressors of transcription.

SCF and APC cell cycle control complex components

Skp1-cullin-F-box-protein complex (SCF) and the anaphase-promoting complex/cyclosome (APC) are ubiquitination complexes regulating progression through the cell cycle.

Nucleocytoplasmic transport components

Factors involved in both import and export of proteins from the nucleus, including nuclear pore components.

Human homologues of putative target "cancer" genes from transposon screens in the mouse

Human orthologues of genes targeted by insertions in transposon insertion screens for cancer genes in the mouse.

Protein Tyrosine Phosphatases (PTP)

Critical regulators of signal transduction, effecting the reversible phosphorylation of tyrosine residues in cell signalling proteins.

Curated Gene Update

The following fully curated genes also received minor updates : BRAF, CDKN2A, EGFR, ERBB2, FLT3, KIT, KRAS, PDGFRA, PTEN, PTPN11.

General Statistics for this release

Experiments 394675
Tumours 194928
Mutant Samples 39520
Mutations 41228
Unique Mutations 7505
Papers curated 4180
Genes 1516


More


10th Jan 2007 COSMIC v25 released

COSMIC v25 released

This month's COSMIC release comprises significant updates to CGP resequencing studies and curation of the scientific literature.

Update to CGP Resequencing Studies

The six non-kinase CGP resequencing studies have received substantial updates to the number of genes included and the number of mutations found (the kinase studies were updated in November 2006). Fifty two new genes have been added to the DNA repair study, together with three in the Apoptosis and two in the GAP-GEF studies. The number of mutations discovered in each of the six studies has increased as shown below:

Study Mutation Count
v24 v25
Inositol Polyphosphate Phosphatases 8 12
Heterotrimeric G-Proteins 5 6
DNA repair genes 114 194
Apoptosis genes 38 82
Small monomeric GTPases 8 28
GAP-GEF genes 48 91


Literature Curation

In addition to the CGP resequencing studies, significant updates have been made to those genes which have received complete scientific literature curation. Three genes have been extensively updated, BRAF (19.1%, increased to 19224 samples), JAK2 (25.1%, increased to 11190 samples) and NOTCH1(75.4%, increased to 488 samples), whilst eighteen other genes have received minor updates (less than 10% increase in sample number): ABL1, APC, CDKN2A, CEBPA, CTNNB1, EGFR, ERBB2, FLT3, KRAS, MET, NRAS, PDGFRA, PIK3CA, PTEN, PTPN11, RET, SRC, VHL.

General Statistics for this release

Experiments 387254
Tumours 193513
Mutant Samples 39003
Mutations 40672
Unique Mutations 7272
Papers curated 4082
Genes 1398


More


14th Dec 2006 COSMIC v24 released

COSMIC v24 released

This months release of Cosmic includes the curation of NPM1 and CDH1.

Curation details for NMP1

NPM1 (Nucleophosmin), is a nucleocytoplasmic shuttling protein and critical regulator of TP53. Frequent mutations have been found in both childhood and adult AML. 20 papers have been manually curated for this gene resulting in the addition of 45 unique mutations (exon 12).


NPM1 statistics

Samples 3870
Experiments 3875
Mutants 1171
Papers 20
Unique Mutations 45

Curation details for CDH1

CDH1 (E-cadherin), is a calcium ion-dependent cell adhesion molecule with loss of function of this gene implicated in cancer invasion and metastasis. In particular, somatic mutations of this gene have been reported in gastric and lobular breast cancer. 181 mutations have been added to Cosmic for this gene from the curation of 46 papers.


CDH1 statistics

Samples 1958
Experiments 1970
Mutants 205
Papers 46
Unique Mutations 181

General Statistics for this release

Experiments 380741
Tumours 190358
Mutant Samples 38206
Mutations 39839
Unique Mutations 7032
Papers curated 4023
Genes 1343


More


30th Nov 2006 COSMIC v23 released

COSMIC v23 released

This months release of Cosmic includes a major update to the protein kinase screens.

Protein Kinase Somatic Data Information

The Cancer Genome Project is pleased to release the full set of protein kinase somatic mutation data resulting from the screening of over 200 human cancers through the full set of 518 annotated genes. Over 1000 mutations have been identified in a combined total of 247 megabases sequenced. This dataset is intended to serve as a catalyst for further biological investigation of mutated kinases and pathways, hopefully leading to new insights and therapeutic opportunities in human cancer.


http://www.sanger.ac.uk/genetics/CGP/Studies/Kinases/

Copy number data update

Oligo array CGH data (using the Affymetrix 10K SNP array) for a further 233 cancer cell lines and 70 primary tumours has been made available increasing the total available from 834 to 1136 samples.



General Statistics for this release

Experiments 374169
Tumours 184092
Mutant Samples 36252
Mutations 37857
Unique Mutations 6758
Papers curated 3945
Genes 1342


More


11th Oct 2006 COSMIC v22 released

COSMIC v22 released

This months release of Cosmic includes the curation from the scientific literature of the APC oncogene and information on the similarity between cell lines is now recorded and displayed in Cosmic.

Curation of APC literature


Mutations in the APC gene are one of the initiating events in colorectal tumorigenesis, both familial and sporadic. Our curation confirms that the majority of mutations occur in the central portion of the gene (the mutation cluster region, 'MCR') where mutations are associated with the most severe phenotype of huge numbers of polyps at a young age, often with extracolonic manifestations. Mutations outside the MCR cause a much milder and late onset phenotype, generating few polyps. We curated 206 papers for APC, finding 1420 mutated samples out of 7115 (almost 20%). As expected, there were no major hotspots; the most frequent mutation was p.R1450*, found in 87 samples (6%).

Web site


On the web site, we have begun to include information in Cosmic on samples found to be significantly similar by their genotype as assessed using Affymetrix SNP arrays. To date, 241 cell lines have been found to have genotypes which are greater than 80% identical with at least one other. This data is now displayed in the Cosmic Sample page under the heading of "Other Cancer Samples from the Same Individual"; here is an example: http://www.sanger.ac.uk/perl/genetics/CGP/cosmic?action=sample&id=687448

General Statistics for this release

Experiments 290332
Tumours 173929
Mutant Samples 32891
Mutations 34390
Papers curated 3781
Genes 1342


More


14th Sep 2006 COSMIC v21 released

COSMIC v21 released

This months release of Cosmic includes major updates to the Cancer Cell Line Project and microsatellite instability status data sets. In addition, published somatic mutation data from two additional genes, MPL and FGFR1, have been added to Cosmic.

Cancer Cell Line Project Major Update


The Cancer Cell Line Project aims to systematically screen a large panel of cancer cell lines for mutations in known cancer genes, thus empowering these cell lines as biological reagents for further work in anti-cancer agent development and further work on cancer molecular and cellular biology.

For this release of Cosmic, a further 137 cell lines have been added to the working set and 78 duplicate cell lines have been removed. This brings the total number of samples to 787. A further 98 mutations have also been added (See: http://www.sanger.ac.uk/genetics/CGP/CellLines/).

Statistics for the CGP Cancer Cell Line Project

Experiments 12887
Samples 787
Mutant samples 1087
Mutations 1144
Unique Mutations 3519
Genes 21

MSI Data Update


The microsatellite instability (MSI) status for CGP samples under study has been updated, bringing the total number of samples with MSI status to 1,530. (See:-http://www.sanger.ac.uk/genetics/CGP/MSI/msi_page.shtml).

Curation of MPL and FGFR1


Somatic mutations reported in the published scientific literature for the FGFR1 and MPL genes has now been added to Cosmic.
MPL- http://www.sanger.ac.uk/perl/genetics/CGP/cosmic?action=gene&ln=MPL
FGFR1- http://www.sanger.ac.uk/perl/genetics/CGP/cosmic?action=gene&ln=FGFR1

General Statistics for this release

Experiments 278786
Tumours 164905
Mutant samples 30566
Mutations 31933
Papers Curated 3611
Genes 1339


More


5th Jul 2006 COSMIC v20 released

COSMIC v20 released

This months release includes NCI-60 updates and mutation data from the scientific literature for VHL.

NCI-60 update


The CGP is pleased to release mutation data for 24 known cancer genes on the NCI-60 series of cancer cell lines. These data should allow for greater power in interpretation of biological data using the lines as well as providing a genetic framework for evaluating response to the large series of compounds screened against this reference cell line set.


Microsatellite instability


Microsatellite instability occurs due to a defect in mismatch repair. This is usually a result of inactivation of MSH2, MLH1 or MSH6 due to a mutation or to reduced expression associated with promoter methylation. Analysis of microsatellite instability was carried out using the BAT markers as described by Rodriguez-Bigas et al. All samples were screened using the markers BAT25, BAT26, D5S346, D2S123 and D17S250. Details of this, when available, are posted on the sample overview page. An example of which can be seen at http://www.sanger.ac.uk/perl/genetics/CGP/cosmic?action=sample&id=905950


Curation of VHL


VHL mutation data from the literature is now available. We have curated 93 papers covering 3412 experiments. These experiments used 3386 samples, in which 879 mutations were recorded.


General Statistics for this release

Experiments 270623
Tumours 160594
Mutant samples 29154
Mutations 30439
Papers curated 3519
Genes 1338


More


7th Jun 2006 COSMIC v19 released

COSMIC v19 released

This month's release of COSMIC includes the Cancer Genome Project screen of the GAP-GEF gene set and new information displays.

GAP-GEF Screen


This gene set, consisting of 173 genes, is comprised of proteins that function to regulate the activity of proteins with GTPase activities. GTPase activating proteins (GAPs) promote hydrolysis of GTP-GDP. Guanine nucleotide exchange factors (GEFs) promote GDP/GTP exchange. Both classes modulate the function of the small monomeric GTPases (including the RAS oncogene family) and other key signalling proteins that use the conversion of GTP-GDP as a molecular switch to regulate function. This system of GTPase/GAP/GEFs regulates a wide variety of cellular processes including growth, differentiation, survival and motility.


Web Improvements


Zygosity and somatic/germline status information are now available for mutations in COSMIC, CGP Resequencing and Cancer Cell Project websites. The somatic/germline status is listed on the sample detail page and the export function with the following statuses:-


  • Not specified
  • Confirmed somatic mutant
  • Reported elsewhere as a somatic variant
  • Confirmed germline variant
  • Reported elsewhere as a germline variant
  • Variant of unknown origin

Zygosity information is available on the mutation detail page with the following statuses:-


  • Unknown
  • Homozygous
  • Heterozygous

General Statistics for this release

Experiments 264296
Tumours 155902
Mutant samples 27732
Mutations 28859
Papers curated 3419
Genes 1337


More


4th May 2006 COSMIC v18 released

COSMIC v18 released

The CGP Resequencing Studies Website is released this month, which will act as a repository for data from CGP resequencing efforts to identify novel somatic mutations in human cancer. The pages have their own distinctive red colour scheme to denote this. Prior data on sets of genes/samples systematically screened for mutations were previously integrated into the "blue" COSMIC pages. This will continue with data now being submitted, prepublication, to and held on the new site. This will allow users to browse, search and evaluate these data more effectively. The web resources that are now available are detailed below:-

  • COSMIC (Blue): All data screened from literature and CGP based projects.
  • CGP Resequencing Studies (Red): Somatic mutations from systematic large scale resequencing of genes in human cancers.
  • CGP Cancer Cell Line Project (Green): Resequencing of known cancer genes and other analyses of human cancer cell lines.

Curation of PTCH

37 papers from the scientific literature have been curated for the PTCH gene in this release. Adding an additional 897 experiments and 168 mutations.


General Statistics for this release (COSMIC)
Experiments 254544
Tumours 153528
Mutant samples 27426
Mutations 28534
Papers curated 3393
Genes 1176

COSMIC DAS track

Ensembl has recently moved to the NCBI 36 assembly of the human genome whilst COSMIC genes and mutations are currently mapped to build 35. This has caused some disparity with the COSMIC DAS track. Therefore we suggest only using the cosmic DAS track on the most recent ensembl archive site(http://feb2006.archive.ensembl.org/index.html).Provided below is a link that will open the appropriate website with the DAS source attached:

http://feb2006.archive.ensembl.org/Homo_sapiens/contigview?conf_script=contigview;c=7:139949999.5:1;w=200000;h=7;add_das_source=(name=COSMIC+url=http://das.ensembl.org/das+dsn=cosmic_ncbi_35+type=ensembl_location+color=blue+strand=b+labelflag=n+stylesheet=y+group=n+depth=0+score=n+active=1)


More


4th Apr 2006 Cosmic v17 release

Cosmic v17 release

This month's release of COSMIC includes the Cancer Genome Project screen of small monomeric GTPases and mutation data from the scientific literature for MADH4.

CGP Small Monomeric GTPase Screen

The small monomeric GTPases function as key molecular switches impacting a large variety of cellular functions such as motility, cell signalling, transcription and the binding, hydrolysis and exchange of GTP/GDP. The RAS subfamily (HRAS, NRAS, KRAS) of small monomeric GTPases were amongst the first identified human oncogenes and are mutationally activated in a wide variety of human cancers.

Curation of MADH4

70 papers from the scientific literature have been curated for the MADH4 gene in this release. Adding an additional 2275 experiments and 259 mutations.

Gene Updates

Further data from the scientific literature for 9 genes, including KRAS and NRAS, has been added for this release. A detailed breakdown for each gene can be seen below.

Gene Name Additional experiments Additional mutations
CDKN2A 668 71
CTNNB1 140 8
EGFR 150 4
ERBB2 84 1
HRAS 54 1
KRAS 3050 647
NRAS 1611 203
PTEN 37 7
PTPN11 309 17

General Statistics for this release
Experiments 249331
Tumours 149251
Mutant samples 26574
Mutations 27637
Papers curated 3324
Genes 1175


More


8th Mar 2006 COSMIC v16 released

COSMIC v16 released

Released for March are data from a kinase domain screen of malignant gliomas. These data cover approximately 400kb of sequence in each of 9 tumours, including data from recurrent/resistant tumours.

We have recently completed a screen for somatic mutations of the kinase domain encoding exons of the entire protein kinase family in a series of human malignant gliomas. The results are presented in this release of COSMIC. No commonly mutated kinase domain was found in these studies. However, as is the case with our other work in this area, deep sequencing data from human tumours is informative about the processes that have contributed to oncogenesis in the patient. Two gliomas recurrent after temozolomide (alkylator) chemotherapy, but not a third recurrent after XRT alone, had the highest mutation prevalence of any tumours we have analysed to date. These data suggests a link between mutation prevalence and recurrent/resistant brain tumours treated with alkylator chemotherapy.

Statistics
Experiments 235213
Tumours 143427
Mutant samples 25360
Mutations 26388
Papers curated 3207
Genes 1035


More


7th Feb 2006 A COSMIC Expansion

A COSMIC Expansion

The Catalogue Of Somatic Mutations In Cancer is two years old and has mutation data for over 1,000 genes, curated from over 3,000 published papers and unpublished data from the Cancer Genome Project.

The original aim of COSMIC continues with the curation of somatic mutation information from the literature for known cancer genes. During 2005 data for 9 genes was collected; ABL1, CDKN2A, EGFR, GATA1, JAK2, MSH6, NOTCH1, PTPN11 and SMO. In addition to this, genes that were curated in 2004 were updated as new data was published.

The number of genes in COSMIC expanded rapidly when the Cancer Genome Project at the Wellcome Trust Sanger Institute published 3 studies of somatic mutations in the protein kinase gene family (518 genes in total). This data provides a unique insight to the somatic mutations in breast, lung and testicular cancers.

More recently the Cancer Genome Project has been submitting unpublished somatic mutation data to COSMIC (link). The data comes from genes involved in apoptosis, DNA repair, maintenance and metabolism and the Inositol Polyphosphate Phosphatase and Heterotrimeric G-Protein families.

In another new departure the COSMIC software was used to create a new web site the Cancer Cell Line Project. This separate site, with it's own 'mint' colour scheme, contains the results from the sequence analysis of 14 known cancer genes in over 700 cancer cell lines. Initial sequence data for 4 genes analysed in the NCI-60 is also available. This work is in progress and more results will be posted in the coming months. What is more, the number of genes in this project will continue to increase; providing genetic data for this wide set of cancer cell lines.

There have been many enhancements to the web site over the past 12 months. A tissue overview provides a summary of mutations reported in a selected tissue. New pages were created to show more details of mutations and samples and give greater depth to the data. There are also links to other data such as genome copy number information.

COSMIC has been summarised in The British Journal of Cancer (Forbes et al, 2006).

This month sees the update of; BRAF, CDKN2A, EGFR, ERBB2, HRAS, KRAS, NRAS, PTEN, PTPN11 and SMARCB1. In addition the Cancer Genome Project has submitted unpublished data for genes involved in apoptosis.

There are plans to continue the development of COSMIC in terms of data content and data presentation. We are always happy to receive feedback and suggestions (email: cosmic@sanger.ac.uk).

Statistics
Experiments 228,669
Tumours 142,569
Mutant samples 25,176
Mutations 26,194
Papers curated 3,013
Genes 1,035


More


10th Jan 2006 COSMIC v14 released

COSMIC v14 released

The COSMIC team is proud to announce the release of COSMIC-14 with data for CDKN2A(p16) and more unpublished data from the CGP.

DNA REPAIR, MAINTENANCE AND METABOLISM

The Cancer Genome Project has released further unpublished somatic mutation data from a screen of 41 cancer cell lines. The 302 genes in this release are involved or associated with DNA repair, maintenance and metabolism. The genes can be viewed together or in 5 subgroups; Telomerase Complex, SWI/SNF, DNA replication, Nucleotide Metabolism and DNA Damage Response and Repair. In total 119 somatic mutations were identified in this study.

CURATION OF CDKN2A

CDKN2A (also known as p16) is a tumour suppressor. It induces cell cycle arrest by inhibiting the phosphorylation of Rb by the cyclin-dependent kinases CDK4 and CDK6. So far 453 papers have been curated for this gene with 2,591 mutations recorded from 16,883 samples.

STATISTICS

Experiments 219,037
Tumours 140,212
Mutant samples 24,817
Mutations 2,637
Papers curated 3,379
Genes 870


More


13th Dec 2005 COSMIC v13 released

COSMIC v13 released

Somatic mutation data from new gene families

In a major new departure the Cancer Genome Project is proud to release further somatic mutation data. The results from the sequencing of two gene families, Inositol Polyphosphate Phosphatases and Heterotrimeric G-Proteins, have been added to the data for the Protein Kinase genes . This data will be expanded in the future with the addition of further gene sets.

Updates to existing genes

Nine genes in COSMIC have been updated with further data; NRAS, RB1, ERBB2, HRAS, PTEN, TP53, KRAS, APC and CDKN2A

New DAS data source

The Cancer Genome Project is pleased to announce the release of a DAS source devoted to the genes and mutations within COSMIC. Using this source you will be able to view the genes and mutations from COSMIC within a genome browser or the DAS client of your choice.

All 587 genes in COSMIC are exported as features. Each of these features displays the genomic 'footprint', which encompasses both exonic and intronic sequence between the start and end points of the CDS sequence. A link is attached to each feature, providing a mechanism for the client to link back directly to the gene entry on the COSMIC website.

In addition to the gene footprints, there are also a large number of unique mutations. These are also displayed as features, with links back to the mutation summary page in COSMIC. The database currently holds 2812 unique mutations, of which 1035 are currently exported. This subset is comprised of all the single nucleotide substitutions. More complex mutations will be included, as the genomic coordinates are mapped.

The DAS source can be found at the following URI:

http://das.ensembl.org/das/cosmic_genomic

The easiest way to view this source is to place the following URI in your browser:

http://www.sanger.ac.uk/turl/6d8

This will attach the DAS source and display some of the mutations found in BRAF. Additional configuration can be performed on the track, by clicking on the track name. For more information, see the help pages on the Ensembl website.

COSMIC statistics
Experiments 190,576
Tumours 124,381
Mutant samples 23,232
Mutations 2,228
Papers curated 2,812
Genes 587


More


1st Nov 2005 COSMIC version 12 released

COSMIC version 12 released

The November release of COSMIC has further data on 9 known cancer genes.

GENE UPDATES

The genes with additional data are; BRAF, PTEN, RB1, EGFR, TP53, CDKN2A, NRAS, KRAS and PIK3CA.

VERSION

We have implemented a versioning system for the data in COSMIC. The current release is version 12 with a plan to release a new version every month.

CANCER CELL LINE PROJECT.

There are additional mutations for the known cancer genes being sequenced through the cancer cell lines. Notably there is data for homozygous deletions in the CDKN2A gene.

COPY NUMBER DATA

The Cancer Genome Project has released more copy number data derived from the analysis of cancer cell lines and primary tumours using Affymetrix SNP microarrays. So far a total of 834 samples have been analysed consisting of 161 primary tumours and 673 cancer cell lines. This data is freely available from the CGP website. The primary tumours overlap with those being sequenced by the CGP while the cancer cell lines include those being sequenced in the Cancer Cell Line Project.

COSMIC STATISTICS
Tumours 124,367
Experiments 188,529
Mutations 23,157
Papers 2,224
Genes 538


More


3rd Oct 2005 COSMIC Update

COSMIC Update

COSMIC has been updated with the addition of 2 new curated genes and new mutation descriptions.

MUTATION DESCRIPTIONS

COSMIC has adopted the Human Genome Variation Society sequence variation/mutation nomenclature for the bulk of the mutations in COSMIC. This represents a major upgrade with the aim of improving clarity and enables the listing of intronic variants for the first time.

GENE UPDATES

Two genes have further data in COMSIC; EGFR and PTEN.

PROTEIN KINASE MUTATIONS IN TESTICULAR CANCER

The sequence analysis of the protein kinase gene family in human testicular germ-cell tumours of adolescents and adults has been published. The mutation data from this work was previously available in COSMIC and is now joined by the published analysis of the data.

CANCER CELL LINE PROJECT

There are additional mutations from the screening of known cancer genes through an extensive set of cancer cell lines.

STATISTICS FOR COSMIC
Tumours 123,197
Experiments 186,181
Mutations 22,711
Papers 2,157
Genes 537


More


6th Sep 2005 COSMIC Update

COSMIC Update

COSMIC has been updated with the addition of 3 new curated genes; MSH6, NOTCH1 and PTPN11.

There is a new member to the COSMIC family; the Cancer Cell Line Project. This portal uses the COSMIC code to serve mutation data from the cancer cell lines being sequenced by the Cancer Genome Project at the Wellcome Trust Sanger Institute. The cell line data is presented in the same style as the COSMIC data with a unique colour scheme. There are links to jump from the Cancer Cell Line Project pages to view all of the data in COSMIC. At present there is data from 12 known cancer genes in the Cancer Cell Line Project database.

In addition the results from the screen of all 518 protein kinase genes in lung cancer, that were available in the previous release of COSMIC, have been published in Cancer Research

.

NEW GENES IN COSMIC

  • MSH6 - is a member of the MutS homolog family and is required for DNA mismatch specific binding. Almost one third of tumours of the large intestine have somatic mutations in this gene.
  • NOTCH1 - has somatic small intragenic mutations in 60% of haematopoietic and lymphoid tumours.
  • PTPN11 - is a nontransmembrane protein-tyrosine phosphatase. Approximately 6% of haematopoietic and lymphoid tumours have mutations in this gene.

COSMIC STATISTICS

Experiments 186014
Tumours 123039
Mutations 22598
References 2153
Genes 537


More


1st Sep 2005 Protein kinase mutations in lung cancer

Protein kinase mutations in lung cancer

The Cancer Genome Project has sequenced all protein kinase genes in lung cancer - the most common cause of cancer deaths worldwide

There are over 27,000 new cases of lung cancer in the United Kingdom each year. Protein kinases are frequently mutated in human cancer and inhibitors of mutant protein kinases have proven to be effective anticancer drugs. The Cancer Genome Project has screened the complete coding sequence of all 518 protein kinase genes in 33 lung cancers. This study, published in Cancer Research, is the largest survey reported to date of somatic mutations in lung cancer.

The Cancer Genome Project at the Wellcome Trust Sanger Institute was established in 2000. Its goal is to identify mutations that occur in cancer cells to enable the development of new diagnostics and new treatments and advance our understanding of the biology of cancer.

The Wellcome Trust Sanger Institute Cancer Genome Project and their collaborators have published the latest results of their survey of genes that might be implicated in cancer. The report is published in Cancer Research on Thursday 1st September 2005 and is also available through COSMIC.

The gene set chosen was a class called protein kinases, key controllers of cell growth and death. Members of this family have been shown to be important in cancer. However, the whole set has never been sequenced in a single set of lung tumours. The study generated over 40 million bases of DNA sequence (1.3 million for each sample).

This work identified 188 somatic mutations in 141 protein kinase genes. There was considerable variation in the number of mutations found in each tumour. The results indicate that several mutated protein kinases may be contributing to lung cancer development, but that mutations in each one are infrequent. Larger studies are warranted to further explore these initial findings. Cancer is a complex set of diseases that will affect 1 in 3 people. This work in the CGP is but one part of a global effort to further understanding of cancer and move towards better diagnosis and treatment.


More


3rd Aug 2005 COSMIC Website Update

COSMIC Website Update

The COSMIC web site has been updated with additional data from the literature and unpublished data from the Cancer Genome Project.

SOMATIC MUTATION DATA FOR KNOWN CANCER GENES

Data for 3 genes has been curated from the literature and included in COSMIC; ABL1, GATA1 and SMO.

SOMATIC MUTATIONS OF THE PROTEIN KINASE GENE FAMILY

The screen of the protein kinase gene family by the Cancer Genome Project now includes two new tumour types; lung cancer and testicular germ-cell tumours. There are marked differences in the mutation prevalence between these two tumour types.

CANCER CELL LINE PROJECT

The mutation data for 9 further genes has been included on the web site giving a total of 550 mutations. The genes are APC, CDH1, CTNNB1, HRAS, MADH4, PIK3CA, PTEN, RB1 and STK11. The sequencing of these genes is not necessarily complete but the cell lines with mutations have been confirmed and the experiments will continue to finish this work.

COSMIC STATISTICS

179,563 Experiments
118,134 Tumours
22,005 Mutations
2,090 References
534 Genes


More


23rd May 2005 Cosmic Update

Cosmic Update

COSMIC now includes data from a screen of all protein kinase genes in breast cancer and an update of mutation data from the literature.

New Data

The data in COSMIC has expanded to include a new data type and the number of known cancer genes has been extended with updates on some of the existing cancer genes.

A screen of the coding sequence of the protein kinase genes in breast cancer.

The Wellcome Trust Sanger Institute Cancer Genome Project and their collaborators have published the latest results of their survey of genes and their mutations in cancer. The report was published online in Nature Genetics on Sunday 22 May 2005 (more). This data has been integrated with the existing data in COSMIC and made available through the web site.

New cancer genes in COSMIC

The mutation data for two further cancer genes has been curated from the scientific literature and added to COSMIC.

  • EGFR - mutations in the epidermal growth factor receptor (EGFR) have been reported in lung cancer and have been associated with the tumour response of patients receiving gefitinib.

  • JAK2 - A single somatic point mutation (V617F) has been identified in JAK2 in patients with polycythaemia vera. The mutation alters a highly conserved valine present in the negative regulatory JH2 domain, and is predicted to dysregulate kinase activity.

Updates to existing COSMIC genes

Further published data has been curated for 5 genes in COSMIC; BRAF, ERBB2, FGFR2, PDGFRA and PIK3CA.

Website

Home Page
  • We have created a new mailing list: COSMIC-announce, with a subscription link located at the bottom of the home page. As a subscriber to this list you will recieve announcements about the latest COSMIC news and website releases.

Browsing by Gene

More improvements have been made to the gene selection pages. The alphabetical lists have been seperated into 3 groups to reduce the amount of guess work involved in finding your gene of interest.

  • Genes from the Cancer Gene Census: - This list contains genes that have been included in the Cancer Gene Census. All of the genes in previous releases of COSMIC are included in this census.

  • Other Genes with Mutations: - These genes are not in the census, but have been found during the curation of the literature and so are included in the database. All these genes have a documented mutation which is thought to be linked to cancer.

  • Other Genes without Mutations: - The final list contains all the other genes that have been recorded during curation. These do not have a documented mutation in the references found in COSMIC.

The karyotype has also been updated. Genes from the census can be located quickly by clicking on the red trinagles. All other genes are indicated by blue lines across the chromosome.



Mutation Overview Page

Each mutation in COSMIC now has its own overview page containing information about the type of mutation and samples/tissues containing the mutation. This page can be reached by clicking on various links throughout the website.

  • Main Histogram:


  • Main Mutation table:


  • Sample Overview Page:


The overview page is divided into 8 main sections:

  • Mutation Id: This id is used to identify a mutation within the COSMIC database and is assigned as the mutation is curated.

  • Mutation type: The mutation type is used to describe the type of mutation that has occurred. This can be anything from a single base inframe substitution, to a frameshift deletion.

  • Mutation Location: Here, an image displays the location of the mutation within the peptide sequence.

    • The grey bar at the top of this section shows the full length sequence. Below this can be found a red box, which indicates the area around the mutation. At the bottom of the image, the red box has been expanded and the peptide sequence around the mutation is shown. Here you will find a red triangle which indicates the starting point of the mutation. Clicking on the triangle will produce a pop-up window showing the mutation at both the peptide and nucleotide level.


    • Additionally there is a link, 'Show all mutations in area', to the main histogram page for the gene. This link will show the gene histogram zoomed into the area displayed on this page. This allows you to see any other mutations that have been identified in the surrounding area.


  • Gene: The name of the gene in which the mutation was found. Clicking on the gene name will link to the summary page for that gene.

  • AA Mutation: This section details the change that has occurred in the peptide sequence as a result of the mutation. Formatting is as follows:
    • Substitutions - X(Y)Z
      Where X is the amino acid found in the wildtype sequence. Y is a number representing the position, within the peptide sequence, at which the mutation occurred. Finally, Z is the amino acid found in the mutant sequence.

    • Deletions - delY(Z)
      Where Y is a number representing the position at which the deletion starts and Z is the amino acid sequence which has been deleted.

    • Insertions - insY(Z)
      Where Y is a number representing the position at which the insertion begins and Z is the amino acid sequence that is inserted.


  • CDS Mutation: This section details the change that has occurred in the nucleotide sequence as a result of the mutation. Formatting is identical to the method used for the peptide sequence.

  • Tissue Distribution (Top 5): The top five tissues in which this mutation has been identified are described in the following bar chart.

    • Each bar represents the number of samples, for a specific tissue type, that have exhibited the selected mutation. A label indicating the name of the tissue type and the number of samples is located below each bar.

    • Clicking on one of the bars will take you to the tissue overview page for the selected tissue.


  • Associated Samples: A list showing all the samples, including their primary tissue types, that have the selected mutation. Clicking on a sample name will take you to the sample summary page for the selected sample. Clicking on the primary tissue type will take you to the tissue overview page.

Sample Overview

Two new sections have been added to this page:

  • Tumour Features: In this section details about the tumour, from which the sample was obtained, are listed whenever they have been supplied by the reference source.

  • External Data Sources: Additional data sources, with information about the sample, are listed here when available. This includes information from some of the studies within the Cancer Genome Project.

References

COSMIC now includes review papers. There is a review section that can be found at the bottom of the reference overview page for each gene. This section includes references that review other works. As the data from these references has already been added to the data from the original sources, this data is not added again.

Statistics
529 Genes
114300 Tumours
20536 Mutations
1894 References


More


4th Mar 2005 COSMIC Website Update

COSMIC Website Update

COSMIC presents 'Tissue Overview' another way to view somatic mutation data. The Tissue Overview page details the Top 5 Genes for any tissue / histology selection ranked by mutation frequency and data volume. In addition it lists other genes with and without mutations for the selection. From the Tissue Overview page you can click through to the specific details of the listed genes.

Website

Home Page
  • We have updated the entry point system.
    • Detailed Search - This has been the standard search pathway and has not changed from previous releases. Please continue to use this pathway to build complex queries, if you are interested in specific subtissues or histologies
    • Quick Search - This is a new pathway greatly reducing the number of steps required to access the tissue overview page and any subsequent pages. This increase in speed does however reduce the complexity of the available search to just primary tissues.
Tissue Overview

As stated above, this new page details all the genes that have samples for the tissues / histologies selected. It is split into three major sections, with the first section detailing what we feel are the most important genes, based on mutation frequency and data volume.

  • Section One: Top Genes With Sample Data
  • This section provides an interactive bar chart and table showing data for the highest ranked genes containing samples from the chosen tissues / histologies

    The coloured bars in the image represent:

    • All Samples
    • Samples With Mutations
    • Clicking on any portion of the bar or name associated with a particular gene will reveal a pop-up menu.

      • Sample Number: (count) - This indicates the number of samples that have been found with the selected tissue/histology type
      • Mutated Samples: (count) - This is the number of the above samples that have shown mutations.
      • Go to Full Gene Display - Clicking on this link will take you to the histogram display for the selected gene.
  • Below the bar chart image is a table that displays all the information found in the image, in a tabular format.

  • Section Two: Other genes with mutations
    • This section contains a list of additional genes with mutated samples that didn't make it into the top 5. Each gene name is linked to the full histogram image.
  • Section Three: Other genes without mutations
    • This section contains a list of additional genes without mutated samples that didn't make it into the top 5. Again, each gene name is linked to the full histogram image.


More


4th Feb 2005 COSMIC's first anniversary

COSMIC's first anniversary

The COSMIC database and web site have been updated and now have somatic mutation data from 21 genes.

New Data
  • CEBPA is mutated in 7% of haematopoietic and lymphoid tissue tumours. It arrests cell proliferation by inhibiting the kinases CDK2 and CDK4.
  • CTNNB1 or beta-catenin is mutated in a variety of tumours. The gene encodes an adherens junction protein that is critical for the establishment and maintenance of epithelial layers
  • KIT is characterised by two clusters of mutations in and around the kinase domain of the gene with frequent mutations in haematopoietic and lymphoid tissue tumours (19%) and soft tissue tumour (32%).
  • PTEN has mutations through the whole coding sequence with a hot spot at codon 130. Tumours of the central nervous system and endometrium frequently have mutations in this gene (19% and 34% respectively)
  • SRC is homologous to the v-src gene of the Rous sarcoma virus and has one mutation that has been found in 10 samples.
  • SUFU encodes a component of the sonic hedgehog/patched signaling pathway and is mutated in central nervous system tumours.
Statistics
21 genes
104,682 tumours
18,478 samples have mutations
1,755 unique mutations
1,672 papers have been curated


More


17th Dec 2004 COSMIC Update

COSMIC Update

The COSMIC team is proud to release somatic mutation data for CSF1R, RB1, RET and SMARCB1. This information has been curated from the scientific literature. Somatic mutation data from 15 genes can be queried and viewed through the COSMIC web site.

Data
  • The 4 new genes in COSMIC give data on specific tumour types and increase the breadth of information that can be queried and displayed.
  • CSF1R, also known as the oncogene FMS, is a receptor kinase that is mutated in ~5% of myelodysplastic syndrome cases. Mutations in this gene have been associated with a predisposition to myeloid malignancy.
  • RB1 is mutated in more than 11% of the tumours that have been studied. It is frequently somatically mutated in cases of retinoblastoma (47%) while germline mutations predispose to the same disease.
  • RET, a tyrosine kinase receptor, is somatically mutated in 38% of thyroid medullary carcinomas. Germline mutations in the RET gene are associated with multiple endocrine neoplasia, type IIA and type IIB, medullary thyroid carcinoma and Hirschsprung disease.
  • SMARCB1, also known as SNF5/INI1, is frequently somatically mutated in soft tissue rhabdoid tumours (41%). These are highly malignant cancers that usually occur in young children.
Statistics
15 genes
73,767 tumours
13,420 samples have mutations
536 unique mutations
1,104 papers have been curated


More


12th Nov 2004 COSMIC Update

COSMIC Update

The COSMIC team are proud to include somatic mutation data for FGFR2, FGFR3, FLT3, MET, PDGFRA and PIK3CA on the COSMIC web site.

Data

The number of genes with data in COSMIC has more than doubled in this release of the database. The additional data represents a set of genes that have a lower, but nevertheless important, mutation frequency in human cancer as a whole. In specific malignancies genes such as FLT3 do have a significant role as can be seen from the data collected in COSMIC.

Gene Number of analysed samples Number of samples with mutations
BRAF 5158 736
ERBB2 714 8
FGFR2 30 2
FGFR3 1735 481
FLT3 7610 1499
HRAS 11876 477
KRAS2 35716 8302
MET 1081 59
NRAS 13884 1132
PDGFRA 146 25
PIK3CA 396 89
TOTAL 78346 12810

Number of unique mutations 307

Number of curated papers 976

Website Changes

Home Page
  • We have added a link to an ATOM feed for those people with ATOM enabled news feed readers. Adding this link to your feeds list will allow you to see the latest news from the COSMIC site as and when it is available.
Distribution View
  • A totals column has been added to the 'Details' table to show the total number of mutated samples that are listed.
  • Links to show only negative data have been added to the'More Details' links in the 'Details' table.
  • The Insertions and deletions table has been split to show different information for the two types of mutation.
Gene Selection
  • Genes can now be selected by chromosome, from the karyotype graphic, or as always from an alphabetical list.
References Page
  • The complete list of references for a specific gene can now be exported in a variety of formats including Excel.
Mutation Data Page
  • All pages with samples containing more than 100 samples have been split to reduce their size. However, the export function will still export all the samples as selected.


More


29th Sep 2004 COSMIC Update

COSMIC Update

We are pleased to announce an update to the COSMIC website. To coincide with the nature paper on ERBB2 we have added all the data for this gene to COSMIC. There have also been a number of improvements to the interface that we hope you will find useful.

New Data

ERBB2

Today, Nature publish our recent findings, the first description of small intragenic ERBB2 mutations in human cancer. Primarily found in non-small cell lung adenocarcinomas, the mutations identified are suggestive of inappropriate activation of ERBB2 kinase activity.

This addition brings 8 new mutations and 714 new samples to the database. Increasing the total number of mutant samples to 10655 and the total number of samples to 58032.

Website Changes

Distribution View
  • The summary table has been removed in favour of a new gene summary page. Containing all the data from this table, plus much more.
  • The mutations tables have been expanded to show insertions, deletions and complex mutations.
  • Information about the negative samples is now available and can be viewed by clicking on the 'More Details' link in the Details table. Like the positive samples, this data can also be exported in various formats.
  • A new insertions and deletions track has been added to the main image. This will allow us to display a larger number of genes with more complex mutation sets.
  • A complex mutations track has also been added to display those mutations (multiple base substitutions) which don't quite fit into any of the other categories.
Gene Summary Page

This has grown from the original four row summary table, on the distribution page, into a full page overview of the information stored about a specific gene.

  • Mutation hot spots: The mutation summary shows those areas of the transcript that have a high density of mutations. This can be used to go directly to the area of interest on the mutation distribution view.
  • References: A quick glance will show the most recently published paper that was analysed by the COSMIC staff.
Sample Summary Page

Here you will find a page containing all the information about a particular sample. Some of the previously unavailable information, such as details about the individual, has been been made available.

  • Genes Tested: Quickly identify all the genes in COSMIC that have been tested against the selected sample.
  • References: Locate all the references that have included the sample.
Reference Summary Page

For the first time in COSMIC you can see all the samples from one paper in one location. In addition to this there are also details about the genes screened and the mutations that were found.


More


28th Jun 2004 Cosmic Update

Cosmic Update

We are pleased to announce a minor update to the COSMIC website. The user interface has been updated to include new features that we hope will make your experience with the site more productive and enjoyable.

Web Site Changes

  • Shorter URLs' - These have been shortened to reduce the amount of text required to link to a specific page within the site. The old style identifiers have been replaced with shortened initials. For example, 'locus_name' has been replaced with 'ln'. The old style links still work and any existing bookmarks should not be affected by this change.
  • Nucleotide Tracks - All mutations can now be viewed with respect to the changes they would cause to the nucleotide sequence, in addition to the already present amino acid changes. The nucleotide views can be accessed by selecting 'cDNA' in the navigation menu on the main display pages. An example of this view can be seen here
  • Navigation & Selection Improvements - The selection process has been updated to allow users to select sub types across a range of up to five tissue types. Adding a new level of refinement to the search process.


More


23rd Jun 2004 COSMIC Detailed

COSMIC Detailed

The British Journal of Cancer have released an advance online version of an article describing COSMIC. Detailed information is provided about the curation and structure of the database. Followed by a description of the facilities provided by the website.


More


6th May 2004 COSMIC Website Unavailable 8th May

COSMIC Website Unavailable 8th May

On Saturday 8th May the COSMIC website, as part of the Sanger website, will be unavailable whilst major network upgrades and essential maintenance work is carried out. We apologise in advance for this loss of service.


More


20th Feb 2004 Nucleotide data available

Nucleotide data available

COSMIC displays mutations at the amino acid level to show the potential implication of the mutations on the protein sequence. In addition to this COSMIC holds the mutations at the nucleotide level. This data is available through the Export function that can be found at the top of the Distribution figure (example) or at the bottom of the expanded Mutation Data tables (example).


More


4th Feb 2004 COSMIC version 1 released

COSMIC version 1 released

Wellcome Trust Sanger Institute launches Catalogue Of Somatic Mutations In Cancer. In the quest to develop rational approaches to treating cancer, researchers need efficient access to existing knowledge. COSMIC (Catalogue Of Somatic Mutations In Cancer), launched today by the Cancer Genome Project at The Wellcome Trust Sanger Institute, is a new tool that provides integrated genetic data from cancer genes, and will make research faster and easier.


More


3rd Feb 2004 BRAF V599E becomes V600E

BRAF V599E becomes V600E

The original BRAF mutations reported by Davies et al were mapped to the DNA sequence NM_004333[gi;4757867] with the common BRAF mutation being V599E. On the 24th July 2003 this sequence was updated to NM_004333[gi;33188458] with the insertion of 3bp in the coding sequence. The net effect of this update was to increase the length of the BRAF protein by one amino acid and increase the position of all published mutations by one amino acid. The beginning of both versions of the proteins are;


MAALSGGGGGGAEPGQALFNGDMEPEAGAGR PAASSAADP	NM_004333[gi;4757867]
||||||||||||||||||||||||||||||  |||||||||
MAALSGGGGGGAEPGQALFNGDMEPEAGAGAGAAASSAADP	NM_004333[gi;33188458]


The BRAF mutations in COSMIC are mapped to the latest version of the cDNA and V599E has become V600E.


More


RSS
Information Projects Other Services
Sanger Home
Sitemap
Site Search
Information
Careers
Press
News
Seminars
Workshops
Publications
Staff Theses
Travel Directions
Research Teams
Research Faculty
Personnel Search
Human Genetics
Model Organism Genetics
Pathogen Genetics
Bioinformatics
Sequencing
Library
Helpdesk
Webmail
VPN Access
Sign In
SSO Pass. Reset

webmaster@sanger.ac.uk

Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK  Tel:+44 (0)1223 834244

Last Modified Tue Mar 20 14:26:52 2007

Genome Research Limited is a charity registered in England with number 1021457

Data Sharing Policy | Conditions of Use | Copyright