Contact WTSI Webmaster Printer friendly format Login to WTSI resources WTSI RSS feed
Genomics & Genetics
  • Overview
  • CGP
  • Faculty
  • Stratton
  • Futreal
  • Projects
  • Cancer Gene Census
  • COSMIC
  • CGP Resequencing Studies
  • Copy Number Mapping
  • NCI-60
  • Planned studies
  • Genomics of Drug Sensitivity in Cancer
  • Software
  • Information
  • Links
  • News
  • Publications
  • Conditions of use
Catalogue Of Somatic Mutations In Cancer - (COSMIC)

25th Jul 2013 COSMIC v66 Release

COSMIC v66 Release

COSMIC v66 contains curation of cancer genes TSC1 &TSC2, four new fusion gene pairs, 17 whole-genome sequencing publications, and extensive updates from ICGC & TCGA.


CENSUS update

We have reviewed the current knowledge on genes involved in human cancer and as a result, added 17 new genes to the Cancer Gene Census (a list of genes with significant proof they contribute to human cancer). We will be adding more genes to this census in future releases, when we consider their involvement proved in the literature.


COSMIC releases scheduled every 3 months

This July release is the last performed on a bimonthly schedule. COSMIC's next release will be in October, when the schedule will become one release every three months.


Literature curation:Point-mutated cancer genes

TCS1 and TSC2

Somatic alterations in tumour suppressor gene TSC1 have been detected in sporadic tumours such as bladder cancer, renal cell carcinoma and hepatocellular carcinoma, and TSC2 alterations in sporadic pulmonary LAM, renal angiomyolipoma and head and neck cancers. TSC1 and TSC2 gene products, hamartin and tuberin, form a protein complex that plays a critical role in growth control as a primary regulator of the mammalian target of Rapamycin (mTOR) pathway. Germline mutations in these genes cause tuberous sclerosis complex (TSC), a neurocutaneous syndrome characterized by seizures, mental retardation, and benign tumours of many organs.


Literature Curation - Gene fusion mutations

EWSR1-YY1

A subgroup of mesotheliomas is characterised by EWSR1-YY1 fusions. The EWSR1 breakpoint is similar to that found in other fusions involving EWSR1 such as EWSR1-FLI1 and EWSR1-DDIT3. YY1 encodes a ubiquitously distributed transcription factor belonging to the GLI-Kruppel class of zinc finger proteins. The EWSR1-YY1 fusion protein includes the transactivation domain of EWSR1 and the DNA-binding domain of YY1.

EWSR1-NFATC1

Further broadening the range of tumours associated with EWSR1 fusions, EWSR1-NFATC1 has been identified in haemangioma of the bone. NFATC1 encodes a component of the nuclear factor of activated T cells DNA-binding transcription complex which is involved primarily in immune response. The transactivation domain of EWSR1 is retained in the fusion where it's fused to the DNA-binding domain, the REL-homology region, of NFATC1.

IRF2BP2-CDX1

IRF2BP2-CDX1 has been identified as an alternative fusion to HEY1-NCOA2 in mesenchymal chondrosarcoma. IRF2BP2 encodes an interferon regulatory factor-2 (IRF2) binding protein that interacts with the C-terminal transcriptional repression domain of IRF2. CDX1 belongs to the homeobox gene family. For the fusion protein, the N terminal includes the IRF2BP2 zinc finger motif and the C terminal includes the CDX1 homeodomain.

STRN-ALK

STRN, encoding a calmodulin-binding protein, has been identified as a novel ALK fusion partner in lung adenocarcinoma. As in most ALK fusions the kinase domain of ALK is preserved in the fusion protein.


Literature Curation - Systematic Screens:

Tarpey et al (2013). Frequent mutation of the major cartilage collagen gene COL2A1 in chondrosarcoma. Nature genetics(epub)

Zhang et al (2013). Genetic heterogeneity of diffuse large B-cell lymphoma. Proceedings of the National Academy of Sciences of the United States of America 110:1398

Reuss et al (2013). Secretory meningiomas are defined by combined KLF4 K409Q and TRAF7 mutations. Acta neuropathologica 125:351

Ho et al (2013). The mutational landscape of adenoid cystic carcinoma. Nature genetics 45:791

Lui et al (2013). Frequent mutation of the PI3K pathway in head and neck cancer defines predictive biomarkers. Cancer discovery 3:761

Murtaza et al (2013). Non-invasive analysis of acquired resistance to cancer therapy by sequencing of plasma DNA. Nature 497:108

Chen et al (2013). Next-generation-sequencing-based risk stratification and identification of new genes involved in structural and sequence variations in near haploid lymphoblastic leukemia. Genes, chromosomes & cancer 52:564

Han SW et al (2013). Targeted sequencing of cancer-related genes in colorectal cancer using next-generation sequencing. PloS one 8:e64271

Bettegowda et al (2013). Exomic Sequencing of Four Rare Central Nervous System Tumor Types. Oncotarget 4:572

Clark et al (2013). Genomic Analysis of Non-NF2 Meningiomas Reveals Mutations in TRAF7, KLF4, AKT1, and SMO. Science 339:1077

Yost SE et al (2013). High-resolution mutational profiling suggests the genetic validity of glioblastoma patient-derived pre-clinical models. PloS one 8:e56185

Zang et al (2012). Exome sequencing of gastric adenocarcinoma identifies recurrent somatic mutations in cell adhesion and chromatin remodeling genes. Nature genetics 44:570

Iyer et al (2012). Genome sequencing identifies a basis for everolimus sensitivity. Science 338:221

Kridel R et al (2012). Whole transcriptome sequencing reveals recurrent NOTCH1 mutations in mantle cell lymphoma. Blood 119:1963

Totoki et al (2011). High-resolution characterization of a hepatocellular carcinoma genome. Nature genetics 43:464

Wang et al (2011). Exome sequencing identifies frequent mutation of ARID1A in molecular subtypes of gastric cancer. Nature genetics 43:1219

Lee et al (2010). The mutation spectrum revealed by paired genome sequences from a lung cancer patient. Nature 465:473


These Cancer Genome Consortium studies have been updated this release

TCGA

Ovarian Serous Cystadenocarcinoma

Rectum Adenocarcinoma

Colon Adenocarcinoma

Acute Myeloid Leukemia

Glioblastoma Multiforme

Prostate Adenocarcinoma

Bladder Urothelial Carcinoma

Breast Invasive Carcinoma

Cervical Squamous Cell Carcinoma

Kidney Renal Clear Cell Carcinoma

Lung Adenocarcinoma

Lung Squamous Cell Carcinoma

Uterine Corpus Endometrioid Carcinoma

Head and Neck Squamous Cell Carcinoma


ICGC (Riken)

Japanese liver cancer

ICGC (NCC)

Japanese liver cancer


COSMIC v66 Total Statistics

Genes 25563
Samples 908687
Mutations 1524954
Unique Variants 1216270
Papers 17157
Fusions 9054
Genomic Rearrangements 7584
Whole Genomes 7677


Additional CGP resources:

The Cancer Gene Census, a listing of all genes known to be involved in cancer promotion

The Cancer Cell Line Project, defining key driver mutations in 800 common cancer cell lines

Cosmic Whole Genomes, all tumours with genome-wide somatic annotations

Genomics of Drug Sensitivity, Analysis of drug sensitivity data in human cancer cell lines.

CGP Copy Number Analysis in Cancer, examining tumours for gains or losses of genomic content


More


28th May 2013 COSMIC v65 Release

COSMIC v65 Release

COSMIC v65 includes full curation of the genes SH2B3, MAP2K1 and MAP2K2, recently identified as causing blood and epithelial cancers, together with 5 new USP6 gene fusions, found in aneurysmal bone cysts. In addition, substantial updates have been made to studies curated from the ICGC and TCGA. 6989 samples in COSMIC now have full-genome annotations.


COSMIC release schedule

In recent years, COSMIC has released a new version every two months, six times a year. Howerver, in order to focus on maximising the content of each release and reducing the workload for those integrating the COSMIC database into their own resources, we will be changing to a 3 monthly schedule from July. After July, the next COSMIC release will be in October 2013 and every three months after this.


Literature curation:Point-mutated cancer genes

SH2B3

SH2B3 (LNK, 12q24.12) is a plasma membrane-bound adapter protein and a negative regulator of cytokine signalling involved in normal haematopoesis. Its functions include inhibition of wild type and mutant JAK2 signaling and it is overexpressed in myeloproliferative neoplasms (MPN) as well as myelodysplastic syndrome and leukaemic cells; growth of some transformed cells is inhibited by overexpression of SH2B3 and loss of LNK in murine models enhances development of MPNs. Mutations have mainly been found in MPNs and idiopathic erythrocytosis, predominantly heterozygous PH2 domain mutations (hotspot E208_D234) in wild type JAK2/MPL blast phase MPNs, implicating SH2B3 in MPN progression. However, mutations have also been found in chronic phase MPNs, can occur in JAK2/MPL mutated tumours and in other SH2B3 domains. Mutations have also been seen in small numbers of early T-cell precursor acute lymphoblastic leukaemia samples and solid tumours. The loss of inhibition of JAK-STAT activation may be related to haploinsufficiency of SH2B3 or due to dominant-negative effect of the mutant protein.

MAP2K1 and MAP2K2

MAP2K1 and MAP2K2 code for dual specificity protein kinases which belong to the mitogen-activated protein (MAP) kinase kinase family. These proteins phosphorylate both a threonine and a tyrosine residue in the activation loops of extracellular signal-regulated kinases (ERK1/2). MAP2K1 and MAP2K2 are activated by point mutations at low prevalence in epithelial cancers such as non-small cell lung cancer. MAP2K1 mutations cluster in 2 hot spots at a hinge region (Q56P, K57N, D67N) between the coiled-coil and catalytic domains, and at the activation loop of the kinase domain (E203K, I204T). Mutations have also been identified in malignant melanomas where they often occur with BRAF or NRAS mutations. There is also evidence that some MAP2K1 exon 3 mutations might confer resistance in melanomas to MEK/RAF inhibitors.


Literature Curation - Gene fusion mutations

CDH11-USP6, THRAP3-USP6, OMD-USP6, CNBP-USP6, COL1A1-USP6

(CDH11-USP6_ENST00000250066, THRAP3-USP6_ENST00000250066, OMD-USP6_ENST00000250066, CNBP-USP6_ENST00000250066, COL1A1-USP6_ENST00000250066)

Fusions involving USP6, partnered with one of five different genes (CDH11, THRAP3, OMD, CNBP or COL1A1), are found in aneurysmal bone cyst, a locally aggressive bone lesion with a propensity to recur. In each translocation the entire ubiquitin-specific protease coding sequence is fused downstream to the promoter region of the partner gene.


Literature Curation - Whole genome studies:

TCGA Cervical Squamous cell

TCGA Bladder Urothelial

TCGA Glioblastoma

TCGA Breast Carcinoma

TCGA Colon Cancer

TCGA Lung Adenocarcinoma

ISC/MICINN Chronic Lymphocytic Leukaemia

TCGA Acute Myeloid Leukaemia

TCGA Ovarian Serous Cystadenocarcinoma

TCGA Prostate Adenocarcinoma

TCGA Rectum Adenocarcinoma

TCGA Lung Squamous cell


Literature Curation - Systematic Screens:

Leich E et al (2013). Multiple myeloma is affected by multiple and heterogeneous somatic mutations in adhesion- and receptor tyrosine kinase signaling molecules. Blood cancer journal 3:e102

Sausen M et al (2013). Integrated genomic analyses identify ARID1A and ARID1B alterations in the childhood cancer neuroblastoma.Nature genetics 45(1):12-7

Agrawal N et al (2013). Exomic Sequencing of Medullary Thyroid Cancer Reveals Dominant and Mutually Exclusive Oncogenic Mutations in RET and RAS. The Journal of clinical endocrinology and metabolism 98(2):E364-9

Dulak AM et al (2013). Exome and whole-genome sequencing of esophageal adenocarcinoma identifies recurrent driver events and mutational complexity. Nature genetics 45(5):478-86

Landau DA et al (2013). Evolution and impact of subclonal mutations in chronic lymphocytic leukemia. Cell 152(4):714-26

Pugh TJ et al (2013). The genetic landscape of high-risk neuroblastoma. Nature genetics 45(3):279-84

Green MR et al (2013). Hierarchy in somatic mutations arising during genomic evolution and progression of follicular lymphoma. Blood 121(9):1604-11

Streppel MM et al (2013). Next-generation sequencing of endoscopic biopsies identifies ARID1A as a tumor-suppressor gene in Barretts esophagus. Oncogene 1476-5594

Zhou D et al (2013). Exome capture sequencing of adenoma reveals genetic alterations in multiple cellular pathways at the early stage of colorectal tumorigenesis. PloS one 8(1):e53310

Kim SC et al (2013).A high-dimensional, deep-sequencing study of lung adenocarcinoma in female never-smokers. PloS one 8(2):e55596

Demeure MJ et al (2012). Cancer of the ampulla of Vater: analysis of the whole genome sequence exposes a potential therapeutic vulnerability. Genome medicine 4(7):56

Newey PJ et al (2012) . Whole-Exome Sequencing Studies of Nonhereditary (Sporadic) Parathyroid Adenomas. The Journal of clinical endocrinology and metabolism 1538-7445


COSMIC v65 Total Statistics

Genes 24715
Samples 885051
Mutations 1146761
Unique Variants 873677
Papers 16514
Fusions 9014
Genomic Rearrangements 7584
Whole Genomes 6989

Additional CGP resources:

The Cancer Gene Census, a listing of all genes known to be involved in cancer promotion

The Cancer Cell Line Project, defining key driver mutations in 800 common cancer cell lines

Cosmic Whole Genomes, all tumours with genome-wide somatic annotations

Genomics of Drug Sensitivity, Analysis of drug sensitivity data in human cancer cell lines.

CGP Copy Number Analysis in Cancer, examining tumours for gains or losses of genomic content


More


26th Mar 2013 COSMIC v64 Release

COSMIC v64 Release

COSMIC v64 contains full curation of gene fusions CIC-DUX4 and ACTB-GLI1 in solid tumours. Twelve additional genome-wide sequencing publications bring the numbers of WGS samples in COSMIC to 5023.

Website update

The new COSMIC website (http://cancer.sanger.ac.uk) now replaces the old one (www.sanger.ac.uk/cosmic) which is no longer available. Existing links and bookmarks will still work, but will redirect to the appropriate page on the new website. Substantial help is available to navigate the new system here, but if you have any questions or comments about the new website, please contact us at cosmic@sanger.ac.uk

Literature Curation - Gene fusion mutations

CIC-DUX4

The recurrent CIC-DUX4 fusion is found in a subset of paediatric and young adult primitive round cell undifferentiated soft tissue sarcomas, distinct from Ewings sarcoma family of tumours. CIC is a member of the HMG-box superfamily of transcription factors and DUX4 is a double-homeobox gene belonging to the family of double homeo-domain transcription activators. The CIC-DUX4 fusion preserves most of the functional regions of the CIC gene, including the DNA-binding HMG-box and most of the MAPK phosphorylation sites, but both DUX4 homeobox domains are lost.

ACTB-GLI1

This novel fusion has been found in a discrete set of soft tissue sarcomas with distinctive pericytic features. The DNA-binding zinc domains of GLI1 are retained in the fusion and the GLI1 promoter region is replaced with that of the ubiquitously expressed ACTB gene.

Literature Curation - Systematic screens

Horn S et al (2013). TERT Promoter Mutations in Familial and Sporadic Melanoma. Science 339:959

Roberts KG et al (2012). Genetic alterations activating kinase and cytokine receptor signaling in high-risk acute lymphoblastic leukemia. Cancer cell 22:153

Le Gallo M et al (2012). Exome sequencing of serous endometrial tumors identifies recurrent somatic mutations in chromatin-remodeling and ubiquitin ligase complex genes. Nature genetics 44:1310

Agrawal N et al (2012). Comparative genomic analysis of esophageal adenocarcinoma and squamous cell carcinoma. Cancer discovery 2:899

Kannan K et al (2012). Whole-exome sequencing identifies ATRX mutation as a key molecular determinant in lower-grade glioma. Oncotarget 3:1194

Lindberg J et al (2012). The Mitochondrial and Autosomal Mutation Landscapes of Prostate Cancer. Eur Urol. 63:702

Nichols AC, et al. (2012). A Pilot Study Comparing HPV-Positive and HPV-Negative Head and Neck Squamous Cell Carcinomas by Whole Exome Sequencing. ISRN Oncol. 2012:809370

Seshagiri S et al. (2012). Recurrent R-spondin fusions in colon cancer. Nature 488:660

Seo JS et al (2012). The transcriptional landscape and mutational profile of lung adenocarcinoma. Genome research 22:2109

Liu J et al (2012). Genome and transcriptome sequencing of lung cancers reveal diverse mutational and splicing events. Genome research 22:2315

Piazza R et al (2012). Recurrent SETBP1 mutations in atypical chronic myeloid leukemia. Nature Genetics 45:18

Rossi D et al (2012). The coding genome of splenic marginal zone lymphoma: activation of NOTCH2 and other pathways regulating marginal zone development. The Journal of experimental medicine 209:1537


COSMIC v64 Total Statistics

Genes 24394
Samples 847698
Mutations 913166
Unique Variants 682654
Papers 16123
Fusions 8945
Genomic Rearrangements 7584
Whole Genomes 5023

Genomics of Drug Sensitivity in Cancer release 4 (March 2013)

This release features improvements increasing functionality of the GDSC website to facilitate analysis and interpretation of results.

Drug overview pages

These pages provide a visual summary of the screening results for each drug. Cell line IC50 values including confidence intervals are plotted as well as summary statistics for each drug. The overview page also contains separate plots for cell line IC90, IC75, IC25 and AUC (area under the curve) values for each drug.

IC50 scatter plots filtered by mutation and tissue type

The scatter plots of cell line IC50s values for significant drug-gene associations have been improved so that they can be filtered by mutation type (coding mutation, amplification or deletion) or by tissue type. A non-parametric test is performed for each resulting scatter plot to assess the significance of each association.

Genomics of Drug Sensitivity in Cancer Team (http://www.cancerRxgene.org), please contact us at: cancerrxgene@sanger.ac.uk

Additional CGP resources:

The Cancer Gene Census, a listing of all genes known to be involved in cancer promotion

The Cancer Cell Line Project, defining key driver mutations in 800 common cancer cell lines

Cosmic Whole Genomes, all tumours with genome-wide somatic annotations

Genomics of Drug Sensitivity, Analysis of drug sensitivity data in human cancer cell lines.

CGP Copy Number Analysis in Cancer, examining tumours for gains or losses of genomic content


More


30th Jan 2013 COSMIC v63 Release

COSMIC v63 Release

COSMIC v63 includes full curation of cancer genes STAT3 and TNFRSF14, together with further FGFR and EWSR1 fusion gene pairs. Nine additional systematic screen papers from 2012 ensure our curation of cancer genome analysis remains very current.

Website update

The new COSMIC website (cancer.sanger.ac.uk) will replace the old one (www.sanger.ac.uk/cosmic) in March 2013, with our v64 release. Existing links and bookmarks will still work, but will redirect to the appropriate page on the new website. If you have any questions or comments about the new website, please contact us at cosmic@sanger.ac.uk.

CCDS gene upgrade

COSMIC has annotated cancer mutations across more than 24,000 genes over the last eleven years, and these have been collected from a variety of sources. In order to better standardise our gene information, we are now updating our gene sequences onto the better CCDS standard of human transcripts, where the sequences have been agreed by consensus between several genome annotation projects. In this release, we include the update of the first 19584 gene transcripts to CCDS standard.

A COSMIC job opportunity

We're planning an expansion of the COSMIC project to include more useful somatic datatypes and further analytical software/webpages. We're based in Cambridge, UK, and our bioinformatics development work is focused on the Perl programming language, making much use of relational databases (Oracle, PostGreSQL). If you have expertise in these areas and would enjoy working on this challenging project, please reply to our job advert below, or email cosmic@sanger.ac.uk for more details by 15th February 2013. https://jobs.sanger.ac.uk/wd/plsql/wd_portal.show_job?p_web_site_id=1764&p_web_page_id=161595

Literature curation:Point-mutated cancer genes

TNFRSF14

TNFRSF14 (tumour necrosis factor receptor superfamily member 14; 1p36). LIGHT mediated triggering of non-mutated TNFRSF14 renders B-cell lymphomas more immunogenic and more sensitive to FAS induced apoptosis, and non-mutated TNFRSF14 can inhibit proliferation of adenocarcinoma cells, suggesting a tumour suppressor role. Somatic mutations have been found in follicular lymphomas and diffuse large B-cell lymphoma, the majority being nonsense and missense mutations, but also including frame-shift, splice site and insertion mutations distributed across the gene, consistent with loss of function of a tumour suppressor. Mutated DLBCL are associated with higher risk clinical features and a worse response to rituximab. FL are often associated with del 1p36; Individuals carrying a TNFRSF14 mutation have a worse prognosis than those carrying a 1p36 deletion alone, patients with both alterations being associated with the worst prognosis.

STAT3

STAT3 (signal transducer and activator of transcription 3 (acute-phase response factor); 17q21) is activated through phosphorylation and, following dimerization, acts as a transcription activator, playing a role in many cell processes including cell growth and apoptosis. It is persistently phosphorylated in cancer cell lines and primary tumours including several haematological malignancies and hepatocellular tumours. Heterozygous somatic mutations have been found particularly in large granular lymphocytic leukaemia (T-cell LGL and chronic lymphoproliferative disorders of NK cells) and inflammatory hepatocellular adenomas lacking IL-6 mutations. Mutations are often found in the SRC homology (SH2) domain, responsible for dimerization and activation of STAT3. STAT3 has been shown to be phosphorylated in patients with SH2 mutations, and the frequently reported Y640F and D661 substitution mutations have been shown to increase transcriptional activity of STAT3.

Literature Curation - Gene fusion mutations

EWSR1-ZNF444

(EWSR1-ZNF444_ENST00000337080)

Another novel fusion gene involving EWSR1 has been identified in a small subset of myoepithelial tumours of soft tissue. ZNF444 encodes a zinc finger protein which activates transcription of a scavenger receptor gene involved in the degradation of acetylated low density lipoprotein. The EWSR1 breakpoint in this fusion is in a position frequently found in other EWSR1 fusions.

FGFR3-BAIAP2L1, FGFR3-TACC3 and FGFR1-TACC1

(FGFR3-BAIAP2L1, FGFR3-TACC3, FGFR1_ENST00000447712-TACC1)

FGFR3 fusions involving 2 different partners, which generate constitutively activated fusion proteins, have been identified in urothelial carcinoma. The FGFR3 component of the fusion is the same in all cases; the tyrosine kinase coding domains are retained but the final exon that includes the PLCgamma1 binding site is lost. The TACC3 fusion component retains the transforming acidic coiled-coil domain that mediates microtubule binding and the BAIAP2L1 component retains the IRSp53/MIM domain that mediates actin binding and Rac interaction. Recurrent FGFR-TACC fusions have also been found in a small subset of glioblastoma multiforme.

Literature Curation - Systematic screens

Hodis et al (2012). A landscape of driver mutations in melanoma.Cell 150:251

Imielinski et al (2012). Mapping the hallmarks of lung adenocarcinoma with massively parallel sequencing. Cell 150:1107

Castellerin et al (2012). GClonal evolution of high-grade serous ovarian carcinoma from primary to recurrent disease. J Pathol (epub)

Biankin et al (2012). Pancreatic cancer genomes reveal aberrations in axon guidance pathway genes. Nature 491:399

Love et al (2012) . The genetic landscape of mutations in Burkitt lymphoma Nat Genet 44:1321

Nikolaev et al (2012). A Single-Nucleotide Substitution Mutator Phenotype Revealed by Exome Sequencing of Human Colon Adenomas. Cancer Res 72:6279

Wang et al (2011). Whole-exome sequencing of human pancreatic cancers and characterization of genomic instability caused by MLH1 haploinsufficiency and complete deficiency. Genome Res22:208

Gerlinger et al (2012). Intratumor heterogeneity and branched evolution revealed by multiregion sequencing. NEJM 366:883

Dolnik et al (2012). Commonly altered genomic regions in acute myeloid leukemia are enriched for somatic mutations involved in chromatin remodeling and splicing. Blood 120:e83

Curation of the ICGC v11 data portal:

Malignant Lymphoma

COSMIC v63 Total Statistics

Genes 24517
Samples 821455
Mutations 834571
Unique Variants 620857
Papers 15613
Fusions 8860
Genomic Rearrangements 7584
Whole Genomes 4677

Additional CGP resources:

The Cancer Gene Census, a listing of all genes known to be involved in cancer promotion

The Cancer Cell Line Project, defining key driver mutations in 800 common cancer cell lines

Cosmic Whole Genomes, all tumours with genome-wide somatic annotations

Genomics of Drug Sensitivity, Analysis of drug sensitivity data in human cancer cell lines.

CGP Copy Number Analysis in Cancer, examining tumours for gains or losses of genomic content


More


29th Nov 2012 COSMIC v62 Release

COSMIC v62 Release

COSMIC v62 includes full curation of genes H3F3A, BCOR and HIST1H3B, together with RSPO2/3 fusions in colon cancer and NTRK1 fusions in thyroid cancer. In addition, 1068 whole-genome screens are included from recent TCGA releases, with many more from 13 newly curated systematic screen publications.

Website update

The new COSMIC website (cancer.sanger.ac.uk) will replace the old one (www.sanger.ac.uk/cosmic) in March 2013, with our v64 release. Existing links and bookmarks will still work, but will redirect to the appropriate page on the new website. If you have any questions or comments about the new website, please contact us at cosmic@sanger.ac.uk.

Literature curation:Point-mutated cancer genes

H3F3A and HIST1H3B

Mutations in H3F3A (encoding histone H3.3) or in the related HIST1H3B (encoding H3.1) have been identified as molecular drivers in diffuse intrinsic pontine glioma, and paediatric and young adult glioblastoma. Mutations consistently occur at 2 key regulatory sites within the highly conserved N-terminal histone tail which influences the dynamic regulation of chromatin structure and accessibility. These hotspot mutations appear linked to tumour location.

BCOR

BCOR, a gene encoding a transcriptional corepressor, has been identified as a tumour suppressor gene in acute myeloid leukaemia. Somatic mutations are more frequent in patients with normal karyotype compared to those with abnormal cytogenetics.

Literature Curation - Gene fusion mutations

EIF3E-RSPO2 and PTPRK-RSPO3

The R-spondin family members RSPO2 and RSPO3 have been identified in recurrent fusions in microsatellite-stable colon adenocarcinoma at a frequency of 10%. R-spondins encode secreted proteins that can potentiate canonical WNT signalling. In the EIF3E-RSPO2 fusion, EIF3E exon 1 fuses to RSPO2 to produce a functional RSPO2 protein driven by the EIF3E promoter. In the most commonly identified PTPRK-RSPO3 fusion, PTPRK exon 1 fuses to RSPO3 exon2, preserving the coding sequence of RSPO3 and replacing its secretion signal sequence with that of PTPRK.

TPM3-NTRK1_ENST00000392302, TPR_ENST00000367478-NTRK1_ENST00000392302 and TFG-NTRK1_ENST00000392302

TRK oncogenes, fusions involving NTRK1, are found in a subset of papillary thyroid carcinomas. NTRK1 encodes a cell-surface transmembrane tyrosine kinase protein acting as receptor for nerve growth factor. In TRKs the 3' terminal sequence of the tyrosine kinase domain of NTRK1 fuses with the 5' terminal sequence of one of 3 activating genes, TPM3, TPR or TFG, all of which contain coiled-coil domains that mediate protein dimerization and consequent tyrosine kinase activation.

Literature Curation - Systematic screens

Huang et al (2012). Exome sequencing of hepatitis B virus-associated hepatocellular carcinoma.Nature Genetics 44:1117-1121

Xu et al (2012). Single-cell exome sequencing reveals single-nucleotide mutation characteristics of a kidney tumor. Cell 148:886-895

Greif et al (2012). GATA2 zinc finger 1 mutations associated with biallelic CEBPA mutations define a unique genetic entity of acute myeloid leukemia. Blood 120:395-403.

Dahlman et al (2012). BRAF L597 mutations in melanoma are associated with sensitivity to MEK inhibitors. Cancer Discovery 2:791-797

Northcott et al (2012). Subgroup-specific structural variation across 1,000 medulloblastoma genomes. Nature 488:49-56

Wang et al (2012). Mutations in isocitrate dehydrogenase 1 and 2 occur frequently in intrahepatic cholangiocarcinomas and share hypermethylation targets with glioblastomas. Oncogene (epub)

Walker et al (2012). Intraclonal heterogeneity and distinct molecular mechanisms characterize the development of t(4;14) and t(11;14) myeloma. Blood 120:1077-1086

Koo et al (2012). Janus Kinase 3-Activating Mutations Identified in Natural Killer/T-cell Lymphoma. Cancer Discovery 2:591-597

Peifer et al (2012). Integrative genome analyses identify key somatic driver mutations of small-cell lung cancer. Nature genetics 44:1104-1110

Kalender Atak et al (2012). High accuracy mutation detection in leukemia on a selected panel of cancer genes. PLoS One 7:e38463

Kuhn et al (2012). Identification of Molecular Pathway Aberrations in Uterine Serous Carcinoma by Genome-wide Analyses. J Natl Cancer Inst 104:1503-1513

Rudin et al (2012). Comprehensive genomic analysis identifies SOX2 as a frequently amplified gene in small-cell lung cancer. Nature Genetics 44:1111-1116

Barber et al (2011). Comprehensive Genomic Analysis of a BRCA2 Deficient Human Pancreatic Cancer. PLoS One 6:e21639

Curation of TCGA genome screens from the ICGC v10 data portal:

Bladder Urothelial Carcinoma (TCGA, US).

Breast Invasive Carcinoma (TCGA, US).

Cervical Squamous Cell Carcinoma (TCGA, US).

Kidney Renal Clear Cell Carcinoma (TCGA, US).

Lung Adenocarcinoma (TCGA, US).

Lung Squamous Cell Carcinoma (TCGA, US).

Uterine Corpus Endometrioid Carcinoma (TCGA, US).

Prostate Adenocarcinoma (TCGA, US).

COSMIC v62 Total Statistics

Genes 24691
Samples 803415
Mutations 745924
Unique Variants 540795
Papers 15365
Fusions 8789
Genomic Rearrangements 7584
Whole Genomes 4172

Genomics of Drug Sensitivity in Cancer release 3

New data

This release sees the addition of 4,901 new IC50 values including data for 4 new anti-cancer drugs as well as new data for previously released compounds.

Number of new drug: 4
Total number of drugs: 142

Number of new IC50 values: 4,901
Total number of IC50 values: 78,070

The therapeutic target(s) of drugs in this release are:

PI3Kbeta (TXG221)
IGF1R (GSK-1904529A)
HDAC (LAQ824)
PDPK1 (OSU-03012)

Enhanced cell line IC50 scatter plots

Cell lines are now coloured coded based on whether they have a coding mutation, amplification or homozygous deletion in a given cancer gene. This makes it simple to determine what types of mutations occur in a specific cancer gene and whether mutation-type influences drug response.

Genomics of Drug Sensitivity in Cancer Team (http://www.cancerRxgene.org), please contact us at : cancerrxgene@sanger.ac.uk

Additional CGP resources:

The Cancer Gene Census, a listing of all genes known to be involved in cancer promotion

The Cancer Cell Line Project, defining key driver mutations in 800 common cancer cell lines

Cosmic Whole Genomes, all tumours with genome-wide somatic annotations

Genomics of Drug Sensitivity, Analysis of drug sensitivity data in human cancer cell lines.

CGP Copy Number Analysis in Cancer, examining tumours for gains or losses of genomic content


More


26th Sep 2012 COSMIC v61 Release

COSMIC v61 Release

COSMIC v61 focuses on whole genome screen publications with information from 17 major new reports, including the new TCGA Colon & Rectal cancer studies. In addition, the full literature on point mutations in PHF6 has been curated, along with 4 new gene fusion pairs.

Literature curation:

Curated cancer genes

PHF6,encoding a plant homeodomain (PHD) factor containing 4 nuclear localization signals and 2 PHD-type finger domains, and with a proposed role in transcriptional regulation, has been identified as an X-linked tumour suppressor gene in T-cell acute lymphoblastic leukaemia and acute myeloid leukaemia. Mutations are evenly distributed throughout the gene with recurrent missense mutations in the second zinc finger domain. Mutation prevalence is greater in male than in female patients.

Newly curated gene fusions

FN1-ALK

A novel ALK fusion involving FN1 which encodes fibronectin, a ubiquitous component of extracellular matrix and plasma, has been found in ovarian malignant stromal sarcoma. The resultant fusion protein contains the amino-terminal 1201 amino acids of FN1 and the carboxyl-terminal 598 amino acids of ALK which include the transmembrane and cytoplasmic regions.

KLC1-ALK

An additional ALK fusion partner has been identified in lung carcinoma. KLC1, encoding a member of the kinesin light chain family, fuses to the canonical ALK exon 20 recombination site in bronchioloalveolar carcinoma.

FAM131B-BRAF

A recurrent oncogenic BRAF fusion involving FAM131B, a currently uncharacterized gene on chromosome 7q34, has been shown to be an alternative mechanism of MAPK pathway activation in pilocytic astrocytoma. In common with other BRAF and RAF1 fusions, the FAM131B-BRAF fusion product lacks the RAF auto-inhibitory domain. Of note is the small number of FAM131B exons, comprising mostly of 5' UTR, included in the fusion.

UBE2L3-KRAS

An oncogenic KRAS fusion has been identified in a metastatic prostate cancer cell line. UBE2L3-KRAS encodes a protein encompassing most of the UBE2L3 protein, a member of the E2 ubiquitin-conjugating enzyme family, and full length KRAS.

Systematic screen curation

Focus on recent high-impact systematic screens:

Guichard et al (2012). Integrated analysis of somatic mutations and focal copy-number changes identifies key genes and pathways in hepatocellular carcinoma. Nat Genet. 44:694-8.

Jones et al (2012). Low-grade serous carcinomas of the ovary contain very few point mutations.J Pathol. 226:413-20.

Gui et al (2011). Frequent mutations of chromatin remodeling genes in transitional cell carcinoma of the bladder. Nat Genet. 43:875-8.

Galante et al (2011). Distinct patterns of somatic alterations in a lymphoblastoid and a tumor genome derived from the same individual. Nucleic Acids Res. 39:6056-68.

Robinson et al (2012). Novel mutations target distinct subgroups of medulloblastoma. Nature. 488:43-8.

The Cancer Genome Atlas Network (2012). Comprehensive molecular characterization of human colon and rectal cancer.Nature. 487:330-7.

Pugh et al (2012). Medulloblastoma exome sequencing uncovers subtype-specific somatic mutations.Nature. 488:106-10.

Zhang et al (2011). Key pathways are frequently mutated in high-risk childhood acute lymphoblastic leukemia: a report from the Children's Oncology Group.Blood. 118:3080-7.

Yip et al (2012). Concurrent CIC mutations, IDH mutations, and 1p/19q loss distinguish oligodendrogliomas from other cancers.J Pathol. 226:7-16.

Lee et al (2012). A remarkably simple genome underlies highly malignant pediatric rhabdoid cancers.J Clin Invest. 122:2983-8.

Bass et al (2011). Genomic sequencing of colorectal adenocarcinomas identifies a recurrent VTI1A-TCF7L2 fusion.Nat Genet. 43:964-8.

Zhang et al (2012). The genetic basis of early T-cell precursor acute lymphoblastic leukaemia.Nature. 481:157-63.

Jones et al (2012). Dissecting the genomic complexity underlying medulloblastoma.Nature. 488:100-5.

Fujimoto et al (2012). Whole-genome sequencing of liver cancers identifies etiological influences on mutation patterns and recurrent mutations in chromatin regulators.Nat Genet. 44:760-4.

Peña-Llopis et al (2012). BAP1 loss defines a new class of renal cell carcinoma. Nat Genet. 44:751-9.

Ong et al (2012). Exome sequencing of liver fluke-associated cholangiocarcinoma. Nat Genet. 44:690-3.

Duns et al (2012). Targeted exome sequencing in clear cell renal cell carcinoma tumors suggests aberrant chromatin regulation as a crucial step in ccRCC development.Hum Mutat. 33:1059-62.

The following genes have been updated in this release:

ABL1, ACVR1B, AKT1, ALK, APC, ARID1A, ASXL1, ATM, ATRX, AXIN1, BAP1, BRAF, BRCA1, BRCA2, CARD11, CBL, CDC73, CDH1, CDKN2A, CEBPA, CREBBP, CRLF2, CSF1R, CTNNA1,CTNNB1, CYLD, DAXX, DNMT3A, EGFR, EML4, EP300, ERBB2, ERG, EZH2, FAM123B, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, GATA1, GATA2, GATA3, GNA11, GNAQ, GNAS, HNF1A, HRAS, IDH1, IDH2, IKZF1, IL7R, JAK1, JAK2, JAK3, KDR, KIT, KRAS, MAP2K4, MED12, MEN1, MET, MLH1, MLL2, MLL3, MPL, MSH2, MSH6, MYD88, NF1, NF2, NFE2L2, NOTCH1, NOTCH2, NPM1, NRAS, NTRK3, PAX5, PBRM1, PDGFRA, PHF6, PHOX2B, PIK3CA, PIK3R1, PPP2R1A, PRDM1, PRKAR1A, PTCH1, PTEN, PTPN11, RB1, RET, RUNX1, SETD2, SF3B1, SMAD4, SMARCA4, SMARCB1, SMO, SRC, SRSF2, STK11, SUFU, TET2, TNFAIP3, TP53, TSHR, U2AF1, VHL, WT1, ZRSR2

COSMIC v61 Total Statistics

Genes 22170
Samples 773098
Mutations 405271
Unique Variants 224649
Papers 14819
Fusions 8931
Genomic Rearrangements 7503
Whole Genomes 2556

Additional CGP resources:

The Cancer Gene Census, a listing of all genes known to be involved in cancer promotion

The Cancer Cell Line Project, defining key driver mutations in 800 common cancer cell lines

Cosmic Whole Genomes, all tumours with genome-wide somatic annotations

Genomics of Drug Sensitivity, Analysis of drug sensitivity data in human cancer cell lines.

CGP Copy Number Analysis in Cancer, examining tumours for gains or losses of genomic content


More


19th Jul 2012 COSMIC v60; Drug Sensitivity v2 Release

COSMIC v60; Drug Sensitivity v2 Release

The 60th release of COSMIC includes the first full version of our new website, 9 new systematic screens and significant updates to the mutation spectra of known cancer genes.

Full release of new COSMIC website.

Following the recent successful trial of our new website, we now present its first full release. As before, there are three parallel web systems. The main COSMIC website provides access to the full COSMIC database, all somatic mutation data collected over the last eight years. The Cell Lines Project site focuses on the analysis of a set of 770 commonly used cell lines, unchanged from the previous release. Finally, the previous CGP studies website has been retired in favour of a new Genomes site. This exclusively presents data from genome-wide screens, including full genome resequencing, exome resequencing and low-coverage rearrangement screens (currently comprising a large majority of exome screens). Please send us any comments about your experiences with the new system (cosmic@sanger.ac.uk), which we will help us make it as good as possible.

Literature curation:

Curated cancer genes

For this release and the next, we are focusing on updating existing curated genes to bring these up-to-date with recent publications.

Systematic screen curation

Focus on recent high-impact systematic screens:

Berger et al (2012). Melanoma genome sequencing reveals frequent PREX2 mutations. Nature 485:502-6.

Molenaar et al (2012). Sequencing of neuroblastoma identifies chromothripsis and defects in neuritogenesis genes. Nature 483:589-93.

Grasso et al (2012). The mutational landscape of lethal castration-resistant prostate cancer. Nature 487:239-43.

Barbieri et al (2012). Exome sequencing identifies recurrent SPOP, FOXA1 and MED12 mutations in prostate cancer. Nat Genet. 44:685-9.

Berger et al (2011). The genomic complexity of primary human prostate cancer. Nature 470:214-20.

Morin et al. (2011). Frequent mutation of histone-modifying genes in non-Hodgkin lymphoma. Nature 476:298-303.

Nikolaev et al (2011). Exome sequencing identifies recurrent somatic MAP2K1 and MAP2K2 mutations in melanoma. Nat Genet.44:133-9.

Wu et al (2011). Whole-exome sequencing of neoplastic cysts of the pancreas reveals recurrent mutations in components of ubiquitin-dependent pathways. Proc Natl Acad Sci U S A. 108:21188-93.

Guo et al (2011). Frequent mutations of genes encoding ubiquitin-mediated proteolysis pathway components in clear cell renal cell carcinoma. Nat Genet. 44:17-9.

The following genes have been updated in this release:

ABL1, ACVR1B, AKT1, ALK, APC, ARID1A, ASXL1, ATM, ATRX, AXIN1, BAP1, BRAF, BRCA1, BRCA2, CARD11, CBL, CDC73, CDH1, CDKN2A, CEBPA, CREBBP, CRLF2, CSF1R, CTNNA1, CTNNB1, CYLD, DAXX, DNMT3A, EGFR, EML4, EP300, ERBB2, ERG, EZH2, FAM123B, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, FOXL2, GATA1, GATA2, GATA3, GNA11, GNAQ, GNAS, HNF1A, HRAS, IDH1, IDH2, IKZF1, IL7R, JAK1, JAK2, JAK3, KDR, KIT, KRAS, MAP2K4, MED12, MEN1, MET, MLH1, MLL2, MLL3, MPL, MSH2, MSH6, MYD88, NF1, NF2, NFE2L2, NOTCH1, NOTCH2, NPM1, NRAS, NTRK3, PAX5, PBRM1, PDGFRA, PHOX2B, PIK3CA, PIK3R1, PPP2R1A, PRDM1, PRKAR1A, PTCH1, PTEN, PTPN11, RB1, RET, RUNX1, SETD2, SF3B1, SMAD4, SMARCA4, SMARCB1, SMO, SOCS1, SRC, STK11, SUFU, TET2, TNFAIP3, TP53, TSHR, U2AF1, VHL, WT1

COSMIC v60 Total Statistics

Genes 21850
Samples 743601
Mutations 340585
Unique Variants 170263
Papers 14310
Fusions 8004
Genomic Rearrangements 5494
Whole Genomes 1894

Genomics of Drug Sensitivity in Cancer v2 release

New data
This release sees the addition of 25,421 new IC50 values including data for 9 new anti-cancer drugs as well as new data for previously released compounds.

Number of new drugs: 9
Total number of drugs: 138

Number of new IC50 values: 25,421
Total number of IC50 values: 73,169

Number of new cell lines: 76
Total number of cell lines: 714

Drug sensitivity predictions with elastic net modeling.

We have enhanced the analysis of drug sensitivity data by including elastic net (EN) modeling. This approach is able to scan across the genome to identify genomic, transcriptomic and tissue-type features associated with drug sensitivity or resistance. EN modeling results are presented as heatmaps and the results of this analysis are freely downloadable from the website.

EN modeling is available in addition to the multivariate ANOVA of drug sensitivity.

Scatter plots of IC50 values

This new functionality allows users to generate scatter plots of cell line IC50 values for significant drug-gene associations. Users are able to select which data to plot depending on their drug or gene of interest. Scatter plot images are downloadable and cell line IC50 values are directly linked to the COSMIC database facilitating integration of drug sensitivity data with detailed cell line information.

Additional CGP resources:

The Cancer Gene Census, a listing of all genes known to be involved in cancer promotion

The Cancer Cell Line Project, defining key driver mutations in 800 common cancer cell lines

Cosmic Whole Genomes, all tumours with genome-wide somatic annotations

Genomics of Drug Sensitivity, Analysis of drug sensitivity data in human cancer cell lines.

CGP Copy Number Analysis in Cancer, examining tumours for gains or losses of genomic content


More


29th Jun 2012 New COSMIC website

New COSMIC website

A new, improved COSMIC website (http://cancer.sanger.ac.uk/)

We are pleased to announce the development of a new website for the COSMIC database, designed to improve the exploration of this increasingly complex data. This more modern system will additionally form a better platform to extend COSMIC with new types of data and additional analysis functions. This new website is now available to the scientific community, presenting the current v59 COSMIC release. We invite all comments and feedback on our email: cosmic@sanger.ac.uk.

Image Unavailable

(Source: http://cancer.sanger.ac.uk/)

Entry into the system has been kept as simple as possible, focusing on a single search box, which is much more helpful in finding the correct information. Behind this, the 'By Gene' and 'By Tissue' options have also been enhanced. The tissue browser has been significantly redesigned, showing all available site/histology options with counts of mutated samples; once a phenotype has been selected, press 'Go' and substantial details will immediately appear beneath, with links deeper into the new website. The gene browser requests a search string, as little as one letter, and will search all gene names and synonyms, then present a list of available gene options.

The website content has been broadly reorganised in line with the current Sanger style. This allows much information to be presented on one webpage, organised into separate tabs along the top of the main panel, rather than as a series of boxes on a long deep web page. This should reduce the scrolling needed to navigate each page and presents the information in more logical tab-formatted units. For instance, on the Gene Overview page:

Image Unavailable

(Source: http://cancer.sanger.ac.uk/cosmic/gene/overview?ln=KRAS)

Amongst numerous improvements, particular new tools include a better zoomable mutations histogram and searchable, exportable data tables. The new mutations histogram is zoomable in a click-and-drag style (instead of +/- 5aa as before). Additional filter options are also available, and these are more noticeably presented; all tabs on the Gene Analysis webpage reflect the filters selected.

Image Unavailable

(Source: http://cancer.sanger.ac.uk/cosmic/gene/analysis?ln=KRAS#histo)

All tables of data in COSMIC are now presented in a new style which allows sorting by selected columns, searching of the table contents, and exporting of this information, as displayed on the screen.

Image Unavailable

(Source: http://cancer.sanger.ac.uk/cosmic/gene/analysis?ln=KRAS#ts)

This new system will form a major release in the near future, and the two websystems will run in parallel from the same release database until early 2013, allowing everyone time to adjust to the new system. We invite you to send us any comments on the new website, including what you like or dislike, or any difficulties you experience so that we can make this system the best possible for all of our users. Please contact us at: cosmic@sanger.ac.uk

Good luck with your research,

The COSMIC Team

(http://cancer.sanger.ac.uk/)

cosmic@sanger.ac.uk


More


23rd May 2012 COSMIC v59 Release

COSMIC v59 Release

This latest release of COSMIC includes full mutation spectra across three new cancer genes and 9 new gene fusions. Also included are updates from the latest TCGA and ICGC releases, together with 4 recent whole-genome publications.

Genes with fully curated mutation spectra.

SRSF2, U2AF1, ZRSR2
In addition to SF3B1, somatic mutations have been identified in other genes coding for components of the splicing machinery responsible for the processing of pre-mRNA to mature RNA. SRSF2, U2AF1 and ZRSR2 are mutated in haematopoietic neoplasms where there's further evidence for an association between clinical phenotype and the different splice gene mutations. Mutated SRSF2 is more frequently associated with leukaemic transformation and has a negative prognostic impact. For SRSF2 the recurrent mutations cluster at P95, while for U2AF1 the mutations cluster at 2 hotspots at S34 and Q157. ZRSR2 mutations are least frequent.

Curated fusion gene pairs:

YWHAE-FAM22A, YWHAE-FAM22B
Recurrent fusions of YWHAE and one of two FAM22 family members (FAM22A, FAM22B) have been identified as specific to high-grade endometrial stromal sarcoma (ESS), a clinically aggressive form of uterine sarcoma and distinct from the classic JAZF1-rearranged ESS. YWHAE encodes a member of the 14-3-3 protein family which plays a role in various signal transduction pathways and YWHAE-FAM22 fusions maintain a structurally and functionally intact YWHAE protein-binding domain. Identical fusions involving YWHAE-FAM22 have also been identified in clear cell sarcoma of kidney.

KIF5B-RET and CCDC6-RET
Fusions involving the kinesin family 5B gene and RET have been found recurrently in lung adenocarcinomas. Several transcript variants have been identified in which the KIF5B portion differs, but all retain the KIF5B coiled-coil domain that mediates homodimerization and all retain the full RET kinase domain. An additional RET fusion, involving CCDC6, has also been identified in lung adenocarcinoma.

EZR-ROS1, LRIG3-ROS1, SDC4-ROS1 and TPM3-ROS1
Four new fusion partners for ROS1, a receptor tyrosine kinase, have been found in non-small cell lung cancer, specifically adenocarcinoma. All of the breakpoints in ROS1 enable the resulting fusion to retain the ROS1 kinase domain and in the SDC4-ROS1 fusion, with a breakpoint at exon 32, the ROS1 transmembrane domain is also retained.

C2orf44-ALK
A novel ALK fusion involving C2orf44 has been identified in colorectal cancer. C2orf44, containing a coiled-coil domain, fuses to the canonical ALK exon 20 recombination site.

Whole genome resequencing data:

TCGA
The TCGA recently released large mutation datasets on three cancer types, Rectal, Colorectal and AML. This data has been curated from the TCGA data portal.

ICGC
The March release of the ICGC DCC (version 8) included four major new datasets, on liver , paediatric brain and two independent sets of pancreatic cancers (QCMG, Aus), (OICR,Canada). This information has been curated from the ICGC data release portal.

CGP
The complete genome sequences of 21 breast tumours have allowed the elucidation of the mutational process moulding these tumours. This data has been curated from pre-publication datasets here.

Shah, et al. (2012). The clonal and mutational evolution spectrum of primary triple-negative breast cancers.Nature (epub, pre-publication).

Focus on Blood tumours

Ding, et al. (2012). Clonal evolution in relapsed acute myeloid leukaemia revealed by whole-genome sequencing. Nature 481:506-10.

Graubert, et al. (2012). Recurrent mutations in the U2AF1 splicing factor in myelodysplastic syndromes. Nature Genetics 44:53-7.

Yoshida, et al. (2011). Frequent pathway mutations of splicing machinery in myelodysplasia. Nature 478:64-9.

The following genes have been updated in this release:

ABL1, ACVR1B, AKT1, ALK, APC, ARID1A, ASXL1, ATM, ATRX, AXIN1, BAP1, BRAF, BRCA1, BRCA2, CARD11, CBL, CDC73, CDH1, CDKN2A, CEBPA, CREBBP, CRLF2, CSF1R, CTNNA1, CTNNB1, CYLD, DAXX, DNMT3A, EGFR, EML4, EP300, ERBB2, EZH2, FAM123B, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, FOXL2, GATA1, GATA2, GATA3, GNA11, GNAQ, GNAS, HNF1A, HRAS, IDH1, IDH2, IKZF1, IL7R, JAK1, JAK2, JAK3, KDR, KIT, KRAS, MAP2K4, MED12, MEN1, MET, MLH1, MLL2, MLL3, MPL, MSH2, MSH6, MYD88, NF1, NF2, NOTCH1, NOTCH2, NPM1, NRAS, NTRK3, PAX5, PDGFRA, PHOX2B, PIK3CA, PIK3R1, PPP2R1A, PRDM1, PRKAR1A, PTCH1, PTEN, PTPN11, RB1, RET, RUNX1, SETD2, SF3B1, SMAD4, SMARCA4, SMARCB1, SMO, SRC, SRSF2, STK11, TET2, TNFAIP3, TP53, TSHR, U2AF1, VHL, WT1, ZRSR2

COSMIC v59 Total Statistics

Experiments 4799855
Tumours 713382
Samples 716988
Mutant Samples 282779
Mutations 298517
Unique Mutations 140641
Papers curated 13728
Genes 21425
Fusions 7732
Structural Variants 2882

Additional CGP resources:

The Cancer Gene Census, a listing of all genes known to be involved in cancer promotion

The Cancer Cell Line Project, defining key driver mutations in 800 common cancer cell lines

Genomics of Drug Sensitivity, Analysis of drug sensitivity data in human cancer cell lines.

CGP Copy Number Analysis in Cancer, examining tumours for gains or losses of genomic content


More


29th Mar 2012 Genomics of Drug Sensitivity in Cancer v1 release

Genomics of Drug Sensitivity in Cancer v1 release

We have launched a new website to present genomic markers of sensitivity to anti-cancer compounds screened across our >1000 cancer cell line resource.

New website

The website (www.cancerRxgene.org) has enhanced user interfaces making it simple to access our data and analyses. All of our genetic and drug sensitivity data, including results from future screening, are freely available and can be downloaded. The website will be regularly updated as new data becomes available.

Drug sensitivity data

We have released sensitivity data for 130 anti-cancer drugs screened across a large subset of our cell line resource. Drug sensitivity data for more than 600 cell lines have been correlated with extensive genetic data to identify genomic events associated with sensitivity and resistance. This is the largest publicly available dataset of its type, representing >48,000 drug-cell line combinations, and provides a comprehensive view of the genomics of drug sensitivity in cancer.

Cancer cell line resource

Our >1000 cell line resource represents the spectrum of common and rare types of adult and childhood cancers of epithelial, mesenchymal and haematopoietic origin. Cell lines have been subjected to sequencing of the full coding exons of 64 commonly mutated cancer genes, genome-wide analysis of copy number gain and loss using Affymetrix SNP6.0 microarrays, and expression profiling of 14,500 genes using Affymetrix HT-U133A microarrays. The presence of seven commonly rearranged cancer genes and of microsatellite instability (MSI) has also been investigated. Genetic data for the cell lines are available through our website and the Cancer Genome Project webpages. The cell lines have been submitted for whole-exome sequencing and this data will soon be available.

Integration with COSMIC

Our analyses have been integrated with the Catalogue of Somatic Mutations in Cancer (COSMIC) database providing a comprehensive resource linking somatic mutations and other information related to cancer with drug sensitivity information.

Data for the following drugs are included in this release:

681640, 17-AAG, A-443654, A-769662, A-770041, ABT-263, ABT-888, AICAR, AKT inhibitor VIII, AMG-706, AP-24534, AS601245, ATRA, AUY922, Axitinib, AZ628, AZD-0530, AZD-2281, AZD-6244, AZD-6482, AZD-7762, AZD-8055, BAY613606, Bexarotene, BI-2536, BI-D1870, BIBW2992, Bicalutamide, BIRB 0796, Bleomycin, BMS-509744, BMS-536924, BMS-754807, Bortezomib, Bosutinib, Bryostatin 1, BX-795, Camptothecin, CEP-701, CGP-082996, CGP-60474, CHIR-99021, CI-1040, Cisplatin, CMK, Cyclopamine, Cytarabine, Dasatinib, DMOG, Docetaxel, Doxorubicin, Elesclomol, Embelin, Epothilone B, Erlotinib, Etoposide, FH535, FTI-277, GDC-0449, GDC0941, Gefitinib, Gemcitabine, GNF-2, GSK-650394, GSK269962A, GW 441756, GW843682X, Imatinib, IPA-3, JNK Inhibitor VIII, JNK-9L, JW-7-52-1, KIN001-135, KU-55933, Lapatinib, Lenalidomide, LFM-A13, Metformin, Methotrexate, MG-132, Midostaurin, Mitomycin C, MK-2206, MS-275, Nilotinib, NSC-87877, NU-7441, Nutlin-3a, NVP-BEZ235, NVP-TAE684, Obatoclax Mesylate, OSI-906, PAC-1, Paclitaxel, Parthenolide, Pazopanib, PD-0325901, PD-0332991, PD-173074, PF-02341066, PF-562271, PHA-665752, PLX4720, Pyrimethamine, QS11, Rapamycin, RDEA119, RO-3306, Roscovitine, S-Trityl-L-cysteine, Salubrinal, SB216763, SB590885, Shikonin, SL 0101-1, Sorafenib, Sunitinib, Temsirolimus, Thapsigargin, Tipifarnib, Vinblastine, Vinorelbine, Vorinostat, VX-680, VX-702, WH-4-023, WZ-1-84, XMD8-85, Z-LLNle-CHO, ZM-447439

Additional CGP resources:

The Cancer Gene Census, a listing of all genes known to be involved in cancer promotion

The Cancer Cell Line Project, defining key driver mutations in 800 common cancer cell lines

Genomics of Drug Sensitivity, Analysis of drug sensitivity data in human cancer cell lines.

CGP Copy Number Analysis in Cancer, examining tumours for gains or losses of genomic content


More


15th Mar 2012 COSMIC v58 Release

COSMIC v58 Release

Five new fusion gene pairs and one new cancer gene are curated in this 58th release of COSMIC, together with 6 recent genome-wide mutation screens.

Cancer Gene Census Update

Our curations have highlighted 13 new cancer genes KIF5B, C2orf44, CCDC6, SDC4, SLC34A2, EZR, LRIG3, H3F3A, DNM2, ECT2L, PHF6, WWTR1, CAMTA1, which have been added to the cancer gene census, bringing the total number of genes implicated in cancer to 487.

New curated genes:

AXIN1 functions as a scaffold protein regulating a variety of signalling pathways and biological functions. It is a key component of Wnt signalling binding to several of its members including APC, GSK3 and β-catenin. Mutations in the AXIN1 gene, a tumour suppressor, lead to stabilization of β-catenin and activation of target genes. AXIN1 mutations have been found in several cancers including colorectal, endometrial, prostate and hepatocellular as well as in hepatoblastoma and sporadic medulloblastoma. Mutations are found throughout the whole gene including the APC, GSK3 and β-catenin binding domains. Biochemical and functional studies have shown that these mutations interfere with AXIN1 binding to GSK3 and interaction with two upstream activators Frat1 and DVL.

New Curated Gene Fusions:

HEY1-NCOA2, NUP107-LGR5
The novel, recurrent fusion HEY1-NCOA2 has been identified in mesenchymal chondrosarcoma as a potential molecular diagnostic marker. HEY1 is a downstream effector of Notch signalling and NCOA2 is a member of the p160 nuclear hormone receptor transcriptional co-activation family. In the HEY1-NCOA2 fusion, the N-terminal basic helix-loop-helix DNA-binding/protein dimerization domain from HEY1 is retained while the C-terminal portion is replaced by the NCOA2 ranscriptional activation domains AD1/CID and AD2. NUP107-LGR5 has also been identified, in dedifferentiated liposarcoma, as a novel but not recurrent translocation event.

SLC34A2-ROS1 and CD74-ROS1,
Two novel translocations have been identified in non small cell carcinoma of the lung. Fusion of ROS1, a receptor tyrosine kinase of the insulin receptor family, to the transmembrane solute carrier protein SLC34A2, where the N-terminal region of the latter is fused to the transmembrane region of ROS1, results in a protein with 2 transmembrane domains. Also fusion of ROS1 with CD74, a type II transmembrane protein with high affinity for the MIF immune cytokine, similarly results in a fusion protein with 2 transmembrane domains. Another ROS1 fusion, GOPC-ROS1, previously found in a glioblastoma cell line, has also been identified as a recurrent fusion in cholangiocarcinoma.

PAX8-PPARG
A subset of thyroid follicular carcinomas, and some follicular adenomas, has the PAX8-PPARG fusion. The DNA binding domains of PAX8, a thyroid cell transcription factor essential for the differentiation of follicular cells and the regulation of thyroid-specific genes, are fused to domains A-F of the peroxisome proliferator-activated receptor PPARG. Several splice variants have been identified in affected tumours with multiple transcripts expressed in some.

Systematic screen curations:

Yan XJ, et al. (2011). Exome sequencing identifies somatic mutations of DNA methyltransferase gene DNMT3A in acute monocytic leukemia. Nat Genet. 43;309-15

Pasqualucci L, et al. (2011). Analysis of the coding genome of diffuse large B-cell lymphoma. Nat Genet. 43;830-7.

Quesada V, et al (2012). Exome sequencing identifies recurrent mutations of the splicing factor SF3B1 gene in chronic lymphocytic leukemia. Nat Genet. 44;47-52.

Jiao X, et al. (2012). Somatic mutations in the NOTCH, NF-KB, PIK3CA, and hedgehog pathways in human breast cancers. Genes Chromosomes Cancer.

The mutation data from exome sequencing of 10 Gastric tumours has been available in ICGC release 7 and is now incorporated into this release of COSMIC, here .

Stephens P and Tarpey P, et al. (2012).100 breast exomes have been sequenced by the CGP and are presented here as a pre publication release Nature (In Press)

The following genes have been updated in this release:

ABL1, ACVR1B, AKT1, ALK, APC, ARID1A, ASXL1, ATM, AXIN1, BAP1, BRAF, BRCA1, BRCA2, CARD11, CBL, CDC73, CDH1, CDKN2A, CEBPA, CREBBP, CSF1R, CTNNA1, CTNNB1, CYLD, DAXX, DNMT3A, EGFR, EML4, EP300, ERBB2, EZH2, FAM123B, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, FOXL2, GATA1, GATA3, GNA11, GNAQ, GNAS, HNF1A, HRAS, IDH1, IDH2, IKZF1, IL7R, JAK1, JAK2, JAK3, KDR, KIT, KRAS, MAP2K4, MED12, MEN1, MET, MLH1, MLL2, MLL3, MPL, MSH2, MYD88, NF1, NF2, NOTCH1, NOTCH2, NPM1, NRAS, NTRK3, PAX5, PDGFRA, PIK3CA, PIK3R1, PPP2R1A, PRDM1, PRKAR1A, PTCH1, PTEN, PTPN11, RB1, RET, RUNX1, SETD2, SF3B1, SMAD4, SMARCB1, SRC, STK11, TET2, TNFAIP3, TP53, TSHR, VHL, WT1

COSMIC v58 Total Statistics

Experiments 4720191
Tumours 698444
Samples 701963
Mutant Samples 233349
Mutations 242608
Unique Mutations 89393
Papers curated 13457
Genes 20948
Fusions 7633
Structural Variants 2752

Additional CGP resources:

The Cancer Gene Census, a listing of all genes known to be involved in cancer promotion

The Cancer Cell Line Project, defining key driver mutations in 800 common cancer cell lines

Genomics of Drug Sensitivity, Analysis of drug sensitivity data in human cancer cell lines.

CGP Copy Number Analysis in Cancer, examining tumours for gains or losses of genomic content


More


1st Feb 2012 Census update

Census update

Four genes have been newly implicated in cancer : YWHAE, FAM22A and FAM22B translocations are associated with endometrial stromal carcinoma, whilst BCOR mutations are associated with Retinoblastoma, AML (Acute Myeloid Leukaemia) and APL (Acute promyelocytic leukemia). The census has been updated to reflect these new findings.


More


18th Jan 2012 COSMIC v57 Release

COSMIC v57 Release

Twelve new fusion gene pairs and one new cancer gene are curated in this 57th release of COSMIC, together with 8 recent genome-wide mutation screens.

Cancer Gene Census Update

Our curations have highlighted 6 new cancer genes FAM46C, FBXO11, HEY1, SRSF2, U2AF1, ZRSR2 which have been added to the cancer gene census, bringing the total number of genes implicated in cancer to 473.

New curated genes:

NFE2L2 encodes a transcription factor that induces expression of cytoprotective proteins upon oxidative stress. Oncogenic NFE2L2 mutations have been identified in lung, head-neck, oesophageal and skin cancers where they occur within or near 2 amino terminal motifs, DLG and ETGE, both of which are important in the interaction of NFE2L2 with KEAP1, an E3 ubiquitin ligase involved in the negative regulation of NFE2L2 expression.

New Curated Gene Fusions:

EWSR1-NFATc2, EWSR1-SMARCA5
NFATc2, a transcription factor which is not a member of the ETS family and is involved in T-cell differentiation and immune response, is a recurrent novel 3' partner to EWSR1 in a histological variant of Ewings sarcoma. The resultant EWSR1-NFATc2 fusion protein lacks the COOH-terminal RNA binding domain of EWSR1, similar to other Ewing sarcoma-specific translocations, and the NH2-terminal transactivation domain and regulatory domain of NFATc2. Another novel 3' partner to EWSR1 in Ewings sarcoma is SMARCA5, a member of the WSTF-SNF2h chromatin-remodelling complex family of genes. The fusion protein, in addition to the SNYG N-terminal of EWSR1, contains 5 conserved domains and motifs of SMARCA5.

ZNF700-MAST1, TADA2A-MAST1, NFIX-MAST1, ARID1A-MAST2, GPBP1L1-MAST2, SEC16A-NOTCH1, NOTCH1-GABBR2,
Gene fusions involving members of the MAST kinase family have been identified in breast cancer at a frequency of 3-5%. All 5 MAST fusions encode contiguous ORFs, some of which retain the MAST serine-threonine kinase domain and all of which retain the PDZ domain and the 3' kinase-like domain. Additionally, gene fusions involving NOTCH1 have also been found in breast cancer where the exons that encode the Notch intracellular domain, responsible for inducing the transcriptional programme following Notch activation, are retained.

TCF12-NR4A3, TFG-NR4A3
TCF12 can replace EWSR1 or TAF15 as a fusion partner to NR4A3 in extraskeletal myxoid chondrosarcoma. The resulting fusion transcript encodes a protein in which the NH2-terminal domain of the basic helix-loop-helix protein TCF12 is fused to the whole NR4A3 protein. Also in extraskeletal myxoid chrondrosarcoma a fusion has been identified where TFG is a novel 5' partner to NR4A3.

DDX5-ETV4
In prostate carcinoma, DDX5 is an additional 5' partner to ETS transcription factor ETV4.

Systematic screen curations:

Focus on Melanoma

Prickett TD, et al. (2011). Exon capture analysis of G protein-coupled receptors identifies activating mutations in GRM3 in melanoma. Nat Genet. [Epub ahead of print]

Wei X, et al. (2011). Analysis of the disintegrin-metalloproteinases family reveals ADAM29 and ADAM7 are often mutated in melanoma. Hum Mutat. 32:E2148-75.

Wei X, et al (2010). Mutational and functional analysis reveals ADAMTS18 metalloproteinase as a novel driver in melanoma. Mol Cancer Res.8:1513-25.

Cárdenas-Navia LI, et al. (2010). Novel somatic mutations in heterotrimeric G proteins in melanoma. Cancer Biol Ther. 10:33-7.

Cronin JC, et al (2009). Frequent mutations in the MITF pathway in melanoma. Pigment Cell Melanoma Res. 22:435-44.

Palavalli LH, et al. (2009). Analysis of the matrix metalloproteinase family reveals that MMP8 is often mutated in melanoma. Nat Genet.41:518-20.

Solomon DA, et al. (2008). Mutational inactivation of PTPRD in glioblastoma multiforme and malignant melanoma. Cancer Res. 68:10300-6.

Focus on Squamous Cell Carcinoma

Durinck S, et al. (2011). Temporal dissection of tumorigenesis in primary cancers. Cancer Discov. 1:137-143.

   (with related data from Wang NJ et al (2011). Loss-of-function mutations in Notch receptors in cutaneous and lung squamous cell carcinoma. PNAS 108:17761-6.)

The following genes have been updated in this release:

ABL1, AKT1, ALK, APC, ARID1A, ASXL1, ATM, BRAF, BRCA1, CARD11, CBL, CDC73, CDH1, CDKN2A, CREBBP, CTNNA1, CTNNB1, CYLD, DAXX, DNMT3A, EGFR, EML4, EP300, ERBB2, EZH2, FAM123B, FGFR3, FLT3, FOXL2, HRAS, IDH1, IDH2, IL7R, JAK2, JAK3, KDR, KIT, KRAS, MAP2K4, MED12, MLH1, MLL3, MPL, MSH2, MSH6, MYD88, NF1, NF2, NFE2L2, NOTCH1, NOTCH2, NPM1, NRAS, PDGFRA, PIK3CA, PIK3R1, PRDM1, PRKAR1A, PTCH1, PTEN, PTPN11, RB1, RUNX1, SETD2, SF3B1, SMAD4, SMARCA4, SMARCB1, SMO, TET2, TNFAIP3, TP53, VHL, WT1

COSMIC v57 Total Statistics

Experiments 4671089
Tumours 683702
Samples 687082
Mutant Samples 217031
Mutations 224634
Unique Mutations 75109
Papers curated 13121
Genes 20259
Fusions 7428
Structural Variants 2752

Additional CGP resources:

The Cancer Gene Census, a listing of all genes known to be involved in cancer promotion

The Cancer Cell Line Project, defining key driver mutations in 800 common cancer cell lines

Genomics of Drug Sensitivity, Analysis of drug sensitivity data in human cancer cell lines.

CGP Copy Number Analysis in Cancer, examining tumours for gains or losses of genomic content


More


15th Nov 2011 COSMIC v56 Release

COSMIC v56 Release

Three new cancer genes together with 13 new fusion gene pairs and 7 recent genome-wide screens have been fully curated into COSMIC for this latest release.

Cancer Gene Census Update

Our curations have highlighted 4 new cancer genes IL7R, VTI1A, TCF7L2, NDRG1 which have been added to the cancer gene census, bringing the total number of genes implicated in cancer to 468.

New curated genes:

IL7R Interleukin-7 receptor alpha (IL7R) is required for normal lymphoid development and has been shown to carry somatic, heterozygous, gain of function mutations in 8- 10% of B and T cell leukaemias. To-date mutations have commonly fallen into 2 classes, S185C substitution in the extracellular domain (B-ALL) or in-frame insertions/deletions in the extracellular juxtamembrane transmembrane interface region (T- and B-ALL), the majority of which also introduce an unpaired cysteine residue. Biochemical and functional assays have demonstrated that the mutations are activating.

MED12 Oncogenic somatic mutations in MED12, an X chromosome gene encoding a subunit of the Mediator Complex, have been found in uterine leiomyomas (fibroids) with a frequency of 70%. Mutations are clustered in exon 2.

SF3B1, a gene encoding a core component of the RNA splicing machinery, is oncogenic in myelodysplastic syndromes (MDS), particularly those associated with increased ringed sideroblasts. With high mutation frequency and specificity, SF3B1 mutations are strongly implicated in the pathogenesis of these MDS subtypes. Recurrent SF3B1 mutations have also been identified in chronic lymphocytic leukaemia with higher frequency in chemorefractory cases.

New Curated Gene Fusions:

VTI1A-TCF7L2
Recurrent VTI1A-TCF7L2 fusions have been found in colorectal adenocarcinomas. VTI1A encodes a v-SNARE protein while CF7L2 encodes the transcription factor TCF4 which dimerizes with beta-catenin. The fusion results in the omission of the TCF4 beta-catenin-binding domain.

GOPC-ROS1
A fusion involving the amino-terminal portion of GOPC and the carboxy-terminal kinase domain of ROS1, a receptor tyrosine kinase, occurs in glioblastoma multiforme. The resulting fusion protein is a constitutively activated tyrosine kinase.

HNRNPA2B1-ETV1, SLC45A3-ETV1, ACSL3-ETV1, KLK2-ETV1, KLK2-ETV4, CANT1-ETV4, SLC45A3-ERG, NDRG1-ERG
In prostate carcinoma, minor novel 5' partners, apart from TMPRSS2, have been identified in recurrent fusions with ETS transcription factors ETV1, ETV4 and ERG.

SLC45A3-ELK4, SLC45A3-ETV5, TMPRSS2-ETV5
Both ELK4 and ETV5 have been identified as other ETS transcription factors involved as 3' partners in recurrent fusions in prostate carcinoma.

Systematic screen curations:

TCGA (2011). Integrated genomic analyses of ovarian carcinoma. Nature 474:609-15

Stransky et al (2011). The mutational landscape of head and neck squamous cell carcinoma. Science 333:1157-60

Li et al (2011). Inactivating mutations of the chromatin remodeling gene ARID2 in hepatocellular carcinoma. Nature Genetics 43:828-9

Prickett et al (2009). Analysis of the tyrosine kinome in melanoma reveals recurrent mutations in ERBB4.Nature Genetics 41:1127-32

Bettegowda et al (2011).Mutations in CIC and FUBP1 contribute to human oligodendroglioma.Science 333:1453-5.

Papaemmanuil et al (2011).Somatic SF3B1 mutation in myelodysplasia with ring sideroblasts.N Engl J Med. 365:1384-95

Malcovati et al (2011).Clinical significance of SF3B1 mutations in myelodysplastic syndromes and myelodysplastic/myeloproliferative neoplasms. Blood (epub).

The following genes have been updated in this release:

ABL1, AKT1, ALK, APC, ARID1A, ASXL1, ATM, ATRX, BAP1, BRAF, BRCA1, BRCA2, CARD11, CBL, CDC73, CDH1, CDKN2A, CREBBP, CSF1R, CTNNA1, CTNNB1, CYLD, DAXX, DNMT3A, EGFR, EP300, ERBB2, EZH2, FAM123B, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, GATA1, GNA11, GNAQ, GNAS, HNF1A, HRAS, IDH1, IDH2, IKZF1, IL7R, JAK1, JAK2, JAK3, KDR, KIT, KRAS, MAP2K4, MED12, MEN1, MET, MLL2, MLL3, MPL, MSH2, MSH6, MYD88, NF1, NF2, NOTCH1, NOTCH2, NPM1, NRAS, NTRK3, PAX5, PDGFRA, PIK3CA, PIK3R1, PPP2R1A, PRKAR1A, PTCH1, PTEN, PTPN11, RB1, RET, RUNX1, SETD2, SF3B1, SMAD4, SMARCA4, SMARCB1, SMO, SRC, STK11, TET2, TNFAIP3, TP53, TSHR, WT1

COSMIC v56 Total Statistics

Experiments 4626046
Tumours 662467
Samples 665763
Mutant Samples 206711
Mutations 213615
Unique Mutations 67405
Papers curated 12818
Genes 20242
Fusions 7224
Structural Variants 2752

Additional CGP resources:

The Cancer Gene Census, a listing of all genes known to be involved in cancer promotion

The Cancer Cell Line Project, defining key driver mutations in 800 common cancer cell lines

Genomics of Drug Sensitivity, Analysis of drug sensitivity data in human cancer cell lines.

CGP Copy Number Analysis in Cancer, examining tumours for gains or losses of genomic content


More


13th Sep 2011 COSMIC v55 Release

COSMIC v55 Release

New curations include the tumour suppressor gene PRDM1, fusions of CRTC1/CRTC3-MAML2 and three new systematic screen publications. A new release of the Cancer Gene Census raises the number of known cancer genes to 464.

Cancer Gene Census Update

A number of genes have recently been implicated in oncogenesis and when this is confirmed, they are added to our Census of known cancer genes. The latest release details 464 known cancer genes, recently including SF3B1, ARID2, CCNE1, CDK12, FUBP1, XPO1, MED12.

New curated genes:

PRDM1 belongs to the PRDM gene family of transcriptional repressors characterized by DNA-binding Kruppel-type zinc fingers and the PR domain at the amino terminus. The encoded protein acts as a master regulator of B cell differentiation. PRDM1 has been identified as a tumour suppressor gene in the activated B cell-like subtype of diffuse large B cell lymphoma.

New Curated Gene Fusions:

CRTC1-MAML2 and CRTC3-MAML2,
CRTC1, a member of a family of CREB coactivators, is fused with MAML2, a gene belonging to a family of Mastermind-like genes, in mucoepidermoid carcinoma (MEC) of the salivary gland and lung. The recurrent fusion consistently has breakpoints in intron 1 of both gene partners resulting in the N terminal basic domain of MAML2 being replaced by the CREB-binding coiled-coil domain of CRTC1. The same CRTC1-MAML2 fusion is also found in a subset of Warthin's tumours of the salivary gland and in some hidradenomas. CRTC3 is occasionally an alternative fusion partner for MAML2 in MEC.

Systematic screen curations:

Agrawal et al (2011). Exome sequencing of head and neck squamous cell carcinoma reveals inactivating mutations in NOTCH1. Science.333:1154-7

Wei et al (2011). Exome sequencing identifies GRIN2A as frequently mutated in melanoma. Nature Genetics. 43:442-6.

Zang et al (2010).Genetic and structural variation in the gastric cancer kinome revealed through targeted deep sequencing. Cancer Research 71:29-39.

The following genes have been updated in this release:

ABL1, ACVR1B, AKT1, ALK, APC, ARID1A, ASXL1, ATM, BAP1, BRAF, BRCA2, CARD11, CBL, CDC73, CDH1, CDKN2A, CEBPA, CREBBP, CSF1R, CTNNB1, DNMT3A, EGFR, EML4, EP300, ERBB2, ERG, EZH2, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, FOXL2, GNAS, HNF1A, HRAS, IDH1, IDH2, JAK1, JAK2, JAK3, KDR, KIT, KRAS, MAP2K4, MET, MLL3, MPL, MSH6, NF1, NF2, NOTCH1, NOTCH2, NPM1, NRAS, NTRK3, PAX5, PDGFRA, PIK3CA, PIK3R1, PPP2R1A, PRDM1, PTCH1, PTEN, PTPN11, RB1, RET, RUNX1, SMAD4, SMARCA4, SMARCB1, SMO, SRC, STK11, SUFU, TET2, TNFAIP3, TP53, TSHR, VHL, WT1

COSMIC v55 Total Statistics

Experiments 4577043
Tumours 639580
Samples 642744
Mutant Samples 186431
Mutations 192795
Unique Mutations 52524
Papers curated 12441
Genes 19885
Fusions 7062
Structural Variants 2752

Additional CGP resources:

The Cancer Gene Census, a listing of all genes known to be involved in cancer promotion

The Cancer Cell Line Project, defining key driver mutations in 800 common cancer cell lines

Genomics of Drug Sensitivity, Analysis of drug sensitivity data in human cancer cell lines.

CGP Copy Number Analysis in Cancer, examining tumours for gains or losses of genomic content


More


12th Jul 2011 COSMIC v54 Release

COSMIC v54 Release

COSMIC v54 Release

Five new cancer genes have received full curation of their mutation spectrum, together with seven new fusion gene pairs focusing on the zinc finger protein PLAG1. Four new systematic screens are included, covering a range of cancer phenotypes. This extensive literature curation brings the total number of publications in COSMIC to over 12,000.

New curated genes:

MLL2 and MLL3
MLL2 and MLL3, members of the mixed-lineage leukaemia family, have been confirmed as tumour suppressor genes in childhood medulloblastoma. They encode histone methyl-transferases that methylate key lysine residues of histone tails mediated through their SET domain activities. Both of these proteins are members of large transcriptional regulatory complexes involved in regulating chromatin structure and transcriptional co-activation.

NTRK3
NTRK3 encodes 1 of 3 high-affinity neurotrophin receptors that regulate growth, differentiation and apoptosis of neurons. It has been implicated in several tumour types including medulloblastoma, breast, colorectal, pancreas and large cell neuroendocrine tumours of the lung.

CREBBP and EP300
The histone acetyltransferase genes CREBBPand EP300 encode proteins which act as transcriptional co-activators in multiple signalling pathways. Both genes have been identified as tumour suppressors, with evidence of haploinsufficiency, in diffuse large B-cell lymphoma and follicular lymphoma where mutations affect the HAT coding domain. Mutations in CREBBP and EP300 are thus uncommon in epithelial cancers but have been found at high frequency in relapsed acute lymphoblastic leukaemia.

New Curated Gene Fusions:

CTNNB1-PLAG1, LIFR-PLAG1, CHCHD7-PLAG1, TCEA1-PLAG1, FGFR1-PLAG1,
A subset of pleomorphic adenomas of the salivary gland has PLAG1 fusions and 5 partner genes have so far been identified. Consistent breakpoints occur in the 5' non-coding region of PLAG1 leading to promoter swapping with the fusion partner gene. For 3 of the partners the breakpoint also occurs in the non-coding region while for TCEA1 and FGFR1 it interrupts the coding sequence. The PLAG1 protein contains an N-terminal zinc-finger DNA binding domain and a C-terminal transactivation domain.

HAS2-PLAG1 and COL1A2-PLAG1
PLAG1 fusions have also been detected in lipoblastomas where again the promoter region of the partner gene, in these cases HAS2 or COL1A2, is fused to the coding sequence of PLAG1.

Systematic screen curations:

Parsons DW, et al (2011). The genetic landscape of the childhood cancer medulloblastoma. Science 331:435-9. A large genome-wide candidate gene screen assessing childhood medulloblastoma, this study identified two new chromatin remodelling genes as cancer genes, MLL2 and MLL3.

Kan Z, et al (2010). Diverse somatic mutation patterns and pathway alterations in human cancers. Nature 466:869-73. This study examined 441 tumours from a variety of sites and morphologies, through over 1500 known or candidate cancer genes, defining roles for over 100 in oncogenesis.

Jones S, et al (2010). Frequent Mutations of Chromatin Remodeling Gene ARID1A in Ovarian Clear Cell Carcinoma. Science. 330:228-31 A whole-exome resequencing study of 8 ovarian clear cell carcinomas further implicates chromatin remodelling genes (PPP2R1A and ARID1A) in cancer.

Puente, et al (2011).Whole-genome sequencing identifies recurrent mutations in chronic lymphocytic leukaemia. Nature. 475:101-5 Full-genome resequencing of four Chronic Lymphocytic Leukaemia patients suggests significant roles for at least four known cancer genes (NOTCH1,XPO1, MYD88 and KLHL6). The data in this publication has been extended with extra annotations from their submission to the ICGC DCC (R5 release).

The following genes have been updated in this release:

ABL1, ACVR1B, AKT1, ALK, APC, ARID1A, ASXL1, ATM, BAP1, BRAF, BRCA1, BRCA2, CARD11, CBL, CDC73 , CDH1, CDKN2A, CEBPA, CREBBP, CRLF2, CSF1R, CTNNA1, CTNNB1, CYLD, DAXX, DNMT3A, EGFR, EML4, EP300, ERBB2, ERG, EZH2, FAM123B, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, FOXL2, GATA1, GATA2, GATA3, GNA11, GNAQ, GNAS, HNF1A, HRAS, IDH1, IDH2, IKZF1, JAK1, JAK2, JAK3, KDR, KIT, KRAS, MAP2K4, MEN1, MET, MLH1, MLL2, MLL3, MPL, MSH2, MSH6, MYD88, NF1, NF2, NOTCH1, NOTCH2, NPM1, NRAS, NTRK3, PAX5, PDGFRA, PHOX2B, PIK3CA, PIK3R1, PPP2R1A, PRKAR1A, PTCH1, PTEN, PTPN11, RB1, RET, RUNX1, SETD2, SMAD4, SMARCA4, SMARCB1, SMO, SOCS1, SRC, STK11, SUFU, TET2, TNFAIP3, TP53, TSHR, VHL, WT1

COSMIC v54 Total Statistics

Experiments 4531163
Tumours 619320
Samples 622464
Mutant Samples 177322
Mutations 183107
Unique Mutations 46615
Papers curated 12026
Genes 19737
Fusions 6365
Structural Variants 2752

Additional CGP resources:

The Cancer Gene Census, a listing of all genes known to be involved in cancer promotion

The Cancer Cell Line Project, defining key driver mutations in 800 common cancer cell lines

Genomics of Drug Sensitivity, Analysis of drug sensitivity data in human cancer cell lines.

CGP Copy Number Analysis in Cancer, examining tumours for gains or losses of genomic content


More


18th May 2011 COSMIC v53 Release

COSMIC v53 Release

COSMIC v53 release includes full curation of IKZF1, PIK3R1, PAX5 and 11 new gene fusion pairs.

This fifty third release of COSMIC brings the number of fully curated cancer genes to 98, together with a total of 81 curated fusion gene pairs. With the inclusion of 10 systematic screen publications, together with substantial output from the ICGC, TCGA and CGP studies, over 3 million curated gene screening experiments are now available in COSMIC, covering 19439 genes, with mutations identified on 11111 of these.

New curated genes:

IKZF1
IKZF1 encodes the zinc finger transcription regulator IKAROS involved in normal lymphoid development and is an important cancer gene in human B-cell progenitor acute lymphoblastic leukaemia (B-ALL). Whole gene deletions resulting in haploinsufficiency, focal somatic IKZF1 deletions (usually including exons encoding the DNA binding domain or regions immediately upstream of IKZF1) and somatic missense and frame shift mutations are observed in B-ALL. Focal deletions have been reported to result in the truncated transcripts that can encode dominant-negative proteins, as have some missense/frameshift mutations.

PIK3R1
PIK3R1 encodes p85alpha, the regulatory subunit of the phosphatidylinositol 3-kinase. It has an N-terminal SH3 domain, a domain homologous to the Rho GTPase-activating protein domain of the BCR gene product (BCR domain) and 2 SH2 domains that flank an intervening anti-parallel coiled-coil domain (iSH2) which mediates binding to p110alpha (PIK3CA) catalytic subunit, also a known cancer gene. PIK3R1 mutations have been identified in both colorectal and endometrial cancers where most are localized to the iSH2 domain.

PAX5
PAX5, a gene within the B-cell development pathway, encodes a transcription factor belonging to the family of paired-box domain transcription factors. Somatic mutations in PAX5, including partial and whole deletions, and point mutations which mostly cluster in the DNA-binding or transcriptional regulatory domains, have been identified in B-cell progenitor acute lymphoblastic leukaemia.

New Curated Gene Fusions:

BCR-JAK2, PCM1-JAK2, ETV6-JAK2, SSBP2-JAK2, SEC31A-JAK2, PAX5-JAK2
JAK2, JAK2 fusions involving multiple partner genes have been identified in haematological malignancies. Whereas the fusion proteins BCR-JAK2, ETV6-JAK2 and PAX5-JAK2 include the protein tyrosine kinase domain (JH1) of JAK2, PCM1-JAK2 includes JH1 and JH2 domains. In the SSBP2 fusion protein the N-terminal LisH domain may be analogous to the dimerization and DNA binding domains of other JAK2 fusion partners. SEC31A-JAK2 has been identified in classic Hodgkin's lymphoma while PAX5-JAK2 has been found in childhood acute lymphoblastic leukaemia.

KIF5B-ALK, PPFIBP1-ALK, SQSTM1-ALK, VCL-ALK
Novel fusion genes involving ALK have been found in a variety of cancers. In non-small cell lung cancer (NSCLC) the occurrence of KIF5B as an alternative to EML4 in ALK fusions strengthens the role of ALK signalling in the pathogenesis of NSCLC. The KIF5B-ALK fusion protein comprises the motor domain and coiled-coil domain of KIF5B and the juxtamembrane intracellular region of ALK, including the entire tyrosine kinase domain. Other novel ALK fusions include VCL-ALK which has been identified as a recurrent oncogenic mechanism in renal medullary carcinoma, a highly malignant sickle cell trait-associated cancer; PPFIBP1-ALK which has been found in pulmonary inflammatory myofibroblastic tumour; and SQSTM1-ALK which has been identified in ALK-positive large B cell lymphoma (ALK+ LBCL) where SQSTM1, a ubiquitin binding protein, replaces the more common ALK partners in ALK+ LBCL i.e. NPM1 and CLTC.

AKAP9-BRAF Fusions between AKAP9 and BRAF have been identified in papillary thyroid carcinoma where it is rare in sporadic tumours and more common following radiation exposure. The fusion protein lacks the N-terminal regulatory domain of BRAF and has an intact catalytic domain. AKAP9 belongs to the group of A-kinase anchor proteins which have the common function of binding to the regulatory subunit of protein kinase A.

Systematic screen curations - Focus on pancreatic cancers:

Jones S, et al (2008). Core signalling pathways in human pancreatic cancers revealed by global genomic analyses. Science 321:1801-6. Analysis of over 20,000 candidate genes for DNA mutations through 114 tumours revealed high mutation rates in 12 cell signalling pathways. The curation of this publication describes the mutations from this study not included in the recent ICGC release r3, detailing 337 additional mutations.

Jiao Y, et al (2011). DAXX/ATRX, MEN1, and mTOR pathway genes are frequently altered in pancreatic neuroendocrine tumors.Science 331:1199-203. This exome resequencing analysis of pancreatic neuroendocrine tumours examined the link between gene mutations and clinical prognosis, finding genes in the DAXX/ATRX, MEN1, and mTOR pathways particularly important. Full exomes were sequenced in 10 tumours, key genes followed in in a further 58; 218 mutations were described.

The following genes have been updated in this release:

ABL1, ACVR1B, AKT1, ALK, APC, ARID1A, ASXL1, ATM, BAP1, BRAF, BRCA1, BRCA2, CARD11, CBL, CDC73, CDH1, CDKN2A, CEBPA, CRLF2, CSF1R, CTNNA1,CTNNB1, CYLD, DAXX, DNMT3A, EGFR, EML4, ERBB2, ERG, EZH2, FAM123B, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, FOXL2, GATA1, GATA2, GATA3,GNA11, GNAQ, GNAS, HNF1A, HRAS, IDH1, IDH2, IKZF1, JAK1, JAK2, KDR, KIT, KRAS, MAP2K4, MEN1, MET, MLH1, MPL, MSH2, MSH6, NF1, NF2,NOTCH1, NOTCH2, NPM1, NRAS, PAX5, PBRM1, PDGFRA, PHOX2B, PIK3CA, PIK3R1, PPP2R1A, PRKAR1A, PTCH1, PTEN, PTPN11, RB1, RET, RUNX1,SETD2, SMAD4, SMARCA4, SMARCB1, SMO, SOCS1, SRC, STK11, SUFU, TET2, TNFAIP3, TP53, TSHR, VHL, WT1

COSMIC v53 Total Statistics

Experiments 3432750
Tumours 604950
Samples 608042
Mutant Samples 171209
Mutations 176856
Unique Mutations 43182
Papers curated 11680
Genes 19439
Fusions 6307
Structural Variants 2752

Additional CGP resources:

The Cancer Gene Census, a listing of all genes known to be involved in cancer promotion

The Cancer Cell Line Project, defining key driver mutations in 800 common cancer cell lines

Genomics of Drug Sensitivity, Analysis of drug sensitivity data in human cancer cell lines.

CGP Copy Number Analysis in Cancer, examining tumours for gains or losses of genomic content


More


23rd Mar 2011 COSMIC v52 integrates Genomics of Drug Sensitivity

COSMIC v52 integrates Genomics of Drug Sensitivity

Also, 4 new genes and 16 new fusion pairs are curated from the literature, TCGA and ICGC portal data are updated and some improvements are made to COSMIC web pages.

Genomics of Drug Sensitivity in Cancer integration

The Genomics of Drug Sensitivity in Cancer, a collaborative project between the Sanger Institute and Massachusetts General Hospital, is screening a range of anti-cancer therapeutics against a large number of genetically characterized human cancer cell lines, generating drug sensitivity correlations. This release of COSMIC includes references to this work, detailing drugs where mutant gene/drug interactions have been shown to modify cell growth responses. For example, the recently described drug PLX4720 has been shown to have a significant growth modifying effect on cells containing mutant BRAF.

Cancer Gene Census update

21 new genes have been added to the cancer gene census. With the rise of large-scale genomic sequencing, the number of novel genes implicated in cancer is increasing. We aim to keep the Census up-to-date with the latest publications.

Website improvements

The front page of COSMIC has been redesigned to make it much easier to find data and sub-projects that were previously difficult to identify. For instance, the Cell Line Project, characterising the cancer genotypes of 800 common tumour cell lines has always been a significant COSMIC sub-project, and is now appropriately highlighted. Also, the FTP site which comprises export files of each release, is now much easier to find.

In addition, we have improved the informativeness of the Tissue Overview page (eg summary page for Skin). The graphic has been extended to include the top 20 genes mutated in the tissues/phenotypes selected, followed by much simpler summary of the mutation load in the selection.

Data Curation

This release (v52) of COSMIC contains full curations of 4 new cancer genes together with 16 new fusion gene pairs. In addition, our curation of TCGA data, output via the TCGA portal, has been updated with further new Ovarian serous carcinoma mutations. Also, we have completed our curation of the validated mutations in the third release of the ICGC, bringing in structural rearrangements for two Japanese Liver cancer screens, HX5T & RK-003-C

New curated genes:

DAXX and ATRX
DAXX and ATRX have been established as tumour suppressor genes in pancreatic neuroendocrine tumours. The protein product of ATRX has an ADD domain at the amino-terminus and a carboxy-terminal helicase domain, and forms a heterodimer with DAXX. The latter gene is an H3.3-specific histone chaperone.

MYD88
Highly recurrent oncogenic mutations in MYD88 have been identified in activated B-cell-like subtype of diffuse large B-cell lymphoma. MYD88 encodes an adaptor protein which mediates toll and interleukin-1 receptor signalling. The most common mutation is MYD88 L265P, which is also detected in mucosa-associated lymphoid tissue lymphoma.

CARD11
CARD11 has been identified as an oncogene in diffuse large B-cell lymphoma, particularly the activated B-cell-like subtype. The mutations commonly occur in the coiled-coil domain which mediates CARD11 oligomerization and NF-kappaB pathway activation.

New Curated Gene Fusions:

BRD4-C15orf55, BRD3-C15orf55
BRD4, from the BET family of nuclear proteins that carry 2 bromodomains and an additional extra terminal domain, functions in the regulation of cell cycle progression. It is fused to C15orf55 (NUT) in poorly differentiated (midline) carcinomas, a clinically aggressive form of carcinoma. The oncogenic fusion protein contains the N-terminal BRD4 sequence up to the serine-rich region followed by almost the entire NUT sequence. In a few cases BRD4 is replaced by the highly homologous BRD3.

PAX3-FOXO1, PAX7-FOXO1, PAX3-NCOA1, PAX3-NCOA2
FOXO1, a member of the fork head family of transcription factors, fused to either PAX3 or PAX7 is characteristic of alveolar rhabdomyosarcoma, although the former is more frequently detected. A consistent fusion protein results where the PAX DNA binding domain is fused to the fork head domain and C-terminal region of FOXO1. The novel variant translocations PAX3-NCOA1 and PAX3-NCOA2 have also been detected in rhabdomyosarcoma.

HMGA2-LPP, HMGA2-RAD51L1, HMGA2-LHFP, HMGA2-EBF1, HMGA2-CCNB1IP1, HMGA2- COX6C, HMGA2-ALDH2, HMGA2-NFIB, HMGA2-FHIT, HMGA2-WIF1
HMGA2, which encodes a protein belonging to the non-histone chromosomal high mobility group protein family, is fused to multiple partners in lipoma, pulmonary chondroid hamartoma, chondroma and uterine leiomyoma. The most common fusion is HMGA2-LPP which results in a protein consisting of the N-terminal DNA-binding domain of HMGA2 and the C-terminal LIM domain of LPP. HMGA2 fused to NFIB, FHIT and WIF1 has been detected in salivary gland pleomorphic adenomas.

The following genes have been updated in this release:

ABL1, AKT1, APC, ASXL1, ATRX, BRAF, BRCA1, BRCA2, CARD11, CBL, CDKN2A, CTNNB1, DAXX, EGFR, EML4, ERBB2, EZH2, FAM123B, FGFR3, FLT3, GATA1, GATA3, HNF1A, HRAS, IDH1, IDH2, JAK2, KIT, KRAS, MAP2K4, MEN1, MET, MPL, MYD88, NF2, NOTCH1, NPM1, NRAS, PDGFRA, PIK3CA, PPP2R1A, PTCH1, PTEN, PTPN11, RB1, RET, SMAD4, SMARCB1, STK11, TET2, TNFAIP3, TP53, VHL

v52 Statistics

Experiments 2968661
Tumours 589549
Samples 592626
Mutant Samples 166206
Mutations 171651
Unique Mutations 42491
Papers curated 11437
Genes 19389
Fusions 6261
Structural Variants 2752


More


27th Jan 2011 COSMIC v51 Release

COSMIC v51 Release

Cosmic release v51 includes the curation of 3 newly identified cancer genes, 4 gene fusions and updates from the ICGC and TCGA international consortia. In addition, upgrades to the website are now available, to improve the analysis of mutation distribution within a gene, and to navigate the increasing number of curated large-scale systematic screens. Both the Genomics of Drug Sensitivity in Cancer website and the Cancer Gene Census Listing were also recently updated (in December 2010).

New genes curated from the literature:

DNMT3A
DNMT3A is a member of a family of methyltransferase enzymes which catalyse addition of methyl groups to sequences containing CpG dinucleotides. Somatic mutations have been found in 22% of adult acute myeloid leukaemia patients and are associated with poor overall survival. Over half of the case examined have a recurrent missense mutation at arginine 882. The mutations have been shown to impair normal enzymatic function and are heterozygous. These data suggest potential dominant negative activity of the mutations. Additional missense, nonsense and frame-shift mutations have also been found throughout the latter half of the gene.

BAP1
Mutational inactivation of BAP1, a tumour suppressor gene, has been identified in uveal melanomas where it coincides with metastasis. BAP1 encodes a nuclear ubiquitin carboxy-terminal hydrolase (UCH) as well as a UCH37-like domain, and binding domains for BRCA1, BARD1 and HCFC1.

GNA11
As previously described for GNAQ, mutations affecting Q209 have been found in GNA11 in melanocytic tumours. The frequency of mutations increases progressively from blue nevus to primary melanomas to uveal melanoma metastases; an inverse pattern to that seen with GNAQ Q209 mutations. Activation of this GTPase pathway appears to be a predominant route to the development of uveal melanoma.

New curated fusion pairs:

TAF15-NR4A3
TAF15 can replace EWSR1 as a fusion partner to NR4A3 in extraskeletal myxoid chondrosarcoma. The resulting transcript (TAF15-NR4A3) is structurally and functionally similar to the EWSR1-NR4A3 fusion.

FUS-CREB3L2, FUS-CREB3L1
FUS-CREB3L2 is tumour specific for low-grade fibromyxoid sarcoma (LGFMS) so enables the accurate diagnosis of a sarcoma with sometimes indistinct histological features. In these fusions there's a diversity of genomic breakpoints and these are often exonic rather than intronic. The rare variant FUS-CREB3L1 is occasionally detected in LGFMS.

FUS-DDIT3
Myxoid/round cell liposarcoma is characterized by the recurrent fusion FUS-DDIT3 where the 5' half of the FUS gene is fused to the entire reading frame of DDIT3, which encodes a leucine-zipper transcription factor belonging to the CCAAT/enhancer-binding protein family.

Curations of large international systematic screens:

TCGA - Full exome resequencing of Ovarian tumours
The Cancer Genome Atlas (TCGA) has recently released somatic mutation data from the exon screening of 325 serous ovarian cystadenocarcinoma tumours. We have now curated the majority of this information into COSMIC, comprising over 13,000 mutations. This can now be viewed here.

Study Samples Mutations
Ovarian Cystadenocarcinoma 325 13650

ICGC - Curation of third ICGC release 'Simple Mutations'
The International Cancer Genome Consortium (ICGC) has recently completed its third data release. We have curated into COSMIC all the Validated Somatic Simple Mutation data from this ICGC release, which can be viewed in the following studies :

Study Samples Mutations
Japanese liver cancer (Riken) 1 30
Japanese liver cancer (NCC) 1 59
Breast cancer (JHU) 18 28
Colorectal cancer (JHU) 13 29
Glioblastoma Multiforme (JHU) 23 209
Pancreatic cancer (JHU) 71 1450
Glioblastoma Multiforme (TCGA) 20 29
Lung Adenocarcinoma (TSP) 21 26

Website improvements:

We have improved the Distribution section of the Histogram page, providing much more analytical pie charts. Once a gene is selected, the mutation spectrum can be explored in a number of different ways. Nucleotide substitution breakdowns are presented as pie charts, and lengths of insertions and deletions are presented as histograms. The same filters as usual can be applied to this Distribution section, including tumour phenotype, mutation type and sample source. Links are also provided to view the mutation data in tabulated detail form, and to export it for external analysis. An example of this new system, presenting the data from the KIT oncogene can be seen here.

Systematic screens:

In the last few years, cancer genome screens have been growing substantially in size, and both whole-genome and large candidate gene screens are being curated into COSMIC. As well as curating publications, we are also collecting somatic mutation information from the data portals of large international consortia, beginning with the TCGA and ICGC. A new page now makes these easier to identify and navigate, please click here.

Related CGP Resources

Genomics of Drug Sensitivity in Cancer
The Genomics of Drug Sensitivity Website was updated on 23rd December 2010 with cell line sensitivity data and genomic correlates of sensitivity for Docetaxel, Gefitinib, CI-1040, BIBW 2992 and PLX4720. This large-scale project is a joint collaboration between the Wellcome Trust Sanger Institute (WTSI) and Massachusetts General Hospital Cancer Centre to correlate genomics with response to cancer drugs in 1,000 cancer cell lines. Within the next year we plan to integrate the data from this project into the COSMIC database and develop new web-based tools to browse and mine these data. Click here to visit the site.

Cancer Cell Line Project
The Cancer Cell Line Project website holds the data from the Cancer Genome Projects large-scale systematic characterization of a panel of 770 cancer cells across 64 known cancer genes. Click here to visit the site.

Cancer Gene Census
The Cancer Gene Census was updated at the beginning of December 2010. The Census is an up-to-date listing of genes causally implicated in cancer and now stands at 436 genes. Click here to view the listing.

The following genes have been updated in this release:

ABL1, ACVR1B, AKT1, ALK, APC, ARID1A, ASXL1, ATM, BAP1, BRAF, BRCA1, BRCA2, CBL, CDC73, CDH1, CDKN2A, CEBPA, CTNNA1, CTNNB1, DNMT3A, EGFR, EML4, ERBB2, ERG, EZH2, FAM123B, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, FOXL2, GATA3, GNA11, GNAQ, GNAS, HNF1A, HRAS, IDH1, IDH2, JAK1, JAK2, JAK3, KDR, KIT, KRAS, MAP2K4, MET, MLH1, MPL, MSH2, MSH6, MYB_ENST00000341911, NF1, NF2, NOTCH2, NPM1, NRAS, PBRM1, PDGFRA, PHOX2B, PIK3CA, PPP2R1A, PTCH1, PTEN, PTPN11, RB1, RET, RUNX1, SETD2, SMAD4, SMARCA4, SMARCB1, TET2, TNFAIP3, TP53, TSHR, VHL, WT1

COSMIC v51 Total Statistics
Experiments 2946792
Tumours 577304
Samples 580306
Mutant Samples 161787
Mutations 167193
Unique Mutations 41405
Papers curated 11062
Genes 19001
Fusions 5573
Structural Variants 2729


More


30th Nov 2010 COSMIC v50 Release

COSMIC v50 Release

Our 50th release significantly enhances the genomic focus of the COSMIC system, including a full genome browser linked directly into the COSMIC websystem. Also included are genome-wide examinations of a further 24 tumours, comprising rearrangement screens of 17 pancreatic tumours and full exome analyses of 7 renal tumours. Two new cancer genes are curated from the scientific literature, together with a further systematic screen publication.

GBrowse

With the inclusion of increasing quantities of genomic data in COSMIC, new methods to navigate and visualise this information are now required. We have implemented a version of GMOD GBrowse and linked it into COSMIC where genomic co-ordinates are available, including coding and non-coding mutations, gene footprints, structural rearrangements and copy number variants (from CONAN ). Full genome annotations have been imported from Ensembl, so that COSMIC data can be examined in the context of these full genome annotations. Throughout the COSMIC websystem, links to GBrowse are available as either links in descriptive text, or via the icon , which will present the selected data in the local genomic context.

Genome rearrangement screens of 17 Pancreatic tumours

The recent Campbell et al (2010) publication "The patterns and dynamics of genomic instability in metastatic pancreatic cancer." , describing genome structure rearrangements as early events in this cancer type, can be viewed in COSMIC, here.

Full-exome sequencing of 7 renal tumours; identification of PBRM1 gene as a tumour suppressor

The Cancer Genome Project has sequenced the full exomes of 7 renal tumours, These data, soon to be published, are available in COSMIC here.

Curation of Ding et al (2010) full-genome resequencing of 3 related breast tumours

In the Ding et al (2010) paper, 3 basal-like breast tumours were taken from one individual and their genomes were fully resequenced and comparatively analysed. We have curated the coding, non-coding and structural rearrangement mutations described in their study, available here.

New Curated genes:

ARID1A has been identified as a tumour suppressor gene in ovarian clear cell and endometrioid carcinomas. It encodes AT-rich interactive domain-containing protein 1A which is a component of the ATP-dependent chromatin modelling complex SWI/SNF.

PPP2R1A has been identified as an oncogene in ovarian clear cell carcinoma and in breast and lung carcinomas. It encodes a regulatory subunit of serine-threonine phosphatase 2 and this subunit forms the scaffold of the holoenzyme.

These curated genes have been updated this release

ABL1, AKT1, ALK, APC, ARID1A, ASXL1, ATM, BRAF, CDC73, CDH1, CEBPA, CSF1R, CTNNB1, CYLD, EGFR, EML4, ERBB2, EZH2, FBXW7, FGFR3, FLT3, FOXL2, GATA1, GNAQ, HRAS, IDH1, IDH2, JAK2, JAK3, KIT, K RAS, MEN1, MET, MLH1, MPL, NF2, NPM1, NRAS, PDGFRA, PIK3CA, PPP2R1A, PTEN, PTPN11, RUNX1, SETD 2, SMARCA4, SMARCB1, STK11, TET2, TP53, TSHR, VHL, WT1


COSMIC v50 Total Statistics

Experiments 2908403
Tumours 562843
Samples 565823
Mutant Samples 142586
Mutations 147613
Unique Mutations 25601
Papers curated 10819
Genes 18660
Fusions 5112
Structural Variants 2729


More


29th Sep 2010 COSMIC v49 Release

COSMIC v49 Release

This release of COSMIC focuses on curation of data from the scientific literature. 57 curated genes have received updates; an additional systematic candidate gene screen has added 1121 mutations, and a further full-genome screen has contributed 45 confirmed coding mutations.

Systematic screen curations:

Shah et al (2009) describes the full-genome examination of a metastatic ER+ lobular breast cancer and compares the mutation spectrum to the primary tumour, finding 32 somatic non-synonymous coding mutations in the metastasis, but only 13 in the primary:
Mutational evolution in a lobular breast tumour profiled at single nucleotide resolution. Shah SP, Morin RD, Khattra J, Prentice L, Pugh T, Burleigh A, Delaney A, Gelmon K, Guliany R, Senz J, Steidl C, Holt RA, Jones S, Sun M, Leung G, Moore R, Severson T, Taylor GA, Teschendorff AE, Tse K, Turashvili G, Varhol R, Warren RL, Watson P, Zhao Y, Caldas C, Huntsman D, Hirst M, Marra MA, Aparicio S. (2009) Nature 461:809-13.

A further analysis expanding that of Sjoblom et al (2006), Wood et al (2007) examines the same tumour set through a set of RefSeq genes additional to the earlier analysis of CCDS genes. 1121 mutations were identified, expanding the mutation spectrum defined in the earlier publication:
The genomic landscapes of human breast and colorectal cancers. Wood LD, Parsons DW, Jones S, Lin J, Sjoblom T, Leary RJ, Shen D, Boca SM, Barber T, Ptak J, Silliman N, Szabo S, Dezso Z, Ustyanksky V, Nikolskaya T, Nikolsky Y, Karchin R, Wilson PA, Kaminker JS, Zhang Z, Croshaw R, Willis J, Dawson D, Shipitsin M, Willson JK, Sukumar S, Polyak K, Park BH, Pethiyagoda CL, Pant PV, Ballinger DG, Sparks AB, Hartigan J, Smith DR, Suh E, Papadopoulos N, Buckhaults P, Markowitz SD, Parmigiani G, Kinzler KW, Velculescu VE, Vogelstein B. (2007) Science. 318:1108-13.

These curated genes have been updated this release

ABL1, AKT1, ALK, APC, ASXL1, ATM, BRAF, BRCA1, CBL, CDKN2A, CEBPA, CRLF2, CTNNA1, CTNNB1, CYLD, EGFR, EML4, ERBB2, EZH2, FAM123B, FBXW7, FGFR2, FLT3, FOXL2, GATA1, GNAQ, GNAS, HRAS, IDH1, IDH2, JAK1, JAK2, KIT, KRAS, MEN1, MET, MPL, NOTCH1, NPM1, NRAS, PDGFRA, PHOX2B, PIK3CA, PTCH1, PTEN, PTPN11, RB1, RUNX1, SETD2, SMARCA4, STK11, TET2, TNFAIP3, TP53, TSHR, VHL, WT1


COSMIC v49 Total Statistics

Experiments 2888511
Tumours 548399
Samples 551325
Mutant Samples 138836
Mutations 143772
Unique Mutations 25079
Papers curated 10578
Genes 18647
Fusions 5050
Structural Variants 2306


More


27th Jul 2010 COSMIC v48 Release

COSMIC v48 Release

This release brings the majority of curated p53 mutation data into COSMIC in collaboration with IARC, significantly improving our coverage of the key cancer genes. Other new curated genes are TET2 & SETD2, together with ten new fusion pairs. Two new systematic screens are also included. The system has been updated to provide genomic coordinates on both NCBI36 and the later GRCh37 genome builds.

TP53

In collaboration with the p53 group at IARC (http://www-p53.iarc.fr/), we have imported the majority of p53 mutation data into COSMIC. The system has been previously lacking substantial coverage of this gene, since it has been fully curated at IARC. However, our new collaboration has brought these two datasets together in COSMIC. Over 73% of the IARC Somatic dataset is now present in COSMIC, comprising a total of 20129 samples mutated, of 66242 samples analysed. The remaining of IARC's curated p53 data will become available in a later release. This p53 data can be viewed here.

Full mutation details have been curated from the scientific literature for two new genes and are now available:

SETD2, a tumour suppressor gene, encodes a histone H3 lysine 36 methyltransferase and is inactivated in clear cell renal carcinoma.

TET2, The tumour suppressor gene TET2 (ten-eleven-translocation gene, 4q 24) was found to be heterozygously deleted in MDS and Leukaemia patients whose remaining copy carried a somatic point mutation. Subsequently a wide spectrum of somatic mutations - often nonsense or frame shifts - have been found in a variety of myeloproliferative neoplasms, leukaemias (AML, sAML, CMML) and mastocytosis; These mutations are thought likely to be an early event in the pathology of these diseases and there is some evidence that TET2 is both a tumour suppressor and haematopoietic regulator.

10 new fusion gene pairs have been curated from the scientific literature:

ASPSCR1-TFE3, A characteristic translocation in alveolar soft part sarcoma results in the fusion of the N-terminal region of ASPSCR1 to the C-terminal region of TFE3. Two alternative fusion breakpoints are observed in TFE3 resulting in expression of 2 distinct fusion transcripts. ASPSCR1-TFE3 is also found in a subset of renal cell carcinomas.

PRCC-TFE3, SFPQ-TFE3, NONO-TFE3, CLTC-TFE3 Papillary renal carcinomas have fusions involving the TFE3 transcription factor gene. Most commonly the fusions are ASPSCR1-TFE3 or PRCC-TFE3 but variant translocations have also been identified in which TFE3 is fused to SFPQ, NONO or CLTC.

ETV6-NTRK3,A recurrent rearrangement in congenital (infantile) fibrosarcoma fuses the helix-loop-helix protein dimerization domain of ETV6 with the protein tyrosine kinase domain of NTRK3. ETV6-NTRK3 fusions are also found in congenital mesoblastic nephroma, a pathogenetically related tumour, and in a rare form of breast cancer, secretory carcinoma.

SLC45A3-BRAF, ESRP1-RAF1, AGTRAP-BRAF While SLC45A3-BRAF and ESRP1-RAF1 fusions have been identified in ETS rearrangement-negative prostate cancers, AGTRAP-BRAF has been found in stomach cancer; all highlighting the role of RAF pathway fusions in solid tumours.

MYB-NFIB,A recurrent fusion of the MYB oncogene to NFIB, a member of the human nuclear factor I gene family of transcription factors, has been identified in adenoid cystic carcinomas of the breast and the head and neck. The common feature of multiple splice variations is the deletion of exon 15 of MYB and its 3'-UTR.

Systematic screen curations:

Mardis et al (2009) describes the full-genome sequencing of a single AML tumour, resulting in 12 coding mutations and 52 confirmed high quality non-coding mutations, with a follow-up study examining a set of 187 AML tumours through the genes mutated in the primary sample.

Ding et al (2008) examines 188 lung adenocarcinomas through 623 genes known to be involved with cancer. 1013 mutations were described, the most mutated genes being TP53, KRAS, STK11 and EGFR.

Genome co-ordinate update to GRCh37

All genes and mutations with NCBI36 genome co-ordinates have now been updated to GRCh37. New DAS tracks have been created, and the website and data exports have been altered to include both co-ordinate systems.

These curated genes have been updated this release

ABL1, ACVR1B, AKT1, ALK, APC, ATM , BRAF, BRCA1, BRCA2, CBL, CDC73, CDH1, CDKN2A, CEBPA, CSF1R, CTNNB1, CYLD, EGFR, EML4, ERBB2, ERG, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, FOXL2, GATA1, GNAQ, GNAS, HNF1A, HRAS, IDH1, IDH2, JAK1, JAK2, JAK3, KDR, KIT, KRAS, MAP2K4, MEN1, MET, MLH1, MPL, MSH2, MSH6, NF1, NF2, NOTCH1, NOTCH2, NPM1, NRAS, PDGFRA, PHOX2B, PIK3CA, PRKAR1A, PTCH1, PTEN, PTPN11, RB1, RET, RUNX1, SETD2, SMAD4, SMARCB1, SMO, SOCS1, SRC, STK11, SUFU, TET2, TP53, TSHR, VHL, WT1


COSMIC v48 Total Statistics

Experiments 2760220
Tumours 541928
Samples 544809
Mutant Samples 136326
Mutations 141212
Unique Mutations 23907
Papers curated 10383
Genes 18490
Fusions 4946
Structural Variants 2306


More


24th May 2010 COSMIC v47 Release

COSMIC v47 Release

This latest release of COSMIC includes six new point-mutated genes with full literature curation, together with curation of five new fusion genes involving ALK. Additionally, recent updates to the public TCGA Glioblastoma dataset have been included, as have a number of whole gene deletions interpreted from SNP6.0 microarray data. Recent publications have been curated to update sixty one other curated genes.

New curated genes

GATA2, This gene encodes a member of the GATA family of zinc-finger transcription factors and the encoded protein plays an essential role in regulating transcription of genes involved in normal haematopoietic cell differentiation and survival. Somatic mutations in GATA2 are associated with acute myeloid transformation in a subset of chronic myelogenous leukaemia (CML) patients.

GATA3, This gene encodes a protein which belongs to the GATA family of transcription factors. The protein contains two GATA-type zinc fingers; it is a regulator of T-cell development and plays an important role in endothelial cell biology. Germline mutations in this gene are found in individuals with HDR syndrome (hypoparathyroidism with sensorineural deafness and renal dysplasia). These mutations cluster in the region of the highly conserved second zinc finger. Somatic mutations in the same region have been identified in tumour tissue from both familial and sporadic breast cancer patients.

EZH2, This gene encodes a member of the Polycomb-group (PcG) family; a histone methyltransferase responsible for trimethylating Lys327 of histone H3 (H3K27). Somatic mutations have been found within the catalytic SET domain in diffuse large B-cell lymphoma and follicular lymphoma.

KDR, KDR encodes the kinase insert domain receptor, one of the two receptors of the VEGF (Vascular endothelial growth factor) - a major growth factor for endothelial cells. It is a vascular specific, type III receptor tyrosine kinase. Germline mutations of this gene are implicated in infantile capillary haemangiomas. Somatic mutations have been found in samples derived from angiosarcomas of the breast/chest wall.

CRLF2, The type I cytokine receptor subunit CRLF2 (thymic stromal lymphopoietin receptor, TSLPR) has been identified as a proto-oncogene in adult and high risk paediatric B-ALL. Over-expression is common in cases lacking rearrangements of TEL, MLL, TCF3 and BCR-ABL and can result from CRLF2 rearrangement or mutually exclusive somatic gain of function point mutations including those in CRLF2, JAK1 or JAK2. The predominant CRLF2 mutation, Phe232Cys, promotes constitutive dimerization and cytokine independent growth.

JAK1,A member of the Janus kinase family comprising JAK's 1, 2 and 3 as well as Tyk2, activating JAK1 somatic mutations have been found in the SH, pseudokinase and kinase domains of T-ALL, B-ALL and, more rarely, AML patients. Mutations have also been found in a variety of non-haematopoietic cancers. Mutations include JAK1 V658F, corresponding to the JAK2 V617F mutation commonly found in PV and ET as well as other MPNs; and R724H, corresponding to JAK2 R683 and JAK3 R657, mutations of which have been found in DS-ALL/B-ALL and DS-AMKL respectively.

New curated ALK fusion genes:

The following gene fusions have been curated from the scientific literature:
ATIC-ALK, CARS-ALK, TFG-ALK, TPM3-ALK, TPM4-ALK Variant ALK fusions, including ATIC-ALK, TPM3-ALK, TPM4-ALK and TFG-ALK, have been identified in ALK-positive anaplastic large cell lymphoma. Each translocation product retains the ALK kinase domain. ALK activation is also a recurrent oncogenic event in inflammatory myofibroblastic tumours, where this is sometimes achieved through fusion with ATIC, TPM3, TPM4 or CARS.

TCGA Glioblastoma update

We have updated our recent curation of the TCGA somatic Glioblastoma mutation data, now including phase II data direct from the public TCGA data portal. The combined data can be browsed here.

Statistics

Glioblastoma Samples 424
Genes 1,212
Sequencing Experiments 513,888
Mutations 1,243

Whole gene deletions

55 whole gene or whole-exon deletions have been defined in the core cell lines by interpretation of SNP6.0 microarray data. While many of these have been confirmed, the few unverified mutations are presented with links to the originating data for independent examination. Mutations involving the deletion of whole gene sequences have been reannotated "p.0?" in line with current HGVS recommendations.

New search system

COSMIC is now combined into the new Sanger-wide search system. This allows much richer searching, using multiple terms such as "BRAF melanoma", to much more easily find data without complex navigation through the website. The system additionally searches other Sanger genomic data, giving indications where compatible data might be found elsewhere on the Institute website.

These curated genes have been updated this release

ABL1, AKT1, ALK, APC, ASXL1, ATM, BRAF, BRCA1, BRCA2, CBL, CDH1, CDKN2A, CEBPA, CSF1R, CTNNB1, CYLD, EGFR, ERBB2, EZH2, FAM123B, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, GATA1, GATA2, GATA3, HNF1A, HRAS, IDH1, IDH2, JAK1, JAK2, JAK3, KDR, KIT, KRAS, MAP2K4, MEN1, MET, MLH1, MPL, MSH2, MSH6, NF1, NF2, NOTCH1, NPM1, NRAS, PDGFRA, PIK3CA, PTCH1, PTEN, PTPN11, RB1, RET, RUNX1, SMAD4, SMARCA4, SMO, SOCS1, SRC, STK11, SUFU, VHL, WT1


COSMIC v47 Total Statistics

Experiments 2564761
Tumours 464139
Samples 466851
Mutant Samples 113286
Mutations 116977
Unique Mutations 20090
Papers curated 9202
Genes 18485
Fusions 4722
Structural Variants 2306


More


8th Mar 2010 COSMIC v46 Release

COSMIC v46 Release

The second full-genome resequencing study from the CGP at the Sanger Institute, UK is now available, together with the curation of Parsons et al (2008), a systematic candidate gene screen of Glioblastomas. In addition, the published literature has been fully curated for fusion mutations between seven new gene pairs.

Full Genome resequencing of NCI-H209

The recent Pleasance et al (2010) publication "A small-cell lung cancer genome with complex signatures of tobacco exposure" (Nature 463, 184-190) is now available within COSMIC; please click here.

Systematic Screen Curation

The largest published candidate gene screen of Glioblastomas Parsons et al (2008), is now curated in COSMIC; please click here:

An integrated genomic analysis of human glioblastoma multiforme. Parsons DW, Jones S, Zhang X, Lin JC, Leary RJ, Angenendt P, Mankoo P, Carter H, Siu IM, Gallia GL, Olivi A, McLendon R, Rasheed BA, Keir S, Nikolskaya T, Nikolsky Y, Busam DA, Tekleab H, Diaz LA, Hartigan J, Smith DR, Strausberg RL, Marie SK, Shinjo SM, Yan H, Riggins GJ, Bigner DD, Karchin R, Papadopoulos N, Parmigiani G, Vogelstein B, Velculescu VE, Kinzler KW Science. 2008;321;1807-12. PMID: 18772396 DOI: 10.1126/science.1164382

Statistics

Samples 105
Mutations 2449

Fusion mutations between 7 new gene pairs have been curated from the literature for this release.

FUS-ERG , FUS-FEV , FUS-ATF1 Both FUS-ERG and FUS-FEV fusions have been identified as alternatives to EWSR1-ETS transcription factor fusions in Ewing's sarcoma, and FUS-ERG also occurs in t (16,21) myeloid leukaemia as well as in these solid tumours. FUS-ATF1 is found in angiomatoid fibrous histiocytoma, where the fusion of the N-terminus of FUS and the DNA binding domain of ATF1 is similar to the EWSR1-ATF1 fusion found in clear cell sarcoma.

SS18-SSX1 This fusion is characteristic for synovial sarcoma along with SS18-SSX2 and more rarely, SS18-SSX4 fusions. Through its N-terminal SNH domain SS18 protein is involved in the remodelling of chromatin structures and functions as a transcriptional activator whereas SSX proteins have 2 putative transcription-repressor domains, one of which, an SSXRD domain in the C-terminal region, is preserved in the fusion protein.

SRGAP3-RAF1 This oncogenic fusion has been identified in paediatric pilocytic astrocytoma as an alternative to the previously described KIAA1549-BRAF fusion. It also activates the ERK/MAPK pathway; the auto-inhibitory domain of RAF1 being replaced by SRGAP3.

COL1A1-PDGFB This recurrent fusion characterizes dermatofibroma protuberans and its juvenile form, giant cell fibroblastoma. The fusion consistently deletes exon 1 of PDGFB releasing this growth factor from its normal regulation. The breakpoints in COL1A1, which encodes an extracellular matrix protein, occur in various exons in the alpha-helical domain.

JAZF1-SUZ12 A fusion involving these two genes is common but not universal in endometrial stromal sarcomas, occurring less frequently in high-grade tumours. The genes encode novel proteins with zinc finger motifs and these are retained in the fusion.

The following curated genes have been updated in this release

ABL1, ACVR1B, AKT1, ALK, APC, ASXL1, ATM, BRAF, BRCA1, BRCA2, CBL, CDC73, CDH1, CDKN2A, CEBPA, CSF1R, CTNNA1, CTNNB1, CYLD, EGFR, EML4, ERBB2, ERG, FAM123B, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, FOXL2, GATA1, GNAQ, GNAS, HNF1A, HRAS, IDH1, IDH2, JAK2, JAK3, KIT, KRAS, MAP2K4, MEN1, MET, MLH1, MPL, MSH2, MSH6, NF1, NF2, NOTCH1, NOTCH2, NPM1, NRAS, PDGFRA, PHOX2B, PIK3CA, PRKAR1A, PTCH1, PTEN, PTPN11, RB1, RET, RUNX1, SMAD4, SMARCA4, SMARCB1, SMO, SOCS1, SRC, STK11, SUFU, TNFAIP3, TSHR, VHL, WT1

COSMIC v46 Total Statistics


Experiments 2077858
Tumours 449676
Samples 451972
Mutant Samples 108773
Mutations 112256
Unique Mutations 19239
Papers curated 8911
Genes 18478
Fusions 4657
Structural Variants 2307


More


21st Jan 2010 COSMIC v45 Release

COSMIC v45 Release

The first full-genome resequencing study is now available, together with the genome-wide rearrangement screens of 24 breast tumours. In addition, five new cancer genes have been curated from the literature.

To make the data easier to investigate in depth, the website has been upgraded with new specialisation features, together with new views on mutation spectrum and distribution. Finally, we are introducing a new COSMIC Biomart, where all COSMIC's information will be available in this industry-standard data mining tool.

Full Genome resequencing of COLO-829

The recent Pleasance et al (2010) publication "A comprehensive catalogue of somatic mutations from a human cancer genome" (Nature 463, 191-196) is now available within COSMIC; please click here.

Whole-genome rearrangement screen of 24 Breast tumour samples:

Also, the CGP Stephens et al (2009) paper "Complex landscapes of somatic rearrangement in human breast cancer genomes" (Nature 462, 1005-1010) is now available in COSMIC; please click here . A paired-end genome-wide Illumina sequencing strategy revealed numerous rearrangements in very diverse patterns between the samples examined.

New genes curated from the scientific literature

GNAQ is the alpha subunit of one of the heterotrimeric GTP-binding proteins that mediate stimulation of protein kinase C signalling. Mutations in GNAQ, occurring at codon 209 in the catalytic domain, have been found as common and early mutational events in uveal melanomas.

TNFAIP3 is a negative regulator of the NF-kappa B pathway functioning through the removal of activating Lys63-linked ubiquitins and the Lys48-linked ubiquitination of receptor-interacting proteins. TNFAIP3 has been shown to be a genetic target in B-lineage lymphomas such as mucosa-associated lymphoma and Hodgkin's lymphoma of nodular sclerosing histology.

CBL encodes a protein with multiadaptor function and E3 ubiquitin ligase activity that targets a variety of tyrosine kinases for degradation. Mutations in CBL have been identified in myeloid malignancies, occurring in the critical linker and ring finger domains of the protein.

JAK3 is a member of the non-receptor tyrosine kinase family which includes JAK2. Rare but significant JAK3 activating mutations located in the JH2 (pseudokinase) and JH6 (receptor binding) domains have been found in Down syndrome and Non-DS acute megakaryoblastic leukaemia (AML-M7). Mutations have also been found in various myeloproliferative neoplasms, lymphomas and carcinomas.

NOTCH2 is a Type 1 transmembrane protein with an extracellular domain consisting of multiple epidermal growth factor-like (EGF) repeats, and an intracellular domain consisting of multiple different domain types. The Notch2 receptor and its 5 ligands, which include Jagged1, Jagged2, and Delta-like 1, 3 and 4, send signals that are important for development before birth. After birth,Notch2 signaling is involved in tissue repair. Mutations in the NOTCH2 gene have been identified in a small percentage of people with Alagille syndrome and malformations in the kidneys, especially in filtering structures. NOTCH2 is also preferentially expressed in mature B cells,is essential for marginal zone B-cell generation, and mutations are evident in a subset of individuals with diffuse large B-cell lymphomas.

Web site enhancements

The main histogram page of the COSMIC website had been improved to provide better ways of selecting and viewing subsets of data. In the navigation bar on the left side, new options are now available to redraw the histogram and associated tables based on four parameters: mutation type (eg deletion, nonsense substitutions, etc), sample source (cultured or tissue sample), somatic status (confirmed somatic or unknown) and systematic screen (genome-wide screen). In addition to redrawing the histogram and tables, a new "Distribution" button displays pie charts of relevant information about the data selected.

The sample summary page has also been upgraded, with every CGP sample (examined through numerous genes) receiving a mutation spectrum diagram. This comprises a histogram showing the relative frequencies of each substitution type, together with a count of insertion/deletion mutations. This is highly useful when looking for mutation signatures which may show characteristsics of, for instance, tobacco or UV light exposure.

Biomart

The new COSMIC biomart is now available, please click here. This system allows much more specialised selection of data in COSMIC and is very useful for data mining. In addition, it can be directly linked to Ensembl for federilsed querying across both databases.

The following curated genes have been updated in this release

JAK2, JAK3, MAP2K4, GNAS, MPL, SOCS1, WT1, CYLD, FBXW7, MEN1, NF1, RUNX1, ASXL1, NOTCH2, IDH1, IDH2, APC, CDH1, VHL, GNAQ, BRAF, HRAS, CEBPA, CTNNB1, FLT3, KIT, PDGFRA, PTEN, RB1, RET, SMARCB1, AKT1, EGFR, ERBB2, CDKN2A, CBL, GATA1, NPM1, PTPN11, NRAS, FGFR3, BRCA1, MSH6, PRKAR1A, KRAS, PIK3CA, MET, TNFAIP3

COSMIC v45 Total Statistics

Experiments 1654274
Tumours 434364
Samples 436577
Mutant Samples 101860
Mutations 105171
Unique Mutations 16788
Papers curated 8624
Genes 13634
Fusions 3635
Structural Variants 2249


More


4th Nov 2009 COSMIC v44 Release

COSMIC v44 Release

This release of COSMIC includes 4 new curated genes, 8 new curated fusion pairs and the TCGA systematic screen publication of 91 Glioblastoma tumour samples. In addition, a new CGP study is available (Adenoid cystic carcinoma) together with substantial updates to existing data.

New curated genes

IDH2 encodes a mitochondrial NADP(+)-dependent isocitrate dehydrogenase which catalyzes oxidative decarboxylation of isocitrate to alpha-ketoglutarate. It is now implicated in the pathogenesis of malignant gliomas and some secondary glioblastomas lacking IDH1 mutations have IDH2 mutations at the analogous amino acid (R172).

AKT1 encodes a serine-threonine protein kinase which is activated by phosphorylated phosphoinositides and is a central mediator of the PI3kinase signalling pathway. A common mutation (E17K) has been identified in the pleckstrin homology domain in cancers of the colon, breast, lung and ovary.

ASXL1 belongs to a family of proteins regulating chromatin remodelling. Originally implicated via aCGH on MDS/AML samples, mutations are mainly frameshift mutations, the predicted truncated proteins lack the PHD finger domain potentially compromising the function of the associated chromatin modifiers.

FOXL2, forkhead box L2 is a winged helix/forkhead transcription factor gene, encoding a nuclear protein that is specifically expressed in eyelids and in fetal and adult ovarian follicular cells. Germline mutations in FOXL2 are responsible for BPES - blepharophimosis ptosis epicanthus inversus syndrome - an autosomal dominant disorder consisting of eyelid abnormalities (only, in Type II) and ovarian failure (Type I). Somatic mutations have recently been described in ovarian granulosa cell tumours.

New curated gene fusion pairs:

The following gene fusions have been curated from the scientific literature:
EML4 / ALK
MSN / ALK
NPM1 / ALK
CLTC / ALK
SEC31A / ALK
RANBP2 / ALK
SS18 / SSX2
SS18 / SSX4

Systematic screen curation:

Comprehensive genomic characterization defines human glioblastoma genes and core pathways.The first systematic screen of the Cancer Genome Atlas Research Network (PMID 18772890) is now curated in COSMIC .

Comprehensive genomic characterization defines human glioblastoma genes and core pathways.
Cancer Genome Atlas Research Network
Nature. 2008;455;1061-8. PMID: 18772890 DOI: 10.1038/nature07385

Statistics


Glioblastoma Samples 91
Genes 599
Sequencing Experiments 54509
Mutations 662

New CGP resequencing study: Adenoid Cystic Carcinoma Candidate Gene Screen

Adenoid cystic carcinoma is a slow growing tumour of the secretory glands, arising most commonly in the salivary glands but also occurring in other parts of the body. As part of an ongoing research effort funded by the Adenoid Cystic Carcinoma Research Fund (www.accrf.org), 400 candidate gene (including genes implicated in cancer, cell signaling and growth control) were sequenced for small point mutations. This work was carried out on 25 samples (provided by ACCRF collaborative research group member Dr. Adel El-Naggar) utilising an approach of PCR product generation for the entire set of PCR amplimers followed by individual concatentation of all amplimers for each tumour and matching normal DNA sample, then sequencing this material utilising next generation sequencing. In total 8 somatic point mutations were identified in 8 genes. No highly prevalent point mutation was identified in this set of genes.

These curated genes have been updated this release

KRAS, PIK3CA, FGFR2, MET, ABL1, FGFR1, JAK2, MAP2K4, GNAS, EML4, FOXL2, PTCH1, MPL, SOCS1, HNF1A, WT1, NF2, CYLD, FBXW7, MEN1, NF1, RUNX1, IDH1, IDH2, ASXL1, FAM123B, APC, CDH1, SMAD4, VHL, BRAF, HRAS, CEBPA, CSF1R, CTNNB1, FLT3, KIT, PDGFRA, PTEN, RB1, RET, SMARCB1, SUFU, ACVR1B, AKT1, ALK, ATM, EGFR, ERBB2, SRC, STK11, CDKN2A, GATA1, SMO, NOTCH1, NPM1, PTPN11, NRAS, FGFR3, BRCA1, BRCA2, MLH1, MSH2, MSH6, PRKAR1A

COSMIC v44 Total Statistics

Experiments 1631186
Tumours 419018
Samples 421193
Mutant Samples 97932
Mutations 101138
Unique Mutations 16072
Papers curated 8336
Genes 13501
Fusions 3521
Structural Variants 40


More


26th Aug 2009 COSMIC v43 Release

COSMIC v43 Release

The COSMIC curation systems have been extended to encompass the entry of large-scale systematic screen papers. For this release, we have entered the first such paper, the Sjoblom et al (2006) screen of human breast and colorectal cancers. This release also contains two new genes successfully curated from the scientific literature (IDH1, SMARCA4) and the finalisation of two of the Cancer Genome Project's current resequencing studies.

Systematic Screen Papers Curated in COSMIC

For this release of COSMIC we have entered the Sjoblom et al (2006) systematic screen paper of human breast and colorectal cancers. An additional 8,648 genes have been added to COSMIC along with the 1,672 mutations from the paper. The COSMIC reference overview page for this publication is available here.

The consensus coding sequences of human breast and colorectal cancers. Sjoblom T, Jones S, Wood LD, Parsons DW, Lin J, Barber TD, Mandelker D, Leary RJ, Ptak J, Silliman N, Szabo S, Buckhaults P, Farrell C, Meeh P, Markowitz SD, Willis J, Dawson D, Willson JK, Gazdar AF, Hartigan J, Wu L, Liu C, Parmigiani G, Park BH, Bachman KE, Papadopoulos N, Vogelstein B, Kinzler KW, Velculescu VE. Science. 2006 Oct 13;314(5797):268-74. Epub 2006 Sep 7. PMID: 16959974

CGP resequencing studies completed

The resequencing of candidate genes in Pilot and Renal tumour sets has now been completed. The finalised studies examined 2978 samples through 4766 genes, discovering a total of 5437 mutations. All of these can be found in COSMIC's CGP Resequencing Studies Site.

New curated genes

IDH1 is a catalytic enzyme causing NADP+ dependent oxidative decarboxylation of isocitric acid. It plays an important role in the control of glucose-stimulated insulin secretion and the cholesterol and fatty acid biosynthetic pathways. Originally implicated in human cancer in genome-wide sequencing scans, when mutated it is an indicator for the longer survival of these patients.

SMARCA4, is a scaffold protein, forming a functional part of the SWI/SNF complex involved in the control of transcription.

These curated genes have been updated this release

FBXW7, MEN1, NF1, BRAF, HRAS, CSF1R, CTNNB1, FLT3, KIT, PDGFRA, PTEN, RET, SMARCB1, SUFU, ACVR1B, ATM, EGFR, ERBB2, SRC, CDKN2A, FAM123B, GATA1, SMO, NOTCH1, NPM1, PTPN11, NRAS, FGFR3, BRCA1, BRCA2, APC, CDH1, SMAD4, VHL, TSHR, MLH1, MSH2, MSH6, SMARCA4, RUNX1, PHOX2B, GNAS, KRAS, PIK3CA, FGFR2, FGFR1, IDH1, JAK2, JAK3, MAP2K4, TET2, PRKAR1A, CDC73, PTCH1, MPL, CTNNA1, SOCS1, HNF1A, WT1, ERG, NF2

COSMIC v43 Total Statistics


Experiments 1506545
Tumours 366477
Samples 368592
Mutant Samples 85749
Mutations 88727
Unique Mutations 14971
Papers curated 7797
Genes 13423
Fusions 2770
Structural Variants 40


More


28th May 2009 COSMIC v42 Release

COSMIC v42 Release

For this release of COSMIC two known cancer genes (GNAS and ALK) and 3 gene fusions (FCHSD1 / BRAF, KIAA1549 / BRAF, EWSR1 / NR4A3) have been successfully curated from the scientific literature. The Cancer Cell Line Project has also been updated with the addition of 80 mutations.


Cancer Cell Line Project Update

The Cancer Cell Line Data has been updated with the addition of 80 mutations. The project has also published a further set of variants identified by the screen which have been classified as Tentatively Oncogenic Variant (TOV) or Unknown Variant (UV). These variants are currently available from our website as an excel file.

Curation of known cancer genes ALK and GNAS

Two further cancer genes have been curated with the addition of 95 mutations for ALK and 235 mutations for GNAS.

Curation of gene fusions

The following gene fusions have been curated from the scientific literature:
FCHSD1 / BRAF
KIAA1549 / BRAF
EWSR1 / NR4A3

Genes updated: KRAS, PIK3CA, ABL1, FGFR1, JAK2, BRAF, HRAS, CEBPA, CSF1R, CTNNB1, KIT, PDGFRA, PTEN, RB1, SUFU, ERBB2, SRC, STK11, CDKN2A, GATA1, SMO, NPM1, PTPN11, NRAS, BRCA2, MLH1, MSH2, MSH6, APC, CDH1, SMAD4, MET, EGFR, FLT3, PTCH1, MPL, WT1, CYLD, FBXW7, NF1, ALK, FGFR3, RET, NOTCH1, NF2, GNAS

COSMIC v42 Total Statistics


Experiments 1111579
Tumours 339481
Samples 341522
Mutant Samples 76132
Mutations 78933
Unique Mutations 12905
Papers curated 7386
Genes 4775
Fusions 2424
Structural Variants 40


More


4th Mar 2009 COSMIC v41 release

COSMIC v41 release

This release of COSMIC comprises an update of published data in which 44 genes have been updated with the addition of 22516 samples and a further 7387 mutations.


Gene Update

STK11, CDKN2A, GATA1, SMO, NPM1, PTPN11, NRAS, BRCA2, MSH2, KRAS, PIK3CA, JAK2, MAP2K4, BRAF, HRAS, CEBPA, CTNNB1, KIT, PDGFRA, PTEN, RB1, ATM, ERBB2, FBXW7, NF1, FAM123B, APC, CDH1, VHL, MET, EGFR, FLT3, PTCH1, MPL, SOCS1, HNF1A, WT1, CYLD, FGFR3, RET, RUNX1, TSHR, PHOX2B, NOTCH1.


COSMIC v41 Total Statistics

Experiments 1078748
Tumours 313780
Samples 315778
Mutant Samples 70086
Mutations 72718
Unique Mutations 12349
Papers curated 6876
Genes 4773
Fusions 2266
Structural Variants 40


More


26th Nov 2008 COSMIC release 40

COSMIC release 40

This release of COSMIC comprises an update of the existing genes totalling almost 3000 new mutations.

Gene Update
2947 new mutations have been added in release 40; the following curated genes have been updated: BRAF, HRAS, CEBPA, CTNNB1, KIT, PDGFRA, PTEN, ERBB2, MLH1, MSH2, KRAS, PIK3CA, JAK2, CDKN2A, GATA1, NPM1, PTPN11, NRAS, FAM123B, APC, VHL, MET, EGFR, FLT3, PTCH1, MPL, HNF1A, FBXW7, PRKAR1A, RUNX1, FGFR3, RET, TSHR, NOTCH1

Cancer Gene Census

On the 5th November, the Cancer Gene Census was brought up to date, with the addition of three genes newly identified in the causation of cancer, IDH1, MDM2, KIAA1549.

COSMIC v40 Total Statistics

Experiments 1050480
Tumours 291551
Samples 293262
Mutant Samples 62930
Mutations 65331
Unique Mutations 11789
Papers curated 6486
Genes 4773
Fusions 2266
Structural Variants 40


More


15th Oct 2008 COSMIC release 39, Annotating Cancer Genomes

COSMIC release 39, Annotating Cancer Genomes

For this release of COSMIC the database and web interfaces have been upgraded to handle Next Generation Sequencing Data. This is part of ongoing work to allow COSMIC to handle the increased volumes and complexity of somatic data that is anticipated from Next Generation Sequencers. In particular, for this release we have concentrated on adapting COSMIC to handle large-scale structural variants (including translocations, large insertions/deletions, inversions, and duplications).

The structural variants from the Campbell et al. 2008 paper, which comprehensively characterizes 2 lung cancer cell lines, have been entered into COSMIC (click here for study overview). Sample Summary pages are available for both cancer cell lines (NCI-H2171 and NCI-H1770).

Circular plots (Circos plots developed by Martin Krzywinski) have been added to the sample overview page which gives a clear overview of all the structural variants along with copy number changes and COSMIC point mutations for a particular sample (Figure 1). More detailed views of complex rearrangements are available on the mutation details page.


Circos Image Unavailable
Figure 1. Circos Plot showing structural variants in relation to copy number and COSMIC Point Mutations.

Tabular views and exports are also available for these data (Figure 2). Due to the complexity of these rearrangements, where possible, a short description term of the variant is given (e.g. deletion, tandem duplication translocation). The variant is also fully described using HGVS mutation nomenclature. For example chr11:g.36585230_76606619del, where chr11: denotes the chromosome involved, g. for genomic coordinates, 36585230 for the deletion start point, 76606619 for deletion end point and del indicates a deletion event.


Figure 2 Unavailable
Figure 2. Summary Structural Variants Table


Bioinformatics Primer on COSMIC published

NCI/Nature Pathway Interaction Database Primer on COSMIC published and is available from here.


Update of the Cancer Gene Census

The Cancer Gene Census was updated on 11th August 2008. The Census now contains information of 379 genes of which 343 harbour somatic alterations and 70 germline.


COSMIC v39 General Statistics

Experiments 1035943
Tumours 281307
Samples 282777
Mutant Samples 60007
Mutations 62352
Unique Mutations 11642
Papers curated 6168
Genes 4773
Fusions 2266
Structural Variants 40


More


3rd Jul 2008 COSMIC release 38

COSMIC release 38

For this release of COSMIC we have concentrated our efforts on significantly updating the following genes: BRAF, HRAS, CTNNB1, KIT, PDGFRA, PTEN, RB1, ERBB2, MAP2K4, CDKN2A, GATA1, SMO, NPM1, NRAS, MLH1, MSH2, KRAS, PIK3CA, JAK2, APC, SMAD4, EGFR, FLT3, PTCH1, MPL, HNF1A, FBXW7, NF1, FGFR3, RET, NF2, NOTCH1.

External links

In collaboration with the Human Gene Nomenclature committee (HGNC) and the Atlas of Genetics and Cytogenetics in Oncology and Haematology (Atlas Genetics Oncology), links are now available from COSMIC's gene summary page to further information at these resources.



A Current Protocol for COSMIC

An article describing COSMIC, its contents and usage, has been published in Current Protocols in Human Genetics, unit 10.11. Describing in detail how the website and exported datasheets may be used and interpreted, this is available at the Wiley Interscience website.



COSMIC v38 total statistics


Experiments 1019304
Tumours 268938
Samples 270095
Mutant Samples 56918
Mutations 59187
Unique Mutations 11400
Papers curated 5902
Genes 4773
Fusions 2266


More


7th May 2008 COSMIC release 37

COSMIC release 37

This months release extends our complete curation of oncogenic EWSR1 fusion partners, together with two new curated genes, PHOX2B & PRKAR1A. CGP's resequencing studies and cell line projects are also significantly updated, each receiving over 100 new mutations. In total, over 1200 new mutations have been added to COSMIC this release.


Curated genes

PHOX2B This gene encodes a highly conserved homeobox transcription factor known to cause congenital central hypoventilation syndrome with associated neuroblastoma.

PRKAR1A This is a regulatory subunit of the cAMP dependent protein kinase holoenzyme. An apparent tumour suppressor gene, it has also been observed to be oncogenic in fusions with RET and RARA.


Gene Unique Samples Samples Experiments Mutants Papers Mutations Unique Mutations
PHOX2B 410 410 411 6 4 6 5
PRKAR1A 232 232 233 7 5 7 7


Curated EWSR1 fusions

EWSR1/ETV4; EWSR1/FEV; EWSR1/PATZ1; EWSR1/PBX1; EWSR1/POU5F1; EWSR1/ZNF384

EWSR1 has been observed in oncogenic gene fusions with over 15 partners. This month we release our curation of the literature describing its fusion with a further six partners, bringing the total to 14.

Genes Unique Breakpoints Mutations Unique Fusions Papers Mutant Samples
EWSR1 / ETV4 3 6 4 2 3
EWSR1 / FEV 5 10 4 4 5
EWSR1 / PATZ1 1 2 2 1 1
EWSR1 / PBX1 1 3 3 1 1
EWSR1 / POU5F1 5 10 4 2 5
EWSR1 / ZNF384 2 4 4 1 2

The following curated genes have received significant updates: BRAF, HRAS, KIT, PTEN, RB1, SMARCB1, ERBB2, STK11, CDKN2A, PTPN11, NRAS, BRCA2, MLH1, MSH2, KRAS, PIK3CA, JAK2, APC, VHL, MSH6, MET, EGFR, MPL, FBXW7, PRKAR1A, RET, RUNX1, NOTCH1, NF2, PHOX2B.



COSMIC v37 total statistics

Experiments 1006553
Tumours 258584
Samples 259684
Mutant Samples 53569
Mutations 55779
Unique Mutations 11207
Papers curated 5706
Genes 4773
Fusions 2249


More


5th Mar 2008 COSMIC release 36

COSMIC release 36

The March 2008 release of COSMIC contains full curation of the TSHR gene together with a further 6 EWSR1 gene fusion pairs.

Curated genes

TSHR - Thyroid stimulating hormone receptor is a 7-TM cell surface receptor expressed in follicular thyroid cells. Upon binding of its ligand, thyrotropin, a signalling cascade is commenced resulting in a range of transcriptional alterations. Somatic mutations in this gene have been described in thyroid adenomas and carcinomas.

Gene Samples Experiments Mutants Papers Mutations Unique Mutations
TSHR 665 669 210 36 210 61

Curated Fusions

EWSR1/ATF1 ; EWSR1/CREB1 ; EWSR1/DDIT3 ; EWSR1/ETV1 ; EWSR1/SP3 ; EWSR1/WT1
EWSR1 is fused to multiple partner genes via recurrent chromosomal translocation in, primarily, Ewing sarcoma. We are currently curating the complete mutation data for this gene, which has so far been fused with over 10 partners; we have released our curation of EWSR1 with ERG & FLI1, we now release the data for six more gene partners.


Genes Mutant Samples Mutations Unique fusions Papers
EWSR1 / ATF1 72 175 16 17
EWSR1 / CREB1 24 36 5 3
EWSR1 / DDIT3 11 22 7 6
EWSR1 / ETV1 4 7 3 3
EWSR1 / SP3 1 3 3 1
EWSR1 / WT1 102 198 22 28

The following curated genes have received significant updates:
BRAF, BRCA1, BRCA2, CDH1, CDKN2A, CEBPA, EGFR, ERBB2, FLT3, HRAS, KRAS, MLH1, MSH2, MSH6, NF2, NRAS, PDGFRA, PTEN, SMARCB1, STK11, TSHR, VHL


COSMIC v36 total statistics

Experiments 1000842
Tumours 254673
Samples 255767
Mutant Samples 52343
Mutations 54519
Unique Mutations 10995
Papers curated 5614
Genes 4772
Fusions 2174


More


16th Jan 2008 COSMIC release 35

COSMIC release 35

This release of COSMIC contains the new curation of four new tumour suppressor genes, and further curation of EWSR1/FLI1 gene fusions in Ewing's sarcoma. We also announce a significant upgrade to the CGP Trace Archive, which is now updated daily with our latest sequencing results.



Literature Curation



MLH1

MLH1 is a tumour suppressor gene, involved in mismatch repair. The encoded protein is a subunit of the large 'BRCA1-associated genome surveillance complex' (BASC) involved in DNA damage detection and repair. This particular subunit dimerises with PMS2 to provide endonuclease capacity within the complex. MLH1 germline mutations give rise to HNPCC (hereditary non-polyposis colorectal cancer). Somatic mutations in this gene are important in sporadic colorectal cancers. Mutations of MLH1 lead to a mutator phenotype often manifested by microsatellite instability.



MSH2

MSH2 is a tumour suppressor gene, also involved in mismatch repair. It resides within the 'BRCA1-associated genome surveillance complex' (BASC) which detects and repairs DNA damage. MSH2, in complex with MSH6, forms a sliding clamp which traverses the DNA backbone detecting mismatched bases. MSH2 germline mutations also give rise to HNPCC. Similar to MLH1, somatic mutations in MSH2 are found predominantly in colorectal cancers. Mutations of MSH2 lead to a mutator phenotype often manifested by microsatellite instability.



CDC73

CDC73 (HRPT2) is a tumour suppressor forming part of the PAF protein complex, which is associated with RNA polymerase II and may therefore be involved in both initiation of RNA synthesis and RNA elongation. Mutations in this gene have been identified in tumours of the parathyroid, most often causing the endocrine disorder hyperparathyroidism (with or without jaw tumour).



MAP2K4

MAP2K4 is one part of the mitogen-activated protein kinase (MAPK) pathway, a signal transduction cascade which mediates certain extracellular signals via RAS/RAF resulting in transcriptional control of a wide range of genes. The MAP2K family of peptides regulate MAPK activity by phosphorylation. MAP2K4 mutations appear involved in many tumour types.



Gene Samples Experiments Mutations Unique Mutations Papers
MLH1 1328 1325 44 38 25
MSH2 1306 1304 36 33 23
CDC73 278 272 39 32 11
MAP2K4 1557 1559 22 19 9


EWSR1/FLI1 Gene fusions

Ewing's sarcoma is a rare bone tumour, infrequently of extraskeletal origin, most frequently occurring in teenage children. The majority of these tumours contain a t(11;22)(q24;q12) translocation which fuses the EWSR1 gene on chromosome 22 with the FLI1 gene on chromosome 11. We have now curated the existing literature describing fusions between this gene pair.



Genes Mutant Samples Papers Unique Mutations
EWSR1/FLI1 1133 115 28


The following curated genes have been updated for this release: CDKN2A, PTPN11, NRAS, MLH1, MSH2, KRAS, JAK2, MAP2K4, BRAF, HRAS, CTNNB1, MEN1, NF1, APC, VHL, EGFR, FLT3, PTCH, MPL, WT1, RET, CDC73, RUNX1, EWSR1, FLI1.



Web site upgrade

Genomic co-ordinates for individual mutations are now available in the data export section, together with the datasheets in the FTP site.



CGP Trace Archive

The CGP trace archive has been updated to contain all the sequencing traces used in our analysis of the samples and genes presented in the CGP Resequencing project (COSMIC red pages). The number of traces available for download is now approaching 9.5 million. The Archive itself has also been upgraded, so that it receives daily updates of CGP sequencing traces as they pass through our sequencing pipeline. Daily updates are available as separate files; these will be integrated into the main download files once per week.

Samples with trace data Total number of traces available
276 9465645


COSMIC Statistics


Experiments 991743
Tumours 250869
Samples 251847
Mutant Samples 50949
Mutations 53098
Unique Mutations 10779
Papers curated 5449
Genes 4763
Fusions 1957



More


8th Nov 2007 COSMIC 34

COSMIC 34

This release of COSMIC includes the addition of BRCA1, BRCA2, and EWSR1/ERG gene fusion from the scientific literature. The website has been enhanced with an update of old gene names and the addition of further links (NCBI Entrez Gene, CCDS, Swiss-Prot and TrEMBL). The CGP Trace and Genotype Archive holding the groups sequence traces and genotype data is also now available.

Literature Curation



BRCA1 and BRCA2

BRCA1 and BRCA2 are tumour suppressor genes initially identified as inherited cancer susceptibility genes for breast and ovarian cancer. Both proteins been shown to have roles in genome surveillance, detection of DNA damage and its subsequent repair. However, they associate with different DNA repair complexes and generate different tumour histologies and spectra. Somatic mutations of either gene are rare, with BRCA2 being more frequently found to have somatic mutations, particularly in ovarian and pancreatic carcinomas.

We report that mutations in these two genes have been discovered at fairly low frequencies (2-3%), with BRCA2 mutated in a wider tissue range than BRCA1.

Gene Unique Samples Samples Experiments Mutant Samples Papers Mutations Unique Mutations
BRCA1 1106 1106 1114 25 22 25 23
BRCA2 1142 1146 1145 29 16 33 29


EWSR1/ERG fusion

Fusions of EWSR1 and ERG are common events in skeletal (and the rarer extraskeletal) Ewing's Sarcoma. These fusions, found at a frequency of approximately 10% in bone tumours result from complex rearrangements, since the two partner genes are not transcribed in the same chromosomal direction.

Genes Mutant Samples Papers Unique Mutations
EWSR1/ERG 77 49 11


COSMIC Data Updates

The CGP Resequencing screens and the following curated genes have received updates: BRAF, HRAS, CSF1R, CTNNB1, KIT, PDGFRA, PTEN, ACVR1B, ATM, ERBB2, BRCA1, BRCA2, KRAS, PIK3CA, FGFR2, ABL1, FGFR1, JAK2, SRC, STK11, CDKN2A, PTPN11, NRAS, FAM123B, APC, SMAD4, VHL, MSH6, MET, EGFR, FLT3, FBXW7, MEN1, NF1, RUNX1, FGFR3, RET.



CGP Trace and Genotype Archive

The groups sequence traces and genotype data are now available from the CGP Trace and Genotype Archive site. In order to access the data a Data Transfer Agreement must be completed and approved. A unique username and password will then be provided to access this resource.

Samples with trace data 276
Samples with genotyping data 1,135
Total number of traces 7,254,445


Gene Name Update

244 genes had their names updated (5.2%). It is still possible to search by the old gene name.



Website Upgrades

There has been an addition of several external gene links on the gene summary page. This includes links to NCBI Entrez gene, CCDS, Swiss-Prot and TrEMBL.

The sample summary page now also contains sample source information.



General Statistics


Experiments 984673
Tumours 246369
Mutant Samples 50032
Mutations 52146
Unique Mutations 10533
Papers curated 5271
Genes 4762
Fusions 685



More


5th Sep 2007 COSMIC 33: Improved CGP data release

COSMIC 33: Improved CGP data release

The WTSI Cancer Genome Project (CGP) announces an updated data release policy. We will now be releasing confirmed somatic mutations on a bi-monthly basis. Confirmed and annotated somatic mutations identified in the previous two months will be released in COSMIC, continuing on at two-monthly intervals. Data will still appear within current COSMIC architecture of gene family/gene set and under appropriate studies. This new policy will result in expedited pre-publication release of curated somatic mutations as they are identified.

This new data will be available in the COSMIC blue pages, but will be most noticeable in COSMIC's CGP resequencing studies site (red pages), as this distinguishes CGP data from the literature curation.

CGP resequencing data is broadly divided (in the red pages) into 3 categories, 'Kinase', 'Pilot' and a new project, 'Renal'. Whilst the Kinase data is completed and published, the other two studies are much larger and still in progress. A collection of approximately 4000 genes has been selected for resequencing in a set of 40 matched pair cell lines ('Pilot' project) and 96 primary clear cell renal cancers. Each tumour sample in these projects has a matched normal sample, which allows the distinction of somatic mutations from germline variants. The pilot project currently comprises 1865 somatic sequence changes, whilst the Renal project, although less advanced than the Pilot, has identified 84 mutations to date. These will be automatically updated with all our confirmed data every bimonthly release.



Literature curation


RUNX1 (AML1) has been fully curated

RUNX1 is one subunit of the PEBP2 transcription factor, binding to DNA at enhancer sequences. This gene is one of the most frequent targets of chromosome translocations associated with leukemia. Small somatic mutations have also been observed, most frequently in myeloblastic leukaemia types (Acute myeloblastic Leukaemia, MyeloDysplastic Syndrome) and it is these that we have curated in COSMIC. Our data suggests a somatic mutation rate of approximately 10% in this phenotype.


Curated Gene Update

The following curated genes have received updates from the literature: APC, ATM, BRAF, CDH1, CDKN2A, CTNNA1, CTNNB1, CYLD, EGFR, ERBB2, ERG, ETV1, FBXW7, FGFR3, FLT3, GATA1, HRAS, JAK2, KIT, KRAS, MADH4, MPL, MSH6, NF1, NF2, NOTCH1, NPM1, NRAS, PIK3CA, PTCH, PTEN, PTPN11, RB1, RET, SMARCB1, SMO, SOCS1, STK11, SUFU, TMPRSS2, VHL, WT1, WTX.


General Statistics

This release includes 1563 new mutations identified in the set of 4799 genes; 1495 genes are new this month.

Experiments 968416
Tumours 239766
Mutant Samples 48959
Mutations 51054
Unique Mutations 10390
Papers curated 5103
Genes 4799
Fusions 445



More


8th Aug 2007 COSMIC v32

COSMIC v32

This release includes four new tumour suppressor genes and improved availability in Ensembl.

New external integration: Ensembl

We are continually striving to improve the utility of the data in COSMIC by integrating it closely with external resources. In this release, we provide a much closer integration with the Ensembl genome browser than previously. All our gene & mutation data now have location coordinates on the NCBI36 genome sequence, allowing us to use Ensembl "DAS" technology to display this information within their genome browser, aligned with their standard genome annotations. We have made this easily available, via a single link from our pages.



BRAF_front_page



Literature curation

Four new tumour suppressor genes have been introduced to COSMIC this month, all receiving full literature curation of their somatic mutation data.

NF1

Neurofibromatosis is a familial disease with a complex phenotype including tumours of the central nervous system, caused by mutations in the NF1 tumour suppressor gene. Somatic mutations in tumours have also been identified in this gene, and it is these that we have fully curated.

NF2

The central form of neurofibromatosis is a similar familial central nervous system tumour syndrome, caused by mutations in the NF2 tumour suppressor gene. Somatic mutations in tumours have also been identified in this gene, and it is these that we have fully curated.

SOCS1

SOCS1 downregulates cellular cytokine signalling by its direct interaction with JAK1. It was first implicated in cancer after aberrant methylation was observed to inactivate its activity causing Hepatocellular Carcinoma. Somatic mutations have also been observed which inactivate this tumour suppressor and these have been curated.

TCF1

TCF1 binds to the promoters of several (largely liver-specific) genes, to enhance their expression. Somatic and germline mutations in this gene have been found which cause liver adenomas, and we have curated the somatic component.



The following curated genes have received updates from the scientific literature: KRAS, PIK3CA, JAK2, BRAF, HRAS, KIT, PDGFRA, PTEN, CDKN2A, VHL, EGFR, FBXW7, MEN1, RET



General Statistics for this release


Experiments 521624
Tumours 235207
Mutant Samples 47470
Mutations 49491
Unique Mutations 9699
Papers curated 5053
Genes 3304
Fusions 445



More


27th Jul 2007 COSMIC (v31) now includes Gene Fusion Data

COSMIC (v31) now includes Gene Fusion Data

The CGP COSMIC team is pleased to announce the addition of gene fusion/translocation somatic mutation data from the literature to the database. Currently, the census of known cancer genes is dominated by somatically generated fusion genes that have been identified primarily in leukaemias, lymphomas and soft tissue tumours. Until now, we have concentrated on curating somatically point mutated cancer genes for COSMIC. Almost all known cancer genes that have somatic point mutations are, however, now curated in COSMIC. In the coming months we will therefore be searching the scientific literature and annotating genes involved in gene fusions and their partners for addition into the COSMIC database.

We have launched this new facility, complete with new views for this data type, with the curation of TMPRSS2, a gene frequently found to be fused to ETS family transcription factors in adenocarcinoma of the prostate. These mutations have served to spur increased investigation into the potential role of fusion genes in adult solid tumours. The move to curate fusion genes is an important addition and will further enhance COSMIC as the most comprehensive source for somatic mutation data from human cancers.



Fusion Gene Pairs



TMPRSS2/ETV1

TMPRSS2/ERG



Website Upgrades

The fusion data has been integrated into existing pages and overviewed in new pages: Translocations Overview and Translocations Summary.

This new data can be viewed graphically and textually.

FusionImage1

The image above shows the table of inferred breakpoints (determined from a sample's observed fusion mRNA spectrum) for a fusion gene pair.

FusionImage2

The image above shows a graphical representation of the observed mRNA transcripts from which the inferred breakpoints are calculated.

Further information of the new gene fusion website features is available in the help pages.



Genes from Literature Curation

A new homepage has been created for genes which have received full curation of the scientific literature. This is a new page which allows the distinction of these genes from CGP's data release, for which no literature has been curated.

Curated Gene Update

The following curated genes have also received updates from the scientific literature: CDKN2A, GATA1, NOTCH1, NPM1, NRAS, JAK2, KRAS, PIK3CA, BRAF, HRAS, CEBPA, CSF1R, CTNNB1, KIT, PDGFRA, PTEN, RB1, MET, EGFR, FLT3, WT1, APC, MADH4, FBXW7, FGFR3.



General Statistics for this release


Experiments 515535
Tumours 230057
Mutant Samples 46978
Mutations 48911
Unique Mutations 9014
Papers curated 4938
Genes 3302
Fusions 438


More


6th Jun 2007 COSMIC v30

COSMIC v30

Today we release full literature curations of five tumour suppressor genes MEN1, ATM, CYLD, FBXW7, WTX; 4712 samples were examined in 112 papers, recording 468 mutations. Additionally, we release two new CGP resequencing studies which add a further 91 new genes to COSMIC.



Literature Curation

Curation of the scientific literature has been completed for five new genes from the cancer census. All five genes are tumour suppressors, causing phenotypes via their inactivation:



MEN1 (Multiple Endocrine Neoplasia Type 1)

Somatic mutations in this gene have been found in tumours from several endocrine sites, recapitulating those seen in patients carrying germline mutations including tumours in the pituitary, pancreas and parathyroid. MEN1 encodes a nuclear protein thought to be a transcriptional regulator.



CYLD (Cylindromatosis)

This gene has been found to have mutations in sporadic cylindromas, tumours arising from skin adnexal structures (such as hair follicles and glands), principally on the face and scalp. CYLD encodes a deubiquitinating enzyme regulating cell signalling including the NF-kappaB pathway.



FBXW7 (CDC4)

Mutations inactivating FBXW7 have been found in a range of cancer types including colorectal, ovarian and T-ALL. The protein is involved targeting a number of key proteins, including NOTCH1 and MYC, for ubiquitin-mediated degradation.



ATM

This gene encodes a protein kinase involved in cell cycle checkpoint control. Amongst other key cell cycle components, it has been shown to phosphorylate TP53 and CHEK2 in response to DNA damage. Germline mutations causes Ataxia-telangiectasia (AT) a recessive disorder characterized by cerebellar ataxia, telangiectases, immune defects, and a predisposition to malignancy, primarily lymphoid in origin.



WTX (FAM123B)

Recently discovered, WTX is inactivated in approximately 30% of Wilms Tumours. Located on the X chromosome, this tumour suppressor only requires a 'single-hit' for tumourigenic inactivation.



New tumour suppressor gene statistics:


Gene Samples Experiments Mutations Papers
MEN1 1680 1683 196 66
ATM 1714 1692 198 33
FBXW7 1207 1204 60 10
WTX 82 82 7 1
CYLD 29 29 7 2


The following curated genes have also received updates: BRAF,HRAS,CEBPA,CTNNB1,KIT,PDGFRA,PTEN,SMARCB1,ERBB2,JAK2,CDKN2A,PTPN11,NRAS,KRAS,PIK3CA,APC,CDH1,MADH4,EGFR,FLT3,MPL,WT1,FGFR3



CGP resequencing studies

91 new genes have been examined in our pilot set of matched pair cell lines, resulting in the discovery of 22 new mutations:

Study Genes Experiments Samples Mutations
Integrin alpha family 16 640 40 11
Miscellaneous genes of interest from literature sources 75 3000 40 11


General Statistics for this release


Experiments 499958
Tumours 217944
Mutant Samples 44491
Mutations 46364
Unique Mutations 8855
Papers curated 4794
Genes 3302


More


9th May 2007 COSMIC v29 released

COSMIC v29 released

COSMIC release 29 includes 22 new CGP resequencing studies, comprising 567 new genes within which 192 new mutations have been identified. Additional updates to our curation of the scientific literature have also been included, adding a total of 1041 mutations to this release.



CGP Resequencing Studies



567 genes have been examined in our pilot set of matched pair cell lines:



2
Study Genes Mutations
PAX transcription factor family 11 5
Tripartite motif-containing protein family 56 28
Genes on APC/CTNNB1 pathway 59 25
FK506/rapamycin binding protein family 26 5
Diacylglycerol kinases and other lipid kinases 18 17
SMAD protein family 24 7
Histone acetyltransferase 7 2
Dual specificity phosphatases 23 2
Genes associated with ERB family of RTKs 8 3
Genes associated with MYC proteins 21 12
Ubiquitin specific peptidase family 50 16
C-X-C/C-C motif chemokine receptor genes 19 8
Essential For Cell Division - derived from a siRNA screen in human cells 21 5
Genes from RNAi TSG gene screen 5
Glycolysis associated genes 23 4
Integrin beta family 8 8
Small ubiquitin-like modifier (SUMO) protein family 14 2
14_3_3 family of scaffold protein 8 1
STAT and SOCS gene families 43 7
Serpin/TIMP peptidase inhibitor families 46 17
Sorting NeXin family 27 3
Genes associated with RAS proteins 53 13




Literature curation



89 new publications have been curated, updating the information for the following genes: JAK2,HRAS,CEBPA,PTEN,RB1,RET,ERBB2,CDKN2A,GATA1,NRAS,KRAS,PIK3CA,EGFR,CTNNA1,APC,CDH1,MADH4



General Statistics for this release


Experiments 482902
Tumours 206972
Mutant Samples 42266
Mutations 44062
Unique Mutations 8420
Papers curated 4515
Genes 3220



More


4th Apr 2007 COSMIC v28 Released

COSMIC v28 Released

This months COSMIC release comprises a substantial increase in the CGP resequencing data, adding 1033 new genes to the system, together with updates to the scientific literature curation.

CGP Resequencing Studies


26 new studies have been included in this release, containing 1033 new genes which have been examined through the pilot matched pair cell line set.

Study Genes Mutations
ADAM metallopeptidase family 40 27
Cyclins and Genes associated with RB 68 20
Nfkappa signalling family 58 14
Phospholipase C Family 13 16
Protein Kinase anchor proteins 32 15
Ral Guanine nucleotide dissociation factors 6 3
Hypoxia inducible factor pathway 23 11
SerThr Phosphotases (PPP) 69 17
Integrin Binding proteins 27 1
K homology RNA-binding domain, type I 25 9
Cytochrome C oxidase family 24 3
DNA methylation and histone deacetylation 38 10
Heat shock proteins 81 20
Ets transcription factor family 28 5
High Mobility Group proteins 24 2
Immediate early/regulator of G-protein signalling family 25 5
Kallikrein protease family 16 5
Matrix metallopeptidase family 21 7
Genes implicated in stem cell regulation 63 12
TCA cycle genes 56 16
Forkhead transcription factor family 43 11
TP53 responsive genes 76 22
Ubiquitination pathway genes 63 21
Ubiquitin Ligases 72 36
DEAD Box proteins 60 25
Genes associated with TP53 and targets 47 16


Curated Gene Update


The following fully curated genes also received minor updates : APC, BRAF, CDKN2A, CTNNB1, EGFR, ERBB2, HRAS, KIT, KRAS, NOTCH1, NPM1, NRAS, PDGFRA, PTEN, PTPN11, RB1, WT1.

General Statistics for this Release

Experiments 455765
Tumours 204457
Mutant Samples 41259
Mutations 43021
Unique Mutations 8122
Papers curated 4426
Genes 2671


More


14th Mar 2007 COSMIC v27 released

COSMIC v27 released

This months release of COSMIC comprises upgrades to both the web site (which now allows searching by gene/sample name or keyword) and data, with new CGP resequencing studies and curated genes. COSMIC now contains data on over 200,000 tumour samples and 400,000 individual experiments. Of these 202109 tumours, 40331 were found to contain one or more mutations (19.9%).



CGP Resequencing Studies

Two new studies examine our pilot data set comprising 40 cancer cell lines that have a matched normal cell line, allowing all of the mutations to be confirmed as somatic.



Notch signalling proteins
This group of proteins comprise the Notch receptors and other proteins which are involved in Notch signalling. The Notch signalling pathway allows cells to communicate with each other and plays a crucial role in developmental regulation. NOTCH1 mutations have been associated with T-cell acute lymphoblastic leukaemia.



Phosphatidylinositol metabolism
This gene set includes proteins which control the synthesis and turnover of phosphatidylinositol which is synthesised in the endoplasmic reticulum before translocating to cytosolic membrane surfaces where it plays an important role in many cellular processes including cell signalling. Mutations in the phosphatidylinositol-3-kinase PIK3CA and the lipid phosphatase PTEN are associated with many types of cancer.



Literature Curation



STK11 (LKB1)
STK11 is a tumour suppressor, physically associating with p53 to effect growth suppression via p53-dependent apoptosis pathways; restoring gene activity into cancer cell lines defective for its expression results in a G1 cell cycle arrest. It has been identified as the cause of Peutz-Jeghers syndrome, an autosomal dominant disorder inducing an increased risk of melanocytic macules, gastrointestinal polyps and various neoplasms.



STK11 Statistics
Samples 2344
Mutations 92
Unique Sequence Changes 63




WT1
Wilms tumour is a solid cancer usually occurring in childhood, caused by malignant transformation of renal stem cells retaining embryonic differentiation potential. Several tumour suppressor genes have been associated with the development of WT, most classically the WT1 zinc finger DNA binding protein located at chromosome 11p13. A number of isoforms of the transcription factor WT1 exist, unusually exerting control over expression of target genes during both their transcription and splicing.



WT1 Statistics
Samples 1710
Mutations 106
Unique Sequence Changes 68




Website Upgrades



Search Facility
A major update to the COSMIC website this month is the Exalead search facility, allowing for easier navigation of the site. In the 'Text Search' field on the home page, you can search for a gene name or accession number, a sample name or id, or a tumour primary site or sub-site. There is a help page for more advanced searches, which can be accessed by clicking on the question mark in the search box, or the help button in the sidebar.



General Statistics for this release


Experiments 408164
Tumours 202109
Mutant Samples 40331
Mutations 42057
Unique Mutations 7736
Papers curated 4348
Genes 1638



More


14th Feb 2007 COSMIC third anniversary release (v26)

COSMIC third anniversary release (v26)

This release comprises a significant increase in the number of CGP resequencing studies. The five new studies all examine our pilot sample set comprising 40 cancer cell lines that all have a matched normal cell line, allowing all of the mutations to be confirmed as somatic.

Nuclear receptors and cofactors

A related but diverse array of transcription factors interacting with a wide range of coregulatory proteins to form a complex network of multicomponent assemblies serving as coactivators or corepressors of transcription.

SCF and APC cell cycle control complex components

Skp1-cullin-F-box-protein complex (SCF) and the anaphase-promoting complex/cyclosome (APC) are ubiquitination complexes regulating progression through the cell cycle.

Nucleocytoplasmic transport components

Factors involved in both import and export of proteins from the nucleus, including nuclear pore components.

Human homologues of putative target "cancer" genes from transposon screens in the mouse

Human orthologues of genes targeted by insertions in transposon insertion screens for cancer genes in the mouse.

Protein Tyrosine Phosphatases (PTP)

Critical regulators of signal transduction, effecting the reversible phosphorylation of tyrosine residues in cell signalling proteins.

Curated Gene Update

The following fully curated genes also received minor updates : BRAF, CDKN2A, EGFR, ERBB2, FLT3, KIT, KRAS, PDGFRA, PTEN, PTPN11.

General Statistics for this release

Experiments 394675
Tumours 194928
Mutant Samples 39520
Mutations 41228
Unique Mutations 7505
Papers curated 4180
Genes 1516


More


10th Jan 2007 COSMIC v25 released

COSMIC v25 released

This month's COSMIC release comprises significant updates to CGP resequencing studies and curation of the scientific literature.

Update to CGP Resequencing Studies

The six non-kinase CGP resequencing studies have received substantial updates to the number of genes included and the number of mutations found (the kinase studies were updated in November 2006). Fifty two new genes have been added to the DNA repair study, together with three in the Apoptosis and two in the GAP-GEF studies. The number of mutations discovered in each of the six studies has increased as shown below:

Study Mutation Count
v24 v25
Inositol Polyphosphate Phosphatases 8 12
Heterotrimeric G-Proteins 5 6
DNA repair genes 114 194
Apoptosis genes 38 82
Small monomeric GTPases 8 28
GAP-GEF genes 48 91


Literature Curation

In addition to the CGP resequencing studies, significant updates have been made to those genes which have received complete scientific literature curation. Three genes have been extensively updated, BRAF (19.1%, increased to 19224 samples), JAK2 (25.1%, increased to 11190 samples) and NOTCH1(75.4%, increased to 488 samples), whilst eighteen other genes have received minor updates (less than 10% increase in sample number): ABL1, APC, CDKN2A, CEBPA, CTNNB1, EGFR, ERBB2, FLT3, KRAS, MET, NRAS, PDGFRA, PIK3CA, PTEN, PTPN11, RET, SRC, VHL.

General Statistics for this release

Experiments 387254
Tumours 193513
Mutant Samples 39003
Mutations 40672
Unique Mutations 7272
Papers curated 4082
Genes 1398


More


14th Dec 2006 COSMIC v24 released

COSMIC v24 released

This months release of Cosmic includes the curation of NPM1 and CDH1.

Curation details for NMP1

NPM1 (Nucleophosmin), is a nucleocytoplasmic shuttling protein and critical regulator of TP53. Frequent mutations have been found in both childhood and adult AML. 20 papers have been manually curated for this gene resulting in the addition of 45 unique mutations (exon 12).


NPM1 statistics

Samples 3870
Experiments 3875
Mutants 1171
Papers 20
Unique Mutations 45

Curation details for CDH1

CDH1 (E-cadherin), is a calcium ion-dependent cell adhesion molecule with loss of function of this gene implicated in cancer invasion and metastasis. In particular, somatic mutations of this gene have been reported in gastric and lobular breast cancer. 181 mutations have been added to Cosmic for this gene from the curation of 46 papers.


CDH1 statistics

Samples 1958
Experiments 1970
Mutants 205
Papers 46
Unique Mutations 181

General Statistics for this release

Experiments 380741
Tumours 190358
Mutant Samples 38206
Mutations 39839
Unique Mutations 7032
Papers curated 4023
Genes 1343


More


30th Nov 2006 COSMIC v23 released

COSMIC v23 released

This months release of Cosmic includes a major update to the protein kinase screens.

Protein Kinase Somatic Data Information

The Cancer Genome Project is pleased to release the full set of protein kinase somatic mutation data resulting from the screening of over 200 human cancers through the full set of 518 annotated genes. Over 1000 mutations have been identified in a combined total of 247 megabases sequenced. This dataset is intended to serve as a catalyst for further biological investigation of mutated kinases and pathways, hopefully leading to new insights and therapeutic opportunities in human cancer.


http://www.sanger.ac.uk/genetics/CGP/Studies/Kinases/

Copy number data update

Oligo array CGH data (using the Affymetrix 10K SNP array) for a further 233 cancer cell lines and 70 primary tumours has been made available increasing the total available from 834 to 1136 samples.



General Statistics for this release

Experiments 374169
Tumours 184092
Mutant Samples 36252
Mutations 37857
Unique Mutations 6758
Papers curated 3945
Genes 1342


More


11th Oct 2006 COSMIC v22 released

COSMIC v22 released

This months release of Cosmic includes the curation from the scientific literature of the APC oncogene and information on the similarity between cell lines is now recorded and displayed in Cosmic.

Curation of APC literature


Mutations in the APC gene are one of the initiating events in colorectal tumorigenesis, both familial and sporadic. Our curation confirms that the majority of mutations occur in the central portion of the gene (the mutation cluster region, 'MCR') where mutations are associated with the most severe phenotype of huge numbers of polyps at a young age, often with extracolonic manifestations. Mutations outside the MCR cause a much milder and late onset phenotype, generating few polyps. We curated 206 papers for APC, finding 1420 mutated samples out of 7115 (almost 20%). As expected, there were no major hotspots; the most frequent mutation was p.R1450*, found in 87 samples (6%).

Web site


On the web site, we have begun to include information in Cosmic on samples found to be significantly similar by their genotype as assessed using Affymetrix SNP arrays. To date, 241 cell lines have been found to have genotypes which are greater than 80% identical with at least one other. This data is now displayed in the Cosmic Sample page under the heading of "Other Cancer Samples from the Same Individual"; here is an example: http://www.sanger.ac.uk/perl/genetics/CGP/cosmic?action=sample&id=687448

General Statistics for this release

Experiments 290332
Tumours 173929
Mutant Samples 32891
Mutations 34390
Papers curated 3781
Genes 1342


More


14th Sep 2006 COSMIC v21 released

COSMIC v21 released

This months release of Cosmic includes major updates to the Cancer Cell Line Project and microsatellite instability status data sets. In addition, published somatic mutation data from two additional genes, MPL and FGFR1, have been added to Cosmic.

Cancer Cell Line Project Major Update


The Cancer Cell Line Project aims to systematically screen a large panel of cancer cell lines for mutations in known cancer genes, thus empowering these cell lines as biological reagents for further work in anti-cancer agent development and further work on cancer molecular and cellular biology.

For this release of Cosmic, a further 137 cell lines have been added to the working set and 78 duplicate cell lines have been removed. This brings the total number of samples to 787. A further 98 mutations have also been added (See: http://www.sanger.ac.uk/genetics/CGP/CellLines/).

Statistics for the CGP Cancer Cell Line Project

Experiments 12887
Samples 787
Mutant samples 1087
Mutations 1144
Unique Mutations 3519
Genes 21

MSI Data Update


The microsatellite instability (MSI) status for CGP samples under study has been updated, bringing the total number of samples with MSI status to 1,530. (See:-http://www.sanger.ac.uk/genetics/CGP/MSI/msi_page.shtml).

Curation of MPL and FGFR1


Somatic mutations reported in the published scientific literature for the FGFR1 and MPL genes has now been added to Cosmic.
MPL- http://www.sanger.ac.uk/perl/genetics/CGP/cosmic?action=gene&ln=MPL
FGFR1- http://www.sanger.ac.uk/perl/genetics/CGP/cosmic?action=gene&ln=FGFR1

General Statistics for this release

Experiments 278786
Tumours 164905
Mutant samples 30566
Mutations 31933
Papers Curated 3611
Genes 1339


More


5th Jul 2006 COSMIC v20 released

COSMIC v20 released

This months release includes NCI-60 updates and mutation data from the scientific literature for VHL.

NCI-60 update


The CGP is pleased to release mutation data for 24 known cancer genes on the NCI-60 series of cancer cell lines. These data should allow for greater power in interpretation of biological data using the lines as well as providing a genetic framework for evaluating response to the large series of compounds screened against this reference cell line set.


Microsatellite instability


Microsatellite instability occurs due to a defect in mismatch repair. This is usually a result of inactivation of MSH2, MLH1 or MSH6 due to a mutation or to reduced expression associated with promoter methylation. Analysis of microsatellite instability was carried out using the BAT markers as described by Rodriguez-Bigas et al. All samples were screened using the markers BAT25, BAT26, D5S346, D2S123 and D17S250. Details of this, when available, are posted on the sample overview page. An example of which can be seen at http://www.sanger.ac.uk/perl/genetics/CGP/cosmic?action=sample&id=905950


Curation of VHL


VHL mutation data from the literature is now available. We have curated 93 papers covering 3412 experiments. These experiments used 3386 samples, in which 879 mutations were recorded.


General Statistics for this release

Experiments 270623
Tumours 160594
Mutant samples 29154
Mutations 30439
Papers curated 3519
Genes 1338


More


7th Jun 2006 COSMIC v19 released

COSMIC v19 released

This month's release of COSMIC includes the Cancer Genome Project screen of the GAP-GEF gene set and new information displays.

GAP-GEF Screen


This gene set, consisting of 173 genes, is comprised of proteins that function to regulate the activity of proteins with GTPase activities. GTPase activating proteins (GAPs) promote hydrolysis of GTP-GDP. Guanine nucleotide exchange factors (GEFs) promote GDP/GTP exchange. Both classes modulate the function of the small monomeric GTPases (including the RAS oncogene family) and other key signalling proteins that use the conversion of GTP-GDP as a molecular switch to regulate function. This system of GTPase/GAP/GEFs regulates a wide variety of cellular processes including growth, differentiation, survival and motility.


Web Improvements


Zygosity and somatic/germline status information are now available for mutations in COSMIC, CGP Resequencing and Cancer Cell Project websites. The somatic/germline status is listed on the sample detail page and the export function with the following statuses:-


  • Not specified
  • Confirmed somatic mutant
  • Reported elsewhere as a somatic variant
  • Confirmed germline variant
  • Reported elsewhere as a germline variant
  • Variant of unknown origin

Zygosity information is available on the mutation detail page with the following statuses:-


  • Unknown
  • Homozygous
  • Heterozygous

General Statistics for this release

Experiments 264296
Tumours 155902
Mutant samples 27732
Mutations 28859
Papers curated 3419
Genes 1337


More


4th May 2006 COSMIC v18 released

COSMIC v18 released

The CGP Resequencing Studies Website is released this month, which will act as a repository for data from CGP resequencing efforts to identify novel somatic mutations in human cancer. The pages have their own distinctive red colour scheme to denote this. Prior data on sets of genes/samples systematically screened for mutations were previously integrated into the "blue" COSMIC pages. This will continue with data now being submitted, prepublication, to and held on the new site. This will allow users to browse, search and evaluate these data more effectively. The web resources that are now available are detailed below:-

  • COSMIC (Blue): All data screened from literature and CGP based projects.
  • CGP Resequencing Studies (Red): Somatic mutations from systematic large scale resequencing of genes in human cancers.
  • CGP Cancer Cell Line Project (Green): Resequencing of known cancer genes and other analyses of human cancer cell lines.

Curation of PTCH

37 papers from the scientific literature have been curated for the PTCH gene in this release. Adding an additional 897 experiments and 168 mutations.


General Statistics for this release (COSMIC)
Experiments 254544
Tumours 153528
Mutant samples 27426
Mutations 28534
Papers curated 3393
Genes 1176

COSMIC DAS track

Ensembl has recently moved to the NCBI 36 assembly of the human genome whilst COSMIC genes and mutations are currently mapped to build 35. This has caused some disparity with the COSMIC DAS track. Therefore we suggest only using the cosmic DAS track on the most recent ensembl archive site(http://feb2006.archive.ensembl.org/index.html).Provided below is a link that will open the appropriate website with the DAS source attached:

http://feb2006.archive.ensembl.org/Homo_sapiens/contigview?conf_script=contigview;c=7:139949999.5:1;w=200000;h=7;add_das_source=(name=COSMIC+url=http://das.ensembl.org/das+dsn=cosmic_ncbi_35+type=ensembl_location+color=blue+strand=b+labelflag=n+stylesheet=y+group=n+depth=0+score=n+active=1)


More


4th Apr 2006 Cosmic v17 release

Cosmic v17 release

This month's release of COSMIC includes the Cancer Genome Project screen of small monomeric GTPases and mutation data from the scientific literature for MADH4.

CGP Small Monomeric GTPase Screen

The small monomeric GTPases function as key molecular switches impacting a large variety of cellular functions such as motility, cell signalling, transcription and the binding, hydrolysis and exchange of GTP/GDP. The RAS subfamily (HRAS, NRAS, KRAS) of small monomeric GTPases were amongst the first identified human oncogenes and are mutationally activated in a wide variety of human cancers.

Curation of MADH4

70 papers from the scientific literature have been curated for the MADH4 gene in this release. Adding an additional 2275 experiments and 259 mutations.

Gene Updates

Further data from the scientific literature for 9 genes, including KRAS and NRAS, has been added for this release. A detailed breakdown for each gene can be seen below.

Gene Name Additional experiments Additional mutations
CDKN2A 668 71
CTNNB1 140 8
EGFR 150 4
ERBB2 84 1
HRAS 54 1
KRAS 3050 647
NRAS 1611 203
PTEN 37 7
PTPN11 309 17

General Statistics for this release
Experiments 249331
Tumours 149251
Mutant samples 26574
Mutations 27637
Papers curated 3324
Genes 1175


More


8th Mar 2006 COSMIC v16 released

COSMIC v16 released

Released for March are data from a kinase domain screen of malignant gliomas. These data cover approximately 400kb of sequence in each of 9 tumours, including data from recurrent/resistant tumours.

We have recently completed a screen for somatic mutations of the kinase domain encoding exons of the entire protein kinase family in a series of human malignant gliomas. The results are presented in this release of COSMIC. No commonly mutated kinase domain was found in these studies. However, as is the case with our other work in this area, deep sequencing data from human tumours is informative about the processes that have contributed to oncogenesis in the patient. Two gliomas recurrent after temozolomide (alkylator) chemotherapy, but not a third recurrent after XRT alone, had the highest mutation prevalence of any tumours we have analysed to date. These data suggests a link between mutation prevalence and recurrent/resistant brain tumours treated with alkylator chemotherapy.

Statistics
Experiments 235213
Tumours 143427
Mutant samples 25360
Mutations 26388
Papers curated 3207
Genes 1035


More


7th Feb 2006 A COSMIC Expansion

A COSMIC Expansion

The Catalogue Of Somatic Mutations In Cancer is two years old and has mutation data for over 1,000 genes, curated from over 3,000 published papers and unpublished data from the Cancer Genome Project.

The original aim of COSMIC continues with the curation of somatic mutation information from the literature for known cancer genes. During 2005 data for 9 genes was collected; ABL1, CDKN2A, EGFR, GATA1, JAK2, MSH6, NOTCH1, PTPN11 and SMO. In addition to this, genes that were curated in 2004 were updated as new data was published.

The number of genes in COSMIC expanded rapidly when the Cancer Genome Project at the Wellcome Trust Sanger Institute published 3 studies of somatic mutations in the protein kinase gene family (518 genes in total). This data provides a unique insight to the somatic mutations in breast, lung and testicular cancers.

More recently the Cancer Genome Project has been submitting unpublished somatic mutation data to COSMIC (link). The data comes from genes involved in apoptosis, DNA repair, maintenance and metabolism and the Inositol Polyphosphate Phosphatase and Heterotrimeric G-Protein families.

In another new departure the COSMIC software was used to create a new web site the Cancer Cell Line Project. This separate site, with it's own 'mint' colour scheme, contains the results from the sequence analysis of 14 known cancer genes in over 700 cancer cell lines. Initial sequence data for 4 genes analysed in the NCI-60 is also available. This work is in progress and more results will be posted in the coming months. What is more, the number of genes in this project will continue to increase; providing genetic data for this wide set of cancer cell lines.

There have been many enhancements to the web site over the past 12 months. A tissue overview provides a summary of mutations reported in a selected tissue. New pages were created to show more details of mutations and samples and give greater depth to the data. There are also links to other data such as genome copy number information.

COSMIC has been summarised in The British Journal of Cancer (Forbes et al, 2006).

This month sees the update of; BRAF, CDKN2A, EGFR, ERBB2, HRAS, KRAS, NRAS, PTEN, PTPN11 and SMARCB1. In addition the Cancer Genome Project has submitted unpublished data for genes involved in apoptosis.

There are plans to continue the development of COSMIC in terms of data content and data presentation. We are always happy to receive feedback and suggestions (email: cosmic@sanger.ac.uk).

Statistics
Experiments 228,669
Tumours 142,569
Mutant samples 25,176
Mutations 26,194
Papers curated 3,013
Genes 1,035


More


10th Jan 2006 COSMIC v14 released

COSMIC v14 released

The COSMIC team is proud to announce the release of COSMIC-14 with data for CDKN2A(p16) and more unpublished data from the CGP.

DNA REPAIR, MAINTENANCE AND METABOLISM

The Cancer Genome Project has released further unpublished somatic mutation data from a screen of 41 cancer cell lines. The 302 genes in this release are involved or associated with DNA repair, maintenance and metabolism. The genes can be viewed together or in 5 subgroups; Telomerase Complex, SWI/SNF, DNA replication, Nucleotide Metabolism and DNA Damage Response and Repair. In total 119 somatic mutations were identified in this study.

CURATION OF CDKN2A

CDKN2A (also known as p16) is a tumour suppressor. It induces cell cycle arrest by inhibiting the phosphorylation of Rb by the cyclin-dependent kinases CDK4 and CDK6. So far 453 papers have been curated for this gene with 2,591 mutations recorded from 16,883 samples.

STATISTICS

Experiments 219,037
Tumours 140,212
Mutant samples 24,817
Mutations 2,637
Papers curated 3,379
Genes 870


More


13th Dec 2005 COSMIC v13 released

COSMIC v13 released

Somatic mutation data from new gene families

In a major new departure the Cancer Genome Project is proud to release further somatic mutation data. The results from the sequencing of two gene families, Inositol Polyphosphate Phosphatases and Heterotrimeric G-Proteins, have been added to the data for the Protein Kinase genes . This data will be expanded in the future with the addition of further gene sets.

Updates to existing genes

Nine genes in COSMIC have been updated with further data; NRAS, RB1, ERBB2, HRAS, PTEN, TP53, KRAS, APC and CDKN2A

New DAS data source

The Cancer Genome Project is pleased to announce the release of a DAS source devoted to the genes and mutations within COSMIC. Using this source you will be able to view the genes and mutations from COSMIC within a genome browser or the DAS client of your choice.

All 587 genes in COSMIC are exported as features. Each of these features displays the genomic 'footprint', which encompasses both exonic and intronic sequence between the start and end points of the CDS sequence. A link is attached to each feature, providing a mechanism for the client to link back directly to the gene entry on the COSMIC website.

In addition to the gene footprints, there are also a large number of unique mutations. These are also displayed as features, with links back to the mutation summary page in COSMIC. The database currently holds 2812 unique mutations, of which 1035 are currently exported. This subset is comprised of all the single nucleotide substitutions. More complex mutations will be included, as the genomic coordinates are mapped.

The DAS source can be found at the following URI:

http://das.ensembl.org/das/cosmic_genomic

The easiest way to view this source is to place the following URI in your browser:

http://www.sanger.ac.uk/turl/6d8

This will attach the DAS source and display some of the mutations found in BRAF. Additional configuration can be performed on the track, by clicking on the track name. For more information, see the help pages on the Ensembl website.

COSMIC statistics
Experiments 190,576
Tumours 124,381
Mutant samples 23,232
Mutations 2,228
Papers curated 2,812
Genes 587


More


1st Nov 2005 COSMIC version 12 released

COSMIC version 12 released

The November release of COSMIC has further data on 9 known cancer genes.

GENE UPDATES

The genes with additional data are; BRAF, PTEN, RB1, EGFR, TP53, CDKN2A, NRAS, KRAS and PIK3CA.

VERSION

We have implemented a versioning system for the data in COSMIC. The current release is version 12 with a plan to release a new version every month.

CANCER CELL LINE PROJECT.

There are additional mutations for the known cancer genes being sequenced through the cancer cell lines. Notably there is data for homozygous deletions in the CDKN2A gene.

COPY NUMBER DATA

The Cancer Genome Project has released more copy number data derived from the analysis of cancer cell lines and primary tumours using Affymetrix SNP microarrays. So far a total of 834 samples have been analysed consisting of 161 primary tumours and 673 cancer cell lines. This data is freely available from the CGP website. The primary tumours overlap with those being sequenced by the CGP while the cancer cell lines include those being sequenced in the Cancer Cell Line Project.

COSMIC STATISTICS
Tumours 124,367
Experiments 188,529
Mutations 23,157
Papers 2,224
Genes 538


More


3rd Oct 2005 COSMIC Update

COSMIC Update

COSMIC has been updated with the addition of 2 new curated genes and new mutation descriptions.

MUTATION DESCRIPTIONS

COSMIC has adopted the Human Genome Variation Society sequence variation/mutation nomenclature for the bulk of the mutations in COSMIC. This represents a major upgrade with the aim of improving clarity and enables the listing of intronic variants for the first time.

GENE UPDATES

Two genes have further data in COMSIC; EGFR and PTEN.

PROTEIN KINASE MUTATIONS IN TESTICULAR CANCER

The sequence analysis of the protein kinase gene family in human testicular germ-cell tumours of adolescents and adults has been published. The mutation data from this work was previously available in COSMIC and is now joined by the published analysis of the data.

CANCER CELL LINE PROJECT

There are additional mutations from the screening of known cancer genes through an extensive set of cancer cell lines.

STATISTICS FOR COSMIC
Tumours 123,197
Experiments 186,181
Mutations 22,711
Papers 2,157
Genes 537


More


6th Sep 2005 COSMIC Update

COSMIC Update

COSMIC has been updated with the addition of 3 new curated genes; MSH6, NOTCH1 and PTPN11.

There is a new member to the COSMIC family; the Cancer Cell Line Project. This portal uses the COSMIC code to serve mutation data from the cancer cell lines being sequenced by the Cancer Genome Project at the Wellcome Trust Sanger Institute. The cell line data is presented in the same style as the COSMIC data with a unique colour scheme. There are links to jump from the Cancer Cell Line Project pages to view all of the data in COSMIC. At present there is data from 12 known cancer genes in the Cancer Cell Line Project database.

In addition the results from the screen of all 518 protein kinase genes in lung cancer, that were available in the previous release of COSMIC, have been published in Cancer Research

.

NEW GENES IN COSMIC

  • MSH6 - is a member of the MutS homolog family and is required for DNA mismatch specific binding. Almost one third of tumours of the large intestine have somatic mutations in this gene.
  • NOTCH1 - has somatic small intragenic mutations in 60% of haematopoietic and lymphoid tumours.
  • PTPN11 - is a nontransmembrane protein-tyrosine phosphatase. Approximately 6% of haematopoietic and lymphoid tumours have mutations in this gene.

COSMIC STATISTICS

Experiments 186014
Tumours 123039
Mutations 22598
References 2153
Genes 537


More


1st Sep 2005 Protein kinase mutations in lung cancer

Protein kinase mutations in lung cancer

The Cancer Genome Project has sequenced all protein kinase genes in lung cancer - the most common cause of cancer deaths worldwide

There are over 27,000 new cases of lung cancer in the United Kingdom each year. Protein kinases are frequently mutated in human cancer and inhibitors of mutant protein kinases have proven to be effective anticancer drugs. The Cancer Genome Project has screened the complete coding sequence of all 518 protein kinase genes in 33 lung cancers. This study, published in Cancer Research, is the largest survey reported to date of somatic mutations in lung cancer.

The Cancer Genome Project at the Wellcome Trust Sanger Institute was established in 2000. Its goal is to identify mutations that occur in cancer cells to enable the development of new diagnostics and new treatments and advance our understanding of the biology of cancer.

The Wellcome Trust Sanger Institute Cancer Genome Project and their collaborators have published the latest results of their survey of genes that might be implicated in cancer. The report is published in Cancer Research on Thursday 1st September 2005 and is also available through COSMIC.

The gene set chosen was a class called protein kinases, key controllers of cell growth and death. Members of this family have been shown to be important in cancer. However, the whole set has never been sequenced in a single set of lung tumours. The study generated over 40 million bases of DNA sequence (1.3 million for each sample).

This work identified 188 somatic mutations in 141 protein kinase genes. There was considerable variation in the number of mutations found in each tumour. The results indicate that several mutated protein kinases may be contributing to lung cancer development, but that mutations in each one are infrequent. Larger studies are warranted to further explore these initial findings. Cancer is a complex set of diseases that will affect 1 in 3 people. This work in the CGP is but one part of a global effort to further understanding of cancer and move towards better diagnosis and treatment.


More


3rd Aug 2005 COSMIC Website Update

COSMIC Website Update

The COSMIC web site has been updated with additional data from the literature and unpublished data from the Cancer Genome Project.

SOMATIC MUTATION DATA FOR KNOWN CANCER GENES

Data for 3 genes has been curated from the literature and included in COSMIC; ABL1, GATA1 and SMO.

SOMATIC MUTATIONS OF THE PROTEIN KINASE GENE FAMILY

The screen of the protein kinase gene family by the Cancer Genome Project now includes two new tumour types; lung cancer and testicular germ-cell tumours. There are marked differences in the mutation prevalence between these two tumour types.

CANCER CELL LINE PROJECT

The mutation data for 9 further genes has been included on the web site giving a total of 550 mutations. The genes are APC, CDH1, CTNNB1, HRAS, MADH4, PIK3CA, PTEN, RB1 and STK11. The sequencing of these genes is not necessarily complete but the cell lines with mutations have been confirmed and the experiments will continue to finish this work.

COSMIC STATISTICS

179,563 Experiments
118,134 Tumours
22,005 Mutations
2,090 References
534 Genes


More


23rd May 2005 Cosmic Update

Cosmic Update

COSMIC now includes data from a screen of all protein kinase genes in breast cancer and an update of mutation data from the literature.

New Data

The data in COSMIC has expanded to include a new data type and the number of known cancer genes has been extended with updates on some of the existing cancer genes.

A screen of the coding sequence of the protein kinase genes in breast cancer.

The Wellcome Trust Sanger Institute Cancer Genome Project and their collaborators have published the latest results of their survey of genes and their mutations in cancer. The report was published online in Nature Genetics on Sunday 22 May 2005 (more). This data has been integrated with the existing data in COSMIC and made available through the web site.

New cancer genes in COSMIC

The mutation data for two further cancer genes has been curated from the scientific literature and added to COSMIC.

  • EGFR - mutations in the epidermal growth factor receptor (EGFR) have been reported in lung cancer and have been associated with the tumour response of patients receiving gefitinib.

  • JAK2 - A single somatic point mutation (V617F) has been identified in JAK2 in patients with polycythaemia vera. The mutation alters a highly conserved valine present in the negative regulatory JH2 domain, and is predicted to dysregulate kinase activity.

Updates to existing COSMIC genes

Further published data has been curated for 5 genes in COSMIC; BRAF, ERBB2, FGFR2, PDGFRA and PIK3CA.

Website

Home Page
  • We have created a new mailing list: COSMIC-announce, with a subscription link located at the bottom of the home page. As a subscriber to this list you will recieve announcements about the latest COSMIC news and website releases.

Browsing by Gene

More improvements have been made to the gene selection pages. The alphabetical lists have been seperated into 3 groups to reduce the amount of guess work involved in finding your gene of interest.

  • Genes from the Cancer Gene Census: - This list contains genes that have been included in the Cancer Gene Census. All of the genes in previous releases of COSMIC are included in this census.

  • Other Genes with Mutations: - These genes are not in the census, but have been found during the curation of the literature and so are included in the database. All these genes have a documented mutation which is thought to be linked to cancer.

  • Other Genes without Mutations: - The final list contains all the other genes that have been recorded during curation. These do not have a documented mutation in the references found in COSMIC.

The karyotype has also been updated. Genes from the census can be located quickly by clicking on the red trinagles. All other genes are indicated by blue lines across the chromosome.



Mutation Overview Page

Each mutation in COSMIC now has its own overview page containing information about the type of mutation and samples/tissues containing the mutation. This page can be reached by clicking on various links throughout the website.

  • Main Histogram:


  • Main Mutation table:


  • Sample Overview Page:


The overview page is divided into 8 main sections:

  • Mutation Id: This id is used to identify a mutation within the COSMIC database and is assigned as the mutation is curated.

  • Mutation type: The mutation type is used to describe the type of mutation that has occurred. This can be anything from a single base inframe substitution, to a frameshift deletion.

  • Mutation Location: Here, an image displays the location of the mutation within the peptide sequence.

    • The grey bar at the top of this section shows the full length sequence. Below this can be found a red box, which indicates the area around the mutation. At the bottom of the image, the red box has been expanded and the peptide sequence around the mutation is shown. Here you will find a red triangle which indicates the starting point of the mutation. Clicking on the triangle will produce a pop-up window showing the mutation at both the peptide and nucleotide level.


    • Additionally there is a link, 'Show all mutations in area', to the main histogram page for the gene. This link will show the gene histogram zoomed into the area displayed on this page. This allows you to see any other mutations that have been identified in the surrounding area.


  • Gene: The name of the gene in which the mutation was found. Clicking on the gene name will link to the summary page for that gene.

  • AA Mutation: This section details the change that has occurred in the peptide sequence as a result of the mutation. Formatting is as follows:
    • Substitutions - X(Y)Z
      Where X is the amino acid found in the wildtype sequence. Y is a number representing the position, within the peptide sequence, at which the mutation occurred. Finally, Z is the amino acid found in the mutant sequence.

    • Deletions - delY(Z)
      Where Y is a number representing the position at which the deletion starts and Z is the amino acid sequence which has been deleted.

    • Insertions - insY(Z)
      Where Y is a number representing the position at which the insertion begins and Z is the amino acid sequence that is inserted.


  • CDS Mutation: This section details the change that has occurred in the nucleotide sequence as a result of the mutation. Formatting is identical to the method used for the peptide sequence.

  • Tissue Distribution (Top 5): The top five tissues in which this mutation has been identified are described in the following bar chart.

    • Each bar represents the number of samples, for a specific tissue type, that have exhibited the selected mutation. A label indicating the name of the tissue type and the number of samples is located below each bar.

    • Clicking on one of the bars will take you to the tissue overview page for the selected tissue.


  • Associated Samples: A list showing all the samples, including their primary tissue types, that have the selected mutation. Clicking on a sample name will take you to the sample summary page for the selected sample. Clicking on the primary tissue type will take you to the tissue overview page.

Sample Overview

Two new sections have been added to this page:

  • Tumour Features: In this section details about the tumour, from which the sample was obtained, are listed whenever they have been supplied by the reference source.

  • External Data Sources: Additional data sources, with information about the sample, are listed here when available. This includes information from some of the studies within the Cancer Genome Project.

References

COSMIC now includes review papers. There is a review section that can be found at the bottom of the reference overview page for each gene. This section includes references that review other works. As the data from these references has already been added to the data from the original sources, this data is not added again.

Statistics
529 Genes
114300 Tumours
20536 Mutations
1894 References


More


4th Mar 2005 COSMIC Website Update

COSMIC Website Update

COSMIC presents 'Tissue Overview' another way to view somatic mutation data. The Tissue Overview page details the Top 5 Genes for any tissue / histology selection ranked by mutation frequency and data volume. In addition it lists other genes with and without mutations for the selection. From the Tissue Overview page you can click through to the specific details of the listed genes.

Website

Home Page
  • We have updated the entry point system.
    • Detailed Search - This has been the standard search pathway and has not changed from previous releases. Please continue to use this pathway to build complex queries, if you are interested in specific subtissues or histologies
    • Quick Search - This is a new pathway greatly reducing the number of steps required to access the tissue overview page and any subsequent pages. This increase in speed does however reduce the complexity of the available search to just primary tissues.
Tissue Overview

As stated above, this new page details all the genes that have samples for the tissues / histologies selected. It is split into three major sections, with the first section detailing what we feel are the most important genes, based on mutation frequency and data volume.

  • Section One: Top Genes With Sample Data
  • This section provides an interactive bar chart and table showing data for the highest ranked genes containing samples from the chosen tissues / histologies

    The coloured bars in the image represent:

    • All Samples
    • Samples With Mutations
    • Clicking on any portion of the bar or name associated with a particular gene will reveal a pop-up menu.

      • Sample Number: (count) - This indicates the number of samples that have been found with the selected tissue/histology type
      • Mutated Samples: (count) - This is the number of the above samples that have shown mutations.
      • Go to Full Gene Display - Clicking on this link will take you to the histogram display for the selected gene.
  • Below the bar chart image is a table that displays all the information found in the image, in a tabular format.

  • Section Two: Other genes with mutations
    • This section contains a list of additional genes with mutated samples that didn't make it into the top 5. Each gene name is linked to the full histogram image.
  • Section Three: Other genes without mutations
    • This section contains a list of additional genes without mutated samples that didn't make it into the top 5. Again, each gene name is linked to the full histogram image.


More


4th Feb 2005 COSMIC's first anniversary

COSMIC's first anniversary

The COSMIC database and web site have been updated and now have somatic mutation data from 21 genes.

New Data
  • CEBPA is mutated in 7% of haematopoietic and lymphoid tissue tumours. It arrests cell proliferation by inhibiting the kinases CDK2 and CDK4.
  • CTNNB1 or beta-catenin is mutated in a variety of tumours. The gene encodes an adherens junction protein that is critical for the establishment and maintenance of epithelial layers
  • KIT is characterised by two clusters of mutations in and around the kinase domain of the gene with frequent mutations in haematopoietic and lymphoid tissue tumours (19%) and soft tissue tumour (32%).
  • PTEN has mutations through the whole coding sequence with a hot spot at codon 130. Tumours of the central nervous system and endometrium frequently have mutations in this gene (19% and 34% respectively)
  • SRC is homologous to the v-src gene of the Rous sarcoma virus and has one mutation that has been found in 10 samples.
  • SUFU encodes a component of the sonic hedgehog/patched signaling pathway and is mutated in central nervous system tumours.
Statistics
21 genes
104,682 tumours
18,478 samples have mutations
1,755 unique mutations
1,672 papers have been curated


More


17th Dec 2004 COSMIC Update

COSMIC Update

The COSMIC team is proud to release somatic mutation data for CSF1R, RB1, RET and SMARCB1. This information has been curated from the scientific literature. Somatic mutation data from 15 genes can be queried and viewed through the COSMIC web site.

Data
  • The 4 new genes in COSMIC give data on specific tumour types and increase the breadth of information that can be queried and displayed.
  • CSF1R, also known as the oncogene FMS, is a receptor kinase that is mutated in ~5% of myelodysplastic syndrome cases. Mutations in this gene have been associated with a predisposition to myeloid malignancy.
  • RB1 is mutated in more than 11% of the tumours that have been studied. It is frequently somatically mutated in cases of retinoblastoma (47%) while germline mutations predispose to the same disease.
  • RET, a tyrosine kinase receptor, is somatically mutated in 38% of thyroid medullary carcinomas. Germline mutations in the RET gene are associated with multiple endocrine neoplasia, type IIA and type IIB, medullary thyroid carcinoma and Hirschsprung disease.
  • SMARCB1, also known as SNF5/INI1, is frequently somatically mutated in soft tissue rhabdoid tumours (41%). These are highly malignant cancers that usually occur in young children.
Statistics
15 genes
73,767 tumours
13,420 samples have mutations
536 unique mutations
1,104 papers have been curated


More


12th Nov 2004 COSMIC Update

COSMIC Update

The COSMIC team are proud to include somatic mutation data for FGFR2, FGFR3, FLT3, MET, PDGFRA and PIK3CA on the COSMIC web site.

Data

The number of genes with data in COSMIC has more than doubled in this release of the database. The additional data represents a set of genes that have a lower, but nevertheless important, mutation frequency in human cancer as a whole. In specific malignancies genes such as FLT3 do have a significant role as can be seen from the data collected in COSMIC.

Gene Number of analysed samples Number of samples with mutations
BRAF 5158 736
ERBB2 714 8
FGFR2 30 2
FGFR3 1735 481
FLT3 7610 1499
HRAS 11876 477
KRAS2 35716 8302
MET 1081 59
NRAS 13884 1132
PDGFRA 146 25
PIK3CA 396 89
TOTAL 78346 12810

Number of unique mutations 307

Number of curated papers 976

Website Changes

Home Page
  • We have added a link to an ATOM feed for those people with ATOM enabled news feed readers. Adding this link to your feeds list will allow you to see the latest news from the COSMIC site as and when it is available.
Distribution View
  • A totals column has been added to the 'Details' table to show the total number of mutated samples that are listed.
  • Links to show only negative data have been added to the'More Details' links in the 'Details' table.
  • The Insertions and deletions table has been split to show different information for the two types of mutation.
Gene Selection
  • Genes can now be selected by chromosome, from the karyotype graphic, or as always from an alphabetical list.
References Page
  • The complete list of references for a specific gene can now be exported in a variety of formats including Excel.
Mutation Data Page
  • All pages with samples containing more than 100 samples have been split to reduce their size. However, the export function will still export all the samples as selected.


More


29th Sep 2004 COSMIC Update

COSMIC Update

We are pleased to announce an update to the COSMIC website. To coincide with the nature paper on ERBB2 we have added all the data for this gene to COSMIC. There have also been a number of improvements to the interface that we hope you will find useful.

New Data

ERBB2

Today, Nature publish our recent findings, the first description of small intragenic ERBB2 mutations in human cancer. Primarily found in non-small cell lung adenocarcinomas, the mutations identified are suggestive of inappropriate activation of ERBB2 kinase activity.

This addition brings 8 new mutations and 714 new samples to the database. Increasing the total number of mutant samples to 10655 and the total number of samples to 58032.

Website Changes

Distribution View
  • The summary table has been removed in favour of a new gene summary page. Containing all the data from this table, plus much more.
  • The mutations tables have been expanded to show insertions, deletions and complex mutations.
  • Information about the negative samples is now available and can be viewed by clicking on the 'More Details' link in the Details table. Like the positive samples, this data can also be exported in various formats.
  • A new insertions and deletions track has been added to the main image. This will allow us to display a larger number of genes with more complex mutation sets.
  • A complex mutations track has also been added to display those mutations (multiple base substitutions) which don't quite fit into any of the other categories.
Gene Summary Page

This has grown from the original four row summary table, on the distribution page, into a full page overview of the information stored about a specific gene.

  • Mutation hot spots: The mutation summary shows those areas of the transcript that have a high density of mutations. This can be used to go directly to the area of interest on the mutation distribution view.
  • References: A quick glance will show the most recently published paper that was analysed by the COSMIC staff.
Sample Summary Page

Here you will find a page containing all the information about a particular sample. Some of the previously unavailable information, such as details about the individual, has been been made available.

  • Genes Tested: Quickly identify all the genes in COSMIC that have been tested against the selected sample.
  • References: Locate all the references that have included the sample.
Reference Summary Page

For the first time in COSMIC you can see all the samples from one paper in one location. In addition to this there are also details about the genes screened and the mutations that were found.


More


28th Jun 2004 Cosmic Update

Cosmic Update

We are pleased to announce a minor update to the COSMIC website. The user interface has been updated to include new features that we hope will make your experience with the site more productive and enjoyable.

Web Site Changes

  • Shorter URLs' - These have been shortened to reduce the amount of text required to link to a specific page within the site. The old style identifiers have been replaced with shortened initials. For example, 'locus_name' has been replaced with 'ln'. The old style links still work and any existing bookmarks should not be affected by this change.
  • Nucleotide Tracks - All mutations can now be viewed with respect to the changes they would cause to the nucleotide sequence, in addition to the already present amino acid changes. The nucleotide views can be accessed by selecting 'cDNA' in the navigation menu on the main display pages. An example of this view can be seen here
  • Navigation & Selection Improvements - The selection process has been updated to allow users to select sub types across a range of up to five tissue types. Adding a new level of refinement to the search process.


More


23rd Jun 2004 COSMIC Detailed

COSMIC Detailed

The British Journal of Cancer have released an advance online version of an article describing COSMIC. Detailed information is provided about the curation and structure of the database. Followed by a description of the facilities provided by the website.


More


6th May 2004 COSMIC Website Unavailable 8th May

COSMIC Website Unavailable 8th May

On Saturday 8th May the COSMIC website, as part of the Sanger website, will be unavailable whilst major network upgrades and essential maintenance work is carried out. We apologise in advance for this loss of service.


More


20th Feb 2004 Nucleotide data available

Nucleotide data available

COSMIC displays mutations at the amino acid level to show the potential implication of the mutations on the protein sequence. In addition to this COSMIC holds the mutations at the nucleotide level. This data is available through the Export function that can be found at the top of the Distribution figure (example) or at the bottom of the expanded Mutation Data tables (example).


More


4th Feb 2004 COSMIC version 1 released

COSMIC version 1 released

Wellcome Trust Sanger Institute launches Catalogue Of Somatic Mutations In Cancer. In the quest to develop rational approaches to treating cancer, researchers need efficient access to existing knowledge. COSMIC (Catalogue Of Somatic Mutations In Cancer), launched today by the Cancer Genome Project at The Wellcome Trust Sanger Institute, is a new tool that provides integrated genetic data from cancer genes, and will make research faster and easier.


More


3rd Feb 2004 BRAF V599E becomes V600E

BRAF V599E becomes V600E

The original BRAF mutations reported by Davies et al were mapped to the DNA sequence NM_004333[gi;4757867] with the common BRAF mutation being V599E. On the 24th July 2003 this sequence was updated to NM_004333[gi;33188458] with the insertion of 3bp in the coding sequence. The net effect of this update was to increase the length of the BRAF protein by one amino acid and increase the position of all published mutations by one amino acid. The beginning of both versions of the proteins are;


MAALSGGGGGGAEPGQALFNGDMEPEAGAGR PAASSAADP	NM_004333[gi;4757867]
||||||||||||||||||||||||||||||  |||||||||
MAALSGGGGGGAEPGQALFNGDMEPEAGAGAGAAASSAADP	NM_004333[gi;33188458]


The BRAF mutations in COSMIC are mapped to the latest version of the cDNA and V599E has become V600E.


More


RSS
Information Projects Other Services
Sanger Home
Sitemap
Site Search
Information
Careers
Press
News
Seminars
Workshops
Publications
Staff Theses
Travel Directions
Research Teams
Research Faculty
Personnel Search
Human Genetics
Model Organism Genetics
Pathogen Genetics
Bioinformatics
Sequencing
Library
Helpdesk
Webmail
VPN Access
Sign In
SSO Pass. Reset

webmaster@sanger.ac.uk

Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK  Tel:+44 (0)1223 834244

Last Modified Tue Mar 20 14:26:52 2007

Genome Research Limited is a charity registered in England with number 1021457

Help | Contact us | Legal | Cookies policy | Data sharing