COSMIC v66 Release
COSMIC v66 contains curation of cancer genes TSC1 &TSC2, four new fusion gene pairs, 17 whole-genome sequencing publications, and extensive updates from ICGC & TCGA.
We have reviewed the current knowledge on genes involved in human cancer and as a result, added 17 new genes to the Cancer Gene Census (a list of genes with significant proof they contribute to human cancer). We will be adding more genes to this census in future releases, when we consider their involvement proved in the literature.
This July release is the last performed on a bimonthly schedule. COSMIC's next release will be in October, when the schedule will become one release every three months.
TCS1 and TSC2
Somatic alterations in tumour suppressor gene TSC1 have been detected in sporadic tumours such as bladder cancer, renal cell carcinoma and hepatocellular carcinoma, and TSC2 alterations in sporadic pulmonary LAM, renal angiomyolipoma and head and neck cancers. TSC1 and TSC2 gene products, hamartin and tuberin, form a protein complex that plays a critical role in growth control as a primary regulator of the mammalian target of Rapamycin (mTOR) pathway. Germline mutations in these genes cause tuberous sclerosis complex (TSC), a neurocutaneous syndrome characterized by seizures, mental retardation, and benign tumours of many organs.
A subgroup of mesotheliomas is characterised by EWSR1-YY1 fusions. The EWSR1 breakpoint is similar to that found in other fusions involving EWSR1 such as EWSR1-FLI1 and EWSR1-DDIT3. YY1 encodes a ubiquitously distributed transcription factor belonging to the GLI-Kruppel class of zinc finger proteins. The EWSR1-YY1 fusion protein includes the transactivation domain of EWSR1 and the DNA-binding domain of YY1.
Further broadening the range of tumours associated with EWSR1 fusions, EWSR1-NFATC1 has been identified in haemangioma of the bone. NFATC1 encodes a component of the nuclear factor of activated T cells DNA-binding transcription complex which is involved primarily in immune response. The transactivation domain of EWSR1 is retained in the fusion where it's fused to the DNA-binding domain, the REL-homology region, of NFATC1.
IRF2BP2-CDX1 has been identified as an alternative fusion to HEY1-NCOA2 in mesenchymal chondrosarcoma. IRF2BP2 encodes an interferon regulatory factor-2 (IRF2) binding protein that interacts with the C-terminal transcriptional repression domain of IRF2. CDX1 belongs to the homeobox gene family. For the fusion protein, the N terminal includes the IRF2BP2 zinc finger motif and the C terminal includes the CDX1 homeodomain.
STRN, encoding a calmodulin-binding protein, has been identified as a novel ALK fusion partner in lung adenocarcinoma. As in most ALK fusions the kinase domain of ALK is preserved in the fusion protein.
Tarpey et al (2013). Frequent mutation of the major cartilage collagen gene COL2A1 in chondrosarcoma. Nature genetics(epub)
Zhang et al (2013). Genetic heterogeneity of diffuse large B-cell lymphoma. Proceedings of the National Academy of Sciences of the United States of America 110:1398
Reuss et al (2013). Secretory meningiomas are defined by combined KLF4 K409Q and TRAF7 mutations. Acta neuropathologica 125:351
Ho et al (2013). The mutational landscape of adenoid cystic carcinoma. Nature genetics 45:791
Lui et al (2013). Frequent mutation of the PI3K pathway in head and neck cancer defines predictive biomarkers. Cancer discovery 3:761
Murtaza et al (2013). Non-invasive analysis of acquired resistance to cancer therapy by sequencing of plasma DNA. Nature 497:108
Chen et al (2013). Next-generation-sequencing-based risk stratification and identification of new genes involved in structural and sequence variations in near haploid lymphoblastic leukemia. Genes, chromosomes & cancer 52:564
Han SW et al (2013). Targeted sequencing of cancer-related genes in colorectal cancer using next-generation sequencing. PloS one 8:e64271
Bettegowda et al (2013). Exomic Sequencing of Four Rare Central Nervous System Tumor Types. Oncotarget 4:572
Clark et al (2013). Genomic Analysis of Non-NF2 Meningiomas Reveals Mutations in TRAF7, KLF4, AKT1, and SMO. Science 339:1077
Yost SE et al (2013). High-resolution mutational profiling suggests the genetic validity of glioblastoma patient-derived pre-clinical models. PloS one 8:e56185
Zang et al (2012). Exome sequencing of gastric adenocarcinoma identifies recurrent somatic mutations in cell adhesion and chromatin remodeling genes. Nature genetics 44:570
Iyer et al (2012). Genome sequencing identifies a basis for everolimus sensitivity. Science 338:221
Kridel R et al (2012). Whole transcriptome sequencing reveals recurrent NOTCH1 mutations in mantle cell lymphoma. Blood 119:1963
Totoki et al (2011). High-resolution characterization of a hepatocellular carcinoma genome. Nature genetics 43:464
Wang et al (2011). Exome sequencing identifies frequent mutation of ARID1A in molecular subtypes of gastric cancer. Nature genetics 43:1219
Lee et al (2010). The mutation spectrum revealed by paired genome sequences from a lung cancer patient. Nature 465:473
Ovarian Serous Cystadenocarcinoma
Acute Myeloid Leukemia
Bladder Urothelial Carcinoma
Breast Invasive Carcinoma
Cervical Squamous Cell Carcinoma
Kidney Renal Clear Cell Carcinoma
Lung Squamous Cell Carcinoma
Uterine Corpus Endometrioid Carcinoma
Head and Neck Squamous Cell Carcinoma
Japanese liver cancer
The Cancer Gene Census, a listing of all genes known to be involved in cancer promotion
The Cancer Cell Line Project, defining key driver mutations in 800 common cancer cell lines
Cosmic Whole Genomes, all tumours with genome-wide somatic annotations
Genomics of Drug Sensitivity, Analysis of drug sensitivity data in human cancer cell lines.
CGP Copy Number Analysis in Cancer, examining tumours for gains or losses of genomic content
COSMIC v65 Release
COSMIC v65 includes full curation of the genes SH2B3, MAP2K1 and MAP2K2, recently identified as causing blood and epithelial cancers, together with 5 new USP6 gene fusions, found in aneurysmal bone cysts. In addition, substantial updates have been made to studies curated from the ICGC and TCGA. 6989 samples in COSMIC now have full-genome annotations.
In recent years, COSMIC has released a new version every two months, six times a year. Howerver, in order to focus on maximising the content of each release and reducing the workload for those integrating the COSMIC database into their own resources, we will be changing to a 3 monthly schedule from July. After July, the next COSMIC release will be in October 2013 and every three months after this.
SH2B3 (LNK, 12q24.12) is a plasma membrane-bound adapter protein and a negative regulator of cytokine signalling involved in normal haematopoesis. Its functions include inhibition of wild type and mutant JAK2 signaling and it is overexpressed in myeloproliferative neoplasms (MPN) as well as myelodysplastic syndrome and leukaemic cells; growth of some transformed cells is inhibited by overexpression of SH2B3 and loss of LNK in murine models enhances development of MPNs. Mutations have mainly been found in MPNs and idiopathic erythrocytosis, predominantly heterozygous PH2 domain mutations (hotspot E208_D234) in wild type JAK2/MPL blast phase MPNs, implicating SH2B3 in MPN progression. However, mutations have also been found in chronic phase MPNs, can occur in JAK2/MPL mutated tumours and in other SH2B3 domains. Mutations have also been seen in small numbers of early T-cell precursor acute lymphoblastic leukaemia samples and solid tumours. The loss of inhibition of JAK-STAT activation may be related to haploinsufficiency of SH2B3 or due to dominant-negative effect of the mutant protein.
MAP2K1 and MAP2K2
CDH11-USP6, THRAP3-USP6, OMD-USP6, CNBP-USP6, COL1A1-USP6
(CDH11-USP6_ENST00000250066, THRAP3-USP6_ENST00000250066, OMD-USP6_ENST00000250066, CNBP-USP6_ENST00000250066, COL1A1-USP6_ENST00000250066)
Fusions involving USP6, partnered with one of five different genes (CDH11, THRAP3, OMD, CNBP or COL1A1), are found in aneurysmal bone cyst, a locally aggressive bone lesion with a propensity to recur. In each translocation the entire ubiquitin-specific protease coding sequence is fused downstream to the promoter region of the partner gene.
TCGA Cervical Squamous cell
TCGA Bladder Urothelial
TCGA Breast Carcinoma
TCGA Colon Cancer
TCGA Lung Adenocarcinoma
ISC/MICINN Chronic Lymphocytic Leukaemia
TCGA Acute Myeloid Leukaemia
TCGA Ovarian Serous Cystadenocarcinoma
TCGA Prostate Adenocarcinoma
TCGA Rectum Adenocarcinoma
TCGA Lung Squamous cell
Leich E et al (2013). Multiple myeloma is affected by multiple and heterogeneous somatic mutations in adhesion- and receptor tyrosine kinase signaling molecules. Blood cancer journal 3:e102
Sausen M et al (2013). Integrated genomic analyses identify ARID1A and ARID1B alterations in the childhood cancer neuroblastoma.Nature genetics 45(1):12-7
Agrawal N et al (2013). Exomic Sequencing of Medullary Thyroid Cancer Reveals Dominant and Mutually Exclusive Oncogenic Mutations in RET and RAS. The Journal of clinical endocrinology and metabolism 98(2):E364-9
Dulak AM et al (2013). Exome and whole-genome sequencing of esophageal adenocarcinoma identifies recurrent driver events and mutational complexity. Nature genetics 45(5):478-86
Landau DA et al (2013). Evolution and impact of subclonal mutations in chronic lymphocytic leukemia. Cell 152(4):714-26
Pugh TJ et al (2013). The genetic landscape of high-risk neuroblastoma. Nature genetics 45(3):279-84
Green MR et al (2013). Hierarchy in somatic mutations arising during genomic evolution and progression of follicular lymphoma. Blood 121(9):1604-11
Streppel MM et al (2013). Next-generation sequencing of endoscopic biopsies identifies ARID1A as a tumor-suppressor gene in Barretts esophagus. Oncogene 1476-5594
Zhou D et al (2013). Exome capture sequencing of adenoma reveals genetic alterations in multiple cellular pathways at the early stage of colorectal tumorigenesis. PloS one 8(1):e53310
Kim SC et al (2013).A high-dimensional, deep-sequencing study of lung adenocarcinoma in female never-smokers. PloS one 8(2):e55596
Demeure MJ et al (2012). Cancer of the ampulla of Vater: analysis of the whole genome sequence exposes a potential therapeutic vulnerability. Genome medicine 4(7):56
Newey PJ et al (2012) . Whole-Exome Sequencing Studies of Nonhereditary (Sporadic) Parathyroid Adenomas. The Journal of clinical endocrinology and metabolism 1538-7445
COSMIC v64 Release
COSMIC v64 contains full curation of gene fusions CIC-DUX4 and ACTB-GLI1 in solid tumours. Twelve additional genome-wide sequencing publications bring the numbers of WGS samples in COSMIC to 5023.
The new COSMIC website (http://cancer.sanger.ac.uk) now replaces the old one (www.sanger.ac.uk/cosmic) which is no longer available. Existing links and bookmarks will still work, but will redirect to the appropriate page on the new website. Substantial help is available to navigate the new system here, but if you have any questions or comments about the new website, please contact us at firstname.lastname@example.org
The recurrent CIC-DUX4 fusion is found in a subset of paediatric and young adult primitive round cell undifferentiated soft tissue sarcomas, distinct from Ewings sarcoma family of tumours. CIC is a member of the HMG-box superfamily of transcription factors and DUX4 is a double-homeobox gene belonging to the family of double homeo-domain transcription activators. The CIC-DUX4 fusion preserves most of the functional regions of the CIC gene, including the DNA-binding HMG-box and most of the MAPK phosphorylation sites, but both DUX4 homeobox domains are lost.
This novel fusion has been found in a discrete set of soft tissue sarcomas with distinctive pericytic features. The DNA-binding zinc domains of GLI1 are retained in the fusion and the GLI1 promoter region is replaced with that of the ubiquitously expressed ACTB gene.
Horn S et al (2013). TERT Promoter Mutations in Familial and Sporadic Melanoma. Science 339:959
Roberts KG et al (2012). Genetic alterations activating kinase and cytokine receptor signaling in high-risk acute lymphoblastic leukemia. Cancer cell 22:153
Le Gallo M et al (2012). Exome sequencing of serous endometrial tumors identifies recurrent somatic mutations in chromatin-remodeling and ubiquitin ligase complex genes. Nature genetics 44:1310
Agrawal N et al (2012). Comparative genomic analysis of esophageal adenocarcinoma and squamous cell carcinoma. Cancer discovery 2:899
Kannan K et al (2012). Whole-exome sequencing identifies ATRX mutation as a key molecular determinant in lower-grade glioma. Oncotarget 3:1194
Lindberg J et al (2012). The Mitochondrial and Autosomal Mutation Landscapes of Prostate Cancer. Eur Urol. 63:702
Nichols AC, et al. (2012). A Pilot Study Comparing HPV-Positive and HPV-Negative Head and Neck Squamous Cell Carcinomas by Whole Exome Sequencing. ISRN Oncol. 2012:809370
Seshagiri S et al. (2012). Recurrent R-spondin fusions in colon cancer. Nature 488:660
Seo JS et al (2012). The transcriptional landscape and mutational profile of lung adenocarcinoma. Genome research 22:2109
Liu J et al (2012). Genome and transcriptome sequencing of lung cancers reveal diverse mutational and splicing events. Genome research 22:2315
Piazza R et al (2012). Recurrent SETBP1 mutations in atypical chronic myeloid leukemia. Nature Genetics 45:18
Rossi D et al (2012). The coding genome of splenic marginal zone lymphoma: activation of NOTCH2 and other pathways regulating marginal zone development. The Journal of experimental medicine 209:1537
This release features improvements increasing functionality of the GDSC website to facilitate analysis and interpretation of results.
Drug overview pages
These pages provide a visual summary of the screening results for each drug. Cell line IC50 values including confidence intervals are plotted as well as summary statistics for each drug. The overview page also contains separate plots for cell line IC90, IC75, IC25 and AUC (area under the curve) values for each drug.
IC50 scatter plots filtered by mutation and tissue type
The scatter plots of cell line IC50s values for significant drug-gene associations have been improved so that they can be filtered by mutation type (coding mutation, amplification or deletion) or by tissue type. A non-parametric test is performed for each resulting scatter plot to assess the significance of each association.
Genomics of Drug Sensitivity in Cancer Team (http://www.cancerRxgene.org), please contact us at: email@example.com
COSMIC v63 Release
COSMIC v63 includes full curation of cancer genes STAT3 and TNFRSF14, together with further FGFR and EWSR1 fusion gene pairs. Nine additional systematic screen papers from 2012 ensure our curation of cancer genome analysis remains very current.
The new COSMIC website (cancer.sanger.ac.uk) will replace the old one (www.sanger.ac.uk/cosmic) in March 2013, with our v64 release. Existing links and bookmarks will still work, but will redirect to the appropriate page on the new website. If you have any questions or comments about the new website, please contact us at firstname.lastname@example.org.
COSMIC has annotated cancer mutations across more than 24,000 genes over the last eleven years, and these have been collected from a variety of sources. In order to better standardise our gene information, we are now updating our gene sequences onto the better CCDS standard of human transcripts, where the sequences have been agreed by consensus between several genome annotation projects. In this release, we include the update of the first 19584 gene transcripts to CCDS standard.
We're planning an expansion of the COSMIC project to include more useful somatic datatypes and further analytical software/webpages. We're based in Cambridge, UK, and our bioinformatics development work is focused on the Perl programming language, making much use of relational databases (Oracle, PostGreSQL). If you have expertise in these areas and would enjoy working on this challenging project, please reply to our job advert below, or email email@example.com for more details by 15th February 2013.
TNFRSF14 (tumour necrosis factor receptor superfamily member 14; 1p36). LIGHT mediated triggering of non-mutated TNFRSF14 renders B-cell lymphomas more immunogenic and more sensitive to FAS induced apoptosis, and non-mutated TNFRSF14 can inhibit proliferation of adenocarcinoma cells, suggesting a tumour suppressor role. Somatic mutations have been found in follicular lymphomas and diffuse large B-cell lymphoma, the majority being nonsense and missense mutations, but also including frame-shift, splice site and insertion mutations distributed across the gene, consistent with loss of function of a tumour suppressor. Mutated DLBCL are associated with higher risk clinical features and a worse response to rituximab. FL are often associated with del 1p36; Individuals carrying a TNFRSF14 mutation have a worse prognosis than those carrying a 1p36 deletion alone, patients with both alterations being associated with the worst prognosis.
Another novel fusion gene involving EWSR1 has been identified in a small subset of myoepithelial tumours of soft tissue. ZNF444 encodes a zinc finger protein which activates transcription of a scavenger receptor gene involved in the degradation of acetylated low density lipoprotein. The EWSR1 breakpoint in this fusion is in a position frequently found in other EWSR1 fusions.
(FGFR3-BAIAP2L1, FGFR3-TACC3, FGFR1_ENST00000447712-TACC1)
FGFR3 fusions involving 2 different partners, which generate constitutively activated fusion proteins, have been identified in urothelial carcinoma. The FGFR3 component of the fusion is the same in all cases; the tyrosine kinase coding domains are retained but the final exon that includes the PLCgamma1 binding site is lost. The TACC3 fusion component retains the transforming acidic coiled-coil domain that mediates microtubule binding and the BAIAP2L1 component retains the IRSp53/MIM domain that mediates actin binding and Rac interaction. Recurrent FGFR-TACC fusions have also been found in a small subset of glioblastoma multiforme.
Hodis et al (2012). A landscape of driver mutations in melanoma.Cell 150:251
Imielinski et al (2012). Mapping the hallmarks of lung adenocarcinoma with massively parallel sequencing. Cell 150:1107
Castellerin et al (2012). GClonal evolution of high-grade serous ovarian carcinoma from primary to recurrent disease. J Pathol (epub)
Biankin et al (2012). Pancreatic cancer genomes reveal aberrations in axon guidance pathway genes. Nature 491:399
Love et al (2012) . The genetic landscape of mutations in Burkitt lymphoma Nat Genet 44:1321
Nikolaev et al (2012). A Single-Nucleotide Substitution Mutator Phenotype Revealed by Exome Sequencing of Human Colon Adenomas. Cancer Res 72:6279
Wang et al (2011). Whole-exome sequencing of human pancreatic cancers and characterization of genomic instability caused by MLH1 haploinsufficiency and complete deficiency. Genome Res22:208
Gerlinger et al (2012). Intratumor heterogeneity and branched evolution revealed by multiregion sequencing. NEJM 366:883
Dolnik et al (2012). Commonly altered genomic regions in acute myeloid leukemia are enriched for somatic mutations involved in chromatin remodeling and splicing. Blood 120:e83
COSMIC v62 Release
COSMIC v62 includes full curation of genes H3F3A, BCOR and HIST1H3B, together with RSPO2/3 fusions in colon cancer and NTRK1 fusions in thyroid cancer. In addition, 1068 whole-genome screens are included from recent TCGA releases, with many more from 13 newly curated systematic screen publications.
H3F3A and HIST1H3B
Mutations in H3F3A (encoding histone H3.3) or in the related HIST1H3B (encoding H3.1) have been identified as molecular drivers in diffuse intrinsic pontine glioma, and paediatric and young adult glioblastoma. Mutations consistently occur at 2 key regulatory sites within the highly conserved N-terminal histone tail which influences the dynamic regulation of chromatin structure and accessibility. These hotspot mutations appear linked to tumour location.
EIF3E-RSPO2 and PTPRK-RSPO3
The R-spondin family members RSPO2 and RSPO3 have been identified in recurrent fusions in microsatellite-stable colon adenocarcinoma at a frequency of 10%. R-spondins encode secreted proteins that can potentiate canonical WNT signalling. In the EIF3E-RSPO2 fusion, EIF3E exon 1 fuses to RSPO2 to produce a functional RSPO2 protein driven by the EIF3E promoter. In the most commonly identified PTPRK-RSPO3 fusion, PTPRK exon 1 fuses to RSPO3 exon2, preserving the coding sequence of RSPO3 and replacing its secretion signal sequence with that of PTPRK.
TRK oncogenes, fusions involving NTRK1, are found in a subset of papillary thyroid carcinomas. NTRK1 encodes a cell-surface transmembrane tyrosine kinase protein acting as receptor for nerve growth factor. In TRKs the 3' terminal sequence of the tyrosine kinase domain of NTRK1 fuses with the 5' terminal sequence of one of 3 activating genes, TPM3, TPR or TFG, all of which contain coiled-coil domains that mediate protein dimerization and consequent tyrosine kinase activation.
Huang et al (2012). Exome sequencing of hepatitis B virus-associated hepatocellular carcinoma.Nature Genetics 44:1117-1121
Xu et al (2012). Single-cell exome sequencing reveals single-nucleotide mutation characteristics of a kidney tumor. Cell 148:886-895
Greif et al (2012). GATA2 zinc finger 1 mutations associated with biallelic CEBPA mutations define a unique genetic entity of acute myeloid leukemia. Blood 120:395-403.
Dahlman et al (2012). BRAF L597 mutations in melanoma are associated with sensitivity to MEK inhibitors. Cancer Discovery 2:791-797
Northcott et al (2012). Subgroup-specific structural variation across 1,000 medulloblastoma genomes. Nature 488:49-56
Wang et al (2012). Mutations in isocitrate dehydrogenase 1 and 2 occur frequently in intrahepatic cholangiocarcinomas and share hypermethylation targets with glioblastomas. Oncogene (epub)
Walker et al (2012). Intraclonal heterogeneity and distinct molecular mechanisms characterize the development of t(4;14) and t(11;14) myeloma. Blood 120:1077-1086
Koo et al (2012). Janus Kinase 3-Activating Mutations Identified in Natural Killer/T-cell Lymphoma. Cancer Discovery 2:591-597
Peifer et al (2012). Integrative genome analyses identify key somatic driver mutations of small-cell lung cancer. Nature genetics 44:1104-1110
Kalender Atak et al (2012). High accuracy mutation detection in leukemia on a selected panel of cancer genes. PLoS One 7:e38463
Kuhn et al (2012). Identification of Molecular Pathway Aberrations in Uterine Serous Carcinoma by Genome-wide Analyses. J Natl Cancer Inst 104:1503-1513
Rudin et al (2012). Comprehensive genomic analysis identifies SOX2 as a frequently amplified gene in small-cell lung cancer. Nature Genetics 44:1111-1116
Barber et al (2011). Comprehensive Genomic Analysis of a BRCA2 Deficient Human Pancreatic Cancer. PLoS One 6:e21639
Bladder Urothelial Carcinoma (TCGA, US).
Breast Invasive Carcinoma (TCGA, US).
Cervical Squamous Cell Carcinoma (TCGA, US).
Kidney Renal Clear Cell Carcinoma (TCGA, US).
Lung Adenocarcinoma (TCGA, US).
Lung Squamous Cell Carcinoma (TCGA, US).
Uterine Corpus Endometrioid Carcinoma (TCGA, US).
Prostate Adenocarcinoma (TCGA, US).
This release sees the addition of 4,901 new IC50 values including data for 4 new anti-cancer drugs as well as new data for previously released compounds.
Number of new drug: 4
Total number of drugs: 142
Number of new IC50 values: 4,901
Total number of IC50 values: 78,070
The therapeutic target(s) of drugs in this release are:
Cell lines are now coloured coded based on whether they have a coding mutation, amplification or homozygous deletion in a given cancer gene. This makes it simple to determine what types of mutations occur in a specific cancer gene and whether mutation-type influences drug response.
Genomics of Drug Sensitivity in Cancer Team (http://www.cancerRxgene.org), please contact us at : firstname.lastname@example.org
COSMIC v61 Release
COSMIC v61 focuses on whole genome screen publications with information from 17 major new reports, including the new TCGA Colon & Rectal cancer studies. In addition, the full literature on point mutations in PHF6 has been curated, along with 4 new gene fusion pairs.
PHF6,encoding a plant homeodomain (PHD) factor containing 4 nuclear localization signals and 2 PHD-type finger domains, and with a proposed role in transcriptional regulation, has been identified as an X-linked tumour suppressor gene in T-cell acute lymphoblastic leukaemia and acute myeloid leukaemia. Mutations are evenly distributed throughout the gene with recurrent missense mutations in the second zinc finger domain. Mutation prevalence is greater in male than in female patients.
A novel ALK fusion involving FN1 which encodes fibronectin, a ubiquitous component of extracellular matrix and plasma, has been found in ovarian malignant stromal sarcoma. The resultant fusion protein contains the amino-terminal 1201 amino acids of FN1 and the carboxyl-terminal 598 amino acids of ALK which include the transmembrane and cytoplasmic regions.
An additional ALK fusion partner has been identified in lung carcinoma. KLC1, encoding a member of the kinesin light chain family, fuses to the canonical ALK exon 20 recombination site in bronchioloalveolar carcinoma.
A recurrent oncogenic BRAF fusion involving FAM131B, a currently uncharacterized gene on chromosome 7q34, has been shown to be an alternative mechanism of MAPK pathway activation in pilocytic astrocytoma. In common with other BRAF and RAF1 fusions, the FAM131B-BRAF fusion product lacks the RAF auto-inhibitory domain. Of note is the small number of FAM131B exons, comprising mostly of 5' UTR, included in the fusion.
An oncogenic KRAS fusion has been identified in a metastatic prostate cancer cell line. UBE2L3-KRAS encodes a protein encompassing most of the UBE2L3 protein, a member of the E2 ubiquitin-conjugating enzyme family, and full length KRAS.
Focus on recent high-impact systematic screens:
Guichard et al (2012). Integrated analysis of somatic mutations and focal copy-number changes identifies key genes and pathways in hepatocellular carcinoma. Nat Genet. 44:694-8.
Jones et al (2012). Low-grade serous carcinomas of the ovary contain very few point mutations.J Pathol. 226:413-20.
Gui et al (2011). Frequent mutations of chromatin remodeling genes in transitional cell carcinoma of the bladder. Nat Genet. 43:875-8.
Galante et al (2011). Distinct patterns of somatic alterations in a lymphoblastoid and a tumor genome derived from the same individual. Nucleic Acids Res. 39:6056-68.
Robinson et al (2012). Novel mutations target distinct subgroups of medulloblastoma. Nature. 488:43-8.
The Cancer Genome Atlas Network (2012). Comprehensive molecular characterization of human colon and rectal cancer.Nature. 487:330-7.
Pugh et al (2012). Medulloblastoma exome sequencing uncovers subtype-specific somatic mutations.Nature. 488:106-10.
Zhang et al (2011). Key pathways are frequently mutated in high-risk childhood acute lymphoblastic leukemia: a report from the Children's Oncology Group.Blood. 118:3080-7.
Yip et al (2012). Concurrent CIC mutations, IDH mutations, and 1p/19q loss distinguish oligodendrogliomas from other cancers.J Pathol. 226:7-16.
Lee et al (2012). A remarkably simple genome underlies highly malignant pediatric rhabdoid cancers.J Clin Invest. 122:2983-8.
Bass et al (2011). Genomic sequencing of colorectal adenocarcinomas identifies a recurrent VTI1A-TCF7L2 fusion.Nat Genet. 43:964-8.
Zhang et al (2012). The genetic basis of early T-cell precursor acute lymphoblastic leukaemia.Nature. 481:157-63.
Jones et al (2012). Dissecting the genomic complexity underlying medulloblastoma.Nature. 488:100-5.
Fujimoto et al (2012). Whole-genome sequencing of liver cancers identifies etiological influences on mutation patterns and recurrent mutations in chromatin regulators.Nat Genet. 44:760-4.
Peña-Llopis et al (2012). BAP1 loss defines a new class of renal cell carcinoma. Nat Genet. 44:751-9.
Ong et al (2012). Exome sequencing of liver fluke-associated cholangiocarcinoma. Nat Genet. 44:690-3.
Duns et al (2012). Targeted exome sequencing in clear cell renal cell carcinoma tumors suggests aberrant chromatin regulation as a crucial step in ccRCC development.Hum Mutat. 33:1059-62.
ABL1, ACVR1B, AKT1, ALK, APC, ARID1A, ASXL1, ATM, ATRX, AXIN1, BAP1, BRAF, BRCA1, BRCA2, CARD11, CBL, CDC73, CDH1, CDKN2A, CEBPA, CREBBP, CRLF2, CSF1R, CTNNA1,CTNNB1, CYLD, DAXX, DNMT3A, EGFR, EML4, EP300, ERBB2, ERG, EZH2, FAM123B, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, GATA1, GATA2, GATA3, GNA11, GNAQ, GNAS, HNF1A, HRAS, IDH1, IDH2, IKZF1, IL7R, JAK1, JAK2, JAK3, KDR, KIT, KRAS, MAP2K4, MED12, MEN1, MET, MLH1, MLL2, MLL3, MPL, MSH2, MSH6, MYD88, NF1, NF2, NFE2L2, NOTCH1, NOTCH2, NPM1, NRAS, NTRK3, PAX5, PBRM1, PDGFRA, PHF6, PHOX2B, PIK3CA, PIK3R1, PPP2R1A, PRDM1, PRKAR1A, PTCH1, PTEN, PTPN11, RB1, RET, RUNX1, SETD2, SF3B1, SMAD4, SMARCA4, SMARCB1, SMO, SRC, SRSF2, STK11, SUFU, TET2, TNFAIP3, TP53, TSHR, U2AF1, VHL, WT1, ZRSR2
COSMIC v60; Drug Sensitivity v2 Release
The 60th release of COSMIC includes the first full version of our new website, 9 new systematic screens and significant updates to the mutation spectra of known cancer genes.
Curated cancer genes
For this release and the next, we are focusing on updating existing curated genes to bring these up-to-date with recent publications.
Berger et al (2012). Melanoma genome sequencing reveals frequent PREX2
mutations. Nature 485:502-6.
Molenaar et al (2012). Sequencing of neuroblastoma identifies
chromothripsis and defects in neuritogenesis genes. Nature 483:589-93.
Grasso et al (2012). The mutational landscape of lethal
castration-resistant prostate cancer. Nature 487:239-43.
Barbieri et al (2012). Exome sequencing identifies recurrent SPOP, FOXA1
and MED12 mutations in prostate cancer. Nat Genet. 44:685-9.
Berger et al (2011). The genomic complexity of primary human prostate
cancer. Nature 470:214-20.
Morin et al. (2011). Frequent mutation of histone-modifying genes in
non-Hodgkin lymphoma. Nature 476:298-303.
Nikolaev et al (2011). Exome sequencing identifies recurrent somatic
MAP2K1 and MAP2K2 mutations in melanoma. Nat Genet.44:133-9.
Wu et al (2011). Whole-exome sequencing of neoplastic cysts of the
pancreas reveals recurrent mutations in components of ubiquitin-dependent pathways. Proc Natl Acad Sci U S A. 108:21188-93.
Guo et al (2011). Frequent mutations of genes encoding ubiquitin-mediated
proteolysis pathway components in clear cell renal cell carcinoma. Nat Genet. 44:17-9.
ABL1, ACVR1B, AKT1, ALK, APC, ARID1A, ASXL1, ATM, ATRX, AXIN1, BAP1, BRAF, BRCA1, BRCA2, CARD11, CBL, CDC73, CDH1, CDKN2A, CEBPA, CREBBP, CRLF2, CSF1R, CTNNA1, CTNNB1, CYLD, DAXX, DNMT3A, EGFR, EML4, EP300, ERBB2, ERG, EZH2, FAM123B, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, FOXL2, GATA1, GATA2, GATA3, GNA11, GNAQ, GNAS, HNF1A, HRAS, IDH1, IDH2, IKZF1, IL7R, JAK1, JAK2, JAK3, KDR, KIT, KRAS, MAP2K4, MED12, MEN1, MET, MLH1, MLL2, MLL3, MPL, MSH2, MSH6, MYD88, NF1, NF2, NFE2L2, NOTCH1, NOTCH2, NPM1, NRAS, NTRK3, PAX5, PBRM1, PDGFRA, PHOX2B, PIK3CA, PIK3R1, PPP2R1A, PRDM1, PRKAR1A, PTCH1, PTEN, PTPN11, RB1, RET, RUNX1, SETD2, SF3B1, SMAD4, SMARCA4, SMARCB1, SMO, SOCS1, SRC, STK11, SUFU, TET2, TNFAIP3, TP53, TSHR, U2AF1, VHL, WT1
New data This release sees the addition of 25,421 new IC50 values including data for 9 new anti-cancer drugs as well as new data for previously released compounds.
Number of new drugs: 9
Total number of drugs: 138
Number of new IC50 values: 25,421
Total number of IC50 values: 73,169
Number of new cell lines: 76
Total number of cell lines: 714
We have enhanced the analysis of drug sensitivity data by including elastic net (EN) modeling. This approach is able to scan across the genome to identify genomic, transcriptomic and tissue-type features associated with drug sensitivity or resistance. EN modeling results are presented as heatmaps and the results of this analysis are freely downloadable from the website.
EN modeling is available in addition to the multivariate ANOVA of drug sensitivity.
This new functionality allows users to generate scatter plots of cell line IC50 values for significant drug-gene associations. Users are able to select which data to plot depending on their drug or gene of interest. Scatter plot images are downloadable and cell line IC50 values are directly linked to the COSMIC database facilitating integration of drug sensitivity data with detailed cell line information.
New COSMIC website
A new, improved COSMIC website (http://cancer.sanger.ac.uk/)
We are pleased to announce the development of a new website for the COSMIC database, designed to improve the exploration of this increasingly complex data. This more modern system will additionally form a better platform to extend COSMIC with new types of data and additional analysis functions. This new website is now available to the scientific community, presenting the current v59 COSMIC release. We invite all comments and feedback on our email: email@example.com.
Entry into the system has been kept as simple as possible, focusing on a single search box, which is much more helpful in finding the correct information. Behind this, the 'By Gene' and 'By Tissue' options have also been enhanced. The tissue browser has been significantly redesigned, showing all available site/histology options with counts of mutated samples; once a phenotype has been selected, press 'Go' and substantial details will immediately appear beneath, with links deeper into the new website. The gene browser requests a search string, as little as one letter, and will search all gene names and synonyms, then present a list of available gene options.
The website content has been broadly reorganised in line with the current Sanger style. This allows much information to be presented on one webpage, organised into separate tabs along the top of the main panel, rather than as a series of boxes on a long deep web page. This should reduce the scrolling needed to navigate each page and presents the information in more logical tab-formatted units. For instance, on the Gene Overview page:
Amongst numerous improvements, particular new tools include a better zoomable mutations histogram and searchable, exportable data tables. The new mutations histogram is zoomable in a click-and-drag style (instead of +/- 5aa as before). Additional filter options are also available, and these are more noticeably presented; all tabs on the Gene Analysis webpage reflect the filters selected.
All tables of data in COSMIC are now presented in a new style which allows sorting by selected columns, searching of the table contents, and exporting of this information, as displayed on the screen.
This new system will form a major release in the near future, and the two websystems will run in parallel from the same release database until early 2013, allowing everyone time to adjust to the new system. We invite you to send us any comments on the new website, including what you like or dislike, or any difficulties you experience so that we can make this system the best possible for all of our users. Please contact us at: firstname.lastname@example.org
Good luck with your research,
The COSMIC Team
COSMIC v59 Release
This latest release of COSMIC includes full mutation spectra across three new cancer genes and 9 new gene fusions. Also included are updates from the latest TCGA and ICGC releases, together with 4 recent whole-genome publications.
TCGAThe TCGA recently released large mutation datasets on three cancer types, Rectal, Colorectal and AML. This data has been curated from the TCGA data portal.
ICGC The March release of the ICGC DCC (version 8) included four major new datasets, on liver , paediatric brain and two independent sets of pancreatic cancers (QCMG, Aus), (OICR,Canada). This information has been curated from the ICGC data release portal.
CGPThe complete genome sequences of 21 breast tumours have allowed the elucidation of the mutational process moulding these tumours. This data has been curated from pre-publication datasets here.
Shah, et al. (2012). The clonal and mutational evolution spectrum of primary triple-negative breast cancers.Nature (epub, pre-publication).
Focus on Blood tumours
Ding, et al. (2012). Clonal evolution in relapsed acute myeloid leukaemia revealed by whole-genome sequencing. Nature 481:506-10.
Graubert, et al. (2012). Recurrent mutations in the U2AF1 splicing factor in myelodysplastic syndromes. Nature Genetics 44:53-7.
Yoshida, et al. (2011). Frequent pathway mutations of splicing machinery in myelodysplasia. Nature 478:64-9.
ABL1, ACVR1B, AKT1, ALK, APC, ARID1A, ASXL1, ATM, ATRX, AXIN1, BAP1, BRAF, BRCA1, BRCA2, CARD11, CBL, CDC73, CDH1, CDKN2A, CEBPA, CREBBP, CRLF2, CSF1R, CTNNA1, CTNNB1, CYLD, DAXX, DNMT3A, EGFR, EML4, EP300, ERBB2, EZH2, FAM123B, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, FOXL2, GATA1, GATA2, GATA3, GNA11, GNAQ, GNAS, HNF1A, HRAS, IDH1, IDH2, IKZF1, IL7R, JAK1, JAK2, JAK3, KDR, KIT, KRAS, MAP2K4, MED12, MEN1, MET, MLH1, MLL2, MLL3, MPL, MSH2, MSH6, MYD88, NF1, NF2, NOTCH1, NOTCH2, NPM1, NRAS, NTRK3, PAX5, PDGFRA, PHOX2B, PIK3CA, PIK3R1, PPP2R1A, PRDM1, PRKAR1A, PTCH1, PTEN, PTPN11, RB1, RET, RUNX1, SETD2, SF3B1, SMAD4, SMARCA4, SMARCB1, SMO, SRC, SRSF2, STK11, TET2, TNFAIP3, TP53, TSHR, U2AF1, VHL, WT1, ZRSR2
Genomics of Drug Sensitivity in Cancer v1 release
We have launched a new website to present genomic markers of sensitivity to anti-cancer compounds screened across our >1000 cancer cell line resource.
COSMIC v58 Release
Five new fusion gene pairs and one new cancer gene are curated in this 58th release of COSMIC, together with 6 recent genome-wide mutation screens.
Yan XJ, et al. (2011). Exome sequencing identifies somatic mutations of DNA methyltransferase gene DNMT3A in acute monocytic leukemia. Nat
Pasqualucci L, et al. (2011). Analysis of the coding genome of diffuse large B-cell lymphoma. Nat Genet. 43;830-7.
Quesada V, et al (2012). Exome sequencing identifies recurrent mutations of the splicing factor SF3B1 gene in chronic lymphocytic leukemia. Nat Genet. 44;47-52.
Jiao X, et al. (2012). Somatic mutations in the NOTCH, NF-KB, PIK3CA, and hedgehog pathways in human breast cancers. Genes Chromosomes Cancer.
The mutation data from exome sequencing of 10 Gastric tumours has been available in ICGC release 7 and is now incorporated into this release of COSMIC, here .
Stephens P and Tarpey P, et al. (2012).100 breast exomes have been sequenced by the CGP and are presented here as a pre publication release Nature (In Press)
ABL1, ACVR1B, AKT1, ALK, APC, ARID1A, ASXL1, ATM, AXIN1, BAP1, BRAF, BRCA1, BRCA2, CARD11, CBL, CDC73, CDH1, CDKN2A, CEBPA, CREBBP, CSF1R, CTNNA1, CTNNB1, CYLD, DAXX, DNMT3A, EGFR, EML4, EP300, ERBB2, EZH2, FAM123B, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, FOXL2, GATA1, GATA3, GNA11, GNAQ, GNAS, HNF1A, HRAS, IDH1, IDH2, IKZF1, IL7R, JAK1, JAK2, JAK3, KDR, KIT, KRAS, MAP2K4, MED12, MEN1, MET, MLH1, MLL2, MLL3, MPL, MSH2, MYD88, NF1, NF2, NOTCH1, NOTCH2, NPM1, NRAS, NTRK3, PAX5, PDGFRA, PIK3CA, PIK3R1, PPP2R1A, PRDM1, PRKAR1A, PTCH1, PTEN, PTPN11, RB1, RET, RUNX1, SETD2, SF3B1, SMAD4, SMARCB1, SRC, STK11, TET2, TNFAIP3, TP53, TSHR, VHL, WT1
Four genes have been newly implicated in cancer : YWHAE, FAM22A and FAM22B translocations are associated with endometrial stromal carcinoma, whilst BCOR mutations are associated with Retinoblastoma, AML (Acute Myeloid Leukaemia) and APL (Acute promyelocytic leukemia). The census has been updated to reflect these new findings.
COSMIC v57 Release
Twelve new fusion gene pairs and one new cancer gene are curated in this 57th release of COSMIC, together with 8 recent genome-wide mutation screens.
Focus on Melanoma
Prickett TD, et al. (2011). Exon capture analysis of G protein-coupled receptors identifies activating mutations in GRM3 in melanoma. Nat
Genet. [Epub ahead of print]
Wei X, et al. (2011). Analysis of the disintegrin-metalloproteinases family reveals ADAM29 and ADAM7 are often mutated in melanoma. Hum Mutat. 32:E2148-75.
Wei X, et al (2010). Mutational and functional analysis reveals ADAMTS18 metalloproteinase as a novel driver in melanoma. Mol Cancer Res.8:1513-25.
Cárdenas-Navia LI, et al. (2010). Novel somatic mutations in heterotrimeric G proteins in melanoma. Cancer Biol Ther. 10:33-7.
Cronin JC, et al (2009). Frequent mutations in the MITF pathway in melanoma. Pigment Cell Melanoma Res. 22:435-44.
Palavalli LH, et al. (2009). Analysis of the matrix metalloproteinase family reveals that MMP8 is often mutated in melanoma. Nat Genet.41:518-20.
Solomon DA, et al. (2008). Mutational inactivation of PTPRD in glioblastoma multiforme and malignant melanoma. Cancer Res. 68:10300-6.
Focus on Squamous Cell Carcinoma
Durinck S, et al. (2011). Temporal dissection of tumorigenesis in primary cancers. Cancer Discov. 1:137-143.
(with related data from Wang NJ et al (2011). Loss-of-function mutations in Notch receptors in cutaneous and lung squamous cell carcinoma. PNAS 108:17761-6.)
ABL1, AKT1, ALK, APC, ARID1A, ASXL1, ATM, BRAF, BRCA1, CARD11, CBL, CDC73, CDH1, CDKN2A, CREBBP, CTNNA1, CTNNB1, CYLD, DAXX, DNMT3A, EGFR, EML4, EP300, ERBB2, EZH2, FAM123B, FGFR3, FLT3, FOXL2, HRAS, IDH1, IDH2, IL7R, JAK2, JAK3, KDR, KIT, KRAS, MAP2K4, MED12, MLH1, MLL3, MPL, MSH2, MSH6, MYD88, NF1, NF2, NFE2L2, NOTCH1, NOTCH2, NPM1, NRAS, PDGFRA, PIK3CA, PIK3R1, PRDM1, PRKAR1A, PTCH1, PTEN, PTPN11, RB1, RUNX1, SETD2, SF3B1, SMAD4, SMARCA4, SMARCB1, SMO, TET2, TNFAIP3, TP53, VHL, WT1
COSMIC v56 Release
Three new cancer genes together with 13 new fusion gene pairs and 7 recent genome-wide screens have been fully curated into COSMIC for this latest release.
TCGA (2011). Integrated genomic analyses of ovarian carcinoma. Nature 474:609-15
Stransky et al (2011). The mutational landscape of head and neck squamous cell carcinoma. Science 333:1157-60
Li et al (2011). Inactivating mutations of the chromatin remodeling gene ARID2 in hepatocellular carcinoma. Nature Genetics 43:828-9
Prickett et al (2009). Analysis of the tyrosine kinome in melanoma reveals recurrent mutations in ERBB4.Nature Genetics 41:1127-32
Bettegowda et al (2011).Mutations in CIC and FUBP1 contribute to human oligodendroglioma.Science 333:1453-5.
Papaemmanuil et al (2011).Somatic SF3B1 mutation in myelodysplasia with ring sideroblasts.N Engl J Med. 365:1384-95
Malcovati et al (2011).Clinical significance of SF3B1 mutations in myelodysplastic syndromes and myelodysplastic/myeloproliferative
neoplasms. Blood (epub).
ABL1, AKT1, ALK, APC, ARID1A, ASXL1, ATM, ATRX, BAP1, BRAF, BRCA1, BRCA2, CARD11, CBL, CDC73, CDH1, CDKN2A, CREBBP, CSF1R, CTNNA1, CTNNB1, CYLD, DAXX, DNMT3A, EGFR, EP300, ERBB2, EZH2, FAM123B, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, GATA1, GNA11, GNAQ, GNAS, HNF1A, HRAS, IDH1, IDH2, IKZF1, IL7R, JAK1, JAK2, JAK3, KDR, KIT, KRAS, MAP2K4, MED12, MEN1, MET, MLL2, MLL3, MPL, MSH2, MSH6, MYD88, NF1, NF2, NOTCH1, NOTCH2, NPM1, NRAS, NTRK3, PAX5, PDGFRA, PIK3CA, PIK3R1, PPP2R1A, PRKAR1A, PTCH1, PTEN, PTPN11, RB1, RET, RUNX1, SETD2, SF3B1, SMAD4, SMARCA4, SMARCB1, SMO, SRC, STK11, TET2, TNFAIP3, TP53, TSHR, WT1
COSMIC v55 Release
New curations include the tumour suppressor gene PRDM1, fusions of CRTC1/CRTC3-MAML2 and three new systematic screen publications. A new release of the Cancer Gene Census raises the number of known cancer genes to 464.
A number of genes have recently been implicated in oncogenesis and when this is confirmed, they are added to our Census of known cancer genes. The latest release details 464 known cancer genes, recently including SF3B1, ARID2, CCNE1, CDK12, FUBP1, XPO1, MED12.
Agrawal et al (2011). Exome sequencing of head and neck squamous cell carcinoma reveals inactivating mutations in NOTCH1. Science.333:1154-7
Wei et al (2011). Exome sequencing identifies GRIN2A as frequently mutated in melanoma. Nature Genetics. 43:442-6.
Zang et al (2010).Genetic and structural variation in the gastric cancer kinome revealed through targeted deep sequencing. Cancer Research 71:29-39.
ABL1, ACVR1B, AKT1, ALK, APC, ARID1A, ASXL1, ATM, BAP1, BRAF, BRCA2, CARD11, CBL, CDC73, CDH1, CDKN2A, CEBPA, CREBBP, CSF1R, CTNNB1, DNMT3A, EGFR, EML4, EP300, ERBB2, ERG, EZH2, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, FOXL2, GNAS, HNF1A, HRAS, IDH1, IDH2, JAK1, JAK2, JAK3, KDR, KIT, KRAS, MAP2K4, MET, MLL3, MPL, MSH6, NF1, NF2, NOTCH1, NOTCH2, NPM1, NRAS, NTRK3, PAX5, PDGFRA, PIK3CA, PIK3R1, PPP2R1A, PRDM1, PTCH1, PTEN, PTPN11, RB1, RET, RUNX1, SMAD4, SMARCA4, SMARCB1, SMO, SRC, STK11, SUFU, TET2, TNFAIP3, TP53, TSHR, VHL, WT1
COSMIC v54 Release
Five new cancer genes have received full curation of their mutation spectrum, together with seven new fusion gene pairs focusing on the zinc finger protein PLAG1. Four new systematic screens are included, covering a range of cancer phenotypes. This extensive literature curation brings the total number of publications in COSMIC to over 12,000.
Parsons DW, et al (2011). The genetic landscape of the childhood cancer medulloblastoma. Science 331:435-9.
A large genome-wide candidate gene screen assessing childhood medulloblastoma, this study identified two new chromatin remodelling genes as cancer genes, MLL2 and MLL3.
Kan Z, et al (2010). Diverse somatic mutation patterns and pathway alterations in human cancers. Nature 466:869-73. This study examined 441 tumours from a variety of sites and morphologies, through over 1500 known or candidate cancer genes, defining roles for over 100 in oncogenesis.
Jones S, et al (2010). Frequent Mutations of Chromatin Remodeling Gene ARID1A in Ovarian Clear Cell Carcinoma. Science. 330:228-31 A whole-exome resequencing study of 8 ovarian clear cell carcinomas further implicates chromatin remodelling genes (PPP2R1A and ARID1A) in cancer.
Puente, et al (2011).Whole-genome sequencing identifies recurrent mutations in chronic lymphocytic leukaemia. Nature. 475:101-5 Full-genome resequencing of four Chronic Lymphocytic Leukaemia patients suggests significant roles for at least four known cancer genes (NOTCH1,XPO1, MYD88 and KLHL6). The data in this publication has been extended with extra annotations from their submission to the ICGC DCC (R5 release).
ABL1, ACVR1B, AKT1, ALK, APC, ARID1A, ASXL1, ATM, BAP1, BRAF, BRCA1, BRCA2, CARD11, CBL, CDC73 , CDH1, CDKN2A, CEBPA, CREBBP, CRLF2, CSF1R, CTNNA1, CTNNB1, CYLD, DAXX, DNMT3A, EGFR, EML4, EP300, ERBB2, ERG, EZH2, FAM123B, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, FOXL2, GATA1, GATA2, GATA3, GNA11, GNAQ, GNAS, HNF1A, HRAS, IDH1, IDH2, IKZF1, JAK1, JAK2, JAK3, KDR, KIT, KRAS, MAP2K4, MEN1, MET, MLH1, MLL2, MLL3, MPL, MSH2, MSH6, MYD88, NF1, NF2, NOTCH1, NOTCH2, NPM1, NRAS, NTRK3, PAX5, PDGFRA, PHOX2B, PIK3CA, PIK3R1, PPP2R1A, PRKAR1A, PTCH1, PTEN, PTPN11, RB1, RET, RUNX1, SETD2, SMAD4, SMARCA4, SMARCB1, SMO, SOCS1, SRC, STK11, SUFU, TET2, TNFAIP3, TP53, TSHR, VHL, WT1
COSMIC v53 Release
COSMIC v53 release includes full curation of IKZF1, PIK3R1, PAX5 and 11 new gene fusion pairs.
This fifty third release of COSMIC brings the number of fully curated cancer genes to 98, together with a total of 81 curated fusion gene pairs. With the inclusion of 10 systematic screen publications, together with substantial output from the ICGC, TCGA and CGP studies, over 3 million curated gene screening experiments are now available in COSMIC, covering 19439 genes, with mutations identified on 11111 of these.
Jones S, et al (2008). Core signalling pathways in human pancreatic cancers revealed by global genomic analyses. Science 321:1801-6.
Analysis of over 20,000 candidate genes for DNA mutations through 114 tumours revealed high mutation rates in 12 cell signalling pathways. The curation of this publication describes the mutations from this study not included in the recent ICGC release r3, detailing 337 additional mutations.
Jiao Y, et al (2011). DAXX/ATRX, MEN1, and mTOR pathway genes are frequently altered in pancreatic neuroendocrine tumors.Science 331:1199-203. This exome resequencing analysis of pancreatic neuroendocrine tumours examined the link between gene mutations and clinical prognosis, finding genes in the DAXX/ATRX, MEN1, and mTOR pathways particularly important. Full exomes were sequenced in 10 tumours, key genes followed in in a further 58; 218 mutations were described.
ABL1, ACVR1B, AKT1, ALK, APC, ARID1A, ASXL1, ATM, BAP1, BRAF, BRCA1, BRCA2, CARD11, CBL, CDC73, CDH1, CDKN2A, CEBPA, CRLF2, CSF1R, CTNNA1,CTNNB1, CYLD, DAXX, DNMT3A, EGFR, EML4, ERBB2, ERG, EZH2, FAM123B, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, FOXL2, GATA1, GATA2, GATA3,GNA11, GNAQ, GNAS, HNF1A, HRAS, IDH1, IDH2, IKZF1, JAK1, JAK2, KDR, KIT, KRAS, MAP2K4, MEN1, MET, MLH1, MPL, MSH2, MSH6, NF1, NF2,NOTCH1, NOTCH2, NPM1, NRAS, PAX5, PBRM1, PDGFRA, PHOX2B, PIK3CA, PIK3R1, PPP2R1A, PRKAR1A, PTCH1, PTEN, PTPN11, RB1, RET, RUNX1,SETD2, SMAD4, SMARCA4, SMARCB1, SMO, SOCS1, SRC, STK11, SUFU, TET2, TNFAIP3, TP53, TSHR, VHL, WT1
COSMIC v52 integrates Genomics of Drug Sensitivity
Also, 4 new genes and 16 new fusion pairs are curated from the literature, TCGA and ICGC portal data are updated and some improvements are made to COSMIC web pages.
The Genomics of Drug Sensitivity in Cancer, a collaborative project between the Sanger Institute and Massachusetts General Hospital, is screening a range of anti-cancer therapeutics against a large number of genetically characterized human cancer cell lines, generating drug sensitivity correlations. This release of COSMIC includes references to this work, detailing drugs where mutant gene/drug interactions have been shown to modify cell growth responses. For example, the recently described drug PLX4720 has been shown to have a significant growth modifying effect on cells containing mutant BRAF.
21 new genes have been added to the cancer gene census. With the rise of large-scale genomic sequencing, the number of novel genes implicated in cancer is increasing. We aim to keep the Census up-to-date with the latest publications.
The front page of COSMIC has been redesigned to make it much easier to find data and sub-projects that were previously difficult to identify. For instance, the Cell Line Project, characterising the cancer genotypes of 800 common tumour cell lines has always been a significant COSMIC sub-project, and is now appropriately highlighted. Also, the FTP site which comprises export files of each release, is now much easier to find.
In addition, we have improved the informativeness of the Tissue Overview page (eg summary page for Skin). The graphic has been extended to include the top 20 genes mutated in the tissues/phenotypes selected, followed by much simpler summary of the mutation load in the selection.
This release (v52) of COSMIC contains full curations of 4 new cancer genes together with 16 new fusion gene pairs. In addition, our curation of TCGA data, output via the TCGA portal, has been updated with further new Ovarian serous carcinoma mutations. Also, we have completed our curation of the validated mutations in the third release of the ICGC, bringing in structural rearrangements for two Japanese Liver cancer screens, HX5T & RK-003-C
ABL1, AKT1, APC, ASXL1, ATRX, BRAF, BRCA1, BRCA2, CARD11, CBL, CDKN2A, CTNNB1, DAXX, EGFR, EML4, ERBB2, EZH2, FAM123B, FGFR3, FLT3, GATA1, GATA3, HNF1A, HRAS, IDH1, IDH2, JAK2, KIT, KRAS, MAP2K4, MEN1, MET, MPL, MYD88, NF2, NOTCH1, NPM1, NRAS, PDGFRA, PIK3CA, PPP2R1A, PTCH1, PTEN, PTPN11, RB1, RET, SMAD4, SMARCB1, STK11, TET2, TNFAIP3, TP53, VHL
COSMIC v51 Release
Cosmic release v51 includes the curation of 3 newly identified cancer genes, 4 gene fusions and updates from the ICGC and TCGA international consortia. In addition, upgrades to the website are now available, to improve the analysis of mutation distribution within a gene, and to navigate the increasing number of curated large-scale systematic screens. Both the Genomics of Drug Sensitivity in Cancer website and the Cancer Gene Census Listing were also recently updated (in December 2010).
DNMT3A is a member of a family of methyltransferase enzymes which catalyse addition of methyl groups to sequences containing CpG dinucleotides. Somatic mutations have been found in 22% of adult acute myeloid leukaemia patients and are associated with poor overall survival. Over half of the case examined have a recurrent missense mutation at arginine 882. The mutations have been shown to impair normal enzymatic function and are heterozygous. These data suggest potential dominant negative activity of the mutations. Additional missense, nonsense and frame-shift mutations have also been found throughout the latter half of the gene.
Mutational inactivation of BAP1, a tumour suppressor gene, has been identified in uveal melanomas where it coincides with metastasis. BAP1 encodes a nuclear ubiquitin carboxy-terminal hydrolase (UCH) as well as a UCH37-like domain, and binding domains for BRCA1, BARD1 and HCFC1.
As previously described for GNAQ, mutations affecting Q209 have been found in GNA11 in melanocytic tumours. The frequency of mutations increases
progressively from blue nevus to primary melanomas to uveal melanoma metastases; an inverse pattern to that seen with GNAQ Q209 mutations. Activation of this GTPase pathway appears to be a predominant route to the development of uveal melanoma.
TAF15 can replace EWSR1 as a fusion partner to NR4A3 in extraskeletal myxoid chondrosarcoma. The resulting transcript (TAF15-NR4A3) is structurally and functionally similar to the EWSR1-NR4A3 fusion.
FUS-CREB3L2 is tumour specific for low-grade fibromyxoid sarcoma (LGFMS) so enables the accurate diagnosis of a sarcoma with sometimes indistinct histological features. In these fusions there's a diversity of genomic breakpoints and these are often exonic rather than intronic. The rare variant FUS-CREB3L1 is occasionally detected in LGFMS.
Myxoid/round cell liposarcoma is characterized by the recurrent fusion FUS-DDIT3 where the 5' half of the FUS gene is fused to the entire reading frame of DDIT3, which encodes a leucine-zipper transcription factor belonging to the CCAAT/enhancer-binding protein family.
TCGA - Full exome resequencing of Ovarian tumours
The Cancer Genome Atlas (TCGA) has recently released somatic mutation data from the exon screening of 325 serous ovarian cystadenocarcinoma tumours. We have now curated the majority of this information into COSMIC, comprising over 13,000 mutations. This can now be viewed here.
ICGC - Curation of third ICGC release 'Simple Mutations'
The International Cancer Genome Consortium (ICGC) has recently completed its third data release. We have curated into COSMIC all the Validated Somatic Simple Mutation
data from this ICGC release, which can be viewed in the following
We have improved the Distribution section of the Histogram page, providing much more analytical pie charts. Once a gene is selected, the mutation spectrum can be explored in a number of different ways. Nucleotide substitution breakdowns are presented as pie charts, and lengths of insertions and deletions are presented as histograms. The same filters as usual can be applied to this Distribution section, including tumour phenotype, mutation type and sample source. Links are also provided to view the mutation data in tabulated detail form, and to export it for external analysis. An example of this new system, presenting the data from the KIT oncogene can be seen here.
In the last few years, cancer genome screens have been growing substantially in size, and both whole-genome and large candidate gene screens are being curated into COSMIC. As well as curating publications,
we are also collecting somatic mutation information from the data portals of large international consortia, beginning with the TCGA and ICGC. A new page now makes these easier to identify and navigate, please click here.
Genomics of Drug Sensitivity in Cancer
The Genomics of Drug Sensitivity Website was updated on 23rd December 2010 with cell line sensitivity data and genomic correlates of sensitivity for Docetaxel, Gefitinib, CI-1040, BIBW 2992 and PLX4720. This large-scale project is a joint collaboration between the Wellcome Trust Sanger Institute (WTSI) and Massachusetts General Hospital Cancer Centre to correlate genomics with response to cancer drugs in 1,000 cancer cell lines. Within the next year we plan to integrate the data from this project into the COSMIC database and develop new web-based tools to browse and mine these data. Click here to visit the site.
Cancer Cell Line Project
The Cancer Cell Line Project website holds the data from the Cancer Genome Projects large-scale systematic characterization of a panel of 770 cancer cells across 64 known cancer genes. Click here to visit the site.
Cancer Gene Census
The Cancer Gene Census was updated at the beginning of December 2010. The Census is an up-to-date listing of genes causally implicated in cancer and now stands at 436 genes. Click here to view the listing.
ABL1, ACVR1B, AKT1, ALK, APC, ARID1A, ASXL1, ATM, BAP1, BRAF, BRCA1, BRCA2, CBL, CDC73, CDH1, CDKN2A, CEBPA, CTNNA1, CTNNB1, DNMT3A, EGFR, EML4, ERBB2, ERG, EZH2, FAM123B, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, FOXL2, GATA3, GNA11, GNAQ, GNAS, HNF1A, HRAS, IDH1, IDH2, JAK1, JAK2, JAK3, KDR, KIT, KRAS, MAP2K4, MET, MLH1, MPL, MSH2, MSH6, MYB_ENST00000341911, NF1, NF2, NOTCH2, NPM1, NRAS, PBRM1, PDGFRA, PHOX2B, PIK3CA, PPP2R1A, PTCH1, PTEN, PTPN11, RB1, RET, RUNX1, SETD2, SMAD4, SMARCA4, SMARCB1, TET2, TNFAIP3, TP53, TSHR, VHL, WT1
COSMIC v50 Release
Our 50th release significantly enhances the genomic focus of the COSMIC system, including a full genome browser linked directly into the COSMIC websystem. Also included are genome-wide examinations of a further 24 tumours, comprising rearrangement screens of 17 pancreatic tumours and full exome analyses of 7 renal tumours. Two new cancer genes are curated from the scientific literature, together with a further systematic screen publication.
With the inclusion of increasing quantities of genomic data in COSMIC, new methods to navigate and visualise this information are now required. We have implemented a version of GMOD GBrowse and linked it into COSMIC where genomic co-ordinates are available, including coding and non-coding mutations, gene footprints, structural rearrangements and copy number variants (from CONAN ). Full genome annotations have been imported from Ensembl, so that COSMIC data can be examined in the context of these full genome annotations. Throughout the COSMIC websystem, links to GBrowse are available as either links in descriptive text, or via the icon , which will present the selected data in the local genomic context.
The recent Campbell et al (2010) publication
"The patterns and dynamics of genomic instability in metastatic pancreatic cancer." , describing genome structure rearrangements as early events in this cancer type, can be viewed in COSMIC, here.
The Cancer Genome Project has sequenced the full exomes of 7 renal tumours, These data, soon to be published, are available in COSMIC
In the Ding et al (2010) paper, 3 basal-like breast tumours were taken from one individual and their genomes were fully resequenced and comparatively analysed. We have curated the coding, non-coding and structural rearrangement mutations described in their study, available here.
ARID1A has been identified as a tumour suppressor gene in ovarian clear cell and endometrioid carcinomas. It encodes AT-rich interactive domain-containing protein 1A which is a component of the ATP-dependent
chromatin modelling complex SWI/SNF.
PPP2R1A has been identified as an oncogene in ovarian clear cell carcinoma and in breast and lung carcinomas. It encodes a regulatory subunit of serine-threonine phosphatase 2 and this subunit forms the scaffold of the holoenzyme.
ABL1, AKT1, ALK, APC, ARID1A, ASXL1, ATM, BRAF, CDC73, CDH1, CEBPA, CSF1R, CTNNB1, CYLD, EGFR,
EML4, ERBB2, EZH2, FBXW7, FGFR3, FLT3, FOXL2, GATA1, GNAQ, HRAS, IDH1, IDH2, JAK2, JAK3, KIT, K
RAS, MEN1, MET, MLH1, MPL, NF2, NPM1, NRAS, PDGFRA, PIK3CA, PPP2R1A, PTEN, PTPN11, RUNX1, SETD
2, SMARCA4, SMARCB1, STK11, TET2, TP53, TSHR, VHL, WT1
COSMIC v49 Release
This release of COSMIC focuses on curation of data from the scientific literature. 57 curated genes have received updates; an additional systematic candidate gene screen has added 1121 mutations, and a further full-genome screen has contributed 45 confirmed coding mutations.
Shah et al (2009) describes the full-genome examination of a metastatic ER+ lobular breast cancer and compares the mutation spectrum to the primary tumour, finding 32 somatic non-synonymous coding mutations in the metastasis, but only 13 in the primary:
Mutational evolution in a lobular breast tumour profiled at single nucleotide resolution.
Shah SP, Morin RD, Khattra J, Prentice L, Pugh T, Burleigh A, Delaney A, Gelmon K, Guliany R, Senz J, Steidl C, Holt RA, Jones S, Sun M, Leung G, Moore R, Severson T, Taylor GA, Teschendorff AE, Tse K, Turashvili G, Varhol R, Warren RL, Watson P, Zhao Y, Caldas C, Huntsman D, Hirst M, Marra MA, Aparicio S. (2009) Nature 461:809-13.
A further analysis expanding that of Sjoblom et al (2006), Wood et al (2007) examines the same tumour set through a set of RefSeq genes additional to the earlier analysis of CCDS genes. 1121 mutations were identified, expanding the mutation spectrum defined in the earlier publication:
The genomic landscapes of human breast and colorectal cancers.
Wood LD, Parsons DW, Jones S, Lin J, Sjoblom T, Leary RJ, Shen D, Boca SM, Barber T, Ptak J, Silliman N, Szabo S, Dezso Z, Ustyanksky V, Nikolskaya T, Nikolsky Y, Karchin R, Wilson PA, Kaminker JS, Zhang Z, Croshaw R, Willis J, Dawson D, Shipitsin M, Willson JK, Sukumar S, Polyak K, Park BH, Pethiyagoda CL, Pant PV, Ballinger DG, Sparks AB, Hartigan J, Smith DR, Suh E, Papadopoulos N, Buckhaults P, Markowitz SD, Parmigiani G, Kinzler KW, Velculescu VE, Vogelstein B. (2007) Science. 318:1108-13.
ABL1, AKT1, ALK, APC, ASXL1, ATM, BRAF, BRCA1, CBL, CDKN2A, CEBPA, CRLF2, CTNNA1, CTNNB1, CYLD, EGFR, EML4, ERBB2, EZH2, FAM123B, FBXW7, FGFR2, FLT3, FOXL2, GATA1, GNAQ, GNAS, HRAS, IDH1, IDH2, JAK1, JAK2, KIT, KRAS, MEN1, MET, MPL, NOTCH1, NPM1, NRAS, PDGFRA, PHOX2B, PIK3CA, PTCH1, PTEN, PTPN11, RB1, RUNX1, SETD2, SMARCA4, STK11, TET2, TNFAIP3, TP53, TSHR, VHL, WT1
COSMIC v48 Release
This release brings the majority of curated p53 mutation data into COSMIC in collaboration with IARC, significantly improving our coverage of the key cancer genes. Other new curated genes are TET2 & SETD2, together with ten new fusion pairs. Two new systematic screens are also included. The system has been updated to provide genomic coordinates on both NCBI36 and the later GRCh37 genome builds.
In collaboration with the p53 group at IARC (http://www-p53.iarc.fr/), we have imported the majority of p53 mutation data into COSMIC. The system has been previously lacking substantial coverage of this gene, since it has been fully curated at IARC. However, our new collaboration has brought these two datasets together in COSMIC. Over 73% of the IARC Somatic dataset is now present in COSMIC, comprising a total of 20129 samples mutated, of 66242 samples analysed. The remaining of IARC's curated p53 data will become available in a later release. This p53 data can be viewed here.
SETD2, a tumour suppressor gene, encodes a histone H3 lysine 36 methyltransferase and is inactivated in clear cell renal carcinoma.
TET2, The tumour suppressor gene TET2 (ten-eleven-translocation gene, 4q 24) was found to be heterozygously deleted in MDS and Leukaemia patients whose remaining copy carried a somatic point mutation. Subsequently a wide spectrum of somatic mutations - often nonsense or frame shifts - have been found in a variety of myeloproliferative neoplasms, leukaemias (AML, sAML, CMML) and mastocytosis; These mutations are thought likely to be an early event in the pathology of these diseases and there is some evidence that TET2 is both a tumour suppressor and haematopoietic regulator.
ASPSCR1-TFE3, A characteristic translocation in alveolar soft part sarcoma results in the fusion of the N-terminal region of ASPSCR1 to the C-terminal region of TFE3. Two alternative fusion breakpoints are observed in TFE3 resulting in expression of 2 distinct fusion transcripts. ASPSCR1-TFE3 is also found in a subset of renal cell carcinomas.
PRCC-TFE3, SFPQ-TFE3, NONO-TFE3, CLTC-TFE3 Papillary renal carcinomas have fusions involving the TFE3 transcription factor gene. Most commonly the fusions are ASPSCR1-TFE3 or PRCC-TFE3 but variant translocations have also been identified in which TFE3 is fused to SFPQ, NONO or CLTC.
ETV6-NTRK3,A recurrent rearrangement in congenital (infantile) fibrosarcoma fuses the helix-loop-helix protein dimerization domain of ETV6 with the protein tyrosine kinase domain of NTRK3. ETV6-NTRK3 fusions are also found in congenital mesoblastic nephroma, a pathogenetically related tumour, and in a rare form of breast cancer, secretory carcinoma.
SLC45A3-BRAF, ESRP1-RAF1, AGTRAP-BRAF While SLC45A3-BRAF and ESRP1-RAF1 fusions have been identified in ETS rearrangement-negative prostate cancers, AGTRAP-BRAF has been found in stomach cancer; all highlighting the role of RAF pathway fusions in solid tumours.
MYB-NFIB,A recurrent fusion of the MYB oncogene to NFIB, a member of the human nuclear factor I gene family of transcription factors, has been identified in adenoid cystic carcinomas of the breast and the head and neck. The common feature of multiple splice variations is the deletion of exon 15 of MYB and its 3'-UTR.
Mardis et al (2009) describes the full-genome sequencing of a single AML tumour, resulting in 12 coding mutations and 52 confirmed high quality non-coding mutations, with a follow-up study examining a set of 187 AML tumours through the genes mutated in the primary sample.
Ding et al (2008) examines 188 lung adenocarcinomas through 623 genes known to be involved with cancer. 1013 mutations were described, the most mutated genes being TP53, KRAS, STK11 and EGFR.
All genes and mutations with NCBI36 genome co-ordinates have now been updated to GRCh37. New DAS tracks have been created, and the website and data exports have been altered to include both co-ordinate systems.
ABL1, ACVR1B, AKT1, ALK, APC, ATM , BRAF, BRCA1, BRCA2, CBL, CDC73, CDH1, CDKN2A, CEBPA, CSF1R, CTNNB1, CYLD, EGFR, EML4, ERBB2, ERG, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, FOXL2, GATA1, GNAQ, GNAS, HNF1A, HRAS, IDH1, IDH2, JAK1, JAK2, JAK3, KDR, KIT, KRAS, MAP2K4, MEN1, MET, MLH1, MPL, MSH2, MSH6, NF1, NF2, NOTCH1, NOTCH2, NPM1, NRAS, PDGFRA, PHOX2B, PIK3CA, PRKAR1A, PTCH1, PTEN, PTPN11, RB1, RET, RUNX1, SETD2, SMAD4, SMARCB1, SMO, SOCS1, SRC, STK11, SUFU, TET2, TP53, TSHR, VHL, WT1
COSMIC v47 Release
This latest release of COSMIC includes six new point-mutated genes with full literature curation, together with curation of five new fusion genes involving ALK. Additionally, recent updates to the public TCGA Glioblastoma dataset have been included, as have a number of whole gene deletions interpreted from SNP6.0 microarray data. Recent publications have been curated to update sixty one other curated genes.
GATA2, This gene encodes a member of the GATA family of zinc-finger transcription factors and the encoded protein plays an essential role in regulating transcription of genes involved in normal haematopoietic cell differentiation and survival. Somatic mutations in GATA2 are associated with acute myeloid transformation in a subset of chronic myelogenous leukaemia (CML) patients.
GATA3, This gene encodes a protein which belongs to the GATA family of transcription factors. The protein contains two GATA-type zinc fingers; it is a regulator of T-cell development and plays an important role in endothelial cell biology. Germline mutations in this gene are found in individuals with HDR syndrome (hypoparathyroidism with sensorineural deafness and renal dysplasia). These mutations cluster in the region of the highly conserved second zinc finger. Somatic mutations in the same region have been identified in tumour tissue from both familial and sporadic breast cancer patients.
EZH2, This gene encodes a member of the Polycomb-group (PcG) family; a histone methyltransferase responsible for trimethylating Lys327 of histone H3 (H3K27). Somatic mutations have been found within the catalytic SET domain in diffuse large B-cell lymphoma and follicular lymphoma.
KDR, KDR encodes the kinase insert domain receptor, one of the two receptors of the VEGF (Vascular endothelial growth factor) - a major growth factor for endothelial cells. It is a vascular specific, type III receptor tyrosine kinase. Germline mutations of this gene are implicated in infantile capillary haemangiomas. Somatic mutations have been found in samples derived from angiosarcomas of the breast/chest wall.
CRLF2, The type I cytokine receptor subunit CRLF2 (thymic stromal lymphopoietin receptor, TSLPR) has been identified as a proto-oncogene in adult and high risk paediatric B-ALL. Over-expression is common in cases lacking rearrangements of TEL, MLL, TCF3 and BCR-ABL and can result from CRLF2 rearrangement or mutually exclusive somatic gain of function point mutations including those in CRLF2, JAK1 or JAK2. The predominant CRLF2 mutation, Phe232Cys, promotes constitutive dimerization and cytokine independent growth.
JAK1,A member of the Janus kinase family comprising JAK's 1, 2 and 3 as well as Tyk2, activating JAK1 somatic mutations have been found in the SH, pseudokinase and kinase domains of T-ALL, B-ALL and, more rarely, AML patients. Mutations have also been found in a variety of non-haematopoietic cancers. Mutations include JAK1 V658F, corresponding to the JAK2 V617F mutation commonly found in PV and ET as well as other MPNs; and R724H, corresponding to JAK2 R683 and JAK3 R657, mutations of which have been found in DS-ALL/B-ALL and DS-AMKL respectively.
The following gene fusions have been curated from the scientific literature:
CARS-ALK, TFG-ALK, TPM3-ALK, TPM4-ALK Variant ALK fusions, including ATIC-ALK, TPM3-ALK, TPM4-ALK and TFG-ALK, have been identified in ALK-positive anaplastic large cell lymphoma. Each translocation product retains the ALK kinase domain. ALK activation is also a recurrent oncogenic event in inflammatory myofibroblastic tumours, where this is sometimes achieved through fusion with ATIC, TPM3, TPM4 or CARS.
We have updated our recent curation of the TCGA somatic Glioblastoma mutation data, now including phase II data direct from the public TCGA data portal. The combined data can be browsed here.
55 whole gene or whole-exon deletions have been defined in the core cell lines by interpretation of SNP6.0 microarray data. While many of these have been confirmed, the few unverified mutations are presented with links to the originating data for independent examination. Mutations involving the deletion of whole gene sequences have been reannotated "p.0?" in line with current HGVS recommendations.
COSMIC is now combined into the new Sanger-wide search system. This allows much richer searching, using multiple terms such as "BRAF melanoma", to much more easily find data without complex navigation through the website. The system additionally searches other Sanger genomic data, giving indications where compatible data might be found elsewhere on the Institute website.
ABL1, AKT1, ALK, APC, ASXL1, ATM, BRAF, BRCA1, BRCA2, CBL, CDH1, CDKN2A, CEBPA, CSF1R, CTNNB1, CYLD, EGFR, ERBB2, EZH2, FAM123B, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, GATA1, GATA2, GATA3, HNF1A, HRAS, IDH1, IDH2, JAK1, JAK2, JAK3, KDR, KIT, KRAS, MAP2K4, MEN1, MET, MLH1, MPL, MSH2, MSH6, NF1, NF2, NOTCH1, NPM1, NRAS, PDGFRA, PIK3CA, PTCH1, PTEN, PTPN11, RB1, RET, RUNX1, SMAD4, SMARCA4, SMO, SOCS1, SRC, STK11, SUFU, VHL, WT1
COSMIC v46 Release
The second full-genome resequencing study from the CGP at the Sanger Institute, UK is now available, together with the curation of Parsons et al (2008), a systematic candidate gene screen of Glioblastomas. In addition, the published literature has been fully curated for fusion mutations between seven new gene pairs.
The recent Pleasance et al (2010) publication "A small-cell lung cancer genome with complex signatures of tobacco exposure" (Nature 463, 184-190) is now available within COSMIC; please click here.
The largest published candidate gene screen of Glioblastomas Parsons et al (2008), is now curated in COSMIC; please click here:
An integrated genomic analysis of human glioblastoma multiforme.
Parsons DW, Jones S, Zhang X, Lin JC, Leary RJ, Angenendt P, Mankoo P, Carter H, Siu IM, Gallia GL, Olivi A, McLendon R, Rasheed BA, Keir S, Nikolskaya T, Nikolsky Y, Busam DA, Tekleab H, Diaz LA, Hartigan J, Smith DR, Strausberg RL, Marie SK, Shinjo SM, Yan H, Riggins GJ, Bigner DD, Karchin R, Papadopoulos N, Parmigiani G, Vogelstein B, Velculescu VE, Kinzler KW Science. 2008;321;1807-12. PMID: 18772396 DOI: 10.1126/science.1164382
FUS-ERG , FUS-FEV , FUS-ATF1 Both FUS-ERG and FUS-FEV fusions have been identified as alternatives to EWSR1-ETS transcription factor fusions in Ewing's sarcoma, and FUS-ERG also occurs in t (16,21) myeloid leukaemia as well as in these solid tumours. FUS-ATF1 is found in angiomatoid fibrous histiocytoma, where the fusion of the N-terminus of FUS and the DNA binding domain of ATF1 is similar to the EWSR1-ATF1 fusion found in clear cell sarcoma.
SS18-SSX1 This fusion is characteristic for synovial sarcoma along with SS18-SSX2 and more rarely, SS18-SSX4 fusions. Through its N-terminal SNH domain SS18 protein is involved in the remodelling of chromatin structures and functions as a transcriptional activator whereas SSX proteins have 2 putative transcription-repressor domains, one of which, an SSXRD domain in the C-terminal region, is preserved in the fusion protein.
SRGAP3-RAF1 This oncogenic fusion has been identified in paediatric pilocytic astrocytoma as an alternative to the previously described KIAA1549-BRAF fusion. It also activates the ERK/MAPK pathway; the auto-inhibitory domain of RAF1 being replaced by SRGAP3.
COL1A1-PDGFB This recurrent fusion characterizes dermatofibroma protuberans and its juvenile form, giant cell fibroblastoma. The fusion consistently deletes exon 1 of PDGFB releasing this growth factor from its normal regulation. The breakpoints in COL1A1, which encodes an extracellular matrix protein, occur in various exons in the alpha-helical domain.
JAZF1-SUZ12 A fusion involving these two genes is common but not universal in endometrial stromal sarcomas, occurring less frequently in high-grade tumours. The genes encode novel proteins with zinc finger motifs and these are retained in the fusion.
ABL1, ACVR1B, AKT1, ALK, APC, ASXL1, ATM, BRAF, BRCA1, BRCA2, CBL, CDC73, CDH1, CDKN2A, CEBPA, CSF1R, CTNNA1, CTNNB1, CYLD, EGFR, EML4, ERBB2, ERG, FAM123B, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, FOXL2, GATA1, GNAQ, GNAS, HNF1A, HRAS, IDH1, IDH2, JAK2, JAK3, KIT, KRAS, MAP2K4, MEN1, MET, MLH1, MPL, MSH2, MSH6, NF1, NF2, NOTCH1, NOTCH2, NPM1, NRAS, PDGFRA, PHOX2B, PIK3CA, PRKAR1A, PTCH1, PTEN, PTPN11, RB1, RET, RUNX1, SMAD4, SMARCA4, SMARCB1, SMO, SOCS1, SRC, STK11, SUFU, TNFAIP3, TSHR, VHL, WT1
COSMIC v45 Release
The first full-genome resequencing study is now available, together with the genome-wide rearrangement screens of 24 breast tumours. In addition, five new cancer genes have been curated from the literature.
To make the data easier to investigate in depth, the website has been upgraded with new specialisation features, together with new views on mutation spectrum and distribution. Finally, we are introducing a new COSMIC Biomart, where all COSMIC's information will be available in this industry-standard data mining tool.
The recent Pleasance et al (2010) publication "A comprehensive catalogue of somatic mutations from a human cancer genome" (Nature 463, 191-196) is now available within COSMIC; please click here.
Also, the CGP Stephens et al (2009) paper "Complex landscapes of somatic rearrangement in human breast cancer genomes" (Nature 462, 1005-1010) is now available in COSMIC; please click here . A paired-end genome-wide Illumina sequencing strategy revealed numerous rearrangements in very diverse patterns between the samples examined.
GNAQ is the alpha subunit of one of the heterotrimeric GTP-binding proteins that mediate stimulation of protein kinase C signalling. Mutations in GNAQ, occurring at codon 209 in the catalytic domain, have been found as common and early mutational events in uveal melanomas.
TNFAIP3 is a negative regulator of the NF-kappa B pathway functioning through the removal of activating Lys63-linked ubiquitins and the Lys48-linked ubiquitination of receptor-interacting proteins. TNFAIP3 has been shown to be a genetic target in B-lineage lymphomas such as mucosa-associated lymphoma and Hodgkin's lymphoma of nodular sclerosing histology.
CBL encodes a protein with multiadaptor function and E3 ubiquitin ligase activity that targets a variety of tyrosine kinases for degradation. Mutations in CBL have been identified in myeloid malignancies, occurring in the critical linker and ring finger domains of the protein.
JAK3 is a member of the non-receptor tyrosine kinase family which includes JAK2. Rare but significant JAK3 activating mutations located in the JH2 (pseudokinase) and JH6 (receptor binding) domains have been found in Down syndrome and Non-DS acute megakaryoblastic leukaemia (AML-M7). Mutations have also been found in various myeloproliferative neoplasms, lymphomas and carcinomas.
NOTCH2 is a Type 1 transmembrane protein with an extracellular domain consisting of multiple epidermal growth factor-like (EGF) repeats, and an intracellular domain consisting of multiple different domain types. The Notch2 receptor and its 5 ligands, which include Jagged1, Jagged2, and Delta-like 1, 3 and 4, send signals that are important for development before birth. After birth,Notch2 signaling is involved in tissue repair. Mutations in the NOTCH2 gene have been identified in a small percentage of people with Alagille syndrome and malformations in the kidneys, especially in filtering structures. NOTCH2 is also preferentially expressed in mature B cells,is essential for marginal zone B-cell generation, and mutations are evident in a subset of individuals with diffuse large B-cell lymphomas.
The main histogram page of the COSMIC website had been improved to provide better ways of selecting and viewing subsets of data. In the navigation bar on the left side, new options are now available to redraw the histogram and associated tables based on four parameters: mutation type (eg deletion, nonsense substitutions, etc), sample source (cultured or tissue sample), somatic status (confirmed somatic or unknown) and systematic screen (genome-wide screen). In addition to redrawing the histogram and tables, a new "Distribution" button displays pie charts of relevant information about the data selected.
The sample summary page has also been upgraded, with every CGP sample (examined through numerous genes) receiving a mutation spectrum diagram. This comprises a histogram showing the relative frequencies of each substitution type, together with a count of insertion/deletion mutations. This is highly useful when looking for mutation signatures which may show characteristsics of, for instance, tobacco or UV light exposure.
The new COSMIC biomart is now available, please click here. This system allows much more specialised selection of data in COSMIC and is very useful for data mining. In addition, it can be directly linked to Ensembl for federilsed querying across both databases.
JAK2, JAK3, MAP2K4, GNAS, MPL, SOCS1, WT1, CYLD, FBXW7, MEN1, NF1, RUNX1, ASXL1, NOTCH2, IDH1, IDH2, APC, CDH1, VHL, GNAQ, BRAF, HRAS, CEBPA, CTNNB1, FLT3, KIT, PDGFRA, PTEN, RB1, RET, SMARCB1, AKT1, EGFR, ERBB2, CDKN2A, CBL, GATA1, NPM1, PTPN11, NRAS, FGFR3, BRCA1, MSH6, PRKAR1A, KRAS, PIK3CA, MET, TNFAIP3
COSMIC v44 Release
This release of COSMIC includes 4 new curated genes, 8 new curated fusion pairs and the TCGA systematic screen publication of 91 Glioblastoma tumour samples. In addition, a new CGP study is available (Adenoid cystic carcinoma) together with substantial updates to existing data.
IDH2 encodes a mitochondrial NADP(+)-dependent isocitrate dehydrogenase which catalyzes oxidative decarboxylation of isocitrate to alpha-ketoglutarate. It is now implicated in the pathogenesis of malignant gliomas and some secondary glioblastomas lacking IDH1 mutations have IDH2 mutations at the analogous amino acid (R172).
AKT1 encodes a serine-threonine protein kinase which is activated by phosphorylated phosphoinositides and is a central mediator of the PI3kinase signalling pathway. A common mutation (E17K) has been identified in the pleckstrin homology domain in cancers of the colon, breast, lung and ovary.
ASXL1 belongs to a family of proteins regulating chromatin remodelling. Originally implicated via aCGH on MDS/AML samples, mutations are mainly frameshift mutations, the predicted truncated proteins lack the PHD finger domain potentially compromising the function of the associated chromatin modifiers.
FOXL2, forkhead box L2 is a winged helix/forkhead transcription factor gene, encoding a nuclear protein that is specifically expressed in eyelids and in fetal and adult ovarian follicular cells. Germline mutations in FOXL2 are responsible for BPES - blepharophimosis ptosis epicanthus inversus syndrome - an autosomal dominant disorder consisting of eyelid abnormalities (only, in Type II) and ovarian failure (Type I). Somatic mutations have recently been described in ovarian granulosa cell tumours.
The following gene fusions have been curated from the scientific literature:
EML4 / ALK
MSN / ALK
NPM1 / ALK
CLTC / ALK
SEC31A / ALK
RANBP2 / ALK
SS18 / SSX2
SS18 / SSX4
Comprehensive genomic characterization defines human glioblastoma genes and core pathways.The first systematic screen of the Cancer Genome Atlas Research Network (PMID 18772890) is now curated in COSMIC .
Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Cancer Genome Atlas Research Network Nature. 2008;455;1061-8. PMID: 18772890 DOI: 10.1038/nature07385
Adenoid cystic carcinoma is a slow growing tumour of the secretory glands, arising most commonly in the salivary glands but also occurring in other parts of the body. As part of an ongoing research effort funded by the Adenoid Cystic Carcinoma Research Fund (www.accrf.org), 400 candidate gene (including genes implicated in cancer, cell signaling and growth control) were sequenced for small point mutations. This work was carried out on 25 samples (provided by ACCRF collaborative research group member Dr. Adel El-Naggar) utilising an approach of PCR product generation for the entire set of PCR amplimers followed by individual concatentation of all amplimers for each tumour and matching normal DNA sample, then sequencing this material utilising next generation sequencing. In total 8 somatic point mutations were identified in 8 genes. No highly prevalent point mutation was identified in this set of genes.
KRAS, PIK3CA, FGFR2, MET, ABL1, FGFR1, JAK2, MAP2K4, GNAS, EML4, FOXL2, PTCH1, MPL, SOCS1, HNF1A, WT1, NF2, CYLD, FBXW7, MEN1, NF1, RUNX1, IDH1, IDH2, ASXL1, FAM123B, APC, CDH1, SMAD4, VHL, BRAF, HRAS, CEBPA, CSF1R, CTNNB1, FLT3, KIT, PDGFRA, PTEN, RB1, RET, SMARCB1, SUFU, ACVR1B, AKT1, ALK, ATM, EGFR, ERBB2, SRC, STK11, CDKN2A, GATA1, SMO, NOTCH1, NPM1, PTPN11, NRAS, FGFR3, BRCA1, BRCA2, MLH1, MSH2, MSH6, PRKAR1A
COSMIC v43 Release
The COSMIC curation systems have been extended to encompass the entry of large-scale systematic screen papers. For this release, we have entered the first such paper, the Sjoblom et al (2006) screen of human breast and colorectal cancers. This release also contains two new genes successfully curated from the scientific literature (IDH1, SMARCA4) and the finalisation of two of the Cancer Genome Project's current resequencing studies.
For this release of COSMIC we have entered the Sjoblom et al (2006) systematic screen paper of human breast and colorectal cancers. An additional 8,648 genes have been added to COSMIC along with the 1,672 mutations from the paper. The COSMIC reference overview page for this publication is available here.
The consensus coding sequences of human breast and colorectal cancers. Sjoblom T, Jones S, Wood LD, Parsons DW, Lin J, Barber TD, Mandelker D, Leary RJ, Ptak J, Silliman N, Szabo S, Buckhaults P, Farrell C, Meeh P, Markowitz SD, Willis J, Dawson D, Willson JK, Gazdar AF, Hartigan J, Wu L, Liu C, Parmigiani G, Park BH, Bachman KE, Papadopoulos N, Vogelstein B, Kinzler KW, Velculescu VE. Science. 2006 Oct 13;314(5797):268-74. Epub 2006 Sep 7. PMID: 16959974
The resequencing of candidate genes in Pilot and Renal tumour sets has now been completed. The finalised studies examined 2978 samples through 4766 genes, discovering a total of 5437 mutations. All of these can be found in COSMIC's CGP Resequencing Studies Site.
IDH1 is a catalytic enzyme causing NADP+ dependent oxidative decarboxylation of isocitric acid. It plays an important role in the control of glucose-stimulated insulin secretion and the cholesterol and fatty acid biosynthetic pathways. Originally implicated in human cancer in genome-wide sequencing scans, when mutated it is an indicator for the longer survival of these patients.
SMARCA4, is a scaffold protein, forming a functional part of the SWI/SNF complex involved in the control of transcription.
FBXW7, MEN1, NF1, BRAF, HRAS, CSF1R, CTNNB1, FLT3, KIT, PDGFRA, PTEN, RET, SMARCB1, SUFU, ACVR1B, ATM, EGFR, ERBB2, SRC, CDKN2A, FAM123B, GATA1, SMO, NOTCH1, NPM1, PTPN11, NRAS, FGFR3, BRCA1, BRCA2, APC, CDH1, SMAD4, VHL, TSHR, MLH1, MSH2, MSH6, SMARCA4, RUNX1, PHOX2B, GNAS, KRAS, PIK3CA, FGFR2, FGFR1, IDH1, JAK2, JAK3, MAP2K4, TET2, PRKAR1A, CDC73, PTCH1, MPL, CTNNA1, SOCS1, HNF1A, WT1, ERG, NF2
COSMIC v42 Release
For this release of COSMIC two known cancer genes (GNAS and ALK) and 3
gene fusions (FCHSD1 / BRAF, KIAA1549 / BRAF, EWSR1 / NR4A3) have been
successfully curated from the scientific literature. The Cancer Cell
Line Project has also been updated with the addition of 80 mutations.
The Cancer Cell Line Data has been updated with the addition of 80 mutations. The project has also published a further set of variants identified by the screen which have been classified as Tentatively Oncogenic Variant (TOV) or Unknown Variant (UV). These variants are currently available from our website as an excel file.
Two further cancer genes have been curated with the addition of 95
mutations for ALK and 235 mutations for GNAS.
The following gene fusions have been curated from the scientific
FCHSD1 / BRAF
KIAA1549 / BRAF
EWSR1 / NR4A3
KRAS, PIK3CA, ABL1, FGFR1, JAK2, BRAF, HRAS, CEBPA, CSF1R, CTNNB1, KIT, PDGFRA, PTEN, RB1, SUFU, ERBB2, SRC, STK11, CDKN2A, GATA1, SMO, NPM1, PTPN11, NRAS, BRCA2, MLH1, MSH2, MSH6, APC, CDH1, SMAD4, MET, EGFR, FLT3, PTCH1, MPL, WT1, CYLD, FBXW7, NF1, ALK, FGFR3, RET, NOTCH1, NF2, GNAS
COSMIC v41 release
This release of COSMIC comprises an update of published data in which 44 genes have been updated with the addition of 22516 samples and a further 7387 mutations.
STK11, CDKN2A, GATA1, SMO, NPM1, PTPN11, NRAS, BRCA2, MSH2, KRAS, PIK3CA, JAK2, MAP2K4, BRAF, HRAS, CEBPA, CTNNB1, KIT, PDGFRA, PTEN, RB1, ATM, ERBB2, FBXW7, NF1, FAM123B, APC, CDH1, VHL, MET, EGFR, FLT3, PTCH1, MPL, SOCS1, HNF1A, WT1, CYLD, FGFR3, RET, RUNX1, TSHR, PHOX2B, NOTCH1.
COSMIC release 40
This release of COSMIC comprises an update of the existing genes totalling almost 3000 new mutations.
COSMIC release 39, Annotating Cancer Genomes
For this release of COSMIC the database and web interfaces have been upgraded to handle Next Generation Sequencing Data. This is part of ongoing work to allow COSMIC to handle the increased volumes and complexity of somatic data that is anticipated from Next Generation Sequencers. In particular, for this release we have concentrated on adapting COSMIC to handle large-scale structural variants (including translocations, large insertions/deletions, inversions, and duplications).
The structural variants from the Campbell et al. 2008 paper, which comprehensively characterizes 2 lung cancer cell lines, have been entered into COSMIC (click here for study overview). Sample Summary pages are available for both cancer cell lines (NCI-H2171 and NCI-H1770).
Circular plots (Circos plots developed by Martin Krzywinski) have been added to the sample overview page which gives a clear overview of all the structural variants along with copy number changes and COSMIC point mutations for a particular sample (Figure 1). More detailed views of complex rearrangements are available on the mutation details page.
Figure 1. Circos Plot showing structural variants in relation to copy number and COSMIC Point Mutations.
Tabular views and exports are also available for these data (Figure 2). Due to the complexity of these rearrangements, where possible, a short description term of the variant is given (e.g. deletion, tandem duplication translocation). The variant is also fully described using HGVS mutation nomenclature. For example chr11:g.36585230_76606619del, where chr11: denotes the chromosome involved, g. for genomic coordinates, 36585230 for the deletion start point, 76606619 for deletion end point and del indicates a deletion event.
Figure 2. Summary Structural Variants Table
NCI/Nature Pathway Interaction Database Primer on COSMIC published and is available from here.
The Cancer Gene Census was updated on 11th August 2008. The Census now contains information of 379 genes of which 343 harbour somatic alterations and 70 germline.
COSMIC release 38
For this release of COSMIC we have concentrated our efforts on significantly updating the following genes:
BRAF, HRAS, CTNNB1, KIT, PDGFRA, PTEN, RB1, ERBB2, MAP2K4, CDKN2A, GATA1, SMO, NPM1, NRAS, MLH1, MSH2, KRAS, PIK3CA, JAK2, APC, SMAD4, EGFR, FLT3, PTCH1, MPL, HNF1A, FBXW7, NF1, FGFR3, RET, NF2, NOTCH1.
In collaboration with the Human Gene Nomenclature committee (HGNC) and the Atlas of Genetics and Cytogenetics in Oncology and Haematology (Atlas Genetics Oncology), links are now available from COSMIC's gene summary page to further information at these resources.
An article describing COSMIC, its contents and usage, has been published in Current Protocols in Human Genetics, unit 10.11. Describing in detail how the website and exported datasheets may be used and interpreted, this is available at the Wiley Interscience website.
COSMIC release 37
This months release extends our complete curation of oncogenic EWSR1 fusion partners, together with two new curated genes, PHOX2B & PRKAR1A. CGP's resequencing studies and cell line projects are also significantly updated, each receiving over 100 new mutations. In total, over 1200 new mutations have been added to COSMIC this release.
This gene encodes a highly conserved homeobox transcription factor known to cause congenital central hypoventilation syndrome with associated neuroblastoma.
This is a regulatory subunit of the cAMP dependent protein kinase holoenzyme. An apparent tumour suppressor gene, it has also been observed to be oncogenic in fusions with RET and RARA.
EWSR1 has been observed in oncogenic gene fusions with over 15 partners. This month we release our curation of the literature describing its fusion with a further six partners, bringing the total to 14.
The following curated genes have received significant updates:
BRAF, HRAS, KIT, PTEN, RB1, SMARCB1, ERBB2, STK11, CDKN2A, PTPN11, NRAS, BRCA2, MLH1, MSH2, KRAS, PIK3CA, JAK2, APC, VHL, MSH6, MET, EGFR, MPL, FBXW7, PRKAR1A, RET, RUNX1, NOTCH1, NF2, PHOX2B.
COSMIC release 36
The March 2008 release of COSMIC contains full curation of the TSHR gene together with a further 6 EWSR1 gene fusion pairs.
TSHR - Thyroid stimulating hormone receptor is a 7-TM cell surface receptor expressed in follicular thyroid cells. Upon binding of its ligand, thyrotropin, a signalling cascade is commenced resulting in a range of transcriptional alterations. Somatic mutations in this gene have been described in thyroid adenomas and carcinomas.
EWSR1 is fused to multiple partner genes via recurrent chromosomal
translocation in, primarily, Ewing sarcoma. We are currently curating the complete mutation data for this gene, which has so far been fused with over 10 partners; we have released our curation of EWSR1 with ERG & FLI1, we now release the data for six more gene partners.
The following curated genes have received significant updates:
BRAF, BRCA1, BRCA2, CDH1, CDKN2A, CEBPA, EGFR, ERBB2, FLT3, HRAS, KRAS, MLH1, MSH2, MSH6, NF2, NRAS, PDGFRA, PTEN, SMARCB1, STK11, TSHR, VHL
COSMIC release 35
This release of COSMIC contains the new curation of four new tumour suppressor genes, and further curation of EWSR1/FLI1 gene fusions in Ewing's sarcoma. We also announce a significant upgrade to the CGP Trace Archive, which is now updated daily with our latest sequencing results.
MLH1 is a tumour suppressor gene, involved in mismatch repair. The encoded protein is a subunit of the large 'BRCA1-associated genome surveillance complex' (BASC) involved in DNA damage detection and repair. This particular subunit dimerises with PMS2 to provide endonuclease capacity within the complex. MLH1 germline mutations give rise to HNPCC (hereditary non-polyposis colorectal cancer). Somatic mutations in this gene are important in sporadic colorectal cancers. Mutations of MLH1 lead to a mutator phenotype often manifested by microsatellite instability.
MSH2 is a tumour suppressor gene, also involved in mismatch repair. It resides within the 'BRCA1-associated genome surveillance complex' (BASC) which detects and repairs DNA damage. MSH2, in complex with MSH6, forms a sliding clamp which traverses the DNA backbone detecting mismatched bases. MSH2 germline mutations also give rise to HNPCC. Similar to MLH1, somatic mutations in MSH2 are found predominantly in colorectal cancers. Mutations of MSH2 lead to a mutator phenotype often manifested by microsatellite instability.
CDC73 (HRPT2) is a tumour suppressor forming part of the PAF protein complex, which is associated with RNA polymerase II and may therefore be involved in both initiation of RNA synthesis and RNA elongation. Mutations in this gene have been identified in tumours of the parathyroid, most often causing the endocrine disorder hyperparathyroidism (with or without jaw tumour).
MAP2K4 is one part of the mitogen-activated protein kinase (MAPK) pathway, a signal transduction cascade which mediates certain extracellular signals via RAS/RAF resulting in transcriptional control of a wide range of genes. The MAP2K family of peptides regulate MAPK activity by phosphorylation. MAP2K4 mutations appear involved in many tumour types.
Ewing's sarcoma is a rare bone tumour, infrequently of extraskeletal origin, most frequently occurring in teenage children. The majority of these tumours contain a t(11;22)(q24;q12) translocation which fuses the EWSR1 gene on chromosome 22 with the FLI1 gene on chromosome 11. We have now curated the existing literature describing fusions between this gene pair.
The following curated genes have been updated for this release:
CDKN2A, PTPN11, NRAS, MLH1, MSH2, KRAS, JAK2, MAP2K4, BRAF, HRAS, CTNNB1, MEN1, NF1, APC, VHL, EGFR, FLT3, PTCH, MPL, WT1, RET, CDC73, RUNX1, EWSR1, FLI1.
Genomic co-ordinates for individual mutations are now available in the data export section, together with the datasheets in the FTP site.
The CGP trace archive has been updated to contain all the sequencing traces used in our analysis of the samples and genes presented in the CGP Resequencing project (COSMIC red pages). The number of traces available for download is now approaching 9.5 million. The Archive itself has also been upgraded, so that it receives daily updates of CGP sequencing traces as they pass through our sequencing pipeline. Daily updates are available as separate files; these will be integrated into the main download files once per week.
This release of COSMIC includes the addition of BRCA1, BRCA2, and EWSR1/ERG gene fusion from the scientific literature. The website has been enhanced with an update of old gene names and the addition of further links (NCBI Entrez Gene, CCDS, Swiss-Prot and TrEMBL). The CGP Trace and Genotype Archive holding the groups sequence traces and genotype data is also now available.
BRCA1 and BRCA2 are tumour suppressor genes initially identified as inherited cancer susceptibility genes for breast and ovarian cancer. Both proteins been shown to have roles in genome surveillance, detection of DNA damage and its subsequent repair. However, they associate with different DNA repair complexes and generate different tumour histologies and spectra. Somatic mutations of either gene are rare, with BRCA2 being more frequently found to have somatic mutations, particularly in ovarian and pancreatic carcinomas.
We report that mutations in these two genes have been discovered at fairly low frequencies (2-3%), with BRCA2 mutated in a wider tissue range than BRCA1.
Fusions of EWSR1 and ERG are common events in skeletal (and the rarer extraskeletal) Ewing's Sarcoma. These fusions, found at a frequency of approximately 10% in bone tumours result from complex rearrangements, since the two partner genes are not transcribed in the same chromosomal direction.
The CGP Resequencing screens and the following curated genes have
received updates: BRAF, HRAS, CSF1R, CTNNB1, KIT, PDGFRA, PTEN, ACVR1B, ATM, ERBB2, BRCA1, BRCA2, KRAS, PIK3CA, FGFR2, ABL1, FGFR1, JAK2, SRC, STK11, CDKN2A, PTPN11, NRAS, FAM123B, APC, SMAD4, VHL, MSH6, MET, EGFR, FLT3, FBXW7, MEN1, NF1, RUNX1, FGFR3, RET.
The groups sequence traces and genotype data are now available from the CGP Trace and Genotype Archive site. In order to access the data a Data Transfer Agreement must be completed and approved. A unique username and password will then be provided to access this resource.
244 genes had their names updated (5.2%). It is still possible to search by the old gene name.
There has been an addition of several external gene links on the gene summary page. This includes links to NCBI Entrez gene, CCDS, Swiss-Prot and TrEMBL.
The sample summary page now also contains sample source information.
COSMIC 33: Improved CGP data release
The WTSI Cancer Genome Project (CGP) announces an updated data release policy. We will now be releasing confirmed somatic mutations on a bi-monthly basis. Confirmed and annotated somatic mutations identified in the previous two months will be released in COSMIC, continuing on at two-monthly intervals. Data will still appear within current COSMIC architecture of gene family/gene set and under appropriate studies. This new policy will result in expedited pre-publication release of curated somatic mutations as they are identified.
This new data will be available in the COSMIC blue pages, but will be most noticeable in COSMIC's
CGP resequencing studies site (red pages), as this distinguishes CGP data from the literature curation.
CGP resequencing data is broadly divided (in the red pages) into 3 categories, 'Kinase', 'Pilot' and a new project, 'Renal'. Whilst the Kinase data is completed and published, the other two studies are much larger and still in progress. A collection of approximately 4000 genes has been selected for resequencing in a set of 40 matched pair cell lines ('Pilot' project) and 96 primary clear cell renal cancers. Each tumour sample in these projects has a matched normal sample, which
allows the distinction of somatic mutations from germline variants. The pilot project currently comprises 1865 somatic sequence changes, whilst the Renal project, although less advanced than the Pilot, has identified 84 mutations to date. These will be automatically updated with all our confirmed data every bimonthly release.
RUNX1 is one subunit of the PEBP2 transcription factor, binding to DNA at enhancer sequences. This gene is one of the most frequent targets of chromosome translocations associated with leukemia. Small somatic mutations have also been observed, most frequently in myeloblastic leukaemia types (Acute myeloblastic Leukaemia, MyeloDysplastic Syndrome) and it is these that we have curated in COSMIC. Our data suggests a somatic mutation rate of approximately 10% in this phenotype.
The following curated genes have received updates from the literature:
APC, ATM, BRAF, CDH1, CDKN2A, CTNNA1, CTNNB1, CYLD, EGFR, ERBB2, ERG, ETV1, FBXW7, FGFR3, FLT3, GATA1, HRAS, JAK2, KIT, KRAS, MADH4, MPL, MSH6, NF1, NF2, NOTCH1, NPM1, NRAS, PIK3CA, PTCH, PTEN, PTPN11, RB1, RET, SMARCB1, SMO, SOCS1, STK11, SUFU, TMPRSS2, VHL, WT1, WTX.
This release includes 1563 new mutations identified in the set of 4799 genes; 1495 genes are new this month.
This release includes four new tumour suppressor genes and improved availability in Ensembl.
We are continually striving to improve the utility of the data in COSMIC by integrating it closely with external resources. In this release, we provide a much closer integration with the Ensembl genome browser than previously. All our gene & mutation data now have location coordinates on the NCBI36 genome sequence, allowing us to use Ensembl "DAS" technology to display this information within their genome browser, aligned with their standard genome annotations. We have made this easily available, via a single link from our pages.
Four new tumour suppressor genes have been introduced to COSMIC this month, all receiving full literature curation of their somatic mutation data.
Neurofibromatosis is a familial disease with a complex phenotype including tumours of the central nervous system, caused by mutations in the NF1 tumour suppressor gene. Somatic mutations in tumours have also been identified in this gene, and it is these that we have fully curated.
The central form of neurofibromatosis is a similar familial central nervous system tumour syndrome, caused by mutations in the NF2 tumour suppressor gene. Somatic mutations in tumours have also been identified in this gene, and it is these that we have fully curated.
SOCS1 downregulates cellular cytokine signalling by its direct interaction with JAK1. It was first implicated in cancer after aberrant methylation was observed to inactivate its activity causing Hepatocellular Carcinoma. Somatic mutations have also been observed which inactivate this tumour suppressor and these have been curated.
TCF1 binds to the promoters of several (largely liver-specific) genes, to enhance their expression. Somatic and germline mutations in this gene have been found which cause liver adenomas, and we have curated the somatic component.
The following curated genes have received updates from the scientific literature: KRAS, PIK3CA, JAK2, BRAF, HRAS, KIT, PDGFRA, PTEN, CDKN2A, VHL, EGFR, FBXW7, MEN1, RET
COSMIC (v31) now includes Gene Fusion Data
The CGP COSMIC team is pleased to announce the addition of gene fusion/translocation somatic mutation data from the literature to the
database. Currently, the census of known cancer genes is dominated by somatically generated fusion genes that have been identified primarily in leukaemias, lymphomas and soft tissue tumours. Until now, we have concentrated on curating somatically point mutated cancer genes for COSMIC. Almost all known cancer genes that have somatic point mutations are, however, now curated in COSMIC. In the coming months we will therefore be searching the scientific literature and annotating genes involved in gene fusions and their partners for addition into the COSMIC database.
We have launched this new facility, complete with new views for this data type, with the curation of TMPRSS2, a gene frequently found to be fused to ETS family transcription factors in adenocarcinoma of the prostate. These mutations have served to spur increased investigation into the potential role of fusion genes in adult solid tumours. The move to curate fusion genes is an important addition and will further enhance COSMIC as the most comprehensive source for somatic mutation data from human cancers.
The fusion data has been integrated into existing pages and overviewed in new pages: Translocations Overview and Translocations Summary.
This new data can be viewed graphically and textually.
The image above shows the table of inferred breakpoints (determined from a sample's observed fusion mRNA spectrum) for a fusion gene pair.
The image above shows a graphical representation of the observed mRNA transcripts from which the inferred breakpoints are calculated.
Further information of the new gene fusion website features is available in the help pages.
A new homepage has been created for genes which have received full curation of the scientific literature. This is a new page which allows the distinction of these genes from CGP's data release, for which no literature has been curated.
The following curated genes have also received updates from the scientific literature: CDKN2A, GATA1, NOTCH1, NPM1, NRAS, JAK2, KRAS, PIK3CA, BRAF, HRAS, CEBPA, CSF1R, CTNNB1, KIT, PDGFRA, PTEN, RB1, MET, EGFR, FLT3, WT1, APC, MADH4, FBXW7, FGFR3.
Today we release full literature curations of five tumour suppressor genes MEN1, ATM, CYLD, FBXW7, WTX; 4712 samples were examined in 112 papers, recording 468 mutations. Additionally, we release two new CGP resequencing studies which add a further 91 new genes to COSMIC.
Curation of the scientific literature has been completed for five new genes from the cancer census. All five genes are tumour suppressors, causing phenotypes via their inactivation:
Somatic mutations in this gene have been found in tumours from several endocrine sites, recapitulating those seen in patients carrying germline mutations including tumours in the pituitary, pancreas and parathyroid. MEN1 encodes a nuclear protein thought to be a transcriptional regulator.
This gene has been found to have mutations in sporadic cylindromas, tumours arising from skin adnexal structures (such as hair follicles and glands), principally on the face and scalp. CYLD encodes a deubiquitinating enzyme regulating cell signalling including the NF-kappaB pathway.
Mutations inactivating FBXW7 have been found in a range of cancer types including colorectal, ovarian and T-ALL. The protein is involved targeting a number of key proteins, including NOTCH1 and MYC, for ubiquitin-mediated degradation.
This gene encodes a protein kinase involved in cell cycle checkpoint control. Amongst other key cell cycle components, it has been shown to phosphorylate TP53 and CHEK2 in response to DNA damage. Germline mutations causes Ataxia-telangiectasia (AT) a recessive disorder characterized by cerebellar ataxia, telangiectases, immune defects, and a predisposition to malignancy, primarily lymphoid in origin.
Recently discovered, WTX is inactivated in approximately 30% of Wilms Tumours. Located on the X chromosome, this tumour suppressor only requires a 'single-hit' for tumourigenic inactivation.
The following curated genes have also received updates: BRAF,HRAS,CEBPA,CTNNB1,KIT,PDGFRA,PTEN,SMARCB1,ERBB2,JAK2,CDKN2A,PTPN11,NRAS,KRAS,PIK3CA,APC,CDH1,MADH4,EGFR,FLT3,MPL,WT1,FGFR3
91 new genes have been examined in our pilot set of matched pair cell lines, resulting in the discovery of 22 new mutations:
COSMIC v29 released
COSMIC release 29 includes 22 new CGP resequencing studies, comprising 567 new genes within which 192 new mutations have been identified. Additional updates to our curation of the scientific literature have also been included, adding a total of 1041 mutations to this release.
COSMIC v28 Released
This months COSMIC release comprises a substantial increase in the CGP resequencing data, adding 1033 new genes to the system, together with updates to the scientific literature curation.
COSMIC v27 released
This months release of COSMIC comprises upgrades to both the web site (which now allows searching by gene/sample name or keyword) and data, with new CGP resequencing studies and curated genes. COSMIC now contains data on over 200,000 tumour samples and 400,000 individual experiments. Of these 202109 tumours, 40331 were found to contain one or more mutations (19.9%).
COSMIC third anniversary release (v26)
This release comprises a significant increase in the number of CGP resequencing studies. The five new studies all examine our pilot sample set comprising 40 cancer cell lines that all have a matched normal cell line, allowing all of the mutations to be confirmed as somatic.
COSMIC v25 released
This month's COSMIC release comprises significant updates to CGP resequencing studies and curation of the scientific literature.
The six non-kinase CGP resequencing studies have received substantial updates to the number of genes included and the number of mutations found (the kinase studies were updated in November 2006). Fifty two new genes have been added to the DNA repair study, together with three in the Apoptosis and two in the GAP-GEF studies. The number of mutations discovered in each of the six studies has increased as shown below:
In addition to the CGP resequencing studies, significant updates have been made to those genes which have received complete scientific literature curation. Three genes have been extensively updated, BRAF (19.1%, increased to 19224 samples), JAK2 (25.1%, increased to 11190 samples) and NOTCH1(75.4%, increased to 488 samples), whilst eighteen other genes have received minor updates (less than 10% increase in sample number): ABL1, APC, CDKN2A, CEBPA, CTNNB1, EGFR, ERBB2, FLT3, KRAS, MET, NRAS, PDGFRA, PIK3CA, PTEN, PTPN11, RET, SRC, VHL.
COSMIC v24 released
This months release of Cosmic includes the curation of NPM1 and CDH1.
NPM1 (Nucleophosmin), is a nucleocytoplasmic shuttling protein and
critical regulator of TP53. Frequent mutations have been found in both
childhood and adult AML.
20 papers have been manually curated for this
gene resulting in the addition of 45 unique mutations (exon 12).
CDH1 (E-cadherin), is a calcium ion-dependent cell adhesion molecule
with loss of function of this gene implicated in cancer invasion and
metastasis. In particular, somatic mutations of this gene have been
reported in gastric and lobular breast cancer. 181 mutations have been
added to Cosmic for this gene from the curation of 46 papers.
COSMIC v23 released
This months release of Cosmic includes a major update to the protein kinase screens.
The Cancer Genome Project is pleased to release the full set of protein
kinase somatic mutation data resulting from the screening of over 200
human cancers through the full set of 518 annotated genes. Over 1000
mutations have been identified in a combined total of 247 megabases
sequenced. This dataset is intended to serve as a catalyst for further
biological investigation of mutated kinases and pathways, hopefully
leading to new insights and therapeutic opportunities in human cancer.
Oligo array CGH data (using the Affymetrix 10K SNP array) for a further
233 cancer cell lines and 70 primary tumours has been made available
increasing the total available from 834 to 1136 samples.
COSMIC v22 released
This months release of Cosmic includes the curation from the scientific literature of the APC oncogene and information on the similarity between cell lines is now recorded and displayed in Cosmic.
COSMIC v21 released
This months release of Cosmic includes major updates to the Cancer Cell
Line Project and microsatellite instability status data sets. In addition,
published somatic mutation data from two additional genes, MPL and
FGFR1, have been added to Cosmic.
The Cancer Cell Line Project aims to systematically screen a large panel of
cancer cell lines for mutations in known cancer genes, thus empowering
these cell lines as biological reagents for further work in anti-cancer
agent development and further work on cancer molecular and cellular
For this release of Cosmic, a further 137 cell lines have been added to
the working set and 78 duplicate cell lines have been removed. This brings
the total number of samples to 787. A further 98 mutations have also been
COSMIC v20 released
This months release includes NCI-60 updates and mutation data from the scientific
literature for VHL.
The CGP is pleased to release mutation data for 24 known cancer genes on the
NCI-60 series of cancer cell lines. These data should allow for greater
power in interpretation of biological data using the lines as well as
providing a genetic framework for evaluating response to the large
series of compounds screened against this reference cell line set.
Microsatellite instability occurs due to a defect in mismatch repair.
This is usually a result of inactivation of MSH2, MLH1 or MSH6 due to a
mutation or to reduced expression associated with promoter methylation.
Analysis of microsatellite instability was carried out using the BAT markers
as described by
Rodriguez-Bigas et al. All samples were screened using the
markers BAT25, BAT26, D5S346, D2S123 and D17S250.
Details of this, when available, are posted on the sample overview page. An example of
which can be seen at
VHL mutation data from the literature is now available. We have curated 93 papers
covering 3412 experiments. These experiments used 3386 samples, in which 879
mutations were recorded.
COSMIC v19 released
This month's release of COSMIC includes the Cancer Genome Project screen of the GAP-GEF gene set and new information displays.
This gene set, consisting of 173 genes, is comprised of proteins that
function to regulate the activity of proteins with GTPase activities.
GTPase activating proteins (GAPs) promote hydrolysis of GTP-GDP. Guanine
nucleotide exchange factors (GEFs) promote GDP/GTP exchange. Both
classes modulate the function of the small monomeric GTPases (including
the RAS oncogene family) and other key signalling proteins that use the
conversion of GTP-GDP as a molecular switch to regulate function. This
system of GTPase/GAP/GEFs regulates a wide variety of cellular processes
including growth, differentiation, survival and motility.
Zygosity and somatic/germline status information are now available for
mutations in COSMIC, CGP Resequencing and Cancer Cell Project websites.
The somatic/germline status is listed on the sample detail page and the
export function with the following statuses:-
Zygosity information is available on the mutation detail page with the
COSMIC v18 released
The CGP Resequencing Studies Website is released this month, which will act
as a repository for data from CGP resequencing efforts to identify novel
somatic mutations in human cancer. The pages have their own distinctive red
colour scheme to denote this. Prior data on sets of genes/samples
systematically screened for mutations were previously integrated into the "blue"
COSMIC pages. This will continue with data now being submitted,
prepublication, to and held on the new site. This will allow users to
browse, search and evaluate these data more effectively. The web resources
that are now available are detailed below:-
37 papers from the scientific literature have been curated for the PTCH gene in this release. Adding an additional 897 experiments and 168 mutations.
Ensembl has recently moved to the NCBI 36 assembly of the human genome whilst COSMIC genes and mutations are currently mapped to build 35. This has caused some disparity with the COSMIC DAS track. Therefore we suggest only using the cosmic DAS track on the most recent ensembl archive site(http://feb2006.archive.ensembl.org/index.html).Provided below is a link that will open the appropriate website with the DAS source attached:
Cosmic v17 release
This month's release of COSMIC includes the Cancer Genome Project screen of
small monomeric GTPases and mutation data from the scientific literature for MADH4.
The small monomeric
GTPases function as key molecular switches impacting a large variety of cellular
functions such as motility, cell signalling, transcription and the binding, hydrolysis and
exchange of GTP/GDP. The RAS subfamily (HRAS, NRAS, KRAS) of small monomeric GTPases were
amongst the first identified human oncogenes and are mutationally activated
in a wide variety of human cancers.
70 papers from the scientific literature have been curated for the MADH4
gene in this release. Adding an additional 2275 experiments and 259 mutations.
Further data from the scientific literature for 9 genes, including KRAS
and NRAS, has been added for this release. A detailed breakdown for each gene can
be seen below.
COSMIC v16 released
Released for March are data from a kinase domain screen of malignant
gliomas. These data cover approximately 400kb of sequence in each of 9
tumours, including data from recurrent/resistant tumours.
We have recently completed a screen for somatic mutations of the kinase
domain encoding exons of the entire protein kinase family in a series of
human malignant gliomas.
The results are presented in this release of
COSMIC. No commonly mutated kinase domain was found in these studies.
However, as is the case with our other work in this area, deep
sequencing data from human tumours is informative about the processes
that have contributed to oncogenesis in the patient. Two gliomas
recurrent after temozolomide (alkylator) chemotherapy, but not a third
recurrent after XRT alone, had the highest mutation prevalence of any
tumours we have analysed to date. These data suggests a link between
mutation prevalence and recurrent/resistant brain tumours treated with
A COSMIC Expansion
The Catalogue Of Somatic Mutations In Cancer is two years old and has mutation data for over 1,000 genes, curated from over 3,000 published papers and unpublished data from the Cancer Genome Project.
The original aim of COSMIC continues with the curation of somatic mutation information from the literature for known cancer genes. During 2005 data for 9 genes was collected; ABL1, CDKN2A, EGFR, GATA1, JAK2, MSH6, NOTCH1, PTPN11 and SMO. In addition to this, genes that were curated in 2004 were updated as new data was published.
The number of genes in COSMIC expanded rapidly when the Cancer Genome Project at the Wellcome Trust Sanger Institute published 3 studies of somatic mutations in the protein kinase gene family (518 genes in total). This data provides a unique insight to the somatic mutations in breast, lung and testicular cancers.
More recently the Cancer Genome Project has been submitting unpublished somatic mutation data to COSMIC (link). The data comes from genes involved in apoptosis, DNA repair, maintenance and metabolism and the Inositol Polyphosphate Phosphatase and Heterotrimeric G-Protein families.
In another new departure the COSMIC software was used to create a new web site the Cancer Cell Line Project. This separate
site, with it's own 'mint' colour scheme, contains the results from the sequence analysis of 14 known cancer genes in over 700 cancer cell lines. Initial sequence data for 4 genes analysed in the NCI-60 is also available. This work is in progress and more results will be posted in the coming months. What is more, the number of genes in this project will continue to increase; providing genetic data for this wide set of cancer cell lines.
There have been many enhancements to the web site over the past 12 months. A tissue overview provides a summary of mutations reported in a selected tissue. New pages were created to show more details of mutations and samples and give greater depth to the data. There are also links to other data such as genome copy number information.
COSMIC has been summarised in The British Journal of Cancer (Forbes et al, 2006).
This month sees the update of; BRAF, CDKN2A, EGFR, ERBB2, HRAS, KRAS, NRAS, PTEN, PTPN11 and SMARCB1. In addition the Cancer Genome Project has submitted unpublished data for genes involved in apoptosis.
There are plans to continue the development of COSMIC in terms of data content and data presentation. We are always happy to receive feedback and suggestions (email: email@example.com).
COSMIC v14 released
The COSMIC team is proud to announce the release of COSMIC-14 with data for
CDKN2A(p16) and more unpublished data from the CGP.
The Cancer Genome Project has released further unpublished somatic mutation data from a screen of 41 cancer cell lines. The 302 genes in this release are involved or associated with DNA repair, maintenance and metabolism. The genes can be viewed
together or in 5 subgroups; Telomerase Complex, SWI/SNF, DNA replication,
Nucleotide Metabolism and DNA Damage Response and Repair. In total 119 somatic mutations were identified in this study.
CDKN2A (also known as p16) is a tumour suppressor. It induces cell cycle arrest by
inhibiting the phosphorylation of Rb by the cyclin-dependent kinases CDK4 and CDK6.
So far 453 papers have been curated for this gene with 2,591 mutations recorded
from 16,883 samples.
COSMIC v13 released
Somatic mutation data from new gene families
In a major new departure the Cancer Genome Project is proud to release
further somatic mutation data. The results from the sequencing of two
gene families, Inositol Polyphosphate Phosphatases and Heterotrimeric
G-Proteins, have been added to the data for the Protein Kinase genes
. This data will
be expanded in the future with the addition of further gene sets.
Nine genes in COSMIC have been updated with further data; NRAS, RB1,
ERBB2, HRAS, PTEN, TP53, KRAS, APC and CDKN2A
The Cancer Genome Project is pleased to announce the release of a DAS
source devoted to the genes and mutations within COSMIC. Using this
source you will be able to view the genes and mutations from COSMIC
within a genome browser or the DAS client of your choice.
All 587 genes in COSMIC are exported as features. Each of these features
displays the genomic 'footprint', which encompasses both exonic and
intronic sequence between the start and end points of the CDS sequence.
A link is attached to each feature, providing a mechanism for the client
to link back directly to the gene entry on the COSMIC website.
In addition to the gene footprints, there are also a large number of
unique mutations. These are also displayed as features, with links back
to the mutation summary page in COSMIC. The database currently holds
2812 unique mutations, of which 1035 are currently exported. This subset
is comprised of all the single nucleotide substitutions. More complex
mutations will be included, as the genomic coordinates are mapped.
The DAS source can be found at the following URI:
The easiest way to view this source is to place the following URI in
This will attach the DAS source and display some of the mutations found
in BRAF. Additional configuration can be performed on the track, by
clicking on the track name. For more information, see the help pages on
the Ensembl website.
COSMIC version 12 released
The November release of COSMIC has further data on 9 known cancer genes.
The genes with additional data are; BRAF, PTEN, RB1, EGFR, TP53, CDKN2A,
NRAS, KRAS and PIK3CA.
We have implemented a versioning system for the data in COSMIC. The
current release is version 12 with a plan to release a new version every
There are additional mutations for the known cancer genes being
sequenced through the cancer cell lines. Notably there is data for
homozygous deletions in the CDKN2A gene.
The Cancer Genome Project has released more copy number data derived
from the analysis of cancer cell lines and primary tumours using
Affymetrix SNP microarrays. So far a total of 834 samples have been
analysed consisting of 161 primary tumours and 673 cancer cell lines.
This data is freely available from the CGP website. The primary
tumours overlap with those being sequenced by the CGP while the cancer
cell lines include those being sequenced in the Cancer Cell Line
COSMIC has been updated with the addition of 2 new curated genes and new mutation descriptions.
COSMIC has adopted the Human Genome Variation Society sequence variation/mutation nomenclature for the bulk of the mutations in COSMIC. This represents a major upgrade with the aim of improving clarity and enables the listing of intronic variants for the first time.
Two genes have further data in COMSIC; EGFR and PTEN.
The sequence analysis of the protein kinase gene family in human testicular germ-cell tumours of adolescents and adults has been published. The mutation data from this work was previously available in COSMIC and is now joined by the published analysis of the data.
There are additional mutations from the screening of known cancer genes through an extensive set of cancer cell lines.
COSMIC has been updated with the addition of 3 new curated genes;
MSH6, NOTCH1 and PTPN11.
There is a new member to the COSMIC family; the
Line Project. This portal uses the COSMIC code to serve mutation
data from the cancer cell lines being sequenced by the Cancer Genome
Project at the Wellcome Trust Sanger Institute. The cell line data
is presented in the same style as the COSMIC data with a unique colour
scheme. There are links to jump from the Cancer Cell Line Project pages
to view all of the data in COSMIC. At present there is data from 12
known cancer genes in the Cancer Cell Line Project database.
In addition the results from the screen of all 518 protein kinase
genes in lung cancer, that were available in the previous release of COSMIC,
have been published in
Protein kinase mutations in lung cancer
The Cancer Genome Project has sequenced all protein kinase genes in lung cancer - the most common cause of cancer deaths worldwide
There are over 27,000 new cases of lung cancer in the United Kingdom each year. Protein kinases are frequently mutated in human cancer and inhibitors of mutant protein kinases have proven to be effective anticancer drugs. The Cancer Genome Project has screened the complete coding sequence of all 518 protein kinase genes in 33 lung cancers. This study, published in Cancer Research, is the largest survey reported to date of somatic mutations in lung cancer.
The Cancer Genome Project at the Wellcome Trust Sanger Institute was established in 2000. Its goal is to identify mutations that occur in cancer cells to enable the development of new diagnostics and new treatments and advance our understanding of the biology of cancer.
The Wellcome Trust Sanger Institute Cancer Genome Project and their collaborators have published the latest results of their survey of genes that might be implicated in cancer. The report is published in Cancer Research on Thursday 1st September 2005 and is also available through COSMIC.
The gene set chosen was a class called protein kinases, key controllers of cell growth and death. Members of this family have been shown to be important in cancer. However, the whole set has never been sequenced in a single set of lung tumours. The study generated over 40 million bases of DNA sequence (1.3 million for each sample).
This work identified 188 somatic mutations in 141 protein kinase genes. There was considerable variation in the number of mutations found in each tumour. The results indicate that several mutated protein kinases may be contributing to lung cancer development, but that mutations in each one are infrequent. Larger studies are warranted to further explore these initial findings. Cancer is a complex set of diseases that will affect 1 in 3 people. This work in the CGP is but one part of a global effort to further understanding of cancer and move towards better diagnosis and treatment.
COSMIC Website Update
The COSMIC web site has been updated with additional data from the
literature and unpublished data from the Cancer Genome Project.
Data for 3 genes has been curated from the literature and included in
COSMIC; ABL1, GATA1 and SMO.
The screen of the protein kinase gene family by the Cancer Genome
Project now includes two new tumour types; lung cancer and testicular
germ-cell tumours. There
are marked differences in the mutation prevalence between these two
The mutation data for 9 further genes has been included on the web site giving a total of
550 mutations. The genes are APC, CDH1, CTNNB1, HRAS, MADH4, PIK3CA,
PTEN, RB1 and STK11. The sequencing of these genes is not necessarily
complete but the cell lines with mutations have been confirmed and the
experiments will continue to finish this work.
COSMIC now includes data from a screen of all protein kinase genes in breast cancer and an update of mutation data from the literature.
The data in COSMIC has expanded to include a new data type and the number of known cancer genes has been extended with updates on some of the existing cancer genes.
The Wellcome Trust Sanger Institute Cancer Genome Project and their collaborators have published the latest results of their survey of genes and their mutations in cancer. The report was published online in Nature Genetics on Sunday 22 May 2005 (more). This data has been integrated with the existing data in COSMIC and made available through the web site.
The mutation data for two further cancer genes has been curated from the scientific literature and added to COSMIC.
Further published data has been curated for 5 genes in COSMIC; BRAF, ERBB2, FGFR2, PDGFRA and PIK3CA.
More improvements have been made to the gene selection pages. The alphabetical lists have been seperated into 3 groups to reduce the amount of guess work involved in finding your gene of interest.
The karyotype has also been updated. Genes from the census can be located quickly by clicking on the red trinagles. All other genes are indicated by blue lines across the chromosome.
Each mutation in COSMIC now has its own overview page containing information about the type of mutation and samples/tissues containing the mutation. This page can be reached by clicking on various links throughout the website.
The overview page is divided into 8 main sections:
Two new sections have been added to this page:
COSMIC now includes review papers. There is a review section that can be found at the bottom of the reference overview page for each gene. This section includes references that review other works. As the data from these references has already been added to the data from the original sources, this data is not added again.
COSMIC presents 'Tissue Overview' another way to view somatic mutation
data. The Tissue Overview page details the Top 5 Genes for any tissue /
histology selection ranked by mutation frequency and data volume. In
addition it lists other genes with and without mutations for the
selection. From the Tissue Overview page you can click through to the
specific details of the listed genes.
As stated above, this new page details all the genes that have samples
for the tissues / histologies selected. It is split into three major sections,
with the first section detailing what we feel are the most important genes,
based on mutation frequency and data volume.
This section provides an interactive bar chart and table showing data for
the highest ranked genes containing samples from the chosen tissues /
The coloured bars in the image represent:
COSMIC's first anniversary
The COSMIC database and web site have been updated and now have somatic
mutation data from 21 genes.
The COSMIC team is proud to release somatic mutation data for CSF1R,
RB1, RET and SMARCB1. This information has been curated from the
scientific literature. Somatic mutation data from 15 genes can be
queried and viewed through the COSMIC web site.
The COSMIC team are proud to include somatic mutation data for FGFR2, FGFR3, FLT3, MET, PDGFRA and PIK3CA on the COSMIC web site.
The number of genes with data in COSMIC has more than doubled in this release of the database. The additional data represents a set of genes that have a lower, but nevertheless important, mutation frequency in human cancer as a whole. In specific malignancies genes such as FLT3 do have a significant role as can be seen from the data collected in COSMIC.
Number of unique mutations 307
Number of curated papers 976
We are pleased to announce an update to the COSMIC website. To coincide with the nature paper on ERBB2 we have added all the data for this gene to COSMIC. There have also been a number of improvements to the interface that we hope you will find useful.
Today, Nature publish our recent findings, the first description of small intragenic ERBB2 mutations in human cancer. Primarily found in non-small cell lung adenocarcinomas, the mutations identified are suggestive of inappropriate activation of ERBB2 kinase activity.
This addition brings 8 new mutations and 714 new samples to the database. Increasing the total number of mutant samples to 10655 and the total number of samples to 58032.
This has grown from the original four row summary table, on the distribution page, into a full page overview of the information stored about a specific gene.
Here you will find a page containing all the information about a particular sample. Some of the previously unavailable information, such as details about the individual, has been been made available.
For the first time in COSMIC you can see all the samples from one paper in one location. In addition to this there are also details about the genes screened and the mutations that were found.
We are pleased to announce a minor update to the COSMIC website. The user interface has been updated to include new features that we hope will make your experience with the site more productive and enjoyable.
Web Site Changes
The British Journal of Cancer have released an advance online version of an article describing COSMIC. Detailed information is provided about the curation and structure of the database. Followed by a description of the facilities provided by the website.
COSMIC Website Unavailable 8th May
On Saturday 8th May the COSMIC website, as part of the Sanger website, will be unavailable whilst major network upgrades and essential maintenance work is carried out. We apologise in advance for this loss of service.
Nucleotide data available
COSMIC displays mutations at the amino acid level to show the potential implication of the mutations on the protein sequence. In addition to this COSMIC holds the mutations at the nucleotide level. This data is available through the Export function that can be found at the top of the Distribution figure (example) or at the bottom of the expanded Mutation Data tables (example).
COSMIC version 1 released
Wellcome Trust Sanger Institute launches Catalogue Of Somatic Mutations In Cancer.
In the quest to develop rational approaches to treating cancer, researchers need efficient access to existing knowledge. COSMIC (Catalogue Of Somatic Mutations In Cancer), launched today by the Cancer Genome Project at The Wellcome Trust Sanger Institute, is a new tool that provides integrated genetic data from cancer genes, and will make research faster and easier.
BRAF V599E becomes V600E
The original BRAF mutations reported by Davies et al were mapped to the DNA sequence NM_004333[gi;4757867] with the common BRAF mutation being V599E. On the 24th July 2003 this sequence was updated to NM_004333[gi;33188458] with the insertion of 3bp in the coding sequence. The net effect of this update was to increase the length of the BRAF protein by one amino acid and increase the position of all published mutations by one amino acid. The beginning of both versions of the proteins are;
MAALSGGGGGGAEPGQALFNGDMEPEAGAGR PAASSAADP NM_004333[gi;4757867]
Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK Tel:+44 (0)1223 834244
Last Modified Tue Mar 20 14:26:52 2007
Genome Research Limited is a charity registered in England with number 1021457
Help | Contact us | Legal | Cookies policy | Data sharing