COSMIC v46 Release
The second full-genome resequencing study from the CGP at the Sanger Institute, UK is now available, together with the curation of Parsons et al (2008), a systematic candidate gene screen of Glioblastomas. In addition, the published literature has been fully curated for fusion mutations between seven new gene pairs.
The recent Pleasance et al (2010) publication "A small-cell lung cancer genome with complex signatures of tobacco exposure" (Nature 463, 184-190) is now available within COSMIC; please click here.
The largest published candidate gene screen of Glioblastomas Parsons et al (2008), is now curated in COSMIC; please click here:
An integrated genomic analysis of human glioblastoma multiforme. Parsons DW, Jones S, Zhang X, Lin JC, Leary RJ, Angenendt P, Mankoo P, Carter H, Siu IM, Gallia GL, Olivi A, McLendon R, Rasheed BA, Keir S, Nikolskaya T, Nikolsky Y, Busam DA, Tekleab H, Diaz LA, Hartigan J, Smith DR, Strausberg RL, Marie SK, Shinjo SM, Yan H, Riggins GJ, Bigner DD, Karchin R, Papadopoulos N, Parmigiani G, Vogelstein B, Velculescu VE, Kinzler KW Science. 2008;321;1807-12. PMID: 18772396 DOI: 10.1126/science.1164382
FUS-ERG , FUS-FEV , FUS-ATF1 Both FUS-ERG and FUS-FEV fusions have been identified as alternatives to EWSR1-ETS transcription factor fusions in Ewing's sarcoma, and FUS-ERG also occurs in t (16,21) myeloid leukaemia as well as in these solid tumours. FUS-ATF1 is found in angiomatoid fibrous histiocytoma, where the fusion of the N-terminus of FUS and the DNA binding domain of ATF1 is similar to the EWSR1-ATF1 fusion found in clear cell sarcoma.
SS18-SSX1 This fusion is characteristic for synovial sarcoma along with SS18-SSX2 and more rarely, SS18-SSX4 fusions. Through its N-terminal SNH domain SS18 protein is involved in the remodelling of chromatin structures and functions as a transcriptional activator whereas SSX proteins have 2 putative transcription-repressor domains, one of which, an SSXRD domain in the C-terminal region, is preserved in the fusion protein.
SRGAP3-RAF1 This oncogenic fusion has been identified in paediatric pilocytic astrocytoma as an alternative to the previously described KIAA1549-BRAF fusion. It also activates the ERK/MAPK pathway; the auto-inhibitory domain of RAF1 being replaced by SRGAP3.
COL1A1-PDGFB This recurrent fusion characterizes dermatofibroma protuberans and its juvenile form, giant cell fibroblastoma. The fusion consistently deletes exon 1 of PDGFB releasing this growth factor from its normal regulation. The breakpoints in COL1A1, which encodes an extracellular matrix protein, occur in various exons in the alpha-helical domain.
JAZF1-SUZ12 A fusion involving these two genes is common but not universal in endometrial stromal sarcomas, occurring less frequently in high-grade tumours. The genes encode novel proteins with zinc finger motifs and these are retained in the fusion.
ABL1, ACVR1B, AKT1, ALK, APC, ASXL1, ATM, BRAF, BRCA1, BRCA2, CBL, CDC73, CDH1, CDKN2A, CEBPA, CSF1R, CTNNA1, CTNNB1, CYLD, EGFR, EML4, ERBB2, ERG, FAM123B, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, FOXL2, GATA1, GNAQ, GNAS, HNF1A, HRAS, IDH1, IDH2, JAK2, JAK3, KIT, KRAS, MAP2K4, MEN1, MET, MLH1, MPL, MSH2, MSH6, NF1, NF2, NOTCH1, NOTCH2, NPM1, NRAS, PDGFRA, PHOX2B, PIK3CA, PRKAR1A, PTCH1, PTEN, PTPN11, RB1, RET, RUNX1, SMAD4, SMARCA4, SMARCB1, SMO, SOCS1, SRC, STK11, SUFU, TNFAIP3, TSHR, VHL, WT1
COSMIC v45 Release
The first full-genome resequencing study is now available, together with the genome-wide rearrangement screens of 24 breast tumours. In addition, five new cancer genes have been curated from the literature.
To make the data easier to investigate in depth, the website has been upgraded with new specialisation features, together with new views on mutation spectrum and distribution. Finally, we are introducing a new COSMIC Biomart, where all COSMIC's information will be available in this industry-standard data mining tool.
The recent Pleasance et al (2010) publication "A comprehensive catalogue of somatic mutations from a human cancer genome" (Nature 463, 191-196) is now available within COSMIC; please click here.
Also, the CGP Stephens et al (2009) paper "Complex landscapes of somatic rearrangement in human breast cancer genomes" (Nature 462, 1005-1010) is now available in COSMIC; please click here . A paired-end genome-wide Illumina sequencing strategy revealed numerous rearrangements in very diverse patterns between the samples examined.
GNAQ is the alpha subunit of one of the heterotrimeric GTP-binding proteins that mediate stimulation of protein kinase C signalling. Mutations in GNAQ, occurring at codon 209 in the catalytic domain, have been found as common and early mutational events in uveal melanomas.
TNFAIP3 is a negative regulator of the NF-kappa B pathway functioning through the removal of activating Lys63-linked ubiquitins and the Lys48-linked ubiquitination of receptor-interacting proteins. TNFAIP3 has been shown to be a genetic target in B-lineage lymphomas such as mucosa-associated lymphoma and Hodgkin's lymphoma of nodular sclerosing histology.
CBL encodes a protein with multiadaptor function and E3 ubiquitin ligase activity that targets a variety of tyrosine kinases for degradation. Mutations in CBL have been identified in myeloid malignancies, occurring in the critical linker and ring finger domains of the protein.
JAK3 is a member of the non-receptor tyrosine kinase family which includes JAK2. Rare but significant JAK3 activating mutations located in the JH2 (pseudokinase) and JH6 (receptor binding) domains have been found in Down syndrome and Non-DS acute megakaryoblastic leukaemia (AML-M7). Mutations have also been found in various myeloproliferative neoplasms, lymphomas and carcinomas.
NOTCH2 is a Type 1 transmembrane protein with an extracellular domain consisting of multiple epidermal growth factor-like (EGF) repeats, and an intracellular domain consisting of multiple different domain types. The Notch2 receptor and its 5 ligands, which include Jagged1, Jagged2, and Delta-like 1, 3 and 4, send signals that are important for development before birth. After birth,Notch2 signaling is involved in tissue repair. Mutations in the NOTCH2 gene have been identified in a small percentage of people with Alagille syndrome and malformations in the kidneys, especially in filtering structures. NOTCH2 is also preferentially expressed in mature B cells,is essential for marginal zone B-cell generation, and mutations are evident in a subset of individuals with diffuse large B-cell lymphomas.
The main histogram page of the COSMIC website had been improved to provide better ways of selecting and viewing subsets of data. In the navigation bar on the left side, new options are now available to redraw the histogram and associated tables based on four parameters: mutation type (eg deletion, nonsense substitutions, etc), sample source (cultured or tissue sample), somatic status (confirmed somatic or unknown) and systematic screen (genome-wide screen). In addition to redrawing the histogram and tables, a new "Distribution" button displays pie charts of relevant information about the data selected.
The sample summary page has also been upgraded, with every CGP sample (examined through numerous genes) receiving a mutation spectrum diagram. This comprises a histogram showing the relative frequencies of each substitution type, together with a count of insertion/deletion mutations. This is highly useful when looking for mutation signatures which may show characteristsics of, for instance, tobacco or UV light exposure.
The new COSMIC biomart is now available, please click here. This system allows much more specialised selection of data in COSMIC and is very useful for data mining. In addition, it can be directly linked to Ensembl for federilsed querying across both databases.
JAK2, JAK3, MAP2K4, GNAS, MPL, SOCS1, WT1, CYLD, FBXW7, MEN1, NF1, RUNX1, ASXL1, NOTCH2, IDH1, IDH2, APC, CDH1, VHL, GNAQ, BRAF, HRAS, CEBPA, CTNNB1, FLT3, KIT, PDGFRA, PTEN, RB1, RET, SMARCB1, AKT1, EGFR, ERBB2, CDKN2A, CBL, GATA1, NPM1, PTPN11, NRAS, FGFR3, BRCA1, MSH6, PRKAR1A, KRAS, PIK3CA, MET, TNFAIP3
COSMIC v44 Release
This release of COSMIC includes 4 new curated genes, 8 new curated fusion pairs and the TCGA systematic screen publication of 91 Glioblastoma tumour samples. In addition, a new CGP study is available (Adenoid cystic carcinoma) together with substantial updates to existing data.
IDH2 encodes a mitochondrial NADP(+)-dependent isocitrate dehydrogenase which catalyzes oxidative decarboxylation of isocitrate to alpha-ketoglutarate. It is now implicated in the pathogenesis of malignant gliomas and some secondary glioblastomas lacking IDH1 mutations have IDH2 mutations at the analogous amino acid (R172).
AKT1 encodes a serine-threonine protein kinase which is activated by phosphorylated phosphoinositides and is a central mediator of the PI3kinase signalling pathway. A common mutation (E17K) has been identified in the pleckstrin homology domain in cancers of the colon, breast, lung and ovary.
ASXL1 belongs to a family of proteins regulating chromatin remodelling. Originally implicated via aCGH on MDS/AML samples, mutations are mainly frameshift mutations, the predicted truncated proteins lack the PHD finger domain potentially compromising the function of the associated chromatin modifiers.
FOXL2, forkhead box L2 is a winged helix/forkhead transcription factor gene, encoding a nuclear protein that is specifically expressed in eyelids and in fetal and adult ovarian follicular cells. Germline mutations in FOXL2 are responsible for BPES - blepharophimosis ptosis epicanthus inversus syndrome - an autosomal dominant disorder consisting of eyelid abnormalities (only, in Type II) and ovarian failure (Type I). Somatic mutations have recently been described in ovarian granulosa cell tumours.
The following gene fusions have been curated from the scientific literature: EML4 / ALK MSN / ALK NPM1 / ALK CLTC / ALK SEC31A / ALK RANBP2 / ALK SS18 / SSX2 SS18 / SSX4
Comprehensive genomic characterization defines human glioblastoma genes and core pathways.The first systematic screen of the Cancer Genome Atlas Research Network (PMID 18772890) is now curated in COSMIC .
Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Cancer Genome Atlas Research Network Nature. 2008;455;1061-8. PMID: 18772890 DOI: 10.1038/nature07385
Adenoid cystic carcinoma is a slow growing tumour of the secretory glands, arising most commonly in the salivary glands but also occurring in other parts of the body. As part of an ongoing research effort funded by the Adenoid Cystic Carcinoma Research Fund (www.accrf.org), 400 candidate gene (including genes implicated in cancer, cell signaling and growth control) were sequenced for small point mutations. This work was carried out on 25 samples (provided by ACCRF collaborative research group member Dr. Adel El-Naggar) utilising an approach of PCR product generation for the entire set of PCR amplimers followed by individual concatentation of all amplimers for each tumour and matching normal DNA sample, then sequencing this material utilising next generation sequencing. In total 8 somatic point mutations were identified in 8 genes. No highly prevalent point mutation was identified in this set of genes.
KRAS, PIK3CA, FGFR2, MET, ABL1, FGFR1, JAK2, MAP2K4, GNAS, EML4, FOXL2, PTCH1, MPL, SOCS1, HNF1A, WT1, NF2, CYLD, FBXW7, MEN1, NF1, RUNX1, IDH1, IDH2, ASXL1, FAM123B, APC, CDH1, SMAD4, VHL, BRAF, HRAS, CEBPA, CSF1R, CTNNB1, FLT3, KIT, PDGFRA, PTEN, RB1, RET, SMARCB1, SUFU, ACVR1B, AKT1, ALK, ATM, EGFR, ERBB2, SRC, STK11, CDKN2A, GATA1, SMO, NOTCH1, NPM1, PTPN11, NRAS, FGFR3, BRCA1, BRCA2, MLH1, MSH2, MSH6, PRKAR1A
COSMIC v43 Release
The COSMIC curation systems have been extended to encompass the entry of large-scale systematic screen papers. For this release, we have entered the first such paper, the Sjoblom et al (2006) screen of human breast and colorectal cancers. This release also contains two new genes successfully curated from the scientific literature (IDH1, SMARCA4) and the finalisation of two of the Cancer Genome Project's current resequencing studies.
For this release of COSMIC we have entered the Sjoblom et al (2006) systematic screen paper of human breast and colorectal cancers. An additional 8,648 genes have been added to COSMIC along with the 1,672 mutations from the paper. The COSMIC reference overview page for this publication is available here.
The consensus coding sequences of human breast and colorectal cancers. Sjoblom T, Jones S, Wood LD, Parsons DW, Lin J, Barber TD, Mandelker D, Leary RJ, Ptak J, Silliman N, Szabo S, Buckhaults P, Farrell C, Meeh P, Markowitz SD, Willis J, Dawson D, Willson JK, Gazdar AF, Hartigan J, Wu L, Liu C, Parmigiani G, Park BH, Bachman KE, Papadopoulos N, Vogelstein B, Kinzler KW, Velculescu VE. Science. 2006 Oct 13;314(5797):268-74. Epub 2006 Sep 7. PMID: 16959974
The resequencing of candidate genes in Pilot and Renal tumour sets has now been completed. The finalised studies examined 2978 samples through 4766 genes, discovering a total of 5437 mutations. All of these can be found in COSMIC's CGP Resequencing Studies Site.
IDH1 is a catalytic enzyme causing NADP+ dependent oxidative decarboxylation of isocitric acid. It plays an important role in the control of glucose-stimulated insulin secretion and the cholesterol and fatty acid biosynthetic pathways. Originally implicated in human cancer in genome-wide sequencing scans, when mutated it is an indicator for the longer survival of these patients.
SMARCA4, is a scaffold protein, forming a functional part of the SWI/SNF complex involved in the control of transcription.
FBXW7, MEN1, NF1, BRAF, HRAS, CSF1R, CTNNB1, FLT3, KIT, PDGFRA, PTEN, RET, SMARCB1, SUFU, ACVR1B, ATM, EGFR, ERBB2, SRC, CDKN2A, FAM123B, GATA1, SMO, NOTCH1, NPM1, PTPN11, NRAS, FGFR3, BRCA1, BRCA2, APC, CDH1, SMAD4, VHL, TSHR, MLH1, MSH2, MSH6, SMARCA4, RUNX1, PHOX2B, GNAS, KRAS, PIK3CA, FGFR2, FGFR1, IDH1, JAK2, JAK3, MAP2K4, TET2, PRKAR1A, CDC73, PTCH1, MPL, CTNNA1, SOCS1, HNF1A, WT1, ERG, NF2
COSMIC v42 Release
For this release of COSMIC two known cancer genes (GNAS and ALK) and 3 gene fusions (FCHSD1 / BRAF, KIAA1549 / BRAF, EWSR1 / NR4A3) have been successfully curated from the scientific literature. The Cancer Cell Line Project has also been updated with the addition of 80 mutations.
The Cancer Cell Line Data has been updated with the addition of 80 mutations. The project has also published a further set of variants identified by the screen which have been classified as Tentatively Oncogenic Variant (TOV) or Unknown Variant (UV). These variants are currently available from our website as an excel file.
Two further cancer genes have been curated with the addition of 95 mutations for ALK and 235 mutations for GNAS.
The following gene fusions have been curated from the scientific literature: FCHSD1 / BRAF KIAA1549 / BRAF EWSR1 / NR4A3
Genes updated: KRAS, PIK3CA, ABL1, FGFR1, JAK2, BRAF, HRAS, CEBPA, CSF1R, CTNNB1, KIT, PDGFRA, PTEN, RB1, SUFU, ERBB2, SRC, STK11, CDKN2A, GATA1, SMO, NPM1, PTPN11, NRAS, BRCA2, MLH1, MSH2, MSH6, APC, CDH1, SMAD4, MET, EGFR, FLT3, PTCH1, MPL, WT1, CYLD, FBXW7, NF1, ALK, FGFR3, RET, NOTCH1, NF2, GNAS
COSMIC v41 release
This release of COSMIC comprises an update of published data in which 44 genes have been updated with the addition of 22516 samples and a further 7387 mutations.
STK11, CDKN2A, GATA1, SMO, NPM1, PTPN11, NRAS, BRCA2, MSH2, KRAS, PIK3CA, JAK2, MAP2K4, BRAF, HRAS, CEBPA, CTNNB1, KIT, PDGFRA, PTEN, RB1, ATM, ERBB2, FBXW7, NF1, FAM123B, APC, CDH1, VHL, MET, EGFR, FLT3, PTCH1, MPL, SOCS1, HNF1A, WT1, CYLD, FGFR3, RET, RUNX1, TSHR, PHOX2B, NOTCH1.
COSMIC release 40
This release of COSMIC comprises an update of the existing genes totalling almost 3000 new mutations.
COSMIC release 39, Annotating Cancer Genomes
For this release of COSMIC the database and web interfaces have been upgraded to handle Next Generation Sequencing Data. This is part of ongoing work to allow COSMIC to handle the increased volumes and complexity of somatic data that is anticipated from Next Generation Sequencers. In particular, for this release we have concentrated on adapting COSMIC to handle large-scale structural variants (including translocations, large insertions/deletions, inversions, and duplications).
The structural variants from the Campbell et al. 2008 paper, which comprehensively characterizes 2 lung cancer cell lines, have been entered into COSMIC (click here for study overview). Sample Summary pages are available for both cancer cell lines (NCI-H2171 and NCI-H1770).
Circular plots (Circos plots developed by Martin Krzywinski) have been added to the sample overview page which gives a clear overview of all the structural variants along with copy number changes and COSMIC point mutations for a particular sample (Figure 1). More detailed views of complex rearrangements are available on the mutation details page.
Figure 1. Circos Plot showing structural variants in relation to copy number and COSMIC Point Mutations.
Tabular views and exports are also available for these data (Figure 2). Due to the complexity of these rearrangements, where possible, a short description term of the variant is given (e.g. deletion, tandem duplication translocation). The variant is also fully described using HGVS mutation nomenclature. For example chr11:g.36585230_76606619del, where chr11: denotes the chromosome involved, g. for genomic coordinates, 36585230 for the deletion start point, 76606619 for deletion end point and del indicates a deletion event.
Figure 2. Summary Structural Variants Table
NCI/Nature Pathway Interaction Database Primer on COSMIC published and is available from here.
The Cancer Gene Census was updated on 11th August 2008. The Census now contains information of 379 genes of which 343 harbour somatic alterations and 70 germline.
COSMIC release 38
For this release of COSMIC we have concentrated our efforts on significantly updating the following genes: BRAF, HRAS, CTNNB1, KIT, PDGFRA, PTEN, RB1, ERBB2, MAP2K4, CDKN2A, GATA1, SMO, NPM1, NRAS, MLH1, MSH2, KRAS, PIK3CA, JAK2, APC, SMAD4, EGFR, FLT3, PTCH1, MPL, HNF1A, FBXW7, NF1, FGFR3, RET, NF2, NOTCH1.
In collaboration with the Human Gene Nomenclature committee (HGNC) and the Atlas of Genetics and Cytogenetics in Oncology and Haematology (Atlas Genetics Oncology), links are now available from COSMIC's gene summary page to further information at these resources.
An article describing COSMIC, its contents and usage, has been published in Current Protocols in Human Genetics, unit 10.11. Describing in detail how the website and exported datasheets may be used and interpreted, this is available at the Wiley Interscience website.
COSMIC release 37
This months release extends our complete curation of oncogenic EWSR1 fusion partners, together with two new curated genes, PHOX2B & PRKAR1A. CGP's resequencing studies and cell line projects are also significantly updated, each receiving over 100 new mutations. In total, over 1200 new mutations have been added to COSMIC this release.
PHOX2B This gene encodes a highly conserved homeobox transcription factor known to cause congenital central hypoventilation syndrome with associated neuroblastoma.
PRKAR1A This is a regulatory subunit of the cAMP dependent protein kinase holoenzyme. An apparent tumour suppressor gene, it has also been observed to be oncogenic in fusions with RET and RARA.
EWSR1 has been observed in oncogenic gene fusions with over 15 partners. This month we release our curation of the literature describing its fusion with a further six partners, bringing the total to 14.
The following curated genes have received significant updates: BRAF, HRAS, KIT, PTEN, RB1, SMARCB1, ERBB2, STK11, CDKN2A, PTPN11, NRAS, BRCA2, MLH1, MSH2, KRAS, PIK3CA, JAK2, APC, VHL, MSH6, MET, EGFR, MPL, FBXW7, PRKAR1A, RET, RUNX1, NOTCH1, NF2, PHOX2B.
COSMIC release 36
The March 2008 release of COSMIC contains full curation of the TSHR gene together with a further 6 EWSR1 gene fusion pairs.
TSHR - Thyroid stimulating hormone receptor is a 7-TM cell surface receptor expressed in follicular thyroid cells. Upon binding of its ligand, thyrotropin, a signalling cascade is commenced resulting in a range of transcriptional alterations. Somatic mutations in this gene have been described in thyroid adenomas and carcinomas.
EWSR1/ATF1 ; EWSR1/CREB1 ; EWSR1/DDIT3 ; EWSR1/ETV1 ; EWSR1/SP3 ; EWSR1/WT1 EWSR1 is fused to multiple partner genes via recurrent chromosomal translocation in, primarily, Ewing sarcoma. We are currently curating the complete mutation data for this gene, which has so far been fused with over 10 partners; we have released our curation of EWSR1 with ERG & FLI1, we now release the data for six more gene partners.
The following curated genes have received significant updates: BRAF, BRCA1, BRCA2, CDH1, CDKN2A, CEBPA, EGFR, ERBB2, FLT3, HRAS, KRAS, MLH1, MSH2, MSH6, NF2, NRAS, PDGFRA, PTEN, SMARCB1, STK11, TSHR, VHL
COSMIC release 35
This release of COSMIC contains the new curation of four new tumour suppressor genes, and further curation of EWSR1/FLI1 gene fusions in Ewing's sarcoma. We also announce a significant upgrade to the CGP Trace Archive, which is now updated daily with our latest sequencing results.
MLH1 is a tumour suppressor gene, involved in mismatch repair. The encoded protein is a subunit of the large 'BRCA1-associated genome surveillance complex' (BASC) involved in DNA damage detection and repair. This particular subunit dimerises with PMS2 to provide endonuclease capacity within the complex. MLH1 germline mutations give rise to HNPCC (hereditary non-polyposis colorectal cancer). Somatic mutations in this gene are important in sporadic colorectal cancers. Mutations of MLH1 lead to a mutator phenotype often manifested by microsatellite instability.
MSH2 is a tumour suppressor gene, also involved in mismatch repair. It resides within the 'BRCA1-associated genome surveillance complex' (BASC) which detects and repairs DNA damage. MSH2, in complex with MSH6, forms a sliding clamp which traverses the DNA backbone detecting mismatched bases. MSH2 germline mutations also give rise to HNPCC. Similar to MLH1, somatic mutations in MSH2 are found predominantly in colorectal cancers. Mutations of MSH2 lead to a mutator phenotype often manifested by microsatellite instability.
CDC73 (HRPT2) is a tumour suppressor forming part of the PAF protein complex, which is associated with RNA polymerase II and may therefore be involved in both initiation of RNA synthesis and RNA elongation. Mutations in this gene have been identified in tumours of the parathyroid, most often causing the endocrine disorder hyperparathyroidism (with or without jaw tumour).
MAP2K4 is one part of the mitogen-activated protein kinase (MAPK) pathway, a signal transduction cascade which mediates certain extracellular signals via RAS/RAF resulting in transcriptional control of a wide range of genes. The MAP2K family of peptides regulate MAPK activity by phosphorylation. MAP2K4 mutations appear involved in many tumour types.
Ewing's sarcoma is a rare bone tumour, infrequently of extraskeletal origin, most frequently occurring in teenage children. The majority of these tumours contain a t(11;22)(q24;q12) translocation which fuses the EWSR1 gene on chromosome 22 with the FLI1 gene on chromosome 11. We have now curated the existing literature describing fusions between this gene pair.
The following curated genes have been updated for this release: CDKN2A, PTPN11, NRAS, MLH1, MSH2, KRAS, JAK2, MAP2K4, BRAF, HRAS, CTNNB1, MEN1, NF1, APC, VHL, EGFR, FLT3, PTCH, MPL, WT1, RET, CDC73, RUNX1, EWSR1, FLI1.
Genomic co-ordinates for individual mutations are now available in the data export section, together with the datasheets in the FTP site.
The CGP trace archive has been updated to contain all the sequencing traces used in our analysis of the samples and genes presented in the CGP Resequencing project (COSMIC red pages). The number of traces available for download is now approaching 9.5 million. The Archive itself has also been upgraded, so that it receives daily updates of CGP sequencing traces as they pass through our sequencing pipeline. Daily updates are available as separate files; these will be integrated into the main download files once per week.
COSMIC 34
This release of COSMIC includes the addition of BRCA1, BRCA2, and EWSR1/ERG gene fusion from the scientific literature. The website has been enhanced with an update of old gene names and the addition of further links (NCBI Entrez Gene, CCDS, Swiss-Prot and TrEMBL). The CGP Trace and Genotype Archive holding the groups sequence traces and genotype data is also now available.
BRCA1 and BRCA2 are tumour suppressor genes initially identified as inherited cancer susceptibility genes for breast and ovarian cancer. Both proteins been shown to have roles in genome surveillance, detection of DNA damage and its subsequent repair. However, they associate with different DNA repair complexes and generate different tumour histologies and spectra. Somatic mutations of either gene are rare, with BRCA2 being more frequently found to have somatic mutations, particularly in ovarian and pancreatic carcinomas.
We report that mutations in these two genes have been discovered at fairly low frequencies (2-3%), with BRCA2 mutated in a wider tissue range than BRCA1.
Fusions of EWSR1 and ERG are common events in skeletal (and the rarer extraskeletal) Ewing's Sarcoma. These fusions, found at a frequency of approximately 10% in bone tumours result from complex rearrangements, since the two partner genes are not transcribed in the same chromosomal direction.
The CGP Resequencing screens and the following curated genes have received updates: BRAF, HRAS, CSF1R, CTNNB1, KIT, PDGFRA, PTEN, ACVR1B, ATM, ERBB2, BRCA1, BRCA2, KRAS, PIK3CA, FGFR2, ABL1, FGFR1, JAK2, SRC, STK11, CDKN2A, PTPN11, NRAS, FAM123B, APC, SMAD4, VHL, MSH6, MET, EGFR, FLT3, FBXW7, MEN1, NF1, RUNX1, FGFR3, RET.
The groups sequence traces and genotype data are now available from the CGP Trace and Genotype Archive site. In order to access the data a Data Transfer Agreement must be completed and approved. A unique username and password will then be provided to access this resource.
244 genes had their names updated (5.2%). It is still possible to search by the old gene name.
There has been an addition of several external gene links on the gene summary page. This includes links to NCBI Entrez gene, CCDS, Swiss-Prot and TrEMBL.
The sample summary page now also contains sample source information.
COSMIC 33: Improved CGP data release
The WTSI Cancer Genome Project (CGP) announces an updated data release policy. We will now be releasing confirmed somatic mutations on a bi-monthly basis. Confirmed and annotated somatic mutations identified in the previous two months will be released in COSMIC, continuing on at two-monthly intervals. Data will still appear within current COSMIC architecture of gene family/gene set and under appropriate studies. This new policy will result in expedited pre-publication release of curated somatic mutations as they are identified.
This new data will be available in the COSMIC blue pages, but will be most noticeable in COSMIC's CGP resequencing studies site (red pages), as this distinguishes CGP data from the literature curation.
CGP resequencing data is broadly divided (in the red pages) into 3 categories, 'Kinase', 'Pilot' and a new project, 'Renal'. Whilst the Kinase data is completed and published, the other two studies are much larger and still in progress. A collection of approximately 4000 genes has been selected for resequencing in a set of 40 matched pair cell lines ('Pilot' project) and 96 primary clear cell renal cancers. Each tumour sample in these projects has a matched normal sample, which allows the distinction of somatic mutations from germline variants. The pilot project currently comprises 1865 somatic sequence changes, whilst the Renal project, although less advanced than the Pilot, has identified 84 mutations to date. These will be automatically updated with all our confirmed data every bimonthly release.
RUNX1 is one subunit of the PEBP2 transcription factor, binding to DNA at enhancer sequences. This gene is one of the most frequent targets of chromosome translocations associated with leukemia. Small somatic mutations have also been observed, most frequently in myeloblastic leukaemia types (Acute myeloblastic Leukaemia, MyeloDysplastic Syndrome) and it is these that we have curated in COSMIC. Our data suggests a somatic mutation rate of approximately 10% in this phenotype.
The following curated genes have received updates from the literature: APC, ATM, BRAF, CDH1, CDKN2A, CTNNA1, CTNNB1, CYLD, EGFR, ERBB2, ERG, ETV1, FBXW7, FGFR3, FLT3, GATA1, HRAS, JAK2, KIT, KRAS, MADH4, MPL, MSH6, NF1, NF2, NOTCH1, NPM1, NRAS, PIK3CA, PTCH, PTEN, PTPN11, RB1, RET, SMARCB1, SMO, SOCS1, STK11, SUFU, TMPRSS2, VHL, WT1, WTX.
This release includes 1563 new mutations identified in the set of 4799 genes; 1495 genes are new this month.
COSMIC v32
This release includes four new tumour suppressor genes and improved availability in Ensembl.
We are continually striving to improve the utility of the data in COSMIC by integrating it closely with external resources. In this release, we provide a much closer integration with the Ensembl genome browser than previously. All our gene & mutation data now have location coordinates on the NCBI36 genome sequence, allowing us to use Ensembl "DAS" technology to display this information within their genome browser, aligned with their standard genome annotations. We have made this easily available, via a single link from our pages.
Four new tumour suppressor genes have been introduced to COSMIC this month, all receiving full literature curation of their somatic mutation data.
Neurofibromatosis is a familial disease with a complex phenotype including tumours of the central nervous system, caused by mutations in the NF1 tumour suppressor gene. Somatic mutations in tumours have also been identified in this gene, and it is these that we have fully curated.
The central form of neurofibromatosis is a similar familial central nervous system tumour syndrome, caused by mutations in the NF2 tumour suppressor gene. Somatic mutations in tumours have also been identified in this gene, and it is these that we have fully curated.
SOCS1 downregulates cellular cytokine signalling by its direct interaction with JAK1. It was first implicated in cancer after aberrant methylation was observed to inactivate its activity causing Hepatocellular Carcinoma. Somatic mutations have also been observed which inactivate this tumour suppressor and these have been curated.
TCF1 binds to the promoters of several (largely liver-specific) genes, to enhance their expression. Somatic and germline mutations in this gene have been found which cause liver adenomas, and we have curated the somatic component.
The following curated genes have received updates from the scientific literature: KRAS, PIK3CA, JAK2, BRAF, HRAS, KIT, PDGFRA, PTEN, CDKN2A, VHL, EGFR, FBXW7, MEN1, RET
COSMIC (v31) now includes Gene Fusion Data
The CGP COSMIC team is pleased to announce the addition of gene fusion/translocation somatic mutation data from the literature to the database. Currently, the census of known cancer genes is dominated by somatically generated fusion genes that have been identified primarily in leukaemias, lymphomas and soft tissue tumours. Until now, we have concentrated on curating somatically point mutated cancer genes for COSMIC. Almost all known cancer genes that have somatic point mutations are, however, now curated in COSMIC. In the coming months we will therefore be searching the scientific literature and annotating genes involved in gene fusions and their partners for addition into the COSMIC database.
We have launched this new facility, complete with new views for this data type, with the curation of TMPRSS2, a gene frequently found to be fused to ETS family transcription factors in adenocarcinoma of the prostate. These mutations have served to spur increased investigation into the potential role of fusion genes in adult solid tumours. The move to curate fusion genes is an important addition and will further enhance COSMIC as the most comprehensive source for somatic mutation data from human cancers.
The fusion data has been integrated into existing pages and overviewed in new pages: Translocations Overview and Translocations Summary.
This new data can be viewed graphically and textually.
The image above shows the table of inferred breakpoints (determined from a sample's observed fusion mRNA spectrum) for a fusion gene pair.
The image above shows a graphical representation of the observed mRNA transcripts from which the inferred breakpoints are calculated.
Further information of the new gene fusion website features is available in the help pages.
A new homepage has been created for genes which have received full curation of the scientific literature. This is a new page which allows the distinction of these genes from CGP's data release, for which no literature has been curated.
The following curated genes have also received updates from the scientific literature: CDKN2A, GATA1, NOTCH1, NPM1, NRAS, JAK2, KRAS, PIK3CA, BRAF, HRAS, CEBPA, CSF1R, CTNNB1, KIT, PDGFRA, PTEN, RB1, MET, EGFR, FLT3, WT1, APC, MADH4, FBXW7, FGFR3.
COSMIC v30
Today we release full literature curations of five tumour suppressor genes MEN1, ATM, CYLD, FBXW7, WTX; 4712 samples were examined in 112 papers, recording 468 mutations. Additionally, we release two new CGP resequencing studies which add a further 91 new genes to COSMIC.
Curation of the scientific literature has been completed for five new genes from the cancer census. All five genes are tumour suppressors, causing phenotypes via their inactivation:
Somatic mutations in this gene have been found in tumours from several endocrine sites, recapitulating those seen in patients carrying germline mutations including tumours in the pituitary, pancreas and parathyroid. MEN1 encodes a nuclear protein thought to be a transcriptional regulator.
This gene has been found to have mutations in sporadic cylindromas, tumours arising from skin adnexal structures (such as hair follicles and glands), principally on the face and scalp. CYLD encodes a deubiquitinating enzyme regulating cell signalling including the NF-kappaB pathway.
Mutations inactivating FBXW7 have been found in a range of cancer types including colorectal, ovarian and T-ALL. The protein is involved targeting a number of key proteins, including NOTCH1 and MYC, for ubiquitin-mediated degradation.
This gene encodes a protein kinase involved in cell cycle checkpoint control. Amongst other key cell cycle components, it has been shown to phosphorylate TP53 and CHEK2 in response to DNA damage. Germline mutations causes Ataxia-telangiectasia (AT) a recessive disorder characterized by cerebellar ataxia, telangiectases, immune defects, and a predisposition to malignancy, primarily lymphoid in origin.
Recently discovered, WTX is inactivated in approximately 30% of Wilms Tumours. Located on the X chromosome, this tumour suppressor only requires a 'single-hit' for tumourigenic inactivation.
The following curated genes have also received updates: BRAF,HRAS,CEBPA,CTNNB1,KIT,PDGFRA,PTEN,SMARCB1,ERBB2,JAK2,CDKN2A,PTPN11,NRAS,KRAS,PIK3CA,APC,CDH1,MADH4,EGFR,FLT3,MPL,WT1,FGFR3
91 new genes have been examined in our pilot set of matched pair cell lines, resulting in the discovery of 22 new mutations:
COSMIC v29 released
COSMIC release 29 includes 22 new CGP resequencing studies, comprising 567 new genes within which 192 new mutations have been identified. Additional updates to our curation of the scientific literature have also been included, adding a total of 1041 mutations to this release.
COSMIC v28 Released
This months COSMIC release comprises a substantial increase in the CGP resequencing data, adding 1033 new genes to the system, together with updates to the scientific literature curation.
COSMIC v27 released
This months release of COSMIC comprises upgrades to both the web site (which now allows searching by gene/sample name or keyword) and data, with new CGP resequencing studies and curated genes. COSMIC now contains data on over 200,000 tumour samples and 400,000 individual experiments. Of these 202109 tumours, 40331 were found to contain one or more mutations (19.9%).
COSMIC third anniversary release (v26)
This release comprises a significant increase in the number of CGP resequencing studies. The five new studies all examine our pilot sample set comprising 40 cancer cell lines that all have a matched normal cell line, allowing all of the mutations to be confirmed as somatic.
COSMIC v25 released
This month's COSMIC release comprises significant updates to CGP resequencing studies and curation of the scientific literature.
The six non-kinase CGP resequencing studies have received substantial updates to the number of genes included and the number of mutations found (the kinase studies were updated in November 2006). Fifty two new genes have been added to the DNA repair study, together with three in the Apoptosis and two in the GAP-GEF studies. The number of mutations discovered in each of the six studies has increased as shown below:
In addition to the CGP resequencing studies, significant updates have been made to those genes which have received complete scientific literature curation. Three genes have been extensively updated, BRAF (19.1%, increased to 19224 samples), JAK2 (25.1%, increased to 11190 samples) and NOTCH1(75.4%, increased to 488 samples), whilst eighteen other genes have received minor updates (less than 10% increase in sample number): ABL1, APC, CDKN2A, CEBPA, CTNNB1, EGFR, ERBB2, FLT3, KRAS, MET, NRAS, PDGFRA, PIK3CA, PTEN, PTPN11, RET, SRC, VHL.
COSMIC v24 released
This months release of Cosmic includes the curation of NPM1 and CDH1.
NPM1 (Nucleophosmin), is a nucleocytoplasmic shuttling protein and critical regulator of TP53. Frequent mutations have been found in both childhood and adult AML. 20 papers have been manually curated for this gene resulting in the addition of 45 unique mutations (exon 12).
CDH1 (E-cadherin), is a calcium ion-dependent cell adhesion molecule with loss of function of this gene implicated in cancer invasion and metastasis. In particular, somatic mutations of this gene have been reported in gastric and lobular breast cancer. 181 mutations have been added to Cosmic for this gene from the curation of 46 papers.
COSMIC v23 released
This months release of Cosmic includes a major update to the protein kinase screens.
The Cancer Genome Project is pleased to release the full set of protein kinase somatic mutation data resulting from the screening of over 200 human cancers through the full set of 518 annotated genes. Over 1000 mutations have been identified in a combined total of 247 megabases sequenced. This dataset is intended to serve as a catalyst for further biological investigation of mutated kinases and pathways, hopefully leading to new insights and therapeutic opportunities in human cancer.
Oligo array CGH data (using the Affymetrix 10K SNP array) for a further 233 cancer cell lines and 70 primary tumours has been made available increasing the total available from 834 to 1136 samples.
COSMIC v22 released
This months release of Cosmic includes the curation from the scientific literature of the APC oncogene and information on the similarity between cell lines is now recorded and displayed in Cosmic.
COSMIC v21 released
This months release of Cosmic includes major updates to the Cancer Cell Line Project and microsatellite instability status data sets. In addition, published somatic mutation data from two additional genes, MPL and FGFR1, have been added to Cosmic.
The Cancer Cell Line Project aims to systematically screen a large panel of cancer cell lines for mutations in known cancer genes, thus empowering these cell lines as biological reagents for further work in anti-cancer agent development and further work on cancer molecular and cellular biology.
For this release of Cosmic, a further 137 cell lines have been added to the working set and 78 duplicate cell lines have been removed. This brings the total number of samples to 787. A further 98 mutations have also been added (See: http://www.sanger.ac.uk/genetics/CGP/CellLines/).
COSMIC v20 released
This months release includes NCI-60 updates and mutation data from the scientific literature for VHL.
The CGP is pleased to release mutation data for 24 known cancer genes on the NCI-60 series of cancer cell lines. These data should allow for greater power in interpretation of biological data using the lines as well as providing a genetic framework for evaluating response to the large series of compounds screened against this reference cell line set.
Microsatellite instability occurs due to a defect in mismatch repair. This is usually a result of inactivation of MSH2, MLH1 or MSH6 due to a mutation or to reduced expression associated with promoter methylation. Analysis of microsatellite instability was carried out using the BAT markers as described by Rodriguez-Bigas et al. All samples were screened using the markers BAT25, BAT26, D5S346, D2S123 and D17S250. Details of this, when available, are posted on the sample overview page. An example of which can be seen at http://www.sanger.ac.uk/perl/genetics/CGP/cosmic?action=sample&id=905950
VHL mutation data from the literature is now available. We have curated 93 papers covering 3412 experiments. These experiments used 3386 samples, in which 879 mutations were recorded.
COSMIC v19 released
This month's release of COSMIC includes the Cancer Genome Project screen of the GAP-GEF gene set and new information displays.
This gene set, consisting of 173 genes, is comprised of proteins that function to regulate the activity of proteins with GTPase activities. GTPase activating proteins (GAPs) promote hydrolysis of GTP-GDP. Guanine nucleotide exchange factors (GEFs) promote GDP/GTP exchange. Both classes modulate the function of the small monomeric GTPases (including the RAS oncogene family) and other key signalling proteins that use the conversion of GTP-GDP as a molecular switch to regulate function. This system of GTPase/GAP/GEFs regulates a wide variety of cellular processes including growth, differentiation, survival and motility.
Zygosity and somatic/germline status information are now available for mutations in COSMIC, CGP Resequencing and Cancer Cell Project websites. The somatic/germline status is listed on the sample detail page and the export function with the following statuses:-
Zygosity information is available on the mutation detail page with the following statuses:-
COSMIC v18 released
The CGP Resequencing Studies Website is released this month, which will act as a repository for data from CGP resequencing efforts to identify novel somatic mutations in human cancer. The pages have their own distinctive red colour scheme to denote this. Prior data on sets of genes/samples systematically screened for mutations were previously integrated into the "blue" COSMIC pages. This will continue with data now being submitted, prepublication, to and held on the new site. This will allow users to browse, search and evaluate these data more effectively. The web resources that are now available are detailed below:-
37 papers from the scientific literature have been curated for the PTCH gene in this release. Adding an additional 897 experiments and 168 mutations.
Ensembl has recently moved to the NCBI 36 assembly of the human genome whilst COSMIC genes and mutations are currently mapped to build 35. This has caused some disparity with the COSMIC DAS track. Therefore we suggest only using the cosmic DAS track on the most recent ensembl archive site(http://feb2006.archive.ensembl.org/index.html).Provided below is a link that will open the appropriate website with the DAS source attached:
Cosmic v17 release
This month's release of COSMIC includes the Cancer Genome Project screen of small monomeric GTPases and mutation data from the scientific literature for MADH4.
The small monomeric GTPases function as key molecular switches impacting a large variety of cellular functions such as motility, cell signalling, transcription and the binding, hydrolysis and exchange of GTP/GDP. The RAS subfamily (HRAS, NRAS, KRAS) of small monomeric GTPases were amongst the first identified human oncogenes and are mutationally activated in a wide variety of human cancers.
70 papers from the scientific literature have been curated for the MADH4 gene in this release. Adding an additional 2275 experiments and 259 mutations.
Further data from the scientific literature for 9 genes, including KRAS and NRAS, has been added for this release. A detailed breakdown for each gene can be seen below.
COSMIC v16 released
Released for March are data from a kinase domain screen of malignant gliomas. These data cover approximately 400kb of sequence in each of 9 tumours, including data from recurrent/resistant tumours.
We have recently completed a screen for somatic mutations of the kinase domain encoding exons of the entire protein kinase family in a series of human malignant gliomas. The results are presented in this release of COSMIC. No commonly mutated kinase domain was found in these studies. However, as is the case with our other work in this area, deep sequencing data from human tumours is informative about the processes that have contributed to oncogenesis in the patient. Two gliomas recurrent after temozolomide (alkylator) chemotherapy, but not a third recurrent after XRT alone, had the highest mutation prevalence of any tumours we have analysed to date. These data suggests a link between mutation prevalence and recurrent/resistant brain tumours treated with alkylator chemotherapy.
A COSMIC Expansion
The Catalogue Of Somatic Mutations In Cancer is two years old and has mutation data for over 1,000 genes, curated from over 3,000 published papers and unpublished data from the Cancer Genome Project.
The original aim of COSMIC continues with the curation of somatic mutation information from the literature for known cancer genes. During 2005 data for 9 genes was collected; ABL1, CDKN2A, EGFR, GATA1, JAK2, MSH6, NOTCH1, PTPN11 and SMO. In addition to this, genes that were curated in 2004 were updated as new data was published.
The number of genes in COSMIC expanded rapidly when the Cancer Genome Project at the Wellcome Trust Sanger Institute published 3 studies of somatic mutations in the protein kinase gene family (518 genes in total). This data provides a unique insight to the somatic mutations in breast, lung and testicular cancers.
More recently the Cancer Genome Project has been submitting unpublished somatic mutation data to COSMIC (link). The data comes from genes involved in apoptosis, DNA repair, maintenance and metabolism and the Inositol Polyphosphate Phosphatase and Heterotrimeric G-Protein families.
In another new departure the COSMIC software was used to create a new web site the Cancer Cell Line Project. This separate site, with it's own 'mint' colour scheme, contains the results from the sequence analysis of 14 known cancer genes in over 700 cancer cell lines. Initial sequence data for 4 genes analysed in the NCI-60 is also available. This work is in progress and more results will be posted in the coming months. What is more, the number of genes in this project will continue to increase; providing genetic data for this wide set of cancer cell lines.
There have been many enhancements to the web site over the past 12 months. A tissue overview provides a summary of mutations reported in a selected tissue. New pages were created to show more details of mutations and samples and give greater depth to the data. There are also links to other data such as genome copy number information.
COSMIC has been summarised in The British Journal of Cancer (Forbes et al, 2006).
This month sees the update of; BRAF, CDKN2A, EGFR, ERBB2, HRAS, KRAS, NRAS, PTEN, PTPN11 and SMARCB1. In addition the Cancer Genome Project has submitted unpublished data for genes involved in apoptosis.
There are plans to continue the development of COSMIC in terms of data content and data presentation. We are always happy to receive feedback and suggestions (email: cosmic@sanger.ac.uk).
COSMIC v14 released
The COSMIC team is proud to announce the release of COSMIC-14 with data for CDKN2A(p16) and more unpublished data from the CGP.
The Cancer Genome Project has released further unpublished somatic mutation data from a screen of 41 cancer cell lines. The 302 genes in this release are involved or associated with DNA repair, maintenance and metabolism. The genes can be viewed together or in 5 subgroups; Telomerase Complex, SWI/SNF, DNA replication, Nucleotide Metabolism and DNA Damage Response and Repair. In total 119 somatic mutations were identified in this study.
CDKN2A (also known as p16) is a tumour suppressor. It induces cell cycle arrest by inhibiting the phosphorylation of Rb by the cyclin-dependent kinases CDK4 and CDK6. So far 453 papers have been curated for this gene with 2,591 mutations recorded from 16,883 samples.
COSMIC v13 released
Somatic mutation data from new gene families
In a major new departure the Cancer Genome Project is proud to release further somatic mutation data. The results from the sequencing of two gene families, Inositol Polyphosphate Phosphatases and Heterotrimeric G-Proteins, have been added to the data for the Protein Kinase genes . This data will be expanded in the future with the addition of further gene sets.
Nine genes in COSMIC have been updated with further data; NRAS, RB1, ERBB2, HRAS, PTEN, TP53, KRAS, APC and CDKN2A
The Cancer Genome Project is pleased to announce the release of a DAS source devoted to the genes and mutations within COSMIC. Using this source you will be able to view the genes and mutations from COSMIC within a genome browser or the DAS client of your choice.
All 587 genes in COSMIC are exported as features. Each of these features displays the genomic 'footprint', which encompasses both exonic and intronic sequence between the start and end points of the CDS sequence. A link is attached to each feature, providing a mechanism for the client to link back directly to the gene entry on the COSMIC website.
In addition to the gene footprints, there are also a large number of unique mutations. These are also displayed as features, with links back to the mutation summary page in COSMIC. The database currently holds 2812 unique mutations, of which 1035 are currently exported. This subset is comprised of all the single nucleotide substitutions. More complex mutations will be included, as the genomic coordinates are mapped.
The DAS source can be found at the following URI:
http://das.ensembl.org/das/cosmic_genomic
The easiest way to view this source is to place the following URI in your browser:
http://www.sanger.ac.uk/turl/6d8
This will attach the DAS source and display some of the mutations found in BRAF. Additional configuration can be performed on the track, by clicking on the track name. For more information, see the help pages on the Ensembl website.
COSMIC version 12 released
The November release of COSMIC has further data on 9 known cancer genes.
The genes with additional data are; BRAF, PTEN, RB1, EGFR, TP53, CDKN2A, NRAS, KRAS and PIK3CA.
We have implemented a versioning system for the data in COSMIC. The current release is version 12 with a plan to release a new version every month.
There are additional mutations for the known cancer genes being sequenced through the cancer cell lines. Notably there is data for homozygous deletions in the CDKN2A gene.
The Cancer Genome Project has released more copy number data derived from the analysis of cancer cell lines and primary tumours using Affymetrix SNP microarrays. So far a total of 834 samples have been analysed consisting of 161 primary tumours and 673 cancer cell lines. This data is freely available from the CGP website. The primary tumours overlap with those being sequenced by the CGP while the cancer cell lines include those being sequenced in the Cancer Cell Line Project.
COSMIC Update
COSMIC has been updated with the addition of 2 new curated genes and new mutation descriptions.
COSMIC has adopted the Human Genome Variation Society sequence variation/mutation nomenclature for the bulk of the mutations in COSMIC. This represents a major upgrade with the aim of improving clarity and enables the listing of intronic variants for the first time.
Two genes have further data in COMSIC; EGFR and PTEN.
The sequence analysis of the protein kinase gene family in human testicular germ-cell tumours of adolescents and adults has been published. The mutation data from this work was previously available in COSMIC and is now joined by the published analysis of the data.
There are additional mutations from the screening of known cancer genes through an extensive set of cancer cell lines.
COSMIC has been updated with the addition of 3 new curated genes; MSH6, NOTCH1 and PTPN11.
There is a new member to the COSMIC family; the Cancer Cell Line Project. This portal uses the COSMIC code to serve mutation data from the cancer cell lines being sequenced by the Cancer Genome Project at the Wellcome Trust Sanger Institute. The cell line data is presented in the same style as the COSMIC data with a unique colour scheme. There are links to jump from the Cancer Cell Line Project pages to view all of the data in COSMIC. At present there is data from 12 known cancer genes in the Cancer Cell Line Project database.
In addition the results from the screen of all 518 protein kinase genes in lung cancer, that were available in the previous release of COSMIC, have been published in Cancer Research
Protein kinase mutations in lung cancer
The Cancer Genome Project has sequenced all protein kinase genes in lung cancer - the most common cause of cancer deaths worldwide
There are over 27,000 new cases of lung cancer in the United Kingdom each year. Protein kinases are frequently mutated in human cancer and inhibitors of mutant protein kinases have proven to be effective anticancer drugs. The Cancer Genome Project has screened the complete coding sequence of all 518 protein kinase genes in 33 lung cancers. This study, published in Cancer Research, is the largest survey reported to date of somatic mutations in lung cancer.
The Cancer Genome Project at the Wellcome Trust Sanger Institute was established in 2000. Its goal is to identify mutations that occur in cancer cells to enable the development of new diagnostics and new treatments and advance our understanding of the biology of cancer.
The Wellcome Trust Sanger Institute Cancer Genome Project and their collaborators have published the latest results of their survey of genes that might be implicated in cancer. The report is published in Cancer Research on Thursday 1st September 2005 and is also available through COSMIC.
The gene set chosen was a class called protein kinases, key controllers of cell growth and death. Members of this family have been shown to be important in cancer. However, the whole set has never been sequenced in a single set of lung tumours. The study generated over 40 million bases of DNA sequence (1.3 million for each sample).
This work identified 188 somatic mutations in 141 protein kinase genes. There was considerable variation in the number of mutations found in each tumour. The results indicate that several mutated protein kinases may be contributing to lung cancer development, but that mutations in each one are infrequent. Larger studies are warranted to further explore these initial findings. Cancer is a complex set of diseases that will affect 1 in 3 people. This work in the CGP is but one part of a global effort to further understanding of cancer and move towards better diagnosis and treatment.
COSMIC Website Update
The COSMIC web site has been updated with additional data from the literature and unpublished data from the Cancer Genome Project.
Data for 3 genes has been curated from the literature and included in COSMIC; ABL1, GATA1 and SMO.
The screen of the protein kinase gene family by the Cancer Genome Project now includes two new tumour types; lung cancer and testicular germ-cell tumours. There are marked differences in the mutation prevalence between these two tumour types.
The mutation data for 9 further genes has been included on the web site giving a total of 550 mutations. The genes are APC, CDH1, CTNNB1, HRAS, MADH4, PIK3CA, PTEN, RB1 and STK11. The sequencing of these genes is not necessarily complete but the cell lines with mutations have been confirmed and the experiments will continue to finish this work.
Cosmic Update
COSMIC now includes data from a screen of all protein kinase genes in breast cancer and an update of mutation data from the literature.
The data in COSMIC has expanded to include a new data type and the number of known cancer genes has been extended with updates on some of the existing cancer genes.
The Wellcome Trust Sanger Institute Cancer Genome Project and their collaborators have published the latest results of their survey of genes and their mutations in cancer. The report was published online in Nature Genetics on Sunday 22 May 2005 (more). This data has been integrated with the existing data in COSMIC and made available through the web site.
The mutation data for two further cancer genes has been curated from the scientific literature and added to COSMIC.
Further published data has been curated for 5 genes in COSMIC; BRAF, ERBB2, FGFR2, PDGFRA and PIK3CA.
More improvements have been made to the gene selection pages. The alphabetical lists have been seperated into 3 groups to reduce the amount of guess work involved in finding your gene of interest.
The karyotype has also been updated. Genes from the census can be located quickly by clicking on the red trinagles. All other genes are indicated by blue lines across the chromosome.
Each mutation in COSMIC now has its own overview page containing information about the type of mutation and samples/tissues containing the mutation. This page can be reached by clicking on various links throughout the website.
The overview page is divided into 8 main sections:
Two new sections have been added to this page:
COSMIC now includes review papers. There is a review section that can be found at the bottom of the reference overview page for each gene. This section includes references that review other works. As the data from these references has already been added to the data from the original sources, this data is not added again.
COSMIC presents 'Tissue Overview' another way to view somatic mutation data. The Tissue Overview page details the Top 5 Genes for any tissue / histology selection ranked by mutation frequency and data volume. In addition it lists other genes with and without mutations for the selection. From the Tissue Overview page you can click through to the specific details of the listed genes.
As stated above, this new page details all the genes that have samples for the tissues / histologies selected. It is split into three major sections, with the first section detailing what we feel are the most important genes, based on mutation frequency and data volume.
This section provides an interactive bar chart and table showing data for the highest ranked genes containing samples from the chosen tissues / histologies
The coloured bars in the image represent:
COSMIC's first anniversary
The COSMIC database and web site have been updated and now have somatic mutation data from 21 genes.
The COSMIC team is proud to release somatic mutation data for CSF1R, RB1, RET and SMARCB1. This information has been curated from the scientific literature. Somatic mutation data from 15 genes can be queried and viewed through the COSMIC web site.
The COSMIC team are proud to include somatic mutation data for FGFR2, FGFR3, FLT3, MET, PDGFRA and PIK3CA on the COSMIC web site.
The number of genes with data in COSMIC has more than doubled in this release of the database. The additional data represents a set of genes that have a lower, but nevertheless important, mutation frequency in human cancer as a whole. In specific malignancies genes such as FLT3 do have a significant role as can be seen from the data collected in COSMIC.
Number of unique mutations 307
Number of curated papers 976
We are pleased to announce an update to the COSMIC website. To coincide with the nature paper on ERBB2 we have added all the data for this gene to COSMIC. There have also been a number of improvements to the interface that we hope you will find useful.
Today, Nature publish our recent findings, the first description of small intragenic ERBB2 mutations in human cancer. Primarily found in non-small cell lung adenocarcinomas, the mutations identified are suggestive of inappropriate activation of ERBB2 kinase activity.
This addition brings 8 new mutations and 714 new samples to the database. Increasing the total number of mutant samples to 10655 and the total number of samples to 58032.
This has grown from the original four row summary table, on the distribution page, into a full page overview of the information stored about a specific gene.
Here you will find a page containing all the information about a particular sample. Some of the previously unavailable information, such as details about the individual, has been been made available.
For the first time in COSMIC you can see all the samples from one paper in one location. In addition to this there are also details about the genes screened and the mutations that were found.
We are pleased to announce a minor update to the COSMIC website. The user interface has been updated to include new features that we hope will make your experience with the site more productive and enjoyable. Web Site Changes
COSMIC Detailed
The British Journal of Cancer have released an advance online version of an article describing COSMIC. Detailed information is provided about the curation and structure of the database. Followed by a description of the facilities provided by the website.
COSMIC Website Unavailable 8th May
On Saturday 8th May the COSMIC website, as part of the Sanger website, will be unavailable whilst major network upgrades and essential maintenance work is carried out. We apologise in advance for this loss of service.
Nucleotide data available
COSMIC displays mutations at the amino acid level to show the potential implication of the mutations on the protein sequence. In addition to this COSMIC holds the mutations at the nucleotide level. This data is available through the Export function that can be found at the top of the Distribution figure (example) or at the bottom of the expanded Mutation Data tables (example).
COSMIC version 1 released
Wellcome Trust Sanger Institute launches Catalogue Of Somatic Mutations In Cancer. In the quest to develop rational approaches to treating cancer, researchers need efficient access to existing knowledge. COSMIC (Catalogue Of Somatic Mutations In Cancer), launched today by the Cancer Genome Project at The Wellcome Trust Sanger Institute, is a new tool that provides integrated genetic data from cancer genes, and will make research faster and easier.
BRAF V599E becomes V600E
The original BRAF mutations reported by Davies et al were mapped to the DNA sequence NM_004333[gi;4757867] with the common BRAF mutation being V599E. On the 24th July 2003 this sequence was updated to NM_004333[gi;33188458] with the insertion of 3bp in the coding sequence. The net effect of this update was to increase the length of the BRAF protein by one amino acid and increase the position of all published mutations by one amino acid. The beginning of both versions of the proteins are;
MAALSGGGGGGAEPGQALFNGDMEPEAGAGR PAASSAADP NM_004333[gi;4757867] |||||||||||||||||||||||||||||| ||||||||| MAALSGGGGGGAEPGQALFNGDMEPEAGAGAGAAASSAADP NM_004333[gi;33188458]
webmaster@sanger.ac.uk
Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK Tel:+44 (0)1223 834244
Last Modified Tue Mar 20 14:26:52 2007
Genome Research Limited is a charity registered in England with number 1021457
Data Sharing Policy | Conditions of Use | Copyright