Comprehensive study finds mutations in non-coding genome are infrequent drivers of cancer

Findings suggest efforts to develop new cancer treatments should primarily focus on protein-coding genes

A clearer picture of how DNA changes lead to cancer has emerged, following the most comprehensive evaluation of non-coding driver mutations to date by researchers at the Wellcome Sanger Institute, the Broad Institute of MIT and Harvard, Massachusetts General Hospital (MGH), Aarhus University Hospital and their collaborators.

The study, published today (5 February) in Nature as part of a global Pan-Cancer Project*, discovered several new cancer drivers in non-coding genes. The overall conclusion, however, reaffirms that the vast majority of cancer drivers occur in protein-coding regions of the human genome. This knowledge will help to focus efforts on discovering new causes and treatments for cancer.

Also published today in Nature and related journals, are 22 further studies from the Pan-Cancer Project. The project represents an unprecedented international exploration of 2,600 cancer genomes, which significantly improves our fundamental understanding of cancer and zeros-in on mechanisms of cancer development.

Driver mutations are DNA changes that ‘drive’ cells down the path towards cancer. Depending on the type of cancer, anywhere from one to ten driver mutations are required for cancer to develop**.

Most large-scale genomic studies of cancer to date have focused on detecting driver mutations in protein-coding genes. As these coding sequences represent less than two per cent of the human genome, investigations into the remaining 98 per cent of the ‘non-coding’ genome*** have taken place in recent years. In 2013, driver mutations were discovered in the non-coding TERT gene across many cancer types, raising the possibility that there may be numerous non-coding driver mutations in the ‘dark matter’ of the genome.

This study is the most comprehensive evaluation of the extent of non-coding driver mutations in cancer to date, in terms of the number of methods employed, number of samples analysed, and the number of cancer, genome region and mutation types studied. Overall, 2,600 genomes of 38 different tumour types were analysed.

The team identified a number of new non-coding cancer-driving mutations, such as non-coding mutations in the 5’ untranslated region of the TP53 gene, which are associated with this gene being less strongly expressed, or ‘turned off’.

The results concluded, however, that mutations in the regulatory sequences surrounding cancer genes are relatively rare. Excluding mutations in the TERT gene, the number of non-coding driver mutations identified equated to around one (or fewer) in every 100 tumours. In comparison, protein-coding regions often harbour several driver mutations per tumour. Some non-coding drivers identified in previous studies were found to be the result of less accurate methodologies or the result of previously uncharacterised hyper-mutation processes.

“The fact that our results contrast so strongly with other studies is largely down to how rigorous our analysis has been. Despite using numerous methods, the largest dataset currently available and surveying a wide range of non-coding regions of the genome, we found very few genuine driver mutations outside protein-coding genes.”

Dr Federico Abascal of the Wellcome Sanger Institute

“The non-coding driver mutations we identified, such as in the TP53 gene, add to the short list of non-coding driver mutations that already includes TERT, FOXA1 and a few other genes. By rigorously analysing the mechanisms that contribute to increased mutation rates, we were not only able to find new drivers but also raise doubts about previously reported ones that are affected by local mutational processes and artefacts uncovered in our study. We hope that our analysis will serve as the basis for future cancer genome studies.”

Dr Gad Getz of the Broad Institute and MGH

This unexpected result has important implications for the treatment of cancer. While technological advancements and larger cohorts will undoubtedly lead to the discovery of more non-coding driver mutations, it is unlikely that the ratio of coding to non-coding drivers will change significantly. This implies that efforts to develop new cancer treatments should primarily focus on protein-coding genes.

“Overall, our study suggests that while increasingly large datasets will continue to yield new coding and non-coding driver mutations, the vast majority of cancer drivers occur in the two per cent of the genome that codes for proteins. To us, this was an unexpected and important result. For cancer patients, this means that the vast majority of clinically-relevant mutations in a cancer are likely to be found in protein-coding sequences, which will simplify efforts for the clinical use of genome sequencing in cancer.”

Dr Inigo Martincorena of the Wellcome Sanger Institute

Press Contacts

If you need help or have any queries, please contact us.

Emily Mobley
Media Manager
Tel +44 (0)1223 496 851
Email: emily.mobley@sanger.ac.uk

 

Dr Matthew Midgley
Media Officer
Tel +44 (0)1223 494 856
Email: matthew.midgley@sanger.ac.uk

 

Dr Samantha Wynne
Media Officer
Tel +44 (0)1223 492 368
Email: samantha.wynne@sanger.ac.uk

 

Press office
Wellcome Sanger Institute, Hinxton,
Cambridgeshire, CB10 1SA, UK
Tel +44 (0) 7748 379849
Email: press.office@sanger.ac.uk

 

Notes to Editors

*The ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG), known as the Pan-Cancer Project, is the largest and most comprehensive study of whole cancer genomes yet. The collaboration involving more than 1,300 scientists and clinicians from 37 countries, analysed more than 2,600 genomes of 38 different tumour types, and has created a huge resource of primary cancer genomes, available to researchers worldwide to advance cancer research. https://dcc.icgc.org/pcawg

Main findings from the Pan-Cancer project:

  • The cancer genome is finite and knowable, but enormously complicated. By combining sequencing of the whole cancer genome with a suite of analysis tools, we can characterise every genetic change found in a cancer, all the processes that have generated those mutations, and even the order of key events during a cancer’s life history.
  • We are close to cataloguing all of the biological pathways involved in cancer and having a fuller picture of their actions in the genome. At least one causal mutation was found in virtually all of the cancers analysed and the processes that generate mutations were found to be hugely diverse — from changes in single DNA letters to the reorganization of whole chromosomes. Multiple novel regions of the genome controlling how genes switch on and off were identified as targets of cancer-causing mutations.
  • Through a new method of “carbon dating”, the Pan-Cancer Project discovered that we can identify mutations which occurred years, sometimes even decades, before the tumour appears. This opens, theoretically, a window of opportunity for early cancer detection.
  • Tumour types can be identified accurately according to the patterns of genetic changes seen throughout the genome, potentially aiding the diagnosis of a patient’s cancer where conventional clinical tests could not identify its type. Knowledge of the exact tumour type could also help tailor treatments.

For access to all the open tier data in the Pan-Cancer project, go to https://dcc.icgc.org/

**For more information on driver mutations in different types of cancer, see the Sanger Institute website https://www.sanger.ac.uk/news/1-10-mutations-are-needed-drive-cancer-scientists-find

***More information on protein-coding and non-coding genes is available at: https://www.yourgenome.org/facts/what-does-dna-do

Publication:

Esther Rheinbay, Morten Muhlig Nielsen and Federico Abascal et al. (2019). Analyses of non-coding somatic drivers in 2,658 cancer whole genomes. Nature. DOI: https://doi.org/10.1038/s41586-020-1965-x

The Nature collection landing page with all PanCancer publications will go live when the papers publish: https://www.nature.com/collections/pcawg/

Funding:

This research was funded by GDAC, the Broad Institute of MIT and Harvard, Independent Research Fund Denmark, The Danish Cancer Society, National Institutes of Health and Wellcome.

Selected websites

  • Aarhus University and Aarhus University Hospital

    Aarhus University is a Danish research-intensive university founded in 1928. It is a top-ten university among universities founded within the past 100 years, and it has a long tradition of partnerships with some of the world’s best research institutions and university networks. The Faculty of Health at Aarhus University seeks to improve public health and benefit society with outstanding basic research, clinical translation, and innovation.

    Aarhus University cooperates closely with Aarhus University Hospital, which is one of the largest and most advanced hospital complexes in Northern Europe. The hospital has a large focus on development and application of precision medicine based on genomics and integrative molecular profiling. For more information about The Faculty of Health at Aarhus University and Aarhus University Hospital, go to https://health.au.dk/en/ and https://www.en.auh.dk/

  • About Massachusetts General Hospital

    Massachusetts General Hospital, founded in 1811, is the original and largest teaching hospital of Harvard Medical School. The MGH Research Institute conducts the largest hospital-based research program in the nation, with an annual research budget of more than $1 billion and comprises more than 8,500 researchers working across more than 30 institutes, centers and departments. In August 2019 the MGH was once again named #2 in the nation by U.S. News & World Report in its list of “America’s Best Hospitals.”

  • About the Broad Institute of MIT and Harvard

    Broad Institute of MIT and Harvard was launched in 2004 to empower this generation of creative scientists to transform medicine. The Broad Institute seeks to describe all the molecular components of life and their connections; discover the molecular basis of major human diseases; develop effective new approaches to diagnostics and therapeutics; and disseminate discoveries, tools, methods, and data openly to the entire scientific community.

    Founded by MIT, Harvard, Harvard-affiliated hospitals, and the visionary Los Angeles philanthropists Eli and Edythe L. Broad, the Broad Institute includes faculty, professional staff, and students from throughout the MIT and Harvard biomedical research communities and beyond, with collaborations spanning over a hundred private and public institutions in more than 40 countries worldwide. For further information about the Broad Institute, go to https://www.broadinstitute.org

  • The Wellcome Sanger Institute

    The Wellcome Sanger Institute is a world leading genomics research centre. We undertake large-scale research that forms the foundations of knowledge in biology and medicine. We are open and collaborative; our data, results, tools and technologies are shared across the globe to advance science. Our ambition is vast – we take on projects that are not possible anywhere else. We use the power of genome sequencing to understand and harness the information in DNA. Funded by Wellcome, we have the freedom and support to push the boundaries of genomics. Our findings are used to improve health and to understand life on Earth. Find out more at www.sanger.ac.uk or follow us on Twitter, Facebook, LinkedIn and on our Blog.

  • About Wellcome

    Wellcome exists to improve health by helping great ideas to thrive. We support researchers, we take on big health challenges, we campaign for better science, and we help everyone get involved with science and health research. We are a politically and financially independent foundation. https://wellcome.org/