Sifting through the Genome Baggage

A New Method to Find Important Genome Regions

Sifting through the Genome Baggage


Evolutionary forces tend to retain important DNA sequences, whilst allowing unimportant sequences to change. Consequently, protein-coding regions - only about 1.5 per cent of the human genome - are similar in all mammalian species.

But there is a further 3 per cent of mammalian genome sequence that does not code for protein, yet is conserved. Are these sequences important or are they merely passengers on the evolutionary journey?

A new study from an international team co-directed by researchers at the Wellcome Trust Sanger Institute and the Broad Institute, published in Nature Genetics, shows that the vast majority of the conserved non-coding (CNC) regions are not areas that fortuitously are free of mutation, but are selectively constrained in their variation. This remarkable conclusion suggests that searches in CNC regions might lead to new discoveries of clinically important variants.

"Although we were aware of CNC regions, we could not tell whether they represented areas of the human genome that were relevant to the working of our genome, or were relics that had no present importance."

"Single-letter differences - called single nucleotide polymorphisms, or SNPs - in our genetic code are rarer in CNCs than in other, non-conserved regions. Crucially, we showed that this was not due to a lower rate of mutation, but to selection in these regions - they are under evolutionary pressure. This suggests these regions, which do not code for protein, perform important functions in our genome."

Dr Manolis Dermitzakis, Investigator, Division of Informatics at the Wellcome Trust Sanger Institute and a corresponding author

Our genome includes regulatory DNA sequences, which are important in control of genetic activity. The structure and sequence of these regions is emerging, but new methods to identify significant sequences are needed. Many of the CNC variants detected here include known regulatory regions, but also many other locations.

Finding regions of the genome where evolution has acted on variation is like finding a new pot of targets in which mutations that predispose to disease are to be discovered. The study also suggests ways in which the hunt for disease-associated variation can be made more productive.

"Our research suggests that CNCs are as important as coding sequences - but our genome has more than twice as much CNC sequence as gene sequence. This means there will be many more mutations to discover in CNCs that are associated with disease than there are in genes."

"If we include in our research a focus on these locations, we would expect to identify important variants more quickly. Our aim is to use the power of genomic information to improve our understanding of disease. This work suggests a method to harness and focus that power."

Dr Manolis Dermatizakis, Sanger Institute

Because SNPs in CNCs are relatively rare, they may not be well captured using standard methods of detecting variation (which tend to emphasize more common variants). If these regions are studied in more detail, greater biomedical benefit should follow.

Notes to Editors
  • Conserved noncoding sequences are selectively constrained and not mutation cold spots.

    Drake JA, Bird C, Nemesh J, Thomas DJ, Newton-Cheh C et al.

    Nature genetics 2006;38;2;223-7

Corresponding Authors
  • Dr Manolis Dermitzakis, Wellcome Trust Sanger Institute
  • Joel N. Hirschhorn, Broad Institute of Harvard and MIT
Participating Centres
  • Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, CB10 1SA, UK
  • Program in Genomics and Division in Endocrinology, Children's Hospital, Boston, MA 02115, USA
  • Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA 02139, USA
  • Department of Biomolecular Engineering, University of California Santa Cruz, CA, 95064, USA
  • Division of Cardiology, Massachusetts General Hospital, Boston, MA 02114, USA
  • NHLBI's Framingham Heart Study, Framingham, MA, 01702, USA
  • Department of Genetic Medicine and Development, University of Geneva Medical School, Geneva, Switzerland
  • Zoological Institute, University of Bern, Bern, Switzerland
  • Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
Selected Websites
Contact the Press Office

Emily Mobley, Media Manager

Tel +44 (0)1223 496 851

Dr Samantha Wynne, Media Officer

Tel +44 (0)1223 492 368

Dr Matthew Midgley, Media Officer

Tel +44 (0)1223 494 856

Wellcome Sanger Institute,
CB10 1SA,

Mobile +44 (0) 7748 379849

Recent News

Unprecedented exploration generates most comprehensive map of cancer genomes charted to date
Pan-Cancer Project discovers causes of previously unexplained cancers, pinpoints cancer-causing events and zeroes in on mechanisms of development
Comprehensive study finds mutations in non-coding genome are infrequent drivers of cancer
Findings suggest efforts to develop new cancer treatments should primarily focus on protein-coding genes