5 September 2012

Google Earth of Biomedical Research

An integrated encyclopaedia of DNA elements in the human genome

Encyclopaedia of DNA elements.

Encyclopaedia of DNA elements. [http://www.genome.gov/10005107#al-2]

The ENCODE Project, today, announces that most of what was previously considered as 'junk DNA' in the human genome is actually functional. The ENCODE Project has found that 80 per cent of the human genome sequence is linked to biological function.

The collaborative project mapped more than 4 million regulatory regions, or genetic switches, where proteins specifically interact with the DNA; these findings represent a significant advance in understanding the precise and complex controls over how genes work within a cell type. This information will greatly enhance our understanding of both common and rare diseases that have a genetic component such as cancers.

The Human Genome Project produced an almost complete list of the 3 billion pairs of chemical letters in the DNA that embodies the genetic code - but nothing about the way this blueprint works. ENCODE wanted to take this vast amount of data and ascribe function to the entire human genome. Now, after five years of concerted effort by more than 440 researchers in 32 labs around the world, working collaboratively in the ENCODE Project, the first holistic view of how the human genome actually does its job has emerged.

"If the Human Genome Project is like an ordnance survey map, then the ENCODE project is like Google Earth," says Dr Jennifer Harrow, Principle Investigator from the Wellcome Trust Sanger Institute. "The Human Genome Project gave us a broad overview of our genome, but the ENCODE maps allow researchers to inspect the chromosomes, genes, functional elements and individual nucleotides in the human genome in much the same way as Google Earth magnifies what we see on a map."

By integrating information from genome wide studies with several datasets from the ENCODE project, it is now possible for researchers to predict variations that may be central to diseases and predict the cell types in which the affected genes might be active. This approach generated functional, biological information for DNA sequences for up to 80% of all previously reported associations.

" The ENCODE project is providing a encyclopaedia to understand how the sequence of the human genome forms the words that tells our bodies how to work at the molecular level. "

Dr Tim Hubbard

With this information researchers can see all single DNA letter variations, what state they are in, what's happening around these variations such as which binding sites are involved and which cell types they're active in. This type of information potentially can provide functional predictions as to the genetics behind disease, making it an extremely powerful interpretation tool.

"We've come a long way," says Dr Ewan Birney of the European Bioinformatics Institute and lead analysis coordinator of the ENCODE data, "and we have learned an incredible amount by integrating the different types of data that ENCODE produced, which was done at a scale never before achieved in biology. This data integration was one of the keys to the success of the project."

The coordinated publication set includes one main integrative paper and five other papers in the journal Nature; 18 papers in Genome Research; and six papers in Genome Biology. The ENCODE data are so complex that the three journals have developed a pioneering way to present the information in an integrated form that they call 'threads.'

Since the same topics were addressed in different ways in different papers, the new website, http://www.nature.com/encode/, will allow anyone to follow a topic through all of the papers in the ENCODE publication set in which it appears, by clicking on the relevant 'thread' at the Nature ENCODE explorer page. For example, thread number one compiles figures, tables, and text relevant to genetic variation and disease from several papers and displays them all on one page. ENCODE scientists believe this will illuminate many biological themes emerging from the analyses.

"The ENCODE project is providing an encyclopaedia to understand how the sequence of the human genome forms the words that tells our bodies how to work at the cellular and molecular level," says Dr Tim Hubbard, lead principle investigator from the Wellcome Trust Sanger Institute. "This will serve as a critical reference for interpreting the relationship between genome variation and disease and in the development of stem cell based therapies. By developing more revolutionary technologies for probing genome function, we expect to accelerate these efforts."

Notes to Editors

Publication details

Details of publications, funding and paticipating centres can be found on http://www.nature.com/encode/

The Wellcome Trust Sanger Institute

The Wellcome Trust Sanger Institute is one of the world's leading genome centres. Through its ability to conduct research at scale, it is able to engage in bold and long-term exploratory projects that are designed to influence and empower medical science globally. Institute research findings, generated through its own research programmes and through its leading role in international consortia, are being used to develop new diagnostics and treatments for human disease.


The Wellcome Trust

The Wellcome Trust is a global charitable foundation dedicated to achieving extraordinary improvements in human and animal health. We support the brightest minds in biomedical research and the medical humanities. Our breadth of support includes public engagement, education and the application of research to improve health. We are independent of both political and commercial interests.


Contact the Press Office

Mark Thomson Senior Media and Public Relations Officer
Wellcome Trust Sanger Institute, Hinxton, Cambs, CB10 1SA, UK

Tel +44 (0)1223 492 384
Mobile +44 (0)7753 775 397
Fax +44 (0)1223 494 919
Email press.office@sanger.ac.uk

* quick link - http://q.sanger.ac.uk/u8xu444o