Standards for a new genomic age

Joint Announcement sets Six Genome Sequence Standards

Email newsletter

News and blog updates

Sign up

Wellcome Library, London
The Wellcome Trust Sanger Institute’s sequencing centre.

New standards in genome sequencing are called for today: the report authors assert that the world of genome sequencing must establish a suite of benchmarks against which a new genome sequence can be measured. The measures are independent of the technology used to deliver the sequence.

The result, they say, will be that researchers will have a much clearer idea of where they stand because of the more transparent and unambiguous estimation of any sequence quality.

There is a desperate need to address the variety of data output from next-generation sequencing and to provide guidance on the quality of the assembled sequence. The rapid deployment of different, high-throughput, next-generation sequencing platforms has challenged the traditional sequence assembly and analysis systems.

“The first three decades of sequencing produced relatively limited amounts of data in a small range of formats designed to deliver quality assemblies of DNA sequences. In the past couple of years a quiet revolution has shaken genomics. As a community, we had to provide guidance for researchers to help them use the outpouring of next-generation sequences as efficiently as possible.”

Darren Grafham From the Wellcome Trust Sanger Institute and joint first author on the report

“Standards are a major issue to be tackled in genomics right now. These proposals are guideposts meant to inform users and generators.”

Patrick Chain from Los Alamos National Laboratory (LANL) New Mexico, USA and joint first author

A range of next-generation sequencing technologies, increasingly deployed in research, generate massive amounts of data in any one of several formats. One example is the Wellcome Trust Sanger Institute where, over the past two years, sequence output has gone from around 100 million bases per day to around 60 billion bases per day.

Perhaps more important, many of these data are short sequence stretches for comparative genomics or other studies on related sequence and not data designed to produce draft or finished genome assemblies.

“There is a widening gap between the output data, draft genomes and finished genomes and a developing confusion over which data sets are of a high quality.

“Until now, we have simply had no descriptors or standards to help researchers. Initial discussions began at the Sequencing and Finishing in the Future meeting and have culminated in today’s article.”

Chris Detter Director of the LANL Joint Genome Institute and senior author on the report

The new standards will take into account the technologies, chemistry or computer programs used to produce and analyse the sequences to place new data into one of six categories.

The categories range from a ‘standard draft sequence’, the minimum for submission to the public DNA databases to ‘finished sequence’, where a sequence is as complete as it reasonably can be with current methods and has less than one error in 100,000 bases.

“Genome sequences are a resource that many researchers use to understand biology and disease. However, it is crucial they can know in advance the quality of any sequence so that they can make best use of it. These guidelines will help to maximize the value of new genome sequences by ensuring that they are used in the most appropriate way.”

Professor Julian Parkhill Director of Sequencing and Head of Pathogen Genomics at the Sanger Institute

More information

Participating Centres

A full list of participating centres is available at the Science website.

Publications:

Loading publications...

Selected websites

  • The Wellcome Trust Sanger Institute

    The Wellcome Trust Sanger Institute, which receives the majority of its funding from the Wellcome Trust, was founded in 1992. The Institute is responsible for the completion of the sequence of approximately one-third of the human genome as well as genomes of model organisms and more than 90 pathogen genomes. In October 2006, new funding was awarded by the Wellcome Trust to exploit the wealth of genome data now available to answer important questions about health and disease.

  • The Wellcome Trust

    The Wellcome Trust is a global charitable foundation dedicated to achieving extraordinary improvements in human and animal health. We support the brightest minds in biomedical research and the medical humanities. Our breadth of support includes public engagement, education and the application of research to improve health. We are independent of both political and commercial interests.