International Team of Researchers Assembles Draft Sequence of Mouse Genome
In a landmark advance in genomics, the international Mouse Genome Sequencing Consortium today announced that it has assembled and deposited into public databases an advanced draft sequence of the mouse genome - the genetic blueprint for the most important animal model in biomedical research. The data are posted on the Internet at the three sites listed below, where they are freely available.
This achievement represents a major milestone for the Human Genome Project because it provides a key tool needed to interpret the human sequence, a draft version of which was published last year. Researchers will be better able to understand the function of many human genes because the mouse carries virtually the same set of genes as the human but can be used in laboratory research.
For most human illnesses, from cancer to autoimmune disease, important insights have come from the study of mouse models. The advanced draft of the mouse sequence will greatly accelerate precise identification of the genetic contributors to those illnesses, leading to better understanding of human disease and improved tests and treatments. The mouse sequence will also allow researchers to recognize regions of the human genome that control gene activity, by virtue of the fact that they are conserved through the 100 million years of evolution separating humans and mice.
"The mouse genome project - carried out alongside finishing the human genome - has generated crucial publicly available information for biomedical research. Throughout the project we have refined both the technologies and the software to improve this resource and to bring it to researchers as swiftly as possible."
Jane Rogers, Ph.D., Head of Genome Sequencing at the Wellcome Trust Sanger Institute
The draft sequence was assembled by the Mouse Genome Sequencing Consortium, an international team of researchers from the Wellcome Trust Sanger Institute and the European Bioinformatics Institute, in Hinxton, England, the Whitehead Institute in Cambridge, MA, and Washington University School of Medicine in St. Louis, MO, with funding from the Wellcome Trust in and the National Institutes of Health in the USA.
The mouse genome is contained in 20 chromosome pairs and the current results suggest that it is about 2.7 billion base pairs in size, or about 15 percent smaller than the human genome. The human genome is 3.1 billion base pairs spread out over 23 pairs of chromosomes (22 autosomes and the X and the Y sex chromosomes).
Analysis of the genome assembly indicates roughly the same number of genes for the mouse as the human. So far researchers have found more than 22,500 high-quality gene predictions, with additional predictions expected to take the total to about 30,000.
"This is a most exciting development for biomedical research. My group and research groups around the world have used the public mouse sequence as it has developed. The new assembly and gene analysis is a phenomenal achievement by the international consortium, which will speed our investigations into human illness."
Allan Bradley, Ph.D, Director of the Wellcome Trust Sanger Institute
The draft sequence shows the order of the DNA chemical bases A, T, C, and G along the mouse chromosomes. The current assembly includes more than 96 percent of the mouse genome with long, continuous stretches of DNA and represents a seven-fold coverage of the genome. This means that the location of every base, or DNA letter, in the mouse genome was determined an average of seven times, a frequency that ensures a high degree of accuracy.
"The mouse sequence is much further along in the process than the human sequence was at the draft stage. Methods for efficient sequencing of large genomes continue to advance dramatically, and the sophistication of the team that accomplished this goal is truly impressive. This sets a new standard for speed, accuracy, and public accessibility."
Francis S. Collins, M.D., Ph.D., director of the National Human Genome Research Institute, Bethesda MD
The quality of the working draft sequence far exceeds the consortium's original expectations for this stage and was completed much sooner than initially expected, reflecting the tremendous efficiencies gained in sequencing and computational technologies in the past few years.
"It is remarkable that we were able to complete the mouse genome in such a short time and with such great accuracy. We are now working hard with an international group of experts to explore the content of the sequence and to use it to improve our understanding of the human sequence."
Robert Waterston, M.D., Ph.D., director of the Genome Center at Washington University, St Louis MO
The sequence information is immediately and freely available to the world. The information will be utilized thousands of times daily by scientists in academia and industry, as well as by commercial database companies providing information services to biotechnologists.
The results from this analysis can be found at several websites, including http://mouse.ensembl.org/ at the European Bioinformatics Institute; at http://www.ncbi.nlm.nih.gov/genome/guide/mouse/ at the National Center for Biotechnology Information at the National Library of Medicine, and http://genome.ucsc.edu/ at the University of California, Santa Cruz. A comparison between the mouse sequence and the human sequence can be found at all three sites.
"The mouse sequence provides a very important chapter from evolution's lab notebook. Being able to read evolution's notebook and compare genomic information across species will allow us to glean important information about ourselves. That's because evolution preserves the most important genetic information across species; if specific DNA sequences have been preserved by evolution over hundreds of millions of years, then they must be functionally important."
Eric Lander, Ph.D., director of the Whitehead/MIT Center for Genome Research, Cambridge MA
This milestone concludes the second phase of the consortium's mouse-sequencing effort. In Phase III, the consortium will produce a "finished" version with the remaining gaps (the 4 percent where the sequence has yet to be determined) filled in and errors resolved. This phase will proceed using clone-based, or hierarchical, sequencing using the publicly available mouse genome clone map. A mapped set of BAC clones that covers the entire mouse genome is being sequenced. The BAC data will be combined with the draft genome sequence to finish the mouse sequence to the same high quality to which the human sequence is being completed. Clone-based sequencing remains the only method proven to produce a complete, fully accurate version of a complex genome. The complete genome sequence of the mouse will be available within 3 years.