The finished human genome
The International Human Genome Sequencing Consortium, of which the Wellcome Trust Sanger Institute is a major partner, today published their scientific analysis of the finished human genome, the Gold Standard sequence that is already acting to prime new biomedical research.
The paper is published on 21 October 2004 in Nature and details the rigorous standards set and surpassed during the 13-year Human Genome Project (HGP). The analysis suggests that there are perhaps only 20,000-25,000 protein-coding genes in our human genome.
The Wellcome Trust Sanger Institute made the largest single contribution to the human genome sequence and the 'genome browser' ENSEMBL, run by the Sanger Institute and the EMBL-European Bioinformatics Institute is a leading resource for researchers around the globe.
Key results of the research are:
The number of gaps has been reduced 400-fold to only 341
It covers 99 per cent of the gene-containing parts of the genome and is 99.999 per cent accurate
The new sequence correctly identifies almost all known genes (99.74 per cent)
It defines 22,287 'gene loci', consisting of 19,599 protein-coding genes in the human genome and another 2,188 DNA segments that are predicted to be protein-coding genes
It identifies the 'birth' of 1183 genes in the last 60-100 million years
It identifies the 'death' of 30 or so genes in a similar time period
The accuracy and completeness allows systematic searches for the causes of disease, for example, to find all key heritable factors predisposing to diabetes or mutations underlying breast cancer - with confidence that little can escape detection
At a practical level, it eliminates tedious confirmatory work by researchers, who can now rely on highly accurate information
More generally, the HGP demonstrates the tremendous potential value of coordinated projects to create community resources to propel biomedical research
"In our analysis we revised some predictions based on the unfinished, draft sequence of the human genome. The task of identifying genes remains challenging, but the finished human genome sequence, genome sequences from other organisms, better computational models and other improved resources, have combined to give a much clearer and more reliable picture of our genomic landscape."
Dr Jane Rogers, Head of Sequencing at the Wellcome Trust Sanger Institute
The quality of sequence produced has an estimated error rate of less than one per 100,000 bases of code - tenfold better than the original goal. This means that gene identification can be more reliable and that studies our genome and health - for example, what genetic changes mean some individuals are predisposed to disease - can be carried out with greater confidence.
"Only a decade ago, most scientists thought humans had about 100,000 genes. When we analyzed the working draft of the human genome sequence three years ago, we estimated there were about 30,000 to 35,000 genes, which surprised many. This new analysis reduces that number even further and provides us with the clearest picture yet of our genome. The availability of the highly accurate human genome sequence in free public databases enables researchers around the world to conduct even more precise studies of our genetic instruction book and how it influences health and disease."
NHGRI Director Francis S. Collins, MD, PhD
Key challenges that lie ahead include: a systematic study of sequence variation among humans in a study of the association of variation with disease; systematic identification of non-protein-coding elements in the human genome, especially regulatory controls and structure elements; systematic identification of all the 'modules' in which genes and proteins function together to place genetic information in a functional context.
"Collectively we have produced a sequence that is as accurate and complete as possible in the present state of the art. It will be open for continuous improvement over the years to come, and of course open for all to use for any purpose, without restraint or fee. Let us continue to work together to ensure that the enormous benefits from this new knowledge flow to all and not just to the few."
Sir John Sulston, former Director of The Wellcome Trust Sanger Institute