Family ties: Relationship between human and zebrafish genomes

Completion of the zebrafish reference genome yields strong comparisons with the human genome

Email newsletter

News and blog updates

Sign up

Researchers demonstrate today that 70 per cent of protein-coding human genes are related to genes found in the zebrafish and that 84 per cent of genes known to be associated with human disease have a zebrafish counterpart. Their study highlights the importance of zebrafish as a model organism for human disease research.

The team developed a high-quality annotated zebrafish genome sequence to compare with the human reference genome. Only two other large genomes have been sequenced to this high standard: the human genome and the mouse genome. The completed zebrafish genome will be an essential resource that drives the study of gene function and disease in people.

At first glance, Zebrafish may seem to be a strange comparator to humans, but like us they are vertebrates and we share a common ancestor. They are remarkably biologically similar to people and share the majority of the same genes as humans, making them an important model for understanding how genes work in health and disease.

“Our aim with this project, like with all biomedical research, is to improve human health. This genome will allow researchers to understand how our genes work and how genetic variants can cause disease in ways that cannot be easily studied in humans or other organisms.”

Dr Derek Stemple Senior author from the Wellcome Trust Sanger Institute

Zebrafish research has already led to biological advances in cancer and heart disease research, and is advancing our understanding of muscle and organ development. Zebrafish have been used to verify the causal gene in muscular dystrophy disorders and also to understand the evolution and formation of melanomas or skin cancers.

“The vast majority of human genes have counterparts in the zebrafish, especially genes related to human disease. This high-quality genome is testament to the many scientists who worked on this project and will spur biological research for years to come.

“By modeling these human disease genes in zebrafish, we hope that resources worldwide will produce important biological information regarding the function of these genes and possibly find new targets for drug development.”

Professor Jane Rogers Senior author formerly at The Genome Analysis Centre

The zebrafish genome has some unique features, not seen in other vertebrates. They have the highest repeat content in their genome sequences so far reported in any vertebrate species: almost twice as much as seen in their closest relative, the common carp. Also unique to the zebrafish, the team identified chromosomal regions that influence sex determination.

The zebrafish genome contains few pseudogenes – genes thought to have lost their function through evolution – compared to the human genome. The team identified 154 pseudogenes in the zebrafish genome, a fraction of the 13,000 or so pseudogenes found in the human genome.

“To realize the benefits the zebrafish can make to human health, we need to understand the genome in its entirety – both the similarities to the human genome and the differences. Armed with the zebrafish genome, we can now better understand how changes to our genomes result in disease.”

Professor Christiane Nüsslein-Volhard Author and Nobel laureate from the Max Planck Institute for Developmental Biology

“This genome will help to uncover the biological processes responsible for common and rare disease and opens up exciting new avenues for disease screening and drug development.”

Dr Derek Stemple Sanger Institute

More information


A full list of funding can be found on the paper.

Participating Centres

  • Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, United Kingdom
  • The Genome Analysis Centre, Norwich Research Park, Norwich, NR4 7UH, United Kingdom
  • Bioinformatics Unit, Centro Nacional de Investigaciones Cardiovasculares, 28029 Madrid, Spain
  • Ecole Normale Supérieure, Institut de Biologie de l’ENS, IBENS, Paris, F-75005 France
  • Inserm, U1024, Paris, F-75005 France
  • CNRS, UMR 8197, Paris, F-75005 France
  • EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, United Kingdom
  • Illumina Cambridge Ltd., Chesterford Research Park, Little Chesterford, Saffron
  • Walden CB10 1XL UK
  • Hubrecht Laboratory, Uppsalalaan 8, 3584 CT Utrecht, The Netherlands
  • Max Planck Institute for Developmental Biology, Spemannstr. 35, 72076 Tübingen, Germany
  • Stem Cell Program and Division of Hematology/Oncology, Children’s Hospital and Dana Farber Cancer Institute, 1 Blackfan Cir., Karp 7, Boston, MA 02115, USA
  • Children’s Hospital Oakland, 747 52nd St. Oakland, Ca. 94609, USA
  • Institute of Neuroscience, University of Oregon, 1254 University of Oregon, 222 Huestis Hall, Eugene, OR 97403-1254
  • Karlsruhe Institute of Technology (KIT), Campus North, Institute of Toxicology and Gentics (ITG), Hermann-von-Helmholtz-Platz 1, 76344 Eggenstein-Leopoldshafen, Germany
  • Department of Pathology, Brigham and Women’s Hospital and Harvard Medical School, Boston, Massachusetts 02115, USA


Loading publications...

Selected websites

  • TGAC

    TGAC is a research institute focused on the development of genomics and computational biology. TGAC is based within the Norwich Research Park and receives strategic funding from the Biotechnology and Biological Science Research Council (BBSRC) and other research funders.

    TGAC is one of eight institutes that receive strategic funding from BBSRC and received a total of £21.4M investment in 2011-12, including £16M capital funding.

    TGAC offers state of the art DNA sequencing facility, unique by its operation of multiple complementary technologies for data generation. The Institute is a UK hub for innovative Bioinformatics through research, analysis and interpretation of multiple, complex data sets. It hosts one of the largest computing hardware facilities dedicated to life science research in Europe. It is also actively involved in developing novel platforms to provide access to computational tools and processing capacity for multiple academic and industrial users, and promoting applications of computational Bioscience. Additionally, the Institute offers a Training programme through courses and workshops, and an Outreach programme targeting schools, teachers and the general public through dialogue and science communication activities.

  • The Wellcome Trust Sanger Institute

    The Wellcome Trust Sanger Institute is one of the world’s leading genome centres. Through its ability to conduct research at scale, it is able to engage in bold and long-term exploratory projects that are designed to influence and empower medical science globally. Institute research findings, generated through its own research programmes and through its leading role in international consortia, are being used to develop new diagnostics and treatments for human disease.

  • The Wellcome Trust

    The Wellcome Trust is a global charitable foundation dedicated to achieving extraordinary improvements in human and animal health. We support the brightest minds in biomedical research and the medical humanities. Our breadth of support includes public engagement, education and the application of research to improve health. We are independent of both political and commercial interests.