Newly sequenced mouse genomes unearth unknown genes

Sixteen newly sequenced mouse strains reveal unexpected diversity that could impact disease research

Newly sequenced mouse genomes unearth unknown genes

Newly sequenced mouse genomes unearth unknown genes. Image credit: Wellcome Sanger Institute, Genome Research LimitedWellcome Sanger Institute, Genome Research Limited
Newly sequenced mouse genomes have revealed new gene structures and coding loci absent from the current reference strain. This discovery could impact future research in genetics, drug development and beyond.

Scientists at EMBL's European Bioinformatics Institute (EMBL-EBI) and the Wellcome Sanger Institute have discovered significant diversity in the genomes of 16 laboratory strains of mouse, potentially impacting future research in genetics, drug development and beyond.

The research, published in the journal Nature Genetics, produced draft genome sequences for 16 of the most widely used mouse strains, revealing, for the first time, notable genetic diversity. Significant areas of the genome where variation was found include regions impacting immunity, pathogen defence and sensory function. These variations also differ widely from the current reference strain, suggesting this discovery has the potential to significantly impact future human disease research.

A research staple

The lab mouse is a staple of research in understanding health and disease, drug development, vaccines and genetics, and is the most widely used mammalian model organism. Its similarity to the human genome, with 98 per cent of genes comparable to those in humans, has made the mouse genome instrumental in helping researchers understand disease and develop drug treatments.

Researchers use a variety of different mouse strains to study human disease. For example the Non-obese Diabetic (NOD) mouse is used to study type 1 diabetes. Prior to the current study, researchers only had the complete genome for one of these strains. By sequencing 16 of the most commonly used mouse strains, this study discovered hundreds of new forms of genes associated with disease, as well as a previously unknown gene. This is one of the largest known mouse genes to date, and has been associated with brain development.

Striking differences

“We examined the regions of the genome that were most different compared to the single genome that the whole community is using. One of the most striking things we found is that genes important for disease research were the most highly-variable genes. We looked in detail at a few of these regions and found completely different gene structures compared to the reference strain.

“If you’re using mice for your experiments you need to be aware of the diversity that’s present in those types of genes. What we’ve generated is a resource for the community.”

Thomas Keane, Faculty member at EMBL-EBI

“Mice have played a critical role in defining the genetics of mammalian development and for modelling human disease. We have known for some time that there are differences between mouse strains in phenotypes such as response to viruses and pathogens. These genomes allow us to understand these differences, which could have profound implications for human disease research.”

David Adams, Senior Group Leader at the Wellcome Sanger Institute

What next?

Compared to previous research in this area this study constructed whole genome sequences rather than just looking at differences between strains. The ability to see across whole loci or regions means researchers will be able to study these variations and differences in a wider context, rather than just looking at individual differences.

These findings have the potential to impact the future of genetics research, drug development and the way in which research is carried out. The resource has now been made available to the wider scientific community. The 16 genomes have been incorporated into Ensembl, where they can be freely accessed and analysed.

Notes to Editors

Lilue, J et al. (2018). Multiple laboratory mouse reference genomes define strain specific haplotypes and novel functional loci. Nature Genetics. Published online DOI: 10.1038/s41588-018-0223-8

Training video available online

Ensembl webinar on how to effectively browse and compare data between strains.


This work was supported by UK Research and Innovation, the Wellcome Trust, the National Human Genome Research Institute, the Medical Research Council, the Biotechnology and Biological Sciences Research Council and many other funding bodies. Please see the paper for the full list of funders.

Selected Websites
Why use the mouse in research?FactsWhy use the mouse in research?
Humans and mice share many common genetic features and by examining the physiology, anatomy and metabolism of a mouse, scientists can gain a valuable insight into how humans function.

Of mice and menStoriesOf mice and men
The mouse is closely related to humans with a striking similarity to us in terms of anatomy, physiology and genetics. This makes the mouse an extremely useful model organism.

What are model organisms?FactsWhat are model organisms?
A model organism is a species that has been widely studied, usually because it is easy to maintain and breed in a laboratory setting and has particular experimental advantages.

Contact the Press Office

Emily Mobley, Media Manager

Tel +44 (0)1223 496 851

Dr Samantha Wynne, Media Officer

Tel +44 (0)1223 492 368

Dr Matthew Midgley, Media Officer

Tel +44 (0)1223 494 856

Wellcome Sanger Institute,
CB10 1SA,

Mobile +44 (0)7900 607793

Recent News

UK launches whole genome sequence alliance to map spread of coronavirus
The Wellcome Sanger Institute will collaborate with expert groups across the country to analyse the genetic code of COVID-19 samples circulating in the UK, providing public health agencies with a unique tool to combat the virus
Sanger Institute scientist wins 2021 Biochemical Society award
Roser Vento-Tormo awarded an Early Career Research Award
Sanger Institute and Wellcome Genome Campus temporarily closed to all but critical research
In response to the rapidly changing COVID-19 pandemic, the Sanger Institute and its neighbours on the Wellcome Genome Campus will close all but critical and essential operations from Friday 20 March.