Mouse Genomes Project

Overview

The ability to manipulate the mouse genome, together with the wealth of disease models, inbred strains and genomic resources, makes the mouse the premier model organism for genetic approaches to mammalian biology. Over a century of mouse genetics has provided scores of inbred strains, spontaneous and engineered mutations, making the mouse unsurpassed as a model system for investigating human disease. Access to complete sequence of multiple inbred strains will add to these resources and will become a permanent foundation for a systems biology approach to phenotypic variation in the mouse. 

The Mouse Genomes Project uses next generation sequencing technologies to sequence the genomes of key laboratory mouse strains. The project consists of two arms:

  • Short-read sequencing of many laboratory mouse strains and identification of sequence variation (SNPs, short insertions and deletions, and larger structural variations) relative to the C57BL/6J mouse reference genome.
  • De novo genome assembly and strain specific gene annotation of the most highly used strains.

Sequence variation

We and our collaborators have used short-read sequencing to identify SNPs, indels, and structural variations relative to the C57BL/6J mouse reference genome. The strains that have been sequenced and are in our variation catalog are:

129P2/OlaHsd View 129S1/SvImJ View 129S5SvEvBrd View A/J View
AKR/J View BALB/cJ View BTBR View BUB/BnJ View
C3H/HeH View C3H/HeJ View C57BL/10J View C57BL/6NJ View
C57BR/cdJ View C57L/J View C58/J View CAST/EiJ View
CBA/J View DBA/1J View DBA/2J View FVB/NJ View
I/LnJ View KK/HiJ View LEWES/EiJ View LP/J View
MOLF/EiJ View NOD/ShiLtJ View NZB/B1NJ View NZO/HlLtJ View
NZW/LacJ View PWK/PhJ View RF/J View SEA/GnJ View
SPRET/EiJ View ST/bJ View WSB/EiJ View ZALENDE/EiJ View

The sample accession codes are listed here. The sequence variation can be queried via our query tool. For bulk download, the sequencing reads are available in BAM format from our ftp site and the variations are available in VCF format on our ftp site. All of the variation data has been published and can be used without restriction. The primary citation for the resource is:

  • Mouse genomic variation and its effect on phenotypes and gene regulation. Keane TM, Goodstadt L, Danecek P, White MA, Wong K et al. Nature 2011;477;7364;289-94 PUBMED: 21921910; PMC: 3276836; DOI: 10.1038/nature10413

Assembled Genomes

We are producing de novo assembled reference genomes and strain specific gene annotation for 16 laboratory and wild derived strains:

129S1/SvImJ View A/J View AKR/J View BALB/cJ View
C3H/HeJ View C57BL/6NJ View CAST/EiJ View CBA/J View
DBA/2J View FVB/NJ View LP/J View NOD/ShiLtJ View
NZO/HlLtJ View PWK/PhJ View SPRET/EiJ View WSB/EiJ View

NOTE: These assembled chromosomes are released as unpublished, preliminary and incomplete sequences and as such they have not yet been submitted to the accessioned in the public genome sequence repositories (INSDC). The assembled sequences will be fully accessioned in public repositories at the time of publication. These data are released in accordance with the Fort Lauderdale agreement and Toronto agreements. As producers of these data we reserve the right to be the first to publish a genome-wide analysis of the data we have generated. The pre-publication data that we release are embargoed for publication except for analyses of single chromosomes in single strains or single gene loci across multiple strains. We strongly encourage researchers to contact us (mousegenomes@sanger.ac.uk) if there are any queries about referencing or publishing analysis based on pre-publication data. We expect to accession and publish the genome sequences and strain specific gene annotation in mid-late 2016.

Downloads

Bibliography

  • Mouse genomic variation and its effect on phenotypes and gene regulation.

    Keane TM, Goodstadt L, Danecek P, White MA, Wong K et al.

    Nature 2011;477;7364;289-94

  • Next-generation sequencing of experimental mouse strains.

    Yalcin B, Adams DJ, Flint J and Keane TM

    Mammalian genome : official journal of the International Mammalian Genome Society 2012;23;9-10;490-8

  • The fine-scale architecture of structural variants in 17 mouse genomes.

    Yalcin B, Wong K, Bhomra A, Goodson M, Keane TM et al.

    Genome biology 2012;13;3;R18

  • The genomic landscape shaped by selection on transposable elements across 18 mouse strains.

    Nellåker C, Keane TM, Yalcin B, Wong K, Agam A et al.

    Genome biology 2012;13;6;R45

  • High levels of RNA-editing site conservation amongst 15 laboratory mouse strains.

    Danecek P, Nellåker C, McIntyre RE, Buendia-Buendia JE, Bumpstead S et al.

    Genome biology 2012;13;4;26

  • Sequencing and characterization of the FVB/NJ mouse genome.

    Wong K, Bumpstead S, Van Der Weyden L, Reinholdt LG, Wilming LG et al.

    Genome biology 2012;13;8;R72

  • Sequence-based characterization of structural variation in the mouse genome.

    Yalcin B, Wong K, Agam A, Goodson M, Keane TM et al.

    Nature 2011;477;7364;326-9

Data Use

This sequencing centre plans on publishing the completed and annotated sequences in a peer-reviewed journal as soon as possible. Permission of the principal investigator should be obtained before publishing analyses of the sequence/open reading frames/genes on a chromosome or genome scale. See our data sharing policy.

Data Release Policy

The Mouse Genomes Project releases sequence data, SNPs and other variant calls as a service to the research community. These data are released in accordance with the Fort Lauderdale agreement and Toronto agreements. As producers of these data we reserve the right to be the first to publish a genome-wide analysis of the data we have generated. The pre-publication data that we release via this website is embargoed for publication except for analyses of single chromosomes in single strains or single gene loci across multiple strains. We strongly encourage researchers to contact us if there are any queries about referencing or publishing analysis based on pre-publication data obtained via this website. More information on the Wellcome Trust Sanger Institute's data sharing policy.

If you have a query about using the project data in your studies or publications, we are happy to answer any queries and can be contacted at mousegenomes@sanger.ac.uk

Contact

For any queries about the data produced by the project or how to use the data, we can be contacted at: mousegenomes@sanger.ac.uk