Mouse Genomes Project
The ability to manipulate the mouse genome, together with the wealth of disease models, inbred strains and genomic resources, makes the mouse the premier model organism for genetic approaches to mammalian biology. Over a century of mouse genetics has provided scores of inbred strains, spontaneous and engineered mutations, making the mouse unsurpassed as a model system for investigating human disease. Access to complete sequence of multiple inbred strains will add to these resources and will become a permanent foundation for a systems biology approach to phenotypic variation in the mouse.
The Mouse Genomes Project aims to use new sequencing technologies to sequence the genomes of 17 key mouse strains. We are releasing the raw sequence data, SNPs, short indels, structural variants and assemblies of each strain, under our data release policy.
The whole-genome sequencing is available from the European Nucleotide Archive (ENA) under the following accessions:
RNA-Seq
We have also completed RNA-Seq from whole-brain RNA across 15 of the strains and the data has been accessioned under
ERP000614. We have also completed RNA-Seq
from a cross of C57BL/6J and DBA/2J across 6 different tissues and this data is available from the ENA under
accession ERP000591
ChIP-Seq
We sequenced DNA from liver bound to chromatin precipitated by a marker for active gene promoters (histone 3, lysine
4 trimethylation; H3K4me3). The two samples that we used in our analysis are ERS001976 and ERS001977
Data Release
The sequencing for the project is now complete. The strains have been sequenced to an average of 25x coverage on the Illumina GAII platform with a mixture of 54bp, 76bp, and 108bp paired reads. The raw data has been accessioned at the Short Read Archive (SRA) and European Nucleotide Archive (ENA) under the study accession numbers given above. The data is also available from our FTP site in the form of:
- Read Alignments in BAM format (Reads are mapped with Maq)
- Denovo Assemblies in FASTA format
- SNPs in VCF format
- Indels in VCF format.
- Structural Variation Calls called with SVMerge.
Please see the README files in each folder for more information. The SNPs and indels have also been submitted to dbSNP recently and will appear soon. The SVs have been submitted to DGVa under accession number estd118. Please note that these are covered by our data release policy.
Data querying and visualization
We have set up a suite of querying and visualization tools to allow the community to gain access to the full range of variation data that has been produced as part of the project. The data is also available for download from our FTP site.
SNP and Indel Query
Use our query page to search for SNPs and indels by genomic
region or gene. Select the strains, SNP quality, and consequences to display. The variation consequences were called
against Ensembl v70. SNPs and indels can also be visualized on our
LookSeq page. Please note that these are preliminary SNPs and indels, and the lists are periodically updated. We
strongly recommend that you carry out independent experimental validation.
Lookseq Visualizer
We have implemented LookSeq to visualize read alignments in a region of
interest. The
Mouse Genomes Project LookSeq page displays data in 'pileup' view to visualize SNPs and indels, or 'read pair' view
to visualize larger structural variants, and allows filtering of data by mapping quality.
References
-
Next-generation sequencing of experimental mouse strains.
Mammalian genome : official journal of the International Mammalian Genome Society 2012;23;9-10;490-8
PUBMED: 22772437; PMC: 3463794; DOI: 10.1007/s00335-012-9402-6
-
Sequencing and characterization of the FVB/NJ mouse genome.
Genome biology 2012;13;8;R72
PUBMED: 22916792; PMC: 3491372; DOI: 10.1186/gb-2012-13-8-r72
-
The fine-scale architecture of structural variants in 17 mouse genomes.
Genome biology 2012;13;3;R18
PUBMED: 22439878; PMC: 3439969; DOI: 10.1186/gb-2012-13-3-r18
-
The genomic landscape shaped by selection on transposable elements across 18 mouse strains.
Genome biology 2012;13;6;R45
PUBMED: 22703977; PMC: 3446317; DOI: 10.1186/gb-2012-13-6-r45
-
High levels of RNA-editing site conservation amongst 15 laboratory mouse strains.
Genome biology 2012;13;4;26
PUBMED: 22524474; PMC: 3446300; DOI: 10.1186/gb-2012-13-4-r26
-
Mouse genomic variation and its effect on phenotypes and gene regulation.
Nature 2011;477;7364;289-94
PUBMED: 21921910; PMC: 3276836; DOI: 10.1038/nature10413
-
Sequence-based characterization of structural variation in the mouse genome.
Nature 2011;477;7364;326-9
PUBMED: 21921916; PMC: 3428933; DOI: 10.1038/nature10432
Announcements Mailing List
We have setup a mailing list where we'll be making announcements related to the project. To join this mailing list, please visit this link and sign up.
Enquiries
For further information, please contact mousegenomes@sanger.ac.uk


