Mouse Genomes Project
The ability to manipulate the mouse genome, together with the wealth of disease models, inbred strains and genomic resources, makes the mouse the premier model organism for genetic approaches to mammalian biology. Over a century of mouse genetics has provided scores of inbred strains, spontaneous and engineered mutations, making the mouse unsurpassed as a model system for investigating human disease. Access to complete sequence of multiple inbred strains will add to these resources and will become a permanent foundation for a systems biology approach to phenotypic variation in the mouse.
The Mouse Genomes Project aims to use new sequencing technologies to sequence the genomes of 17 key mouse strains. We are releasing the raw sequence data, SNPs, short indels, structural variants and assemblies of each strain, under our data release policy.
The whole-genome sequencing is available from the European Nucleotide Archive (ENA) under the following accessions:
Data Release
The sequencing for the project is now complete. The strains have been sequenced to an average of 25x coverage on the Illumina GAII platform with a mixture of 54bp, 76bp, and 108bp paired reads. The raw data has been accessioned at the Short Read Archive (SRA) and European Nucleotide Archive (ENA) under the study accession numbers given above. The data is also available from our FTP site in the form of:
- Read Alignments in BAM format (Reads are mapped with Maq)
- Denovo Assemblies in FASTA format
- SNPs in VCF format
- Indels in VCF format.
- Structural Variation Calls called with SVMerge.
Please see the README files in each folder for more information. The SNPs and indels have also been submitted to dbSNP recently and will appear soon. The SVs have been submitted to DGVa under accession number estd118. Please note that these are covered by our data release policy.
Data querying and visualization
We have set up a suite of querying and visualization tools to allow the community to gain access to the full range of variation data that has been produced as part of the project. The data is also available for download from our FTP site.
SNP and Indel Query
Use our query page to search for SNPs and indels by genomic region or gene. Select the strains, SNP quality, and consequences to display. The variation consequences were called against Ensembl v64. SNPs and indels can also be visualized on our LookSeq page. Please note that these are preliminary SNPs and indels, and the lists are periodically updated. We strongly recommend that you carry out independent experimental validation.
Lookseq Visualizer
We have implemented LookSeq to visualize read alignments in a region of interest. The Mouse Genomes Project LookSeq page displays data in 'pileup' view to visualize SNPs and indels, or 'read pair' view to visualize larger structural variants, and allows filtering of data by mapping quality.
DAS Tracks
Coding SNPs can also be viewed in the Ensembl Genome Browser by adding a DAS track. Click here to find out how.
References
Keane TM, Goodstadt L, Danecek P, et al. (2011) Mouse genomic variation and its effect on phenotypes and gene regulation, Nature, 477(7364):289-294. link
Yalcin B, Wong K, Agam A, et al. (2011) Sequence-based characterization of structural variation in the mouse genome, Nature, 477(7364):326-329. link
Announcements Mailing List
We have setup a mailing list where we'll be making announcements related to the project. To join this mailing list, please visit this link and sign up.
Enquiries
For further information, please contact mousegenomes@sanger.ac.uk

