Software

As a leading genomics centre, the Sanger Institute often needs to develop software solutions to novel biological problems.

All our software is made available to the research community and is open access, recognising that community improvement is essential to maximising efficiencies in software development.

[Genome Research Limited]

Top downloads

  • Artemis - a free genome viewer and annotation tool that allows visualisation of sequence features and the results of analyses within the context of the sequence, and also its six-frame translation
    • ACT - a DNA sequence comparison viewer written in Java. It is based on the software for Artemis, the genome viewer and annotation tool
  • SSAHA2 - a pairwise sequence alignment program designed for the efficient mapping of sequencing reads onto genomic reference sequences

All software downloads

  • Alfresco - FRont-End for Sequence COmparison
  • Alien_hunter - an application for the prediction of putative Horizontal Gene Transfer (HGT) events with the implementation of Interpolated Variable Order Motifs (IVOMs)
  • Artemis - a free genome viewer and annotation tool that allows visualisation of sequence features and the results of analyses within the context of the sequence, and also its six-frame translation
    • ACT - a DNA sequence comparison viewer written in Java. It is based on the software for Artemis, the genome viewer and annotation tool
    • DNAPlotter - makes use of the existing circular plot in Jemboss and the Artemis sequence libraries
    • BamView - interactive display of read alignments in BAM data files
  • Blast - the sequencing projects' Blast Search Services
  • CAF - a text format for describing sequence assemblies
  • CnD - a copy number variant caller for inbred strains
  • CCRaVAT & QuTie - enables analysis of rare variants in large-scale case control and quantitative trait association studies
  • DAS - the Institute provides support for the Distributed Annotation Systems via a range of different projects, websites and applications
  • Doublescan - a program for comparative ab initio prediction of protein coding genes in mouse and human DNA
  • Eponine - a probabilistic method for detecting transcription start sites in mammalian genomic sequence
  • Est_DB - a software suite and database system designed to support expressed sequence tag (EST) sequencing projects, and to provide comprehensive bioinformatic analysis of sequenced EST libraries, for gene discovery and other purposes
  • Evoker - a graphical tool for visualizing genotype intensity data in order to assess genotype calls as part of quality control procedures for genome-wide association studies
  • GAZE - integrates gene prediction signal and content sensor information into complete gene structures
  • Genevar - a database and Java application for the analysis and visualization of SNP-gene associations in eQTL studies
  • GFF - a format for describing genes and other features associated with DNA, RNA and Protein sequences
  • GLIDERS - Genome-wide linkage disequilibrium repository and search engine
  • Illuminus - a fast and accurate algorithm for assigning single nucleotide polymorphism (SNP) genotypes to microarray data from the Illumina BeadArray technology
  • Image - a package of analysis algorithms for processing gel images from restriction digest fingerprinting experiments
  • Logomat-M - a method to graphically visualize all central aspects of profile Hidden Markov Models (pHMMs), thus generalizing the concept of sequence logos
  • Logomat-P - illustrates the similarities of pairs of protein family profiles in an intuitive way
  • Lookseq - a web-based application for alignment visualization, browsing and analysis of genome sequence data
  • Margarita - infers genealogies from population genotype data and uses these to map disease loci
  • Mascot Percolator - a software package that interfaces the database search algorithm Mascot with Percolator
  • NestedMICA - a method for discovering over-represented short motifs in large sets of strings, for example in finding transcription-factor-binding sites in DNA sequences
  • NPG - Short read sequencing pipeline
  • PEER - A Bayesian framework to account for complex non-genetic factors in high-dimensional phenotype data
  • Phusion - a software package for assembling genome sequences from whole genome shotgun(WGS) reads
  • Projector - a program for the comparative, homology based prediction of protein coding genes in mouse and human DNA
  • Proserver - a very lightweight DAS server written in Perl
  • PSILC - Pseudogene inference from loss of constraint
  • Quicktree - allows the reconstruction of phylogenies for very large protein families that would be infeasible using other popular methods
  • SCOOP - allows the comparison of families of proteins
  • SMALT - a highly efficient and accurate mapper of DNA sequencing reads from a variety of platforms including paired reads
  • SSAHA - a software tool for very fast matching and alignment of DNA sequences
  • SSAHA2 - a pairwise sequence alignment program designed for the efficient mapping of sequencing reads onto genomic reference sequences
  • SSAHAest - a software tool for very fast matching and alignment of ESTs/cDNAs to genomic DNA sequences
  • SSAHAsnp - a polymorphism detection tool, which detects homozygous SNPs and indels by aligning shotgun reads to the finished genome sequence