Analysis
Genotype/Phenotype analysis
- Evoker - a graphical tool for visualising genotype intensity data in order to assess genotype calls as part of quality control procedures for genome-wide association studies
- Genevar - a platform of database and web services for integrative analysis and visualization of SNP-gene associations in eQTL studies
- GLIDERS - Genome-wide linkage disequilibrium repository and search engine
- Illuminus - a fast and accurate algorithm for assigning single nucleotide polymorphism (SNP) genotypes to microarray data from the Illumina BeadArray technology
- Olorin - an interactive filtering tool for next generation sequencing data coming from the study of large complex disease pedigrees
- optiCall - a robust genotype-calling algorithm for calling rare, low-frequency and common variants from SNP microarray intensity data
- Optimist - a simple software package for inferring positive selection from marker dynamics in an asexual population
- PEER - a Bayesian framework to account for complex non-genetic factors in high-dimensional phenotype data
Protein analysis
- Doublescan - a program for comparative ab initio prediction of protein coding genes in mouse and human DNA
- Logomat-P - illustrates the similarities of pairs of protein family profiles in an intuitive way
- Mascot Percolator - a software package that interfaces the database search algorithm Mascot with Percolator
- Projector - a program for the comparative, homology based prediction of protein coding genes in mouse and human DNA
- Quicktree - allows the reconstruction of phylogenies for very large protein families that would be infeasible using other popular methods
- SCOOP - allows the comparison of families of proteins
- Turbo SLoMo - a software tool which can localise and score sites of protein modification in mass spectrometry data
Sequence analysis
- Alfresco - FRont-End for Sequence COmparison
- Alien_hunter - an application for the prediction of putative Horizontal Gene Transfer (HGT) events with the implementation of Interpolated Variable Order Motifs (IVOMs)
- AMELIA - a program that employs an allele-matching approach that is robust to the presence of both directions of effect for variants within the locus analysed
- ARIEL - analysis software that employs a locus-wide regression-based collapsing approach that incorporates variant quality scores
- AutoCSA (Automatic Comparative Sequence Analysis) - mutation detection program designed to detect small mutations (1-50 bases) in sequence traces
- BioView - a suite of tools for generating lightweight chromatogram images from any trace file that can be cast as a biojava chromatogram interface
- Blast - the sequencing projects' Blast Search Services
- CAROL - a combined functional annotation score of non-synonymous coding variants
- CnD - a copy number variant caller for inbred strains
- CCRaVAT & QuTie - enables analysis of rare variants in large-scale case control and quantitative trait association studies
- Dindel - accurate indel calls from short-read data
- EMu - software for inferring the mutational signatures present in a number of cancer mutation sets
- Eponine - a probabilistic method for detecting transcription start sites in mammalian genomic sequence
- ESGI - information about bioinformatics and computational tools available for the analysis of high-throughput genomic data
- Est_DB - a software suite and database system designed to support expressed sequence tag (EST) sequencing projects, and to provide comprehensive bioinformatic analysis of sequenced EST libraries, for gene discovery and other purposes
- Image - a package of analysis algorithms for processing gel images from restriction digest fingerprinting experiments
- Hexamer - scans for likely coding regions using 6-mers but without deriving information from base composition
- KATE - a program that analyses the effects of low frequency and rare variants on quantitative traits within a chromosomal region
- Logomat-M - a method to graphically visualise all central aspects of profile Hidden Markov Models (pHMMs), thus generalizing the concept of sequence logos
- Margarita - infers genealogies from population genotype data and uses these to map disease loci
- NestedMICA - a method for discovering over-represented short motifs in large sets of strings, for example in finding transcription-factor-binding sites in DNA sequences
- PICNIC - an algorithm designed to identify copy number segments and genotypes in cancer using a SNP6 'cel' file as input
- RetroSeq - Transposable element discovery from next-generation sequencing data
Annotation
- Anacode and Annotools - specialist analysis pipelines and annotation tools
- Artemis - a free genome viewer and annotation tool that allows visualisation of sequence features and the results of analyses within the context of the sequence, and also its six-frame translation
- ACT - a DNA sequence comparison viewer written in Java. It is based on the software for Artemis, the genome viewer and annotation tool
- BamView - interactive display of read alignments in BAM data files
- DNAPlotter - makes use of the existing circular plot in Jemboss and the Artemis sequence libraries
Assembly
- Lookseq - a web-based application for alignment visualisation, browsing and analysis of genome sequence data
- NPG - short read sequencing pipeline
- PAGIT - Tools to generate automatically high quality sequence by ordering contigs, closing gaps, correcting sequence errors and transferring annotation
- Phusion - a software package for assembling genome sequences from whole genome shotgun(WGS) reads
- REAPR - A tool that evaluates the accuracy of a genome assembly using mapped paired end reads
- SMALT - a highly efficient and accurate mapper of DNA sequencing reads from a variety of platforms including paired reads
- SSAHA - a software tool for very fast matching and alignment of DNA sequences
- SSAHA2 - a pairwise sequence alignment program designed for the efficient mapping of sequencing reads onto genomic reference sequences
- SSAHAest - a software tool for very fast matching and alignment of ESTs/cDNAs to genomic DNA sequences
- SSAHAsnp - a polymorphism detection tool, which detects homozygous SNPs and indels by aligning shotgun reads to the finished genome sequence
Database software
Data formats
Gene finding
All downloads
Analysis
Analysis
Genotype/Phenotype analysis
- Evoker - a graphical tool for visualising genotype intensity data in order to assess genotype calls as part of quality control procedures for genome-wide association studies
- Genevar - a platform of database and web services for integrative analysis and visualization of SNP-gene associations in eQTL studies
- GLIDERS - Genome-wide linkage disequilibrium repository and search engine
- Illuminus - a fast and accurate algorithm for assigning single nucleotide polymorphism (SNP) genotypes to microarray data from the Illumina BeadArray technology
- Olorin - an interactive filtering tool for next generation sequencing data coming from the study of large complex disease pedigrees
- optiCall - a robust genotype-calling algorithm for calling rare, low-frequency and common variants from SNP microarray intensity data
- Optimist - a simple software package for inferring positive selection from marker dynamics in an asexual population
- PEER - a Bayesian framework to account for complex non-genetic factors in high-dimensional phenotype data
Protein analysis
- Doublescan - a program for comparative ab initio prediction of protein coding genes in mouse and human DNA
- Logomat-P - illustrates the similarities of pairs of protein family profiles in an intuitive way
- Mascot Percolator - a software package that interfaces the database search algorithm Mascot with Percolator
- Projector - a program for the comparative, homology based prediction of protein coding genes in mouse and human DNA
- Quicktree - allows the reconstruction of phylogenies for very large protein families that would be infeasible using other popular methods
- SCOOP - allows the comparison of families of proteins
- Turbo SLoMo - a software tool which can localise and score sites of protein modification in mass spectrometry data
Sequence analysis
- Alfresco - FRont-End for Sequence COmparison
- Alien_hunter - an application for the prediction of putative Horizontal Gene Transfer (HGT) events with the implementation of Interpolated Variable Order Motifs (IVOMs)
- AMELIA - a program that employs an allele-matching approach that is robust to the presence of both directions of effect for variants within the locus analysed
- ARIEL - analysis software that employs a locus-wide regression-based collapsing approach that incorporates variant quality scores
- AutoCSA (Automatic Comparative Sequence Analysis) - mutation detection program designed to detect small mutations (1-50 bases) in sequence traces
- BioView - a suite of tools for generating lightweight chromatogram images from any trace file that can be cast as a biojava chromatogram interface
- Blast - the sequencing projects' Blast Search Services
- CAROL - a combined functional annotation score of non-synonymous coding variants
- CnD - a copy number variant caller for inbred strains
- CCRaVAT & QuTie - enables analysis of rare variants in large-scale case control and quantitative trait association studies
- Dindel - accurate indel calls from short-read data
- EMu - software for inferring the mutational signatures present in a number of cancer mutation sets
- Eponine - a probabilistic method for detecting transcription start sites in mammalian genomic sequence
- ESGI - information about bioinformatics and computational tools available for the analysis of high-throughput genomic data
- Est_DB - a software suite and database system designed to support expressed sequence tag (EST) sequencing projects, and to provide comprehensive bioinformatic analysis of sequenced EST libraries, for gene discovery and other purposes
- Image - a package of analysis algorithms for processing gel images from restriction digest fingerprinting experiments
- Hexamer - scans for likely coding regions using 6-mers but without deriving information from base composition
- KATE - a program that analyses the effects of low frequency and rare variants on quantitative traits within a chromosomal region
- Logomat-M - a method to graphically visualise all central aspects of profile Hidden Markov Models (pHMMs), thus generalizing the concept of sequence logos
- Margarita - infers genealogies from population genotype data and uses these to map disease loci
- NestedMICA - a method for discovering over-represented short motifs in large sets of strings, for example in finding transcription-factor-binding sites in DNA sequences
- PICNIC - an algorithm designed to identify copy number segments and genotypes in cancer using a SNP6 'cel' file as input
- RetroSeq - Transposable element discovery from next-generation sequencing data
Annotation
Annotation
- Anacode and Annotools - specialist analysis pipelines and annotation tools
- Artemis - a free genome viewer and annotation tool that allows visualisation of sequence features and the results of analyses within the context of the sequence, and also its six-frame translation
- ACT - a DNA sequence comparison viewer written in Java. It is based on the software for Artemis, the genome viewer and annotation tool
- BamView - interactive display of read alignments in BAM data files
- DNAPlotter - makes use of the existing circular plot in Jemboss and the Artemis sequence libraries
Assembly
Assembly
- Lookseq - a web-based application for alignment visualisation, browsing and analysis of genome sequence data
- NPG - short read sequencing pipeline
- PAGIT - Tools to generate automatically high quality sequence by ordering contigs, closing gaps, correcting sequence errors and transferring annotation
- Phusion - a software package for assembling genome sequences from whole genome shotgun(WGS) reads
- REAPR - A tool that evaluates the accuracy of a genome assembly using mapped paired end reads
- SMALT - a highly efficient and accurate mapper of DNA sequencing reads from a variety of platforms including paired reads
- SSAHA - a software tool for very fast matching and alignment of DNA sequences
- SSAHA2 - a pairwise sequence alignment program designed for the efficient mapping of sequencing reads onto genomic reference sequences
- SSAHAest - a software tool for very fast matching and alignment of ESTs/cDNAs to genomic DNA sequences
- SSAHAsnp - a polymorphism detection tool, which detects homozygous SNPs and indels by aligning shotgun reads to the finished genome sequence
Database software
Database software
- DAS - the Institute provides support for the Distributed Annotation Systems via a range of different projects, websites and applications
- DBCon - database pooling, distributed configuration and SQL Libraries for Java
- Proserver - a very lightweight DAS server written in Perl
Data formats
Data formats
- CAF - a text format for describing sequence assemblies
- GFF - a format for describing genes and other features associated with DNA, RNA and Protein sequences

