SeqTools

A suite of tools for visualising sequence alignments.

Blixem is an interactive browser of pairwise alignments that have been stacked up in a "master-slave" multiple alignment; it is not a 'true' multiple alignment but a 'one-to-many' alignment. It displays an overview section showing the positions of genes and alignments around the alignment window, and a detail section showing the actual alignment of protein or nucleotide sequences to the genomic DNA sequence.

Dotter is a graphical dot-matrix program for detailed comparison of two sequences. Every residue in one sequence is compared to every residue in the other, with one sequence plotted on the x-axis and the other on the y-axis. Noise is filtered out so that alignments appear as diagonal lines.

Belvu is a multiple sequence alignment viewer and phylogenetic tool. It has an extensive set of user-configurable modes to color residues by conservation or by residue type, and some basic alignment editing capabilities. It can generate distance matrices between sequences and construct distance-based trees, either graphically or as part of a phylogenetic software pipeline.

[Genome Research Limited]

Information

Blixem features

  • View alignments against both strands of the reference sequence.
  • View sequences in nucleotide or protein mode; in protein mode, Blixem will display the three-frame translation of the reference sequence.
  • Residues are highlighted in different colours depending on whether they are an exact match, conserved substitution or mismatch.
  • Gapped alignments are supported, with insertions and deletions being highlighted in the match sequence.
  • Matches can be sorted and filtered.
  • SNPs and other variations can be highlighted in the reference sequence.
  • Poly(A) tails can be displayed and poly(A) signals highlighted in the reference sequence.

Dotter features

  • Every residue in one sequence is compared with every residue in the other, and a matrix of scores is calculated.
  • Pairwise scores are averaged over a sliding window to make the score matrix more intelligible.
  • The averaged score matrix forms a three-dimensional landscape, with the two sequences in two dimensions and the height of the peaks in the third. This landscape is projected onto two dimensions using a grey-scale image - the darker grey of a peak, the higher the score is.
  • The contrast and threshold of the grey-scale image can be adjusted.
  • A tool is provided to examine the sequence alignment that the grey-scale image represents.
  • Compare a sequence against itself to find internal repeats.
  • Find overlaps between multiple sequences by making a dot-plot of all sequences versus themselves.
  • Run Dotter in batch mode to create large, time-consuming dot-plots as a background process.

Belvu features

  • Residues can be coloured by conservation, with user-configurable cutoffs and colours.
  • Residues can be coloured by residue type (user-configurable).
  • Colour schemes can be imported or exported.
  • Swissprot (or PIR) entries can be fetched by double clicking.
  • The position in the alignment can be easily tracked.
  • Simple editing commands for rows and columns is supported (although Belvu is not intended to be a full editor).
  • The alignment can be saved in Stockholm, Selex, MSF or FASTA format.
  • Distance matrices between sequences can be generated using a variety of distance metrics.
  • Distance matrices can be imported or exported.
  • Trees can be constructed based on various distance-based tree reconstruction algorithms.
  • Trees can be saved in New Hampshire format.
  • Belvu can perform bootstrap phylogenetic reconstruction.

Software pipelines

As well as being used independently, Blixem, Dotter and Belvu can also be called from other tools as part of a software pipeline. A common workflow is to call Blixem from the ZMap genome browser to analyse a set of alignments in more detail, and to call Dotter from within Blixem to give a graphical representation of a particular alignment. Belvu has an extensive set of command-line arguments for specifying processing and output parameters, making it possible to perform complete processes in a single command-line call. See our team page for more information.

Background

Blixem, Dotter and Belvu were originally written as part of the AceDB genome database system. Version 4 of the programs involved an extensive re-write to take advantage of modern GUI toolkits and to separate them from AceDB to form this independent SeqTools package. They can be used independently or with any other tool that outputs data in a suitable format - the current preferred file formats are FASTA and GFF v3 for Blixem and Dotter; a variety of file formats are supported by Belvu.

Supported platforms

Currently supported platforms are Linux, Mac OS X (Intel) and FreeBSD (in the ports).

Licence

SeqTools is free software and is distributed under the terms of the GNU General Public License.

Screenshots

Blixem - DNA mode

Blixem - DNA mode

zoom

Blixem - protein mode

Blixem - protein mode

zoom

Dotter - DNA mode

Dotter - DNA mode

zoom

Dotter - protein mode

Dotter - protein mode

zoom

Dotter greyramp tool

Dotter greyramp tool

zoom

Dotter alignment tool - DNA mode

Dotter alignment tool - DNA mode

zoom

Dotter alignment tool - protein mode

Dotter alignment tool - protein mode

zoom

Belvu - colour by conservation

Belvu - colour by conservation

zoom

Belvu - colour by residue

Belvu - colour by residue

zoom

Belvu tree

Belvu tree

zoom

Belvu conservation plot

Belvu conservation plot

zoom

Download

Production release

This is the recommended release for most users. It is well-tested, stable and supported code.

The latest version is 4.29, compiled on 12:07:22 Sep 30 2014: seqtools-4.29.tar.gz

Development build

Reasonably stable development code, which contains most of the latest features.

The latest version is 4.29-4-gd795, compiled on 15:32:45 Oct 17 2014: seqtools-4.29-4-gd795.tar.gz

Daily build

Experimental code; not guaranteed to be stable (or even to compile). Should only be used if you require the very latest changes.

The latest version is 4.29-4-gd795, compiled on 23:01:30 Oct 19 2014: seqtools-4.29-4-gd795.tar.gz

Installation

Prerequisites

The SeqTools package requires GTK+ version 2.12 or later to be installed on your machine.

For more details, see the README file in the source code.

Installation

To install on either Linux or Mac OS X:

  • Download the source code from the Download page. The downloaded file will be a tar file named seqtools-XXX.tar.gz, where XXX is the version number.
  • Unpack the tar file, e.g. by typing the following in a terminal:
    tar -xf seqtools-XXX.tar.gz
    
    This will create a directory called seqtools-XXX.
  • To install the package in the default location (usually /usr/bin), open a terminal in the seqtools-XXX directory and type the following commands:
    ./configure 
    make
    make install

Tips

  • You may need to run make install using sudo if you do not have root privileges, i.e.:
    sudo make install
    
  • Alternatively, to install to a different location (e.g. one not requiring root privileges), use the --prefix argument when you run ./configure. For example, the following command would set the install location to foo/bar in your home directory:
    ./configure --prefix=~/foo/bar
    
  • If GTK+ is not in the default location (/usr/lib) then you will need to pass its location to the configure script. GTK+ is usually installed in /usr/lib, /usr/local/lib, /opt/lib or /opt/local/lib. If GTK+ is in /opt/local/lib then you would call configure with the following arguments:
    ./configure PKG_CONFIG_PATH=/opt/local/lib/pkgconfig LDFLAGS="-Xlinker -rpath -Xlinker /opt/local/lib"
    
  • If you need to install GTK+ on a Mac, we recommend using MacPorts. Once MacPorts is installed you can install GTK+ using the following command:
    sudo port install gtk2
    

Documentation

Getting started

Run the programs without arguments to see their usage information, or try out the examples given in the examples directory of the source-code download.

For more details, see the README file in the source code.

Help pages

Help pages, including a quick-start guide and user manual, are installed along with the programs. They can be accessed from within the programs using either the Help menu, the lifebuoy icon on the toolbar, or the Ctrl-H keyboard shortcut. They are included in the doc/User_doc directory in the source code and can also be viewed here.

User manuals

User manuals are installed along with the programs. The manuals for the current production versions can also be downloaded here:

Other documentation

Other documentation, such as design notes, is included in the doc directory in the source-code. It can also be viewed here

Publications

  • Scoredist: a simple and robust protein sequence distance estimator.

    Sonnhammer EL and Hollich V

    BMC bioinformatics 2005;6;108

  • A dot-matrix program with dynamic threshold control suited for genomic DNA and protein sequence analysis.

    Sonnhammer EL and Durbin R

    Gene 1995;167;1-2;GC1-10

  • A workbench for large-scale sequence homology analysis.

    Sonnhammer EL and Durbin R

    Computer applications in the biosciences : CABIOS 1994;10;3;301-7

Contact

SeqTools is maintained by the Annotools team at the Sanger Institute.

* quick link - http://q.sanger.ac.uk/xhg2bnm0