LookSeq

LookSeq is a web-based application for alignment visualization, browsing and analysis of genome sequence data.

LookSeq supports multiple sequencing technologies, alignment sources, and viewing modes; low or high-depth read pileups; and easy visualization of putative single nucleotide and structural variation. The visible range, from whole chromosome to single base resolution, can be set manually or by scrolling or zooming the display with fast, on-the-fly rendering from the server-side alignment database. LookSeq uses a universal database for alignments of different sequencing technologies and algorithms. Sequence data from multiple sources can be viewed separately or aligned in a single display, facilitating direct comparison between datasets. LookSeq can also link to relevant external sites such as PubMed and other online analysis tools, via buttons or double-clicking on the displayed sequence annotation.

LookSeq requires no setup or installation, and is very intuitive to use.

[Genome Research Limited]

Software

Server

Source code

Related

Tutorial

The following is a guided tour through LookSeq, our read alignment viewer.

  1. When you enter the LookSeq demo page, you will see a list of samples on the right side of your browser window. Select "sample 2a". LookSeq now shows the alignment of paired Solexa/Illumnia reads to the reference sequence of P. falciparum (3D7), chromosome 1.
    • The X axis of the plot represents the entire chromosome 1 (positions 1 through 643292).
    • The Y axis of the plot represents the apparent fragment size of read pairs, that is, the distance between the two halfs of the read pair as they align to the reference sequence.
    • The yellow band highlights the expected fragment size.
    • Blue marks denote perfect matches (reads completely identical to the reference), red marks denote mismatches to the reference (potential SNPs).
    • Grey marks within the X scale show the positions of potential SNPs.
    • Below the X axis is an annotation display.
      • Coding sequences (exons) are shown in blue, repeats in red, the centromere in green.
      • You can toggle the annotation display by (un)selecting the "annotation" checkbox right above the display.
  2. Now, click on the "50kb" button above the display. The display now zooms in to show a region of 50 kilobases. (You can always return to the initial display via the "Full chromosome" button.) At this zoom level, you can now begin to make out individual read pairs. The alignments are dense in some regions (usually around coding regions) and sparse in others.
  3. Find the dense region of reads on the left side of the display, around position 310.000 (annotated as "PFA0375c"). Move your mouse cursor to this read cluster, hold down the mouse button, and "drag" the display to the right, so that the mouse cursor and the region beneath it are roughly in the middle of the display. Once there, release the left mouse button. The display will now update, showing you data in the area that was gray during the "dragging". This is a basic method to move along the chromosome in either direction. Another is using the "arrow" buttons ? and ? on the top left of the page, beneath the "MAL1" chromosome selector. Try dragging the display a few times to get a feel for it, but try to keep the dense read region in view.
  4. Double-click into the dense region. The display will center around the position on the chromosome you double-clicked on, and zoom in one level. You can zoom in and out one level at a time using the appropriate buttons (next to the arrow buttons).
  5. Now drag the display so the dense region is located in the middle of the display, if neccessary. Then, click on the "2kb" button. You now see a region 2 kilobases in size. Within the cluster of blue reads, you can see vertical red markings; few appear to be randomly distributed, but most will align at one of three vertical positions. These indicate a potential SNP. (There are also "red reads" above and beneath the yellow band; these indicate inversions. You can disable them by unselecting the "inversions" checkbox above the display.)
  6. Find the potential SNP between positions 308.000 and 308.500, and double-click on it. Then, click on the "1:1" button. You are now at the highest zoom setting. The potential SNP shows "A"s at each marking, indicating that each read contains an A at this position.
  7. At this zoom level, a "classic" plieup view can be helpful. Change the radio button from "Paired reads" to "Pileup" (in the row that contains the "MAL1" chromosome selector). You now see a "text" pileup of the reads. Perfect matches are in lowercase, mismatches in uppercase and red. Reads aligning outside the expected fragment size have a gray background. (Also, inversions will show in lower-case orange.) A look at the red SNP position (308.327) reveals that the observed bases in all reads are "A", whereas the reference shows a "G". Also, the gray "A" beneath the reference indicates that this position was marked as a "potential G=>A" SNP by other algorithms.
  8. This looks like a SNP, but you can be more sure by adding more data. The same sample has been sequenced on several Solexa lanes, and LookSeq can merge that data into a single view. On the right side of your browser window, at the bottom of the "Lanes" list, click on the button "Multiple lanes". The radio buttons will change to checkboxes, and the button changes to "Single lane". Check the other lanes of this sample, labeled "sample 2b" through "sample 2g". To update the display (only neccessary once), click on the "Update image" button (next to the "MAL1" selector). While this gives nice confirmation, it shows the limitation of viewing large pileups for Solexa-based alignments.
  9. Let's look for potential deletions. Follow this link to a region on chromosome 1. On either side of the display, you can see a "tower"-like structure of read pairs. Such a tower is formed by read pairs that map farther apart than they should, based on their fragment size. Assuming the fragments are within the expected range, the fact that their distance is greater on the reference that in the sample indicates a deletion. A cluster of such reads results in the characteristic tower-structure. That these reflect deletions is supported by the fact that little or no reads match within the "tower gap"; that gap contains the deletion boundary, making the reads too different from the reference sequence to map.
* quick link - http://q.sanger.ac.uk/h5xge7g4