ssahaSNP: Sequence Search and Alignment by Hashing Algorithm

ssahaSNP is a polymorphism detection tool. It detects homozygous SNPs and indels by aligning shotgun reads to the finished genome sequence.

Highly repetitive elements are filtered out by ignoring those kmer words with high occurrence numbers. For those less repetitive or non-repetitive reads, we place them uniquely on the reference genome sequence and find the best alignment according to the pair-wise alignment score if there are multiple seeded regions. From the best alignment, SNP candidates are screened, taking into account the quality value of the bases with variation as well as the quality values in the neighbouring bases, using neighbourhood quality standard (NQS). For insertions/deletions, we check if the same indel is mapped by more than one read, ensuring the detected indel with high confidence.

For an academic licence please send us an email to We will provide limited support for academic users, please try to provide enough information for us to recreate any problems you may have. We would be happy to consider any suggestions for new features. If you require binaries for another platform, we will build them if we have one. If you use ssahaEST to accomplish any scientific work please cite this web page.

Usage instructions are obatained from the binary with a '-h' option.

There is a readme file describing how to do a complete analysis using ssahaSNP, parseSNP and parseIndel. Further documentation and a preprint can be found in our ftp directory.

  • Adam Spargo. First point of contact for problem reports, comments etc.
  • Zemin Ning. Issues specific to SNP/INDEL detection and parsing codes.
