Crumble

Crumble - Lossy compression of DNA sequence quality values

Crumble - Lossy compression of DNA sequence quality values

Crumble performs lossy compression of aligned DNA sequencing quality values, substantially reducing the size of CRAM files.

After alignment (for example by bwa) and sorting to coordinate order, Crumble analyses the sequence pileup to identify sites where the quality values are not necessary for accurate prediction of the presence, or otherwise, of a variant. In places where the qualities appear to have no impact, they are replaced by a binary high or low quality based on whether the base call is in agreement with the likely variant call or not. Note this process assumes a single diploid sample and the tool is not designed to work on other data sets, including cancer samples.

Options also exist to perform lossy compression of read names and removal of certain auxiliary tags.

Downloads

Crumble can be obtained from https://github.com/jkbonfield/crumble.

Further information

To report problems, please use the github issue tracker.

Crumble uses the BSD Open Source software license.