Crumble

Crumble

Crumble

Overview

Crumble performs lossy compression of aligned DNA sequencing quality values, substantially reducing the size of CRAM files.

After alignment (for example by bwa) and sorting to coordinate order, Crumble analyses the sequence pileup to identify sites where the quality values are not necessary for accurate prediction of the presence, or otherwise, of a variant.   In places where the qualities appear to have no impact, they are replaced by a binary high or low quality based on whether the base call is in agreement with the likely variant call or not.  Note this process assumes a single diploid sample and the tool is not designed to work on other data sets, including cancer samples.

Options also exist to perform lossy compression of read names and removal of certain auxiliary tags.

Download and Installation

Crumble can be obtained from https://github.com/jkbonfield/crumble.

Learn and Support

To report problems, please use the github issue tracker.

License and Citation

Crumble uses the BSD Open Source software license.

Authors

Sanger Contributors

Publications

  • Crumble: reference free lossy compression of sequence quality values.

    Bonfield JK, McCarthy SA and Durbin R

    Bioinformatics (Oxford, England) 2019;35;2;337-339