BlobToolKit

BlobToolKit is a software suite to aid researchers in identifying and isolating non-target data in draft and publicly available genome assemblies.

BlobToolKit can be used to process assembly, read and analysis files for fully reproducible interactive exploration in the browser-based Viewer. BlobToolKit can be used during genome sequence assembly to filter non-target DNA, helping researchers produce assemblies with high biological credibility.

About

The reconstruction of genomes of target species uses sequence data produced by instruments that are unaware of the species-of-origin for which we want them to produce data. This may result in genome assemblies being compromised by contaminant DNA from other organisms, for example bacteria or parasites. Whether introduced during the processing of samples or through co-extraction alongside the target DNA, if insufficient care is taken during the assembly process, the final assembled genome may be a muddle of data from several species. Such assemblies can confound sequence-based biological inference and, when deposited in public databases, may be included in later analytical research by users unaware of underlying problems.

We present BlobToolKit, a software suite to aid researchers in identifying and isolating non-target data in draft and publicly available genome assemblies. BlobToolKit can be used to process assembly, read and analysis files for fully reproducible interactive exploration in the browser-based Viewer. BlobToolKit can be used during assembly to filter non-target DNA, helping researchers produce assemblies with high biological credibility.

We have been running an automated BlobToolKit pipeline on eukaryotic assemblies publicly available in the International Nucleotide Sequence Data Collaboration and are making the results available through a public instance of the Viewer at https://blobtoolkit.genomehubs.org/view. We aim to complete analysis of all publicly available genomes and then maintain currency with the flow of new genomes. We have worked to embed these views into the presentation of genome assemblies at the European Nucleotide Archive, providing an indication of assembly quality alongside the public record with links out to allow full exploration in the Viewer.

Further information

Browse our list of Tutorials to learn how to use the Viewer then visit our public instance at blobtoolkit.genomehubs.org/view or check out our open-source code on GitHub to start using it on your own datasets.

Contact

If you need help or have any queries, please contact us using the details below.

Via email, please contact blobtoolkit@genomehubs.org. Or reach us on Twitter @rjchallis or @blaxterlab.


Sanger Institute Contributors

Photo of Professor Mark Blaxter

Professor Mark Blaxter

Programme Lead for Tree of Life Programme and Senior Group Leader

Photo of Dr Richard Challis

Dr Richard Challis

Senior Bioinformatician

Previous contributors

 
See full index

Publications

Loading publications...