Data Quality Control (QC)
The areas in which the Data QC Team are involved in are as follows:
Data QC of all Illumina® Runs
DNA Pipelines Operations can have thousands of incoming samples weekly which are usually varied, often complex. The Data QC team are experienced at QC analysis of everything from high throughput whole genome and transcriptome sequencing through to more challenging samples such as G&T sequencing, bisulphite, Hi-C, GC/AT-rich, cancerous, malarial, single cell, pull-down from custom baits and custom primer sequencing. Each of these sample types have their own characteristics and need a broad knowledge base. Each sample type needs to be appropriately understood so we can be sure we are managing customer’s expectations and delivering the best output whilst considering any limitations for the given platform/process combination.
Monitoring standards and data outputs
The Data QC team work closely with our own in-house Scientific Service Representatives (SSRs), Production Software Development and DNA Pipelines Informatics teams and externally with Illumina® to both ensure that the data we release is the best that our customers can expect within their own research aims and for further improvements to both output and quality of data. This can include evaluating new reagents or platforms, but also examining trends within our processes and within each of our platforms – NovaSeq 6000, HiSeq X Ten, HiSeq 4000 and MiSeq. These processes require continual re-evaluation but is vital to ensure standards are maintained.
Troubleshooting of problematic runs
Whilst we have robust systems in place inevitably there are problems encountered within the sequencing runs, identified at the QC stage. We identify and understand the nature of the issue in order to determine the best course of action for resolution, working closely with other teams within DNA Pipelines. We provide troubleshooting to ensure that subsequently the correct data is sent to the appropriate customer so they can have high confidence in using it.
25 Genomes for 25 Years
The project's primary goal was to sequence 25 novel genomes representing UK biodiversity, as part of the Wellcome Sanger Institute' ...
DNA Pipelines Informatics
The Sequencing Informatics group ensures that the harvesting, storage and analysis of DNA geynotype and sequence information at the Sanger Institute ...
High-Throughput DNA Sequencing
The High Throughput DNA sequencing team within DNA Pipelines Operations is a highly automated high throughput team specialising in producing libraries ...
High-Throughput RNA and Laser Capture Microdissection Biopsy (LCMB) Sequencing
The High Throughput RNA and LCMB Sequencing team within DNA Pipelines Operations is a high throughput Illumina® library creation team ...
Long Read Sequencing
The long read sequencing team within DNA Pipelines Operations at the Wellcome Sanger Institute provides support for research projects requiring ...
New Pipeline Group (NPG)
DNA Pipelines Informatics
NPG is responsible for DNA Pipelines's production informatics analysis pipelines, Illumina sequencing QC tools and expertise, and internal archiving of ...