IT

Pipeline Solutions

Overview

Delivering the software solutions, programs and workflows to enable seamless, standardized, reproducible storage and access to data the produced by the scientific experiments, large-scale cellular generation and analysis, and DNA sequencing operations of the Sanger Institute

The Pipelines Solution Programme is led by Graham Beaver, Associate Director of Pipeline Solutions.

The department works in close partnership with Sanger’s Research Programmes and Scientific Operations. Pipelines Solutions teams work agilely to run and improve existing data pipelines and workflows, and rapidly onboard and scale up new solutions.

Key Functions

The key functions of the Pipeline Solutions Division are:

  • Designing, developing, and operating new and existing pipelines for the Sanger Institute’s research programmes and supporting scientific operations. This work includes horizon-scanning and advising on emerging technologies and approaches to deliver new pipelines in the fields of spatial-omics and synthetic genomics.
  • Delivering informatics services that align the data, informatics and infrastructure of the Institute’s research programmes and scientific operations to enable researchers to exploit the opportunities that these rich data offer.
  • Designing, developing, and supporting new and existing Laboratory Information Management Systems (LIMS) and Electronic Lab Notebooks (ELN), with improved reporting and search functions.

Capabilities and Products

The Pipeline Solutions division seeks to add value to Sanger’s science by providing a range of Data Products and Knowledge Products:

Data Products

  • QC Data Product – developing and delivering QC Metrics, and providing interpretation as required, relating to the results of the Institute’s DNA Sequencing Operations.
  • Sequence Data Product – covering the full life cycle of the Sanger’s sequencing data: from creation through to disposal. This includes ongoing maintenance, such as changes to the Meta Data while in centralised storage, along with any special sequencing requests.
  • Post Sequencing Analysis – working with the science teams across all research programmes to identify standard medium- to high- volume post sequencing analysis requirements that can be delivered centrally. We design our solutions with the expectation that they will develop into Multi-Omics approaches.
  • Data Publishing – developing and delivering data publishing processes to internal and external databases and partners – for example to the EMBL-EBI’s ENA and EGA databases and archives.
  • Cell-Line Data – providing relevant data to both the Institute’s research programmes and scientific operations teams. This achieved by delivering cell line and associated data to Sanger’s researchers, and providing related process data for cell lines.
  • CRISPR target data – assisting the design of CRISPR-based experiments by providing solutions to off-target scoring and visualising target sites in the genomic context.

Platform and Process Products

  • Core LIMS & ELN Platform – covering development/configuration, operation, and support of LIMS Platform for both Cellular and Sequencing, including Sample Management, Sample Tracking, Registry and Inventory, Task Flow, barcoding, Reporting and Dashboarding, Search, and Audit.
  • Sequencing Specific LIMS Processes – configuring the LIMS to meet the requirements of Sequencing Specific processes, built upon the Core LIMS Platform, covering both Long Read and Short Read.
  • Cellular Specific LIMS Processes – configuring the LIMS to meet the requirements for Cellular Biology specific processes, built upon the Core LIMS Platform.
  • Spatial Specific LIMS Processes – configuring the LIMS to meet the requirements for Spatial specific processes, built upon the Core LIMS Platform.
  • Sample / Inventory Tracking and Management LIMS Processes – configuring the LIMS required for Sample Tracking and Inventory tracking and management specific processes, built upon the Core LIMS Platform.
  • Lab Automation Platform Support – configuring Lab Automation platforms (lab robotics) to support the scaling of scientific operations.

Knowledge Products

  • Support to R&D – including for the introduction of new Sequencing Technologies and Cellular Operations Processes.
  • Sequencing Consultancy and Support – particularly in support in investigation into issues within sequencing operations, this includes the use of Machine Learning to assist in investigations and the creation of early detection methods.

Our Approach

We support and empower our staff to deliver our services by:

  • providing the tools and training that enable them to perform their roles efficiently
  • encouraging and providing space for innovation
  • promoting and driving digital transformation
  • creating a culture where our members are supported to fully engage with, and understand the needs of, engaging with and understanding the needs of our researchers and scientific operations colleagues through active partnerships.

Innovation

As well as providing ongoing support for the sequencing and informatics operations of the Wellcome Sanger Institute, we actively support and provide advice on a wide range of areas of Research and Innovation, for the Institute’s Research Programmes and Scientific Operations.  We focus upon the scaling up of informatics operations through Pipelines and Control through the LIMS platforms, and are supporting the following areas of innovation:

  • Spatial and Multi-Omics – the growth of imagery and Spatial analysis within Institute, has resulted in a number of new opportunities, including the creation of Spatial Analysis pipelines, management of spatial data processes within the LIMS Platform, analysis algorithms and AI/ML approaches, to support the upscaling of this capability, as well as being able to link the Spatial and Sequencing Datasets (Multi-Omics) for new analysis techniques;
  • Machine Learning / Artificial Intelligence (ML/AI) – advances in both ML and AI in bioinformatics is being embraced across the Wellcome Sanger Institute.  ML / AI within Informatics Pipelines will be enhanced to provide greater insight and classification of results, and will also through the mining of data points contained within the LIMS system, will be able to provide insight into processes, including increasing efficiencies and finding causes of failures within our operations;
  • Robotics and Lab Automation – the adoption of Robotics and Lab Automation approaches will continue. with further integration with the LIMS environment.