Cellular Genetics Informatics
We provide informatics support for the Human Cell Atlas, Human Induced Pluripotent Stem Cell Initiative and Human BioMolecular Atlas Program. We work with large amounts of single-cell RNA sequencing and microscopy imaging data.
More information about our group is available in our Travel Guide.
- provides comprehensive bioinformatics support for the Cellular Genetics programme
- develops and supports single-cell Nextflow analysis pipelines for various experimental protocols, including 10X, Smartseq2, scATACseq, TraCeR/BraCeR, CITEseq, inDrop. All of our pipelines have full HPC (LSF) and OpenStack Cloud integrations and are avaiable at GitHub
- enables data sharing with external collaborators
- facilitates data submission to ArrayExpress
- maintains and supports the programme’s Jupyter Hub dedicated for the data secondary analysis. We provide software support and a set of standard analysis notebooks, including Seurat, scanpy, single cell data integration notebooks
- enables data access for external collaborators via Jupyter Hub
- develops the programme’s internal Web portal, which includes a multiuser sample tracker, ability for the users to run our pipelines from the web and integration with our Jupyter Hub
- develops and supports the Cellular Genetics Imaging portal which allows the users to both run the imaging analysis pipelines and visualise their results
- develops GPU-accelerated imaging pipelines
- deploys bespoke and the publication supporting websites containing models and concise data visualisations
- organises workshops for the users on different topics, including Git, Jupyter Notebooks, Docker/Cloud, Rstudio/Shiny, Nextflow.
Illustrator: Christina Usher
Our Software/Analysis Stack
We run our pipelines and perform analysis on the Sanger’s High Performance Compute cluster (thousands of cores orchestrated by LSF) and the Sanger’s OpenStack Flexible Compute environment (private OpenStack cloud with thousands of cores orchestrated by Kubernetes). We use the following software/analysis stack (more information is in our GitHub organisation):
- Back End: Kubernetes, Docker, Singularity, Python
- Interactive Analysis Environments: Jupyter Hub
- Pipelines runners: Nextflow
- Secondary Analysis: R, python, C, bash
- Imaging: Omero, Bio-Formats, Fiji, cellprofiler, StarFish
Our team is growing, please check the Sanger vacancies website for more information.
We have an excellent experience with student internships and apprenticeships. We welcome students with their own funding (with a possibility of topping it up) to work on both infrastructure and research projects.
Previous team members
We seek to explore the vast cellular diversity in the human brain using large-scale spatial transcriptomics, imaging and functional screens.
Connecting human development and disease
Our research sits at the interface of cancer genomics and single cell transcriptomics. Our aim is to unravel the identity and ...
Single-cell multi-omics tissue mapping and organoid-based disease modeling
We work on a range of tissues and employ multiple cutting-edge tools such as CRISPR perturbation and organoid modeling.
Core Software Services
Informatics and Digital Solutions (Web, Web security and Core Bioinformatics)
Core Software Services comprises: Core Web Team; Core Bioinformatics (CoreBio) and; Core Web security.
Genomics of immune cell populations at single-cell resolution
Based at the Wellcome Sanger Institute and Newcastle University Biosciences Institute, we study the human immune system at single-cell resolution using ...
Informatics Support Group
High Performance Computing
We deliver the at-scale computational platforms that enable the Sanger Institute’s scientists to deliver genomic research that others are unable ...
New Pipeline Group (NPG)
NPG is responsible for the delivery of DNA Pipelines's data products and the provision of informatics expertise and QC systems.
Understanding human DNA function by engineering
Our goal is to mechanistically understand impact of mutations in human DNA. To do so, we engineer DNA variation in cells, ...
Stegle and Theis Group
Cellular Genetics Programme
We aim to leverage machine learning in the context of single cell genomics to provide a true model-based understanding of the ...
Gene expression genomics
We use cutting edge single cell genomics technologies and computational methods to understand genes, proteins and cells in human health and ...
The Trynka group combines experimental and computational approaches to study how genetics control the immune system and predispose individuals to autoimmune ...
Rodent models of malaria
At the Sanger Institute Oliver Billker's group used experimental genetics in rodent models to study the basic biology of malaria ...
Genomics of gene regulation
Gene expression involves the transformation of genetic information encoded in DNA sequence into a gene product, such as a protein. Regulation ...
Quantitative models of gene expression
The Hemberg group is interested in developing quantitative models of gene expression. Our approach is theoretical and we strive to develop ...