Cellular Genetics Informatics
We provide informatics support for the Human Cell Atlas, Human Induced Pluripotent Stem Cell Initiative and Human BioMolecular Atlas Program. We work with large amounts of single-cell RNA sequencing and microscopy imaging data.
More information about our group is available in our Travel Guide.
- provides comprehensive bioinformatics support for the Cellular Genetics programme
- develops and supports single-cell Nextflow analysis pipelines for various experimental protocols, including 10X, Smartseq2, scATACseq, TraCeR/BraCeR, CITEseq, inDrop.
All of our pipelines have full HPC (LSF) and OpenStack Cloud integrations
- enables data sharing with external collaborators
- facilitates data submission to ArrayExpress
- maintains and supports the programme’s Jupyter Hub dedicated for the data secondary analysis. We provide software support and a set of standard analysis notebooks, including Seurat, scanpy, single cell data integration notebooks
- enables data access for external collaborators via Jupyter Hub
- develops the programme’s internal Web portal, which includes a multiuser sample tracker, ability for the users to run our pipelines from the web and integration with our Jupyter Hub
- develops and supports the Cellular Genetics Imaging portal which allows the users to both run the imaging analysis pipelines and visualise their results
- develops GPU-accelerated imaging pipelines
- deploys bespoke and the publication supporting websites containing models and concise data visualisations
- organises workshops for the users on different topics, including Git, Jupyter Notebooks, Docker/Cloud, Rstudio/Shiny, Nextflow.
Illustrator: Christina Usher
Our Software/Analysis Stack
We run our pipelines and perform analysis on the Sanger’s High Performance Compute cluster (thousands of cores orchestrated by LSF) and the Sanger’s OpenStack Flexible Compute environment (private OpenStack cloud with thousands of cores orchestrated by Kubernetes). We use the following software/analysis stack (more information is in our GitHub organisation):
- Back End: Kubernetes, Docker, Singularity, Terraform, Ansible
- Interactive Analysis Environments: Jupyter Hub
- Pipelines runners: Nextflow
- Secondary Analysis: R, python, C, bash
- Imaging: Omero, Bio-Formats, Fiji, cellprofiler, StarFish
Our team is growing, please check the Sanger vacancies website for more information.
We have an excellent experience with student internships and apprenticeships. We welcome students with their own funding (with a possibility of topping it up) to work on both infrastructure and research projects.
Previous team members
Informatics Support Group
High Performance Computing
Our Informatics support team is responsible for both developing and providing scale out scientific compute platforms that can both meet todays ...
New Pipeline Group (NPG)
NPG is responsible for the delivery of DNA Pipelines's data products and the provision of informatics expertise and QC systems.
Function of human DNA and its variation
Our goal is to understand how genetic background influences outcome of mutations. To do so, we measure, model, and modulate cell ...
Gene expression genomics
We use cutting edge single cell genomics technologies and computational methods to understand genes, proteins and cells in human health and ...
The Trynka group combines experimental and computational approaches to study how genetics control the immune system and predispose individuals to autoimmune ...
Rodent models of malaria
At the Sanger Institute Oliver Billker's group used experimental genetics in rodent models to study the basic biology of malaria ...
Genomics of gene regulation
Gene expression involves the transformation of genetic information encoded in DNA sequence into a gene product, such as a protein. Regulation ...
Quantitative models of gene expression
The Hemberg group is interested in developing quantitative models of gene expression. Our approach is theoretical and we strive to develop ...