Vertebrate Resequencing

Informatics

We are a team of bioinformaticians, software developers and genomic data scientists primarily responsible for the informatics and large scale sequencing projects for the Durbin and Adams groups.

Projects

We have played lead or key roles in the data processing and analysis of large scale sequencing projects such as 1000 Genomes, Mouse Genomes Project, UK10K, HipSci, and Haplotype Reference Consortium among others.

Recently, in collaboration with the Durbin and GRIT groups at the Sanger Institute, along with a number of external partners, we have joined the Vertebrate Genomes Project and Genome 10K to begin producing genome assemblies for hundreds to thousands of species, using cutting edge long-read sequencing technologies like PacBio, Oxford Nanopore and 10x alongside Illumina.

Software

We develop tools and software to manage our data management and analysis needs at scale.

BCFtools is a set of tools for variant calling and manipulating variant data stored in VCF and BCF files. We also contribute to the development of HTSlib and SAMtools.

We develop pipelines and pipeline management systems to track and process our data. The 1000 Genomes and UK10K projects were made possible using the VRPipe and vr-runner systems. With the Sanger Institute recently moving to a cloud oriented compute infrastructure we are developing a new workflow runner (wr) system.

Services

As part of our work with the Haplotype Reference Consortium, we have developed a free genotype imputation and phasing service, the Sanger Imputation Service.

Core team

Photo of Mr Sendu Bala

Mr Sendu Bala

Senior Software Developer

Previous team members

Photo of Yasin Memari

Yasin Memari

Senior Bioinformatician

Photo of Dr Dirk-Dominik Dolle

Dr Dirk-Dominik Dolle

Senior Bioinformatician

 

Publications

Loading publications...