Archived

Vertebrate Resequencing

Informatics

Archive Page

This page is maintained as a historical record and is no longer being updated.

The Vertebrate Resequencing group has become the Tree of Life Assembly group to support the work of the Tree of Life Programme.

We were a team of bioinformaticians, software developers and genomic data scientists primarily responsible for the informatics and large scale sequencing projects for the Durbin and Adams groups.

We played lead or key roles in the data processing and analysis of large scale sequencing projects such as 1000 Genomes, Mouse Genomes Project, UK10K, HipSci, and Haplotype Reference Consortium among others.

In collaboration with the Durbin and GRIT groups at the Sanger Institute, along with a number of external partners, we joined the Vertebrate Genomes Project and Genome 10K to begin producing genome assemblies for hundreds to thousands of species, using cutting edge long-read sequencing technologies like PacBio, Oxford Nanopore and 10x alongside Illumina.

Software

We developed tools and software to manage our data management and analysis needs at scale.

BCFtools is a set of tools for variant calling and manipulating variant data stored in VCF and BCF files. We also contributed to the development of HTSlib and SAMtools.

We developex pipelines and pipeline management systems to track and process our data. The 1000 Genomes and UK10K projects were made possible using the VRPipe and vr-runner systems. With the Sanger Institute recently moving to a cloud oriented compute infrastructure we are developing a new workflow runner (wr) system.

Services

As part of our work with the Haplotype Reference Consortium, we developed a free genotype imputation and phasing service, the Sanger Imputation Service.

Our people

Group lead

Dr Shane A. McCarthy

Tree of Life Assembly Team Lead

Shane leads the Vertebrate Resequencing team, who are responsible for handling the informatics and large scale sequencing projects for the Durbin and Adams groups.

Core team

Mr Sendu Bala

Principle Software Developer

Previous core team members

Dr Dirk-Dominik Dolle

Senior Bioinformatician

Yasin Memari

Senior Bioinformatician

Associated research

Collaborations

Collaboration

Haplotype Reference Consortium

The Haplotype Reference Consortium (HRC) was a collaboration to create a large reference panel of human haplotypes by combining together sequencing ...

Collaboration

HipSci

Hundreds of induced pluripotent stem cell lines for cellular genetic analysis

Collaboration

UK10K Project

The UK10K project enabled researchers in the UK and beyond to better understand the link between low-frequency and rare genetic changes, ...

Tools & software

Tool

SAMtools / BCFtools / HTSlib

SAMtools, BCFtools. and HTSlib are tools for manipulating sequence alignment (SAM, BAM, CRAM) and variant call (VCF and BCF) files.

Tool

Sanger Imputation Service

A free genotype imputation and phasing service provided by the Wellcome Sanger Institute.

Data

Data set

1000 Genomes

The 1000 Genomes Project developed a new map of the human genome at a resolution that was unmatched by other ...

Data set

Mouse Genomes Project

The Mouse Genomes Project is an ongoing effort to catalog all forms of genetic variation between the common laboratory mouse strains ...

Related groups

Science group

Adams Group

Somatic Functional Genomics and Cancer

We are a team of cancer biologists, geneticists and computational biologists interested in understanding how cancers develop and the ways of ...

Science group

Durbin Group

Computational Genomics

Population and evolutionary genomics, novel computational genomics methods, and related mathematical and statistical models.

Science group

Genome Reference Informatics Team

Tree of Life Programme

The Genome Reference Informatics Team analyses genome assemblies to reveal and correct quality issues and to identify and add variation. It ...

Science group

New Pipeline Group (NPG)

Sequencing Informatics

NPG is responsible for the delivery of DNA Pipelines's data products and the provision of informatics expertise and QC systems.

Science group

Sequence Analysis and Management (SAM)

Science Support - Informatics and Digital Solutions

The Sequence Analysis and Management team contributes to various software packages for processing DNA sequence data, including samtools, htslib, biobambam and ...

Science group

Tyler-Smith Group

Human evolution

We study variation in the DNA of people from different parts of the world, and also in related species such as ...

Publications

Loading publications...

Careers and Study

Policies

Archive

Leadership

Faculty

Vertebrate Resequencing

Archive Page

Software

Services

Our people

Group lead

Dr Shane A. McCarthy

Core team

Mr Sendu Bala

Previous core team members

Dr Dirk-Dominik Dolle

Yasin Memari

Associated research

Haplotype Reference Consortium

HipSci

UK10K Project

SAMtools / BCFtools / HTSlib

Sanger Imputation Service

1000 Genomes

Mouse Genomes Project

Related groups

Adams Group

Durbin Group

Genome Reference Informatics Team

New Pipeline Group (NPG)

Sequence Analysis and Management (SAM)

Tyler-Smith Group

Publications