Vertebrate Resequencing | Informatics

Vertebrate Resequencing | Informatics

Vertebrate Resequencing

vertebrate-resequencing-informatics-3.jpgSanger Institute, Genome Research Limited

Our Research and Approach

We are a team of bioinformaticians, software developers and genomic data scientists primarily responsible for the informatics and large scale sequencing projects for the Durbin and Adams groups.


We have played lead or key roles in the data processing and analysis of large scale sequencing projects such as 1000 Genomes, Mouse Genomes Project, UK10K, HipSci, and Haplotype Reference Consortium among others.

Recently, in collaboration with the Durbin and GRIT groups at the Sanger Institute, along with a number of external partners, we have joined the Vertebrate Genomes Project and Genome 10K to begin producing genome assemblies for hundreds to thousands of species, using cutting edge long-read sequencing technologies like PacBio, Oxford Nanopore and 10x alongside Illumina.


We develop tools and software to manage our data management and analysis needs at scale.

BCFtools is a set of tools for variant calling and manipulating variant data stored in VCF and BCF files. We also contribute to the development of HTSlib and SAMtools.

We develop pipelines and pipeline management systems to track and process our data. The 1000 Genomes and UK10K projects were made possible using the VRPipe and vr-runner systems. With the Sanger Institute recently moving to a cloud oriented compute infrastructure we are developing a new workflow runner (wr) system.


As part of our work with the Haplotype Reference Consortium, we have developed a free genotype imputation and phasing service, the Sanger Imputation Service.


McCarthy, Shane
Dr Shane A. McCarthy
Group Leader

Shane leads the Vertebrate Resequencing team, who are responsible for handling the informatics and large scale sequencing projects for the Durbin and Adams groups.

Show Alumni


Key Projects, Collaborations, Tools & Data

Partners and Funders

Internal Partners


  • Birth, expansion, and death of VCY-containing palindromes on the human Y chromosome.

    Shi W, Massaia A, Louzada S, Handsaker J, Chow W et al.

    Genome biology 2019;20;1;207

  • Crumble: reference free lossy compression of sequence quality values.

    Bonfield JK, McCarthy SA and Durbin R

    Bioinformatics (Oxford, England) 2019;35;2;337-339

  • BCFtools/csq: haplotype-aware variant consequences.

    Danecek P and McCarthy SA

    Bioinformatics (Oxford, England) 2017;33;13;2037-2039

  • Rare Variant Analysis of Human and Rodent Obesity Genes in Individuals with Severe Childhood Obesity.

    Hendricks AE, Bochukova EG, Marenne G, Keogh JM, Atanassova N et al.

    Scientific reports 2017;7;1;4394

  • Enrichment of low-frequency functional variants revealed by whole-genome sequencing of multiple isolated European populations.

    Xue Y, Mezzavilla M, Haber M, McCarthy S, Chen Y et al.

    Nature communications 2017;8;15927

  • Common genetic variation drives molecular heterogeneity in human iPSCs.

    Kilpinen H, Goncalves A, Leha A, Afzal V, Alasoo K et al.

    Nature 2017;546;7658;370-375

  • Whole-Genome Sequencing Coupled to Imputation Discovers Genetic Signals for Anthropometric Traits.

    Tachmazidou I, Süveges D, Min JL, Ritchie GRS, Steinberg J et al.

    American journal of human genetics 2017;100;6;865-884

  • Whole-genome view of the consequences of a population bottleneck using 2926 genome sequences from Finland and United Kingdom.

    Chheda H, Palta P, Pirinen M, McCarthy S, Walter K et al.

    European journal of human genetics : EJHG 2017;25;4;477-484

  • Exploring the genetic architecture of inflammatory bowel disease by whole-genome sequencing identifies association at ADCY7.

    Luo Y, de Lange KM, Jostins L, Moutsianas L, Randall J et al.

    Nature genetics 2017;49;2;186-192

  • Using reference-free compressed data structures to analyze sequencing reads from thousands of human genomes.

    Dolle DD, Liu Z, Cotten M, Simpson JT, Iqbal Z et al.

    Genome research 2017;27;2;300-309

  • Whole-exome sequencing of 228 patients with sporadic Parkinson's disease.

    Sandor C, Honti F, Haerty W, Szewczyk-Krolikowski K, Tomlinson P et al.

    Scientific reports 2017;7;41188

  • Discovery and refinement of genetic loci associated with cardiometabolic risk using dense imputation maps.

    Iotchkova V, Huang J, Morris JA, Jain D, Barbieri C et al.

    Nature genetics 2016;48;11;1303-1312

  • Reference-based phasing using the Haplotype Reference Consortium panel.

    Loh PR, Danecek P, Palamara PF, Fuchsberger C, A Reshef Y et al.

    Nature genetics 2016;48;11;1443-1448

  • A reference panel of 64,976 haplotypes for genotype imputation.

    McCarthy S, Das S, Kretzschmar W, Delaneau O, Wood AR et al.

    Nature genetics 2016;48;10;1279-83

  • Whole-exome sequencing in an isolated population from the Dalmatian island of Vis.

    Jeroncic A, Memari Y, Ritchie GR, Hendricks AE, Kolb-Kokocinski A et al.

    European journal of human genetics : EJHG 2016;24;10;1479-87

  • TTC25 Deficiency Results in Defects of the Outer Dynein Arm Docking Machinery and Primary Ciliary Dyskinesia with Left-Right Body Asymmetry Randomization.

    Wallmeier J, Shiratori H, Dougherty GW, Edelbusch C, Hjeij R et al.

    American journal of human genetics 2016;99;2;460-9

  • DNAH11 Localization in the Proximal Region of Respiratory Cilia Defines Distinct Outer Dynein Arm Complexes.

    Dougherty GW, Loges NT, Klinkenbusch JA, Olbrich H, Pennekamp P et al.

    American journal of respiratory cell and molecular biology 2016;55;2;213-24

  • Deficient methylation and formylation of mt-tRNA(Met) wobble cytosine in a patient carrying mutations in NSUN3.

    Van Haute L, Dietmann S, Kremer L, Hussain S, Pearce SF et al.

    Nature communications 2016;7;12039

  • Punctuated bursts in human male demography inferred from 1,244 worldwide Y-chromosome sequences.

    Poznik GD, Xue Y, Mendez FL, Willems TF, Massaia A et al.

    Nature genetics 2016;48;6;593-9

  • Health and population effects of rare gene knockouts in adult humans with related parents.

    Narasimhan VM, Hunt KA, Mason D, Baker CL, Karczewski KJ et al.

    Science (New York, N.Y.) 2016;352;6284;474-7

  • Deep Roots for Aboriginal Australian Y Chromosomes.

    Bergström A, Nagle N, Chen Y, McCarthy S, Pollard MO et al.

    Current biology : CB 2016;26;6;809-13

  • A Method for Checking Genomic Integrity in Cultured Cell Lines from SNP Genotyping Data.

    Danecek P, McCarthy SA, HipSci Consortium and Durbin R

    PloS one 2016;11;5;e0155014

  • An interactive genome browser of association results from the UK10K cohorts project.

    Geihs M, Yan Y, Walter K, Huang J, Memari Y et al.

    Bioinformatics (Oxford, England) 2015;31;24;4029-31

  • A global reference for human genetic variation.

    1000 Genomes Project Consortium, Auton A, Brooks LD, Durbin RM, Garrison EP et al.

    Nature 2015;526;7571;68-74

  • An integrated map of structural variation in 2,504 human genomes.

    Sudmant PH, Rausch T, Gardner EJ, Handsaker RE, Abyzov A et al.

    Nature 2015;526;7571;75-81

  • The UK10K project identifies rare variants in health and disease.

    UK10K Consortium, Walter K, Min JL, Huang J, Crooks L et al.

    Nature 2015;526;7571;82-90

  • Homozygous loss-of-function variants in European cosmopolitan and isolate populations.

    Kaiser VB, Svinti V, Prendergast JG, Chau YY, Campbell A et al.

    Human molecular genetics 2015;24;19;5464-74

  • Immunofluorescence Analysis and Diagnosis of Primary Ciliary Dyskinesia with Radial Spoke Defects.

    Frommer A, Hjeij R, Loges NT, Edelbusch C, Jahnke C et al.

    American journal of respiratory cell and molecular biology 2015;53;4;563-73

  • Whole-genome sequencing identifies EN1 as a determinant of bone density and fracture.

    Zheng HF, Forgetta V, Hsu YH, Estrada K, Rosello-Diez A et al.

    Nature 2015;526;7571;112-7

  • Improved imputation of low-frequency and rare variants using the UK10K haplotype reference panel.

    Huang J, Howie B, McCarthy S, Memari Y, Walter K et al.

    Nature communications 2015;6;8111

  • Deficiency of ECHS1 causes mitochondrial encephalopathy with cardiac involvement.

    Haack TB, Jackson CB, Murayama K, Kremer LS, Schaller A et al.

    Annals of clinical and translational neurology 2015;2;5;492-509

  • Ascl1 Coordinately Regulates Gene Expression and the Chromatin Landscape during Neurogenesis.

    Raposo AASF, Vasconcelos FF, Drechsel D, Marie C, Johnston C et al.

    Cell reports 2015;10;9;1544-1556

  • Whole-genome sequence-based analysis of thyroid function.

    Taylor PN, Porcu E, Chew S, Campbell PJ, Traglia M et al.

    Nature communications 2015;6;5681

  • A rare variant in APOC3 is associated with plasma triglyceride and VLDL levels in Europeans.

    Timpson NJ, Walter K, Min JL, Tachmazidou I, Malerba G et al.

    Nature communications 2014;5;4871

  • Genomic and phenotypic characterization of a wild medaka population: towards the establishment of an isogenic population genetic resource in fish.

    Spivakov M, Auer TO, Peravali R, Dunham I, Dolle D et al.

    G3 (Bethesda, Md.) 2014;4;3;433-45

  • Phenotypic spectrum of eleven patients and five novel MTFMT mutations identified by exome sequencing and candidate gene screening.

    Haack TB, Gorza M, Danhauser K, Mayr JA, Haberberger B et al.

    Molecular genetics and metabolism 2014;111;3;342-52

  • A calibrated human Y-chromosomal phylogeny based on resequencing.

    Wei W, Ayub Q, Chen Y, McCarthy S, Hou Y et al.

    Genome research 2013;23;2;388-95

  • Jdp2 downregulates Trp53 transcription to promote leukaemogenesis in the context of Trp53 heterozygosity.

    van der Weyden L, Rust AG, McIntyre RE, Robles-Espinoza CD, del Castillo Velasco-Herrera M et al.

    Oncogene 2013;32;3;397-402

  • Seventy-five genetic loci influencing the human red blood cell.

    van der Harst P, Zhang W, Mateo Leach I, Rendon A, Verweij N et al.

    Nature 2012;492;7429;369-75

  • An integrated map of genetic variation from 1,092 human genomes.

    1000 Genomes Project Consortium, Abecasis GR, Auton A, Brooks LD, DePristo MA et al.

    Nature 2012;491;7422;56-65

  • Insights into hominid evolution from the gorilla genome sequence.

    Scally A, Dutheil JY, Hillier LW, Jordan GE, Goodhead I et al.

    Nature 2012;483;7388;169-75

  • Genes contributing to pain sensitivity in the normal population: an exome sequencing study.

    Williams FM, Scollen S, Cao D, Memari Y, Hyde CL et al.

    PLoS genetics 2012;8;12;e1003095

  • New gene functions in megakaryopoiesis and platelet formation.

    Gieger C, Radhakrishnan A, Cvejic A, Tang W, Porcu E et al.

    Nature 2011;480;7376;201-8

  • Path to fracture in granular flows: dynamics of contact networks.

    Herrera M, McCarthy S, Slotterback S, Cephas E, Losert W and Girvan M

    Physical review. E, Statistical, nonlinear, and soft matter physics 2011;83;6 Pt 1;061303

  • A novel function of the proneural factor Ascl1 in progenitor proliferation identified by genome-wide characterization of its targets.

    Castro DS, Martynoga B, Parras C, Ramesh V, Pacary E et al.

    Genes & development 2011;25;9;930-45

  • Large-scale analysis of the regulatory architecture of the mouse genome with a transposon-associated sensor.

    Ruf S, Symmons O, Uslu VV, Dolle D, Hot C et al.

    Nature genetics 2011;43;4;379-86

  • The light responsive transcriptome of the zebrafish: function and regulation.

    Weger BD, Sahinbas M, Otto GW, Mracek P, Armant O et al.

    PloS one 2011;6;2;e17080

  • NPC-db, a Niemann-Pick type C disease gene variation database.

    Runz H, Dolle D, Schlitter AM and Zschocke J

    Human mutation 2008;29;3;345-50