Arthur Gilly

Principal Bioinformatician


This person is a member of Sanger Institute Alumni.

Arthur worked in the Human Genetics department from August 2013 to November 2018 in Ele Zeggini's lab. He focused on the development of new methods to empower the discovery of actionable targets for the treatment of common human diseases. He managed analytical projects, designed pipelines, developed new statistical methods and performed bioinformatic follow-up of association discoveries.

His main focus was on the analysis of sequencing data from the HELIC study (ca. 3000 participants). First, he worked on very low depth (1x) data, whose interpreation posed numerous analytical challenges due to the high amount of missingness and low signal-to-noise ratio among variant calls. He established a quality control and imputation pipeline for very low-depth sequence data, and carried out association analysis with over 60 quantitative traits. He performed a thorough evaluation of the quality of such sequences by comparing it to other genotyping methods, in order to guide future study design choices.

Then, he transitioned to high-depth (>15x) sequencing data on the same samples, which opened up new analytical perspectives and challenges. In particular, high depth allows reliable calling of rare variants, whose contribution to the aetiology of complex traits is not currently known. With Daniel Suveges, he developed a rare variant association study (RVAS) analysis pipeline and applied it to genome-wide gene-based tests on the HELIC high-depth WGS data. This allowed to evidence a combined role of rare and low-frequency exonic and regulatory variants.

Software development was an important part of Arthur’s responsibilities and interests. He wrote Daniel Suveges and have applied it to the large pool of phenotypes available within the HELIC project.

I am interested in every step of the omics data lifecycle, from acquisition and quality control, to statistical analysis and translation of results into clinical applications.

My background is in applied mathematics, so my main area of expertise is to develop and apply statistical methods to analyse large amounts of data. Recently, I developed a piece of software that corrects for overlapping samples in large-scale meta-analyses of genetic association studies, as well as a copy number variant caller that works on SNP genotypes (as opposed to sequencing read data). Feel free to browse the team’s github account or his own for interesting data visualisation and analysis tools.

My timeline


My publications

Loading publications...