Alumni

This person is a member of Sanger Institute Alumni.

Worked at the Wellcome Sanger Institute from January 10th 2022 to November 28th 2024

Bioinformatics scientist involved in the development and automation of bioinformatics pipelines for processing and analysing genomic data.

Background

Undergraduate
I completed my undergraduate degree in Biomedical Science at the University of Lincoln, where for my final year thesis project I investigated the impacts of interleukin-6 polymorphisms on stroke risk. This involved genotyping large numbers of samples with associated ultrasound carotid intima-media thickness measurements for particular single nucleotide polymorphisms. This was followed by applying a series of statistical models to test for associations between these genotypes, pack years of smoking, and measured carotid intima-media thickness (in addition to correcting for multiple comparisons and other related factors e.g. age). This project greatly encouraged me to take my further studies in the direction of exploring data.

Postgraduate
I completed my Master’s degree in Bioinformatics at the University of Bristol, where for my thesis project I developed a bioinformatics pipeline to detect and ascertain the significance of changes in mRNA abundance, poly-A tail length and epitranscriptomic modifications from yeast (wild type & modified to synthesis beta-carotene) long read nanopore sequencing datasets. This is where my main interest of working with bioinformatics pipelines was sparked as well as my fascination with long read sequencing.

Bioinformatician at the Wellcome Sanger Institute

Whilst working at the Wellcome Sanger Institute, I became heavily involved in the development of scalable Nextflow bioinformatics pipelines and analysis codebases for malaria parasite and vector genomic surveillance. In addition to this, I contributed to the testing, optimisation and maintenance of these pipelines which were routinely used to produce large volumes data for partners. My role eventually expanded to also include analytical responsibilities, such as validation of data produced by our codebases, as well as ad hoc data analysis tasks for wet & dry lab scientists.

Key Achievements:

Here is a summary of some of my key achievements whilst working as a bioinformatician at the Wellcome Sanger Institute:

  • Contributing to the process of writing, optimising, testing, validating, & operationalising Nextflow pipelines that:
    • Produce malaria parasite (Plasmodium falciparumPlasmodium vivax) alignments, variant calls, & genetic report cards which detail various anti-malarial drug resistance phenotypes from amplicon sequencing data.
    • Produce high quality malaria parasite (Plasmodium falciparumPlasmodium vivax) variant calls & alignments from whole genome sequencing data.
    • Produce high quality malaria vector (Anopheles gambiae & Anopheles funestus) variant calls & alignments from whole genome sequencing data.
    • Identifies viruses in bait capture samples (SARS-CoV-2, Influenza, and RSV in particular), & where possible generates high quality consensus sequences.
  • Greatly reducing the data turnaround time for the Plasmodium Falciparum whole genome sequencing monthly release & the malaria amplicon sequencing genetic report card curation processes.
  • My analysis work comparing lineage calls & sequencing metrics resulted in the SARS-CoV-2 sequencing to be switched over to the Illumina MiSeq platform (saving £16,000 per month).
  • Using our codebases to regularly produce large volumes of data for the MalariaGEN Pf7 & Ag1000g projects in addition to many curated resistance phenotype genetic report cards for partners

My timeline