Fighting COVID-19 with data: Callum Saint

SARS-CoV-2 virusThe work to track the transmission of COVID-19 across the UK, and monitor for new variants of the SARS-CoV-2 virus that causes it, is a major undertaking that draws on a wide range of skills, experience and expertise.

Thousands of COVID samples, and their associated data, arrive at the Sanger Institute every day. They flow through the Institute, their genomes are sequenced and the resulting data are cleaned, checked, analysed and uploaded to public databases, together with associated metadata, within days.

Teams – many brand new –  including data engineers, curators, managers, analysts and software developers, keep that data flowing.

To find out more about how our work contributes to the national COVID-19 effort, we spoke to a number of people at the Sanger Institute about their roles, their experience of working at the Sanger Institute, and the plans for their teams as data becomes one of the most important weapons in the fight against SARS-CoV-2.

Callum Saint: Senior Data Curation and Distribution Manager

Callum joined the Sanger Institute in November 2020 to support the COVID-19 genomic surveillance work.

Tell us about your role

“My team is responsible for the COVID sample metadata.

“I am building systems to bring all the metadata together, to make sure that people across the Institute can easily access the data they need and it’s ‘analysis ready’ for researchers. I work with the public health agencies of the UK, NHS Digital, the Department of Health and COG-UK where I’m trying to coordinate all of the incoming and outgoing data sources. I’m putting in infrastructure and laying the groundwork at the moment to support these processes.

“My other remit is in data governance.”

What technologies are you using?

“We’re using a variety of technologies through the teams.

“We primarily work with Virtual Machines in Linux environments along with multiple open source and licenced products. This ranges from Elasticsearch, MySQL, the Apache ecosystem, MongoDB and more.

“Depending on the data involved we would pick one technology over another, such as genetic data vs metadata.”

What are the opportunities at the Sanger Institute?

“As we expand the teams, we’re putting in place roles where people will be able to progress. There will be training, and I want people to be credited.”

What motivates you? 

“Part of our work, alongside the COG-UK consortium, is identifying variants of concern. We only know these exist and can monitor them because of our genomic surveillance capabilities.

“The UK is world leading in this area with Sanger at the forefront and, because we have this in-built capacity, we almost take it for granted.

“Building near real-time genomic surveillance systems that are helping the pandemic response and, in turn, saving lives is very rewarding.”

What attracted you to the Sanger Institute?

“A few things, but firstly it was the COVID work, definitely. The work we’re doing has helped identify variants of concern.

“It may sound crude but it’s waking up in the morning, hearing the news and knowing that we really are making a difference. Our work is directly feeding into Public Health England, parts of the NHS and the Department of Health to help the pandemic response and save lives.

“The other thing is the Campus. I only started a few months ago and I’m working remotely, but I can’t wait to get there, it looks lovely! Also, it’s the people. Everyone is really nice – they have a job to do, and it’s obvious they are passionate about it.

“I’m passionate about my work too, so I knew it was going to be a good place for me.”