Institute researcher is sequence-squeezing champ
James Bonfield wins worldwide competition to speed up access to genetic data
Wellcome Trust Sanger Institute researcher, James Bonfield, has won the $15000 Pistoia Alliance Sequence Squeeze prize for creating the best ways to efficiently compress genetic data. His work will help to speed the sharing of genetic information around the world, and he couldn’t have done it without the help of his competitors.
“My programs would have been substantially weaker had I not had the challenge of my fellow competitors. The mix of competition and open discussion really produced amazing results. But, perhaps the most exciting thought is where this work will go next. Several entrants shared ideas with each other and I suspect that we can produce an even better solution by combining the best parts from each of our entries.”
James Bonfield Wellcome Trust Sanger Institute
This competition was created by the Pistoia Alliance – a precompetitive alliance of research groups, pharmaceutical companies and scientific societies seeking to improve worldwide genetic research by solving the problems that all researchers in the field face. The aim was to drive the creation of solutions to one of the most pressing problems in genetic research today: the storage and sharing of the vast volumes of genetic data that researchers need to find disease-causing gene variants.
“The latest high-speed sequencing machines are opening up the genetic study of disease and biological pathways in incredible depth because they allow hundreds or thousands of genes or genomes to be read and compared. However, this major leap forward is creating mountains of data that need to be stored and distributed around the world. Current storage solutions and internet transfer methods are struggling to cope, which is why James’ work is so vital. It literally reduces the size of the problem.”
Tony Cox Head of Operation Production Software and Informatics at the Sanger Institute
The competition itself was a demonstration of a novel way to drive forward innovation through its open and interactive set up. The Alliance encouraged continual innovation by posting an interactive leaderboard that showed, day by day, which entrant had produced the most efficient approach. In addition, the collaborative nature of the competition saw entrants sharing their problems and ideas on a variety of discussion forums.
“Seeing my entry being beaten by others spurred me on to improve my code again and again. Forums, such as encode.ru, had numerous and surprisingly open discussions on ideas, particularly from respected programmer Matt Mahoney, who went as far as to post code snippets. The views on that thread gave me ideas for improving my own program, so the final outcome was better than if I had worked purely in isolation.”
James Bonfield Sanger Institute
Out of the more than 100 entries, James’ solutions were judged to be the best overall for compressing the avalanche of information produced by the latest high-speed sequencing machines into forms that can be easily stored and transferred across the internet. The judging panel evaluated the approaches on their ability to:
- Squeeze the data into the smallest possible space (have the highest compression ratio)
- Achieve this in the shortest possible time (fastest compression and time)
- Allow others to unpack the compressed data as quickly as possible for use (fastest decompression time)
- Use the least amount of computing memory to compress and decompress the data
James’ algorithms scored highly in the top three criteria and ensured that alignment data was preserved to allow genetic sequences to be put together quickly and efficiently.
James will be giving half of his prize money to the British Heart Foundation.
“We are delighted for James that his work has been recognised in this way. We hope that his efforts will benefit the Institute and genetic researchers around the world for years to come.”
Emma Millican Head of DNA Pipelines and responsible for sequencing at the Sanger Institute
The Pistoia Alliance is a global, not-for-profit, precompetitive alliance of life science companies, vendors, publishers, and academic groups that aims to lower barriers to innovation by improving the interoperability of R&D business processes
The Wellcome Trust Sanger Institute is one of the world’s leading genome centres. Through its ability to conduct research at scale, it is able to engage in bold and long-term exploratory projects that are designed to influence and empower medical science globally. Institute research findings, generated through its own research programmes and through its leading role in international consortia, are being used to develop new diagnostics and treatments for human disease.
The Wellcome Trust is a global charitable foundation dedicated to achieving extraordinary improvements in human and animal health. We support the brightest minds in biomedical research and the medical humanities. Our breadth of support includes public engagement, education and the application of research to improve health. We are independent of both political and commercial interests.
15 Sep 2023
Syphilis transmission networks and antimicrobial resistance in England uncovered using genomics
Scientists use genomics to uncover syphilis transmission patterns in England, in a pioneering new approach for STI surveillance
14 Sep 2023
SIREN study expands surveillance of respiratory pathogens ahead of winter
The Wellcome Sanger Institute will sequence samples provided by UKHSA and use metagenomics to investigate pathogens that cause respiratory infection