Carla Stark / Wellcome Sanger Institute

New tool helps accurately assemble notoriously difficult bird genomes 

MicroFinder can be used to assemble bird genomes much quicker and more accurately than ever before. 

Microfinder

Find out more on our Tools page

MicroFinder

A new computational programme, MicroFinder,1 enables faster and more accurate assembly of bird genomes. Improved genome assemblies will help researchers better understand bird biology, evolution, and help inform conservation efforts.

Reported today (21 May) in GigaScience, MicroFinder uses protein markers to improve accuracy and quality of bird DNA sequences, particularly in their tiny and difficult microchromosomes – regions that normally make genome curation incredibly challenging and time-consuming. MicroFinder was created by researchers at the Wellcome Sanger Institute and has been made freely available to scientists worldwide.

A genome is the complete set of genetic instructions containing all the information for an organism to maintain life. In research, genome ‘curation’ is the process of taking assembled DNA sequences and creating a high-quality, accurate, chromosome-level genome assembly.

There has been a long-standing issue with assembling bird genomes because they are quite unusual. They have a mixture of large chromosomes and very tiny ‘dot’ microchromosomes. These dot microchromosomes are tiny and often contain repetitive genetic sequences, which makes them hard for computers to assemble correctly. They come out extremely fragmented in tiny pieces and are not assembled properly. Even high-quality genome projects often miss some of these dot microchromosomes, resulting in an incomplete genome.

To combat this problem, researchers at the Sanger Institute developed a computational programme called MicroFinder. The scientists used a set of well-curated bird genomes to identify common genes located on dot microchromosomes. These genes were grouped into families and used to design a set of protein markers specific to these tiny chromosomes.

The MicroFinder programme searches for these protein markers in draft genome assemblies that have not yet been organised into chromosomes. It identifies fragments of dot microchromosomes that are often scattered among small, repetitive sequences and moves them to the start of the assembly, making them much easier and faster for genome curators to locate and assemble correctly.

By speeding up the heavily manual process of correcting and refining genome assemblies, MicroFinder could help researchers generate more reliable bird genomes. These improved datasets will support studies into bird evolution, species diversity, and the genetic basis of avian characteristics. It will also provide stronger genomic resources for future research into biodiversity and conservation.

“MicroFinder provides a simple but effective computational method to help scientists identify and assemble the smallest chromosomes in birds, improving the completeness and accuracy of bird DNA sequences. We hope this has a positive knock-on effect, supporting bird biology and ultimately research into conservation.”

Dr Thomas Mathers, first author at the Wellcome Sanger Institute

“Bird microchromosomes have long been a struggle to accurately curate. Within the Large White-Headed Gull clade, the rapid radiation of species has left us with remarkably syntenic genomes that are difficult to differentiate. By helping us resolve these elusive ‘dot’ chromosomes, the MicroFinder tool gives us the chance to investigate whether they contain the genomic differences we haven’t been able to see in the larger chromosomes, providing the more solid genomic foundation needed for our biodiversity efforts.”

Dr Elisa Ramos, post-doctoral researcher at Universität Basel

 

More information

Notes to Editors

  1. MicroFinder can be accessed through GitHub: https://github.com/sanger-tol/MicroFinder

Publication

T. Mathers, et al. (2026) ‘MicroFinder: conserved gene-set mapping and assembly ordering for manual curation of bird dot microchromosomes’. GigaScience. DOI: 10.1093/gigascience/giag036

Funding

This research was supported by Wellcome. A full list of acknowledgement can be found in the publication.