New methodology unifies single-cell data
CellHint can unify different single-cell data, creating harmonised, applicable datasets for the study of human health and disease.
A new methodology that allows for the categorisation and organisation of single-cell data has been launched. It can be used to create a harmonised dataset for the study of human health and disease.
Researchers at the Wellcome Sanger Institute, the University of Cambridge, EMBL’s European Bioinformatics Institute (EMBL-EBI), and collaborators developed the tool, known as CellHint. CellHint uses machine learning to unify data produced across the world, allowing it to be accessed by the wider research community, potentially driving new discoveries.
In a new study, published today (21 December) in Cell, researchers applied CellHint to reveal underexplored connections between healthy and diseased lung cell states. They looked at eight diseases, such as interstitial lung disease and chronic obstructive pulmonary lung disease, and showed the possible benefits of this tool. They also applied CellHint to 12 tissues from 38 datasets, providing a deeply curated cross-tissue database with around 3.7 million cells.
Cellhint is freely available worldwide and was created as part of the Human Cell Atlas initiative which aims to map every cell type in the human body to transform understanding of health and disease.
Single-cell genomics enables the understanding of every cell in the context of the human body at high resolution. Currently, a challenge in assembling the diverse datasets produced by single-cell research is that there is no unified system for naming and organising data.
To address this, researchers from the Wellcome Sanger Institute, and collaborators developed CellHint, which can unify cell types produced by independent laboratories. CellHint then places the data into a defined graph that shows the relationships between cell subtypes, giving a full picture of all the cells identified across different datasets.
The team applied CellHint to current data and revealed underexplored relationships between healthy and diseased lung cell states in eight diseases. It also identified cell types in adult human hippocampus that could be of potential interest for future research.
The researchers also applied CellHint to 12 tissues from 38 datasets, providing a deeply curated cross-tissue database with around 3.7 million cells. Each cell was annotated, which is the process of labelling cells with particular information. They also showed how it can create various models for automatic cell annotation across human tissues.
“CellHint stands out from other tools because it makes full use of the often inconsistent but valuable cell annotation information from individual studies, to achieve biologically-driven data integration. We are excited that with CellHint, cells from independent laboratories can be re-annotated and researchers can utilise the resulting information to put each cell into different contexts beyond the original study. We hope that this tool will greatly facilitate the reuse of molecular and cellular data and information across laboratories, potentially driving new discoveries in biology.”
Dr Chuan Xu, first author from the Wellcome Sanger Institute
“The Human Cell Atlas is creating detailed reference maps of all cells in the human body to transform our understanding of biology, health and disease, and single-cell technologies underpin this hugely ambitious project. Global collaboration and open data sharing are vital to achieve the aim of a representative Human Cell Atlas that will benefit humanity worldwide. CellHint enables the unification and sharing of single-cell data, which allows the global research community to contribute to and benefit from the ongoing research that is happening around the world, and help drive advances in health and healthcare.”
Dr Sarah Teichmann, senior author from the Wellcome Sanger Institute and co-founder of the Human Cell Atlas
- This study is part of the international Human Cell Atlas (HCA) consortium, which is aiming to map every cell type in the human body as a basis for both understanding human health and for diagnosing, monitoring, and treating disease. An open, scientist-led consortium, the HCA is a collaborative effort of researchers, institutes, and funders worldwide, with more than 3,100 members from 99 countries across the globe. The HCA is likely to impact every aspect of biology and medicine, propelling translational discoveries and applications and ultimately leading to a new era of precision medicine. More information can be found at https://www.humancellatlas.org/
CellHint can be found at https://github.com/Teichlab/cellhint
C. Xu, M. Prete, S. Webb, et al. (2023) Automatic cell-type harmonization and integration across Human Cell Atlas datasets. Cell. DOI: 10.1016/j.cell.2023.11.026
This research is part-funded by Wellcome and the Engineering and Physical Sciences Research Council (EPSRC).
Related blog posts
10 Oct 2023
From single cells to systemic change: shining a light on poorly understood aspects of women’s health
Dr Roser Vento-Tormo speaks to us about the lack of women's health research, how she is filling this gap with ...
9 Feb 2023
A passion for problem solving
As part of our innovator blog series, we spoke to Qianxin Wu who works jointly in our Cellular Genetics programme and ...
21 Feb 2024
Butterfly and moth genomes mostly unchanged despite 250 million years of evolution
Comparison of over 200 high-quality butterfly and moth genomes reveals key insights into their biology, evolution and diversification over the last ...
14 Feb 2024
Key genes linked to DNA damage and human disease uncovered
Scientists unveil 145 genes vital for genome health, and possible strategies to curb progression of human genomic disorders