Advance access to data: Researchers post genetic profiles of human and mouse cells on Human Cell Atlas online portal before publication

Prior to publishing their results, researchers compile and make raw data openly accessible on the preview version of the HCA Data Coordination Platform

A team of postdoctoral and research scientists at the Wellcome Sanger Institute and MRC Cancer Unit has made data sets of human and mouse immune and stromal cells openly accessible on a preview site that provides initial access to data for the Human Cell Atlas initiative, before these have been published.

Both data sets are timecourse studies, of cells from a widely used mouse tumour model and from human spleen. Additional data* of immune cells from umbilical cord blood and adult bone marrow was provided by the Broad Institute of MIT and Harvard.

By making the data openly accessible before manuscript publication, the researchers have provided the broader scientific community with a valuable resource. The data sets can reveal basic biology, provide a reference for studying disease, and allow computational biologists to test new analysis tools.

One of the data sets came from researchers studying a mouse tumour model, to discover how the microenvironment adapts and responds to a tumour over time, both locally and in the draining lymph node. Skin tumours were induced in mice using a model cell line – B16 melanoma. The supporting ‘normal’ cells, including immune and stromal populations, were analysed on different days after tumour induction from both the tumour site and lymph nodes.

In a large effort, the Teichmann (Wellcome Sanger Institute) and Shields groups (MRC Cancer Unit, University of Cambridge) worked together to collect immune and stromal cell populations from multiple replicate mice over an 11-day timecourse. Each of the thousands of cells was sorted and sequenced to obtain full length transcript profiles.

“Releasing these data early means they will be available immediately for the whole scientific community as a resource. This is a very popular mouse model in cancer studies, and we hope our data will aid research into cancer development.”

Mirjana Efremova in Sarah Teichmann’s group at the Wellcome Sanger Institute, who developed computational methods to dissect the cell populations and their intercellular communication

Another large data set came from a donated single human spleen, which was being studied to see how time without oxygen affected the transcriptomic profiles of single cells. This spleen was provided by the transplant team headed by Dr Kourosh Saeb-Parsy at the Cambridge Repository for Translational Medicine (CBTM). It was couriered directly to the Sanger Institute at 3am and processing started immediately. Philippa Harding and Anna Wilbrey-Clark from Kerstin Meyer’s team at the Sanger Institute processed and single-cell sequenced separate parts of the spleen at four different time points.

It is difficult to get spleen tissue from previously healthy donors, so the team needed to be on call, ready to process the organ whenever it became available.

“This is the first single-cell data set from the human spleen. We hope that making this unique data available before publication will encourage collaboration between scientists. Other researchers can look at the data from other angles, to mine it in different ways and use this valuable resource to test new computational methods.”

Anna Wilbrey-Clark From the Wellcome Sanger Institute

The data is now available at

More information

*Read a parallel story by the Broad Institute on the immune cell data


The mouse model work was supported through funding from Wellcome, MRC and CRUK, and the spleen work was funded by the Chan Zuckerberg Inititative (Grant 2017-174169).