Chan Zuckerberg Initiative supports Human Cell Atlas data platform
The Chan Zuckerberg Initiative (CZI) has announced financial and engineering support for the Human Cell Atlas, an ambitious international collaboration led by the Wellcome Trust Sanger Institute and the Broad Institute, which is using sequencing technology to redefine every cell in the body. Support from CZI will enable the European Bioinformatics Institute (EMBL-EBI), the Broad Institute and the University of California Santa Cruz (UCSC) to set up an open, cloud-based Data Coordination Platform to check, share and analyse the vast amounts of diverse information generated in the initiative.
The field of human genetics has advanced so far and so fast in the past two decades that scientists believe it’s time to rethink human anatomy, starting with DNA. The Human Cell Atlas aims to do just that by creating a new, open, accessible reference map of the healthy human body.
“Anatomy textbooks as they are now were designed about 100 years ago by people assigning meaning according to how things look and function. Now, we’re using molecular tools to characterise what’s going on in organs and tissues and to get a deeper view of anatomy. That’s the Human Cell Atlas.”
Dr John Marioni, Research Group Leader at EMBL-EBI and a key member of the Data Coordination Platform
This international collaboration is using RNA sequencing technology to define cells in a whole new way. Such a highly specific, sequencing-based reference of healthy human function will be transformative for biomedical research, in any number of ways. This will mean a lot of data.
“The scale of the Atlas will be in the tens of millions of datasets. Interoperability and transparency are essential for keeping so many moving parts working – we know this from our long experience collaborating with one another. We’ve designed the data architecture as open-source and modular from the get-go. That will make it easier for others to use and add to the Atlas in the future.”
Dr Sarah Teichmann, Head of Cellular Genetics at the Wellcome Trust Sanger Institute and joint leader of the Human Cell Atlas
Moving away from ftp-based file sharing, the new cloud-based pipeline will allow Human Cell Atlas partners to upload their datasets, analyse them jointly, and compare healthy and diseased tissues in meaningful ways. This will involve cloud technologies including Open Stack, Google and Amazon Web Services.
“The size and scope of this new data platform will require large-scale collaborations between informatics and genomics experts across academia and industry. That is why we are thrilled to bring together three of the world’s leading institutions in genomics, informatics, and data sharing to build this important new resource - and our own software engineers will help develop the tools and facilitate the collaboration. It is a great example of how we can help accelerate science by supporting collaborations across institutions and by bringing scientists and engineers together in new ways.”
Cori Bargmann, president of science at the Chan Zuckerberg Initiative
The raw data produced by Human Cell Atlas researchers will be stored and accessed at EMBL-EBI, flowed to platform partners in the US for cloud-based analysis and annotation, then sent back to EMBL-EBI to be stored and shared in the public archives, making it available to the wider world.
“Science is truly international, and that is clear in the way the Human Cell Atlas partners work across continents. Each partner brings substantial experience building essential data services for the life sciences. CZI is not just funding the project - they’re a hands-on partner. So we know the Atlas will be built with the best engineering possible.”
Ewan Birney, Director of EMBL-EBI and Chair of the Global Alliance for Genomics and Health
“This contribution is for all the world’s biomedical scientists, because the Human Cell Atlas will be shared with everyone. CZI’s support will help us start to build a data platform for scientists around the world to see and analyse each other’s data, and to share the results of their work widely and openly. This will inspire others to ask new questions, and empower them to find the answers.”
Dr Aviv Regev, Chair of the Faculty at the Broad Institute and joint leader of the Human Cell Atlas
Building the platform is just the start of a colossal undertaking that will take many years to complete, during which technologies will inevitably change. The next step for the Data Coordination Platform is to plan for emerging technologies such as bioimaging, and for sustaining the public resource over the long term.