David Clode, Unsplash
Aquatic Symbiosis Project

Aquatic Symbiosis Genomics Project

An ambitious project to read the genomes of 1,000 freshwater and marine species that represent more than 500 symbiotic relationships to provide vitally needed genetic information for conservationists and biologists to understand how species evolve and live together

Laying the foundations for future conservation

The Aquatic Symbiosis Project, jointly funded by the Wellcome Sanger Institute and the Gordon and Betty Moore Foundation, seeks to provide the genomic foundations needed by scientists to answer key questions about the ecology and evolution of symbiosis in marine and freshwater species, where at least one partner is a microbe.

By applying the latest genomic techniques and tools to 1,000 aquatic species, representing 500 symbiotic relationships, the data will guide future studies and help to inform conservation efforts.

The key role of symbiosis in ecology, evolution and conservation

Symbiosis covers a spectrum of relationships, from temporary to lifelong, and from mutually beneficial – such as between coral and algae –  to exploitative – between parasite and host. These relationships are hugely important and have evolved independently many times. For example coral provides species-rich reef systems around the world, while almost one-third of all complex species are parasites.

Yet little is known about the underlying genetics of these complex relationships between species, how symbiotic partners adapt to one another over time, how resilient these partnerships are and how they respond to disruption. The aims of the Aquatic Symbiosis Project are to provide gold-standard reference genomes for these symbiotic species to then allow genomic exploration and understanding of how species adapt to survive.

Evolution is often characterised as a fight for survival, with individuals pitted against each other in the race to spawn the next generation. However, collaborations – where two distinct organisms build a relationship that is beneficial – are widespread. Every plant carries within it the functional remnants of a once free-living light-harvesting bacterium, and these chloroplasts are what allow plants to use sunlight to fix carbon. The chloroplast-plant collaboration is billions of years old, but new collaborations have evolved many times. Similarly the collaboration between sessile jellyfish relatives and free living photosynthetic algae we call corals are the foundations of hyperdiverse reef ecosystems where tens of thousands of species thrive.

These symbioses – sym: together, biosis: living – are key to the functioning of the living world. Symbioses range from mutually beneficial to one sided (also described as “parasitism”) and from temporary or facultative to essential. As two once-independent species evolve to collaborate (or be exploiter and exploited) they will adapt to communicate with each other, to exchange nutrients and to accommodate to each other’s physiologies. This evolutionary change is delivered from and written in the genomes of the symbionts.

Collaboration and Openness

The Tree of Life programme at the Sanger Institute has embarked on an ambitious programme to sequence the genomes of species involved in a wide range of symbioses in marine and freshwater ecosystems. Funded by the Gordon and Betty Moore Foundation’s Symbiosis in Aquatic Systems Initiative and working with global partners, we will collect a wide range of symbiotic organisms (protists, plants and animals in the main), focused on specific scientific questions and generate reference-quality genomes for all species in each symbiosis.

We will collaborate with up to ten Hubs, experts in the field of symbiosis, and they will nominate the species we will sequence. The sequencing will follow the lead of our Darwin Tree of Life project, using long read and long range data, but the assembly process will be focused on delivering multiple independent genomes from the same sample. Using transcriptome data we will annotate the genomes, and release these openly through the European Nucleotide Archive, who are developing a dedicated data portal for the project.

In collaboration with all project partners we will analyse the genomes to answer long-standing questions in symbiosis biology. To build the symbiosis genomics community we will develop and deliver a collaborative programme of training in genomics and bioinformatics for early career researchers to build capacity to fully exploit the genome sequences.

Just as the symbiotic corals are the essential foundations of flourishing reef ecosystems, we intend that these new genome resources will be the lasting foundation of a new, flourishing ecosystem of functional genomics and population genomics of aquatic symbioses.

Phase 1 - Four Pilot Aquatic Symbiosis Hubs

Phase One of the Aquatic Symbiosis Genomics project will create the essential research infrastructure and capacity of the scientific community to analyse genomic information in relation to the development and adaptation symbiosis.

Four international teams of collaborators with expert knowledge in symbiosis (see below) are working with the Sanger Institute’s Tree of Life Programme to generate reference genomes, develop new laboratory techniques and bioinformatic approaches.

Each collaborating hub will submit samples for around 100 genomes for sequencing and analysis by Sanger Institute scientists.

The first four Hubs are:

Sponges as symbiont communities

Ute Hentschel Humeida (GEOMAR Helmholtz Centre for Ocean Research, Kiel, Germany) and colleagues study sponges and their diverse associations with microbes, often based on mutual nutritional support. Genome sequencing will define the different sets of organisms involved in sponge-microbe symbiosis, and help understand how the collaborations have arisen, how they are maintained and how they contribute to major geochemical cycles in the oceans.

Photosymbiosis in marine animals

Jose Victor Lopez (Nova Southeastern University, Florida, USA) is the nucleus of a group interested in how animals have established symbiosis with light-harvesting algae and other microbes – photosymbiosis. Working on corals, molluscs, flatworms and others, they will use the complete genome sequences of these astonishing “plant-animals” to show how the animals have rearranged the way they live to rely on food directly from their partners.

Coral symbiosis sensitivity to environmental change

Michael Sweet (University of Derby, UK) and colleagues work on the sensitivity of corals to environmental change, in particular the global phenomenon of “bleaching”, where the coral loses its photosynthetic algal partner. The coral then turns bright white, and may ultimately die. Mass bleaching events are threatening the future of reef ecosystems worldwide. The Hub will compare bleaching-resistant colonies to those which are more sensitive from the same species, with the goal of shining some light onto symbiotic processes which occur and allow for this natural variation in the response of corals facing the same stress. A greater understanding may allow mitigation of the effects of climate change and assist reefs to evolve in the face of continuing  and increasing stressors.

Evolution of new symbioses in single-celled eukaryotes

John Archibald (Dalhousie University, Canada) brings together a team working on symbioses between different kinds of single celled organisms. The symbionts cooperate through photosymbiosis and nutritional or metabolic symbiosis. Understanding how these organisms work together will both illuminate dark areas of the tree of life and reveal how symbiosis can evolve multiple times.


Phase 2 - Open Call for a further six Hubs


The Tree of Life Programme at the Wellcome Sanger Institute has initiated a major project in the genomics of symbiosis of aquatic organisms, funded by the Gordon and Betty Moore Foundation’s Symbiosis in Aquatic Systems Initiative. We are opening a call for collaborators globally to work with us in generating and deciphering a large number of symbiont genomes and in building a community of symbiosis researchers.

The Aquatic Symbiosis Genomics project aims to generate high quality genome sequences for symbiotic organisms in marine and freshwater environments, where at least one partner is a microbe, driving new understanding of the origins, evolution and ecology of symbiotic associations. We are seeking new collaborators who will lead themed research hubs. Each hub will be empowered to nucleate international teams focussed on a system of interest. Collectively, each hub will source up to 50 symbiotic taxa for reference genome sequencing at the Wellcome Sanger Institute. Sanger will generate and annotate all species in these symbioses and release them publicly for analysis by hub partners. The ASG project will build community through in-person and online training and support, including intensive “samples to genomes” and advanced bioinformatics workshops focused on aquatic symbiosis themes, as well as more general training in bioinformatics and genome analysis.


In this call, we are inviting prospective collaborators interested in leading a research hub, coordinating submission of samples of up to 50 symbiotic systems (i.e. approximately 100 species) for genome sequencing. We expect to recruit up to six hubs. We expect leaders will propose research hubs with:

  • An explicit theme/focus, likely on a particular taxonomic group, ecosystem, and/or symbiotic process. For example, “photosynthetic symbioses of sponges” would be appropriate, as would “animal chemosymbiosis in cold seep ecosystems”.
  • Clear scientific questions related to the genomic biology of the symbioses and patterns of symbiosis across the group and/or ecosystems, and articulation of how genomics will help understand the systems.
  • A collaborative team of researchers from several institutions who will each bring specific tax of researchers who are aligned to the theme and able to supply samples from (for example) the field, laboratory or aquaria.
    • The research hub can have up to 10 named collaborators who will supply specific specimens or resources.
    • A hub could be open to onboarding of additional hub participants in the future.

How to Apply

Please fill in the Smartsheet application form to describe your proposed project. Forms are due by November 1, 2020 and applicants will be notified of a decision by mid-December 2020.

Please refer to the document Aquatic Symbiosis Genomics Sanger Project_Phase2_OpenCall for full details of how to apply, including a list of FAQs.

Please note: The Wellcome Sanger Institute is already sequencing several aquatic symbioses as part of the soft launch of this program. Please refer to the list of organisms already in the sequencing queue in the Aquatic Symbiosis Genomics _ Species list Phase 1We encourage you to consider other species in your application to maximize representation across the tree of life. Please email masg@sanger.ac.uk for additional information or questions.

Ground rules

Please see the FAQs below for additional information. 

If you have specific questions, please email us: masg@sanger.ac.uk .

  • Proposals must be submitted by a researcher with a permanent appointment at a university or academic research institution. We are unable to accept applications from students, postdocs or contractors.
  • The project is time limited, and so potential collaborators should be mindful of the need to have their high quality biological samples with the Sanger genomics team as rapidly as possible – within 6 months of project initiation. 
  • Species and samples need to be ethically sourced and compliant with the Nagoya Protocol on Access and Benefit Sharing of the Convention on Biological Diversity (i.e. come with documentation of compliance). These permissions will need to include the ability to release the assembly and sequence data openly, and to destructively sample the supplied materials. Samples lacking comprehensive sampling metadata cannot be submitted to public databases, and so we will require extensive sample source metadata to accompany each sample.
  • All data will be made public on completion of primary validated assemblies.
    • Our data release policy is appended below. In summary, data will not be embargoed but we will publish Genome Notes naming all who contribute to the generation of the sequence data (including local coordinators, collectors and lab staff, etc.) to ensure credit is ascribed correctly. These Genome Notes will not preclude subsequent publication of integrated, comparative or “deep dive” analyses.
    • To maximise the future utility of the genome data generated, we strongly encourage collaborators to commit to making biological samples available to others where logistically feasible.

Open Data Release

The Moore Foundation-funded Aquatic Symbiosis Genomics Project is a project of the Tree of Life programme at The Wellcome Sanger Institute. We are working with a wide range of collaborators to generate reference genome data for many symbiotic organisms from marine and freshwater ecosystems.

All sequence data generated by the project will be openly available for reuse. All raw and assembled data will be deposited in the European Nucleotide Archive (ENA) public database and from there, into the other International Nucleotide Sequence Database Collaboration (INSDC) nodes: GenBank and the DNA Data Bank of Japan. In the spirit of collaboration and community-building, we strongly encourage research hubs to make biological materials available to others for post-genomic work. We expect collaborators will deposit samples relevant to the sequenced species and individuals into national and local collections (including cryorepositories). Where samples derive from cultured organisms, collaborators should, where feasible, make cultures/organisms available on request to other research labs.

The Sanger Institute project team encourages community reuse, and project data will be released freely for reuse for any purpose upon deposition in ENA. Our intention is to rapidly publish all submitted assemblies as Wellcome Open Research notes, which can be cited (see, for example, Daniel Mead, Kathryn Fingland, Rachel Cripps et al. [2020]. The genome sequence of the Eurasian red squirrel, Sciurus vulgaris Linnaeus 1758. Wellcome Open Research. DOI: 10.12688/wellcomeopenres.15679.1). We scientists who use the genome sequence data give appropriate acknowledgement and citation in their own publications.

The Sanger Institute team will also make available for download intermediate data and assemblies via a project website. These data and assemblies are provided “as is” as a service to the community, and we make no assurances as to their completeness or quality. Please note that these assemblies will be improved before final submission to ENA and we cannot guarantee persistence or availability of intermediate files in the long term. We strongly recommend that published analyses are based on data and assemblies submitted to ENA/INSDC. The genome sequences submitted to ENA by the Sanger Institute will be presented through the EBI Ensembl database, and the annotations presented through Ensembl should be regarded as the official versions.

Sanger people

Photo of Professor Mark Blaxter

Professor Mark Blaxter

Programme Lead for Tree of Life Programme and Senior Group Leader

Photo of Dr Victoria Wright

Dr Victoria Wright

Project Manager

Photo of Dr Sophie Potter

Dr Sophie Potter

Research Administrator

Photo of Catherine McCarthy

Catherine McCarthy

Compliance Officer - Nagoya Protocol

External Contributors

Photo of Professor Ute Hentschel Humeida

Professor Ute Hentschel Humeida

GEOMAR Helmholtz Centre for Ocean Research, Kiel, Germany 

Photo of Jose Victor Lopez

Jose Victor Lopez

Nova Southeastern University, Florida, USA

Photo of Michael Sweet

Michael Sweet

University of Derby, UK

Photo of John Archibald

John Archibald

Dalhousie University, Canada

External partners and funders


The Gordon and Betty Moore Foundation

The Gordon and Betty Moore Foundation fosters path-breaking scientific discovery, environmental conservation, patient care improvements and preservation of the special character of the Bay Area


EMBL-EBI European Nucleotide Archive

The archive is developing a dedicated data portal for the project