Data sharing policy

The Wellcome Trust Sanger Institute is dedicated to advancing genetic and genomic science to the benefit of all. Rapid and open data sharing strategically supports this mission by enabling research and accelerating translation. However, such policies are only sustainable if scientific credit is generated for all parties involved, and the Institute will play its part in developing a global research environment which rewards data sharing.

The following principles form the basis for data sharing at the Wellcome Trust Sanger Institute and the Institute will not consider collaborations that do not adhere to this policy.


The Institute aims to provide rapid access to data sets of use to the research community and will place these in publicly accessible repositories when possible. The Institute will support data and interoperability standards to maximise access and ensure ease of integration with other global resources.

Ethical Considerations

Conducting genetic and genomic research carries responsibilities to protect confidentiality and the privacy of research participants. Access to certain data sets will therefore be carefully managed and granted in a transparent manner to all appropriately qualified researchers.

Rights of Data Providers

The Institute recognises the need for researchers to be appropriately credited for their scientific contribution and investment in data generation. It is therefore expected that all researchers both honour agreements in line with Fort Lauderdale's data sharing principles [1] and appropriately acknowledge the contributions of others [2].

Optimising Translation

The Institute recognises that, in specific instances, the use of intellectual property protection and attendant potential delays to data sharing may be necessary to prevent inappropriately exclusive claims by others and to ensure health benefits occur.


  1. Sharing Data from Large-scale Biological Research Projects: A System of Tripartite Responsibility. Report of a meeting organised by the Wellcome Trust and held on 14-15 January 2003 at Fort Lauderdale, USA.
  2. For information on acknowledging the use of data provided by the Institute, please see the Guidelines on the Use of Data in Publications provided below.

Download the pdf version of the Data Sharing Policy.

Guidelines on the Use of Data in Publications

Code of Access

Data sharing policies will only be sustainable if researchers commit to:

  • conducting appropriate research
  • protecting the confidentiality of managed access data sets
  • carefully communicating research results
  • respecting rights to first publication and to acknowledgement
  • sharing, in turn, resulting data and analysis with the research community

The Wellcome Trust Sanger Institute aims to provide rapid access to data sets of use to the research community. However, there has been considerable investment in acquiring these data, and we request that you follow these guidelines when using data provided by us.

If you have been referred here when accessing data from the public archives, please check the list of conditions of use below for relevant information.

Users of data from the Wellcome Trust Sanger Institute should not publish or otherwise disseminate the information without appropriate acknowledgment. Unless otherwise described below, the appropriate format is as follows: "These data were provided by the xxx group at the Wellcome Trust Sanger Institute and can be obtained from yyy."

(You can find the necessary information within the comments in the submission record or in the sections below)

Accession numbers should be quoted when available.

If you have any questions regarding the data or their use in publications, please contact the appropriate address from the following table (or the appropriate group leader). We strongly encourage you to get in touch, so that we can let you know of any related data and inform you about progress, and we can ensure that published information relating to our data is properly reflected in our databases.

Project Contacts

Human Genome Project
Cancer Genome Project
Mouse Genomes Project and Sanger Mouse Exomes Project

Microbial and Protozoan Unpublished Sequence Data

This sequencing centre plans on publishing the completed and annotated sequences in a peer-reviewed journal as soon as possible. Permission of the principal investigator should be obtained before publishing solely bioinformatic analyses (e.g. metabolic reconstruction, synteny) of unpublished sequence/open reading frames/genes on a chromosome- or genome-wide scale.

Catalogue of Somatic Mutations In Cancer (COSMIC) Data

The appropriate format for acknowledgment is as follows: 'The mutation data was obtained from the Sanger Institute Catalogue Of Somatic Mutations In Cancer web site, Bamford et al (2004) The COSMIC (Catalogue of Somatic Mutations in Cancer) database and website. Br J Cancer, 91,355-358.'

Data are made available before scientific publication with the understanding that we intend to publish the initial large-scale analysis of the dataset. This will include a summary detailing the data that have been curated and key features of the data. Any redistribution of the data should carry this notice. Please ensure that you use the latest available version of the data as we are continually adding information to COSMIC.

DECIPHER Consortium data

Authors who use data from the project must acknowledge DECIPHER using the following wording "This study makes use of data generated by the DECIPHER Consortium. A full list of centres who contributed to the generation of the data is available from and via email from Funding for the project was provided by the Wellcome Trust."

The author shall also declare in any published work that those who carried out the original analysis and collection of the Data bear no responsibility for the further analysis or interpretation of it by the author.

The author shall also contact the coordinator for the project/participating centre that entered the data on any individual who they wish to specifically include in their report (whether identified or not) and offer appropriate agreed recognition of their contribution, which may include co-authorship if the magnitude of the contribution warrants it to at least one representative from the project/participating centre (possibly the member who submitted the patient data). Contact details can be obtained by email to

Mouse Genomes Project and Sanger Mouse Exomes Project

The Mouse Genomes Project and Sanger Mouse Exomes Project releases sequence data, SNPs and other variant calls as a service to the research community.

These data are released in accordance with the Fort Lauderdale agreement and Toronto agreements. As producers of these data we reserve the right to be the first to publish a genome-wide analysis of the data we have generated.

The pre-publication data that we release via this website is embargoed for publication except for analyses of single chromosomes in single strains or single gene loci across multiple strains. We strongly encourage researchers to contact us if there are any queries about referencing or publishing analysis based on pre-publication data obtained via this website.

Zebrafish and Pig sequence data

Please note that some BAC and PAC clones were sequenced early at the request of researchers who provided us with the clones. If you are using these sequences for your analysis, the people who isolated and provided the clones should be properly acknowledged. You can find the necessary information within the comments in the EMBL record.

All sequence data are made available before scientific publication with the understanding that the groups involved in generating the data intend to publish the initial large-scale analyses of the dataset. This will include a summary detailing the data that have been generated and key features of the genome identified from genomic assembly and clone mapping/sequencing. Any redistribution of the data should carry this notice.

Vertebrate mapping data

If a map figure is reproduced in a publication, the figure legend should include an explicit statement concerning the source of the information.

Mapping data should be regarded as preliminary and so subject to regular update and change. It is recommended, therefore, that users carry out appropriate confirmatory checks.


The Wellcome Trust Sanger Institute provides these data in good faith, but makes no warranty, express or implied, nor assumes any legal liability or responsibility for any purpose for which the data are used.

Data Sharing Guidelines

The Data Sharing Guidelines provide details about implementing the Data Sharing Policy. Please download the Data Sharing Guidelines.

Publication Policy

The Publication Policy provides information for collaborators on how WTSI implements the WT open access policy. Please download the Publication Policy.

Human Genetics Data Security Policy

The Human Genetics Data Security Policy describes data security procedures for working with managed access data. Please download the Human Genetics Data Security Policy.

Relevant Documents and References

* quick link -