Genomics and genome editing in the NHS

Evidence submitted by the Wellcome Sanger Institute to the Genomics and Genome Editing Inquiry of the House of Commons Select Committeee on Science and Technology in 2017.

Oral evidence was given by Professor Sir Mike Stratton, Director, Wellcome Trust Sanger Institute, and Chief Executive Officer, Wellcome Genome Campus on 8 March 2017. It is available at:

Written evidence was also submitted by the Wellcome Trust Sanger Institute (GNH0003), see below:


The Wellcome Trust Sanger Institute uses genome sequences to advance the understanding of the biology of humans and pathogens to improve human health. We use science at scale to tackle the most challenging global health research questions.
Our evidence submitted to the previous inquiry is included at the end (see Annex 1 and 2).

Key Messages

The cost and speed of sequencing should not be viewed as a barrier to adoption. If the appropriate public health infrastructure is built historical evidence has shown that the cost and time of sequencing will rapidly drop as the technology is developed and improves.

It is important that sequencing data is not locked into proprietary systems. NHS or Public Health England must be able to share data in order to support public health responses and continuing development and innovation.

Integration of genomics into healthcare requires:

  • Partnerships between researchers and the NHS (Health Advanced Research Programme – Life Sciences Strategy) and government.
  • Continuing education of healthcare professionals in genomics.
  • Addressing the skills shortages in bioinformatics and software developments.
  • Support for entrepreneurship.

The recommendations from the Chief Medical Officer’s report, Generation Genome, should be adopted.

The Wellcome Trust Sanger Institute, EMBL-European Bioinformatics Institute and Genomics England make the UK a world leader in genomics and should be part of all strategy and decision making for UK genomics.

Pathogen Genomics

Whole genome sequencing (WGS) of pathogens has the near-term potential to be adopted by the NHS and Public Health England (PHE) for use in transmission tracking and predicting drug resistance in bacteria and viruses, particularly in TB and HIV. The technology has the medium to long term potential for use to predict drug resistance in a range of pathogens and for identifying individual bacterial and viral strains.

Successful implementation of WGS for pathogen surveillance in the NHS and PHE needs the correct data infrastructure with both local and centralised sequencing capacities and capabilities. Currently, introduction of pathogen sequencing is fragmented and coordination is required to ensure a joined-up system.

Rapid sharing of WGS data, with adequate metadata, is critical for ensuring sequencing efforts are translated into actionable information; particularly in the event of a national or international public health crisis. For this to work, data must be accessible and interoperable, with internationally agreed data standards developed by organisations such as the Global Alliance for Genomics and Health.

Solutions for integrating microbial sequencing into the NHS must be generic. Commercial products which offer integrated sequence databases, sequencing and analytical tools may appear attractive, but such proprietary systems may result in technological lock-in, and could hinder both routine surveillance and response to an epidemic if the data is not accessible to those outside the NHS/PHE. Initial cost savings may be cancelled out by subsequent reliance on the commercial supplier to provide all improvements and upgrades and the inability to implement alternative technologies. Open systems will allow other groups to develop new analytical tools which can then be adopted by the NHS or PHE.


Public Health England (PHE) is best placed to develop data generation and sharing infrastructure. They should be supported and encouraged to create and curate a database of microbial genomes and validated resistance markers that is accessible to both public health and research communities.


Local health outbreaks should be sequenced in regional centres with information shared with and by PHE. National or international level surveillance and outbreaks should be centrally handled by PHE.


Data standards should be mandated and raw data should be made accessible to research and the public health communities. Standards such as those developed by the Global Alliance for Genomics and Health should be adopted, and PHE and the NHS should work with the Alliance to develop future standards.

Cancer Genomics

Whole genome and whole exome sequencing has transformed our understanding of the development and progression of cancer. Although widely used in research, next generation sequencing is still in early phase usage in the NHS and there is a strong need to develop an appropriate infrastructure.

There is a critical need for platforms which support data sharing as well as standardisation of sample collection, sequencing and molecular marker panels. Genomics England has been instrumental in developing some of these processes and standards. Their knowledge and experience should be utilised in regional centres. Moreover, given the trajectory towards eventual whole genome sequencing in cancer patients, closer collaborations between Genomics England and those organisations with specific expertise in this complex field, including the Sanger Institute, would be prudent.

The use of next generation sequencing (NGS) in the NHS requires clinicians to be data-literate. Education of medical students and ongoing-learning of healthcare professionals is critical. The data generated by NGS is complex and clinicians need to be adequately prepared. Masters courses such as the Master of Studies in Genomic Medicine jointly run by the Wellcome Trust Sanger Institute and Cambridge Institute of Continuing Education support professional development, but currently demand exceeds supply for such courses.


Harmonised sequencing services should be established at the 13 existing genetics centres across the UK, to create a semi-centralised system. Data should be archived at each site, but analysis should be done in a secure shared cloud environment, using harmonised analytics and a standardised list of genetic variants.


A national committee of clinicians should be established to identify those genes and variants of specific clinical utility for the different cancers, to be reported for sequenced samples together with their clinical implications as part of the report to the clinical teams. These should be regularly reviewed and updated as new variants are identified and confirmed as disease significant.

Industrial Strategy – Health Advanced Research Programme (HARP)

The Life Sciences Strategy’s proposal to bring basic genomics research and healthcare closer together is of key importance to keep the UK at the cutting edge of innovation and to deliver true societal benefit. We support the recommendations made in the Life Sciences Strategy on developing genomic medicine.

Fulfilling the vision of the Life Sciences Strategy and Generation Genome will require the Government and organisations to address skills shortages, particularly around bioinformatics. Staff increasingly require diverse skill sets, across multiple disciplines and apprenticeships and continued professional education programmes should be expanded.

Alongside addressing the skills shortage there is a need to encourage and support entrepreneurship, particularly amongst academics. In addition, start-ups need to be able to access a range of funding, including patient capital, is also key.

The Wellcome Trust Sanger Institute, the EMBL-European Bioinformatics Institute and Genomics England together are the heart of the large genomics hub in the world. Continued engagement with, and support and investment in these organisations will keep the UK at the forefront of genomics.

Annex 1


The Wellcome Trust Sanger Institute is a world-leading genomics Institute, whose mission is to use genome sequences to advance understanding of the biology of humans and pathogens in order to improve human health. The Institute was founded as the single largest contributor to the public Human Genome Project, and since that time has led the development of the field of genomics. The Institute welcomes the Committee’s inquiry into Genomics and Genome Editing and would be happy to provide additional evidence to the Committee.

The Sanger Institute is situated on the Wellcome Genome Campus, which also hosts the European Molecular Biology Laboratory – European Bioinformatics Institute (EMBL-EBI), Illumina, ELIXIR and Genomics England. The campus also provides incubator space for computational start-ups and spin-outs working in the fields of genomics and biodata.

Genomics is a relatively new discipline, however, in the last 30 years the cost and time to sequence a single human genome has gone from billions of dollars and years to complete, to under $1,000 and a few days. The scale and scope of genomics has expanded sufficiently rapidly that it is already in use in the clinic.

The development of genome editing tools, notably CRISPR-Cas9, has led to rapid advances in genome editing. The concept of genome editing raises a number of challenging ethical issues, however the idea of altering the genomes of humans and other organisms, and the associated ethical and regulatory issues, has been thought about since the 1970s. Although CRISPR-Cas9 is new, the idea of altering the genome is not, and care should be taken to ensure that any regulation and legislation reflects the maturity of thinking around some of these issues.

Although genomics and genome editing have scientific and technological overlap they represent two distinct areas of research and technology and each have their own distinct regulatory and ethical challenges that do not easily lend themselves to consideration as one entity. We would strongly recommend considering genomics and genome editing as two distinct and separate issues.

Genomics in the Clinic – Genetics

The Sanger Institute uses genomics to advance understanding of human biology in order to improve human health. To this end the Institute has research programs studying cancer and aging, pathogens and pathogen surveillance, and malaria, as well as human development and population biology.

Although the Institute would broadly view much of its research as basic research, it has engaged in research programs with interactions with the clinic. Amongst the most notable of these, the Deciphering Developmental Disorders (DDD) program ( has provided diagnoses of rare developmental disorders for several thousand children with no agreed genetic diagnosis identifiable by conventional investigations. Although in the instance of developmental disorders, a diagnosis almost certainly does not mean a cure, understanding the genetic basis of the disorder can mean more effective treatment of symptoms, allow parents to make informed reproductive choices regarding future children and can lead to the formation of supportive communities around the disorder. The DDD program has recruited more than the 13,500 families, sequenced more than 30,000 exomes and was the template for the developmental disorders arm of the 100,000 Genomes Project (Genomics England).

The DDD program has provided diagnoses for around 40% of the nearly 4,000 families analysed to date in the program. This is an exceptional diagnosis rate for rare developmental disorders, that remain undiagnosed following conventional testing and demonstrates the power of well-implemented genomics in the clinic. The DDD program relies on a large and very carefully planned distributed network of clinical expertise and infrastructure within the NHS and the clinical expertise required to deliver this program should not be underestimated. Creating a model for scalable implementation of genomics for diagnosing rare disorders has been a key part of the DDD project.

The Clinical Lead for the DDD project emphasised the importance of the DDD genomic network and the expertise created by it over the past 5 years and highlighted the potential for reputational damage, patient harm and cost if the implementation of genomics is not carefully managed. Specifically, she raised concerns around the scientific and clinical expertise required to properly interpret genetic variants many of which are novel and the difficulties in determining if a genetic variant is clinically significant in an individual patient. She also highlighted how the distributed network brought in knowledge, experience and ideas and significantly boosted the analytical power of the project.

Projects led by the Sanger Institute have demonstrated that there is a high level of genetic variation between individuals (approximately 4-5 million variants per genome). To add further complexity, even where genes have been linked to a disease, e.g. the BRCA1 gene and breast cancer, most genetic variations within that gene are not disease causing. Therefore identifying and classifying novel genetic variants correctly is highly challenging, and requires experience and expertise. Furthermore, this interpretation requires clinicians to be able to access and compare data from other clinical genetic testing centres and control data sets.

Unlike many other test results, a genetic test result is constant and unchanging. The impact of being given a diagnosis based on a genetic finding is lifelong and may affect not only the patient but also their family. Therefore, it is critical that appropriate investment is made to ensure that the genetic diagnosis is robust, even if this means funding orthogonal investigations to corroborate the diagnosis e.g. biochemical tests or imaging. An incorrect diagnosis, based on a genetic finding, for a patient can have severe and profound consequences. Incorrect identification of a genetic variant as potentially disease-causing is not only distressing for the patient and their family, but can lead to invalid testing and management for their relatives, thus amplifying the error. For high-risk genetic conditions such as sudden cardiac death or inherited cancer syndromes, this is a real risk to patients through false reassurance or ill-founded diagnosis with subsequent management that may involve invasive procedures and emotional burden.

Genomics has the potential to dramatically improve patient care by improving specificity of diagnosis and helping stratify management and treatment. However, care must be taken to ensure genetic and genomic tests are only mainstreamed into broad clinical practice when the genetic variant in question is properly understood and there is an evidence base with which to inform subsequent management. Rushing the implementation of genomics, without (i) respecting the boundaries between research and clinical practice and (ii) investing in the network of clinical expertise required to ensure that genetic diagnoses are robust, risks significant harm to individuals and their families and inappropriate expense to the health service. Whilst the carefully considered and phased implementation of genomics has the potential to transform diagnostic practice in rare disease.

Careful and timely implementation of genomics has the demonstrable ability to transform patients’ care. The power of a diagnosis not only brings the possibility of treatment but offers patients and families reproductive and lifestyle choices, provides support and community and can bring some degree of certainty to what are often highly challenging and potentially charged situations.

Genomics in the Clinic – Pathogen Surveillance and Microbiomics

Genomics is not only used in the clinic to provide genetic testing but also increasingly is used for pathogen surveillance to inform the decisions of hospitals, clinicians and public health officials. In 2011, The Sanger Institute worked with the Special Care Baby Unit (SCBU) at an NHS Foundation Trust in Cambridge to isolate and monitor transmission of an MRSA outbreak. This proof-of-concept work not only allowed the NHS Trust to identify the source of the MRSA and prevent further spread, but also demonstrated that genomics can allow the hospitals to differentiate between an outbreak spread between individuals and a spontaneous but unrelated cluster. Understanding if there is a relationship between cases is a crucial distinction to make when tackling MRSA (or other infectious disease) spread but one that’s difficult to determine based on case numbers alone.

More recently, the Institute co-founded the Centre for Genomic Pathogen Surveillance with Imperial College. The Centre aims to provide data and tools for local, national and international health organisations through the surveillance of pathogens using whole genome sequencing, improved understanding of the emergence and spread of drug resistance, and the provision of actionable data.

The Sanger Institute has an established research program studying the microbiota (the naturally occurring bacterial population in the body) and the relationship between the microbiota and human health. Determining the composition of the bacterial population in the body is allowing our researchers to identify new potential therapeutics and novel drug targets for a range of diseases. The work in this area has been so promising that two of our faculty have founded a new spin-out company, Microbiotica, which seeks to use combinations of bacteria themselves as therapeutics that target the microbiota and change and alter its composition for the treatment of human disease.

Pathogen surveillance and microbiomics are examples of clinical uses of genomics where the data generated by genomics allows hospitals, clinicians, public health officials and even policymakers, to make informed, evidence-based decisions. The information is frequently less complex than genetic diagnoses with respect to determination of significance and can often work with existing processes, to inform decision making, such as how to handle an infectious outbreak in a clinic.

Genomics England

The Genomics England sequencing facility is based on the Wellcome Genome Campus alongside the Sanger Institute. Part of the Genomics England bioinformatics team have recently taken space in new Biodata Innovation Centre: a dry-lab incubator space on Campus for spin-out and start-up companies.

Genomics England is viewed by the Sanger Institute as a ground-breaking, world-leading genomics initiative which seeks bring genomics into the clinical setting. Since its launch many other countries have launched similar initiatives which seek to model themselves on Genomics England and use the lessons learned by this visionary initiative. As is often the case with first-movers, there is a danger that those countries playing “catch-up” to Genomics England will ultimately overtake it. It is therefore important to ensure that Genomics England continues to be outward looking and is properly resourced and governed in order to keep it a world-leader in genomics for healthcare.

Although an excellent program, there are areas where recurring concerns have been raised. The first is around a lack of transparency and clarity in the existing program and the second is around how the current operational model can be transformed into a workable, large-scale model for the NHS.

In order to ensure Genomic England’s legacy it is critical that the genomic data it generates becomes centralised within the NHS and that this centralised NHS genomic data be made accessible, with appropriate data security and governance, to bonafide researchers to enable the benefits for patients to be maximised. These are not theoretical benefits but real-world proven benefits as demonstrated by the past 12 years of DECIPHER, the database created by the DDD project.

Currently, there is a lack of clarity around data sharing rules and the tools available to those who may be interested in accessing Genomics England’s data. However, Genomics England is working with the Global Alliance for Genomics and Health (GA4GH), to establish standards for datasharing that conform to best practice, and facilitate the amalgamation of datasets to increase the interpretive power of those datasets. GA4GH is an international consortium whose mission is to accelerate progress in human health by helping to establish a common framework of harmonized approaches to enable effective and responsible sharing of genomic and clinical data. Such efforts to ensure bonafide researchers can access Genomic England’s genomic data in a properly governed manner should be encouraged and supported at a political level.

The choice of diseases and why they were chosen has not been effectively communicated to stakeholders. There is still confusion over what diseases/disease areas Genomics England is working on compared to what was initially proposed, e.g. initial press releases included a work stream on infectious diseases that was later dropped without explanation.

Genomics England has worked hard to engage the public, as well as scientists and clinicians. There are a number of very good examples of educational videos and public engagement events which have sought to increase awareness of genomics. The inclusion of social science research into the public’s view and understanding of genomics is important and forward-thinking. However, there has been an apparent lack of follow-up on progress, with limited information being given out on the numbers of genomes sequenced, the progress of the different work streams, the ability to provide clinically relevant information and the progress of the Genomic Clinical Interpretation Partnerships (GeCIPs).

The aim of the GeCIPs has also not always been clear, and in public they have been presented as having a remit that is both challenging and potentially paradoxical. The GeCIPs are meant to support this first iteration of Genomics England but they are also supposed to be working with Genomics England to understand and capture its barriers, pitfalls and successes and develop the processes for consent through to analytics and beyond. Being a pioneer/first mover project necessarily means mistakes will have been made, it is critical that the GeCIPs and other key groups learn from the mistakes made and are in a position to develop a scalable, functioning system for the next phase of Genomics England.

The current protocols and processes devised by Genomics England have been designed for the delivery of a large, time-limited project and will need to be substantially adapted for delivering an efficient, sustainable clinical service. Genomics England will need to demonstrate considerable flexibility if it is to leave the desired legacy. Genomics England has a politically high-profile and was personally supported by the previous Prime Minister. It is important that fear of political ramifications does not inhibit necessary changes.

The future of Genomics England and how genomics will go forward in the NHS is uncertain with no clear indications from Government on their priorities regarding Genomics England. The UK has always been a global leader in Genomics, and the establishment of Genomics England further established the UK’s global dominance. It has been the first serious initiative to attempt to move genomics from research into mainstream healthcare and should be considered a national asset.

Genome Editing

Researchers at the Sanger Institute use genome editing, primarily in the form of CRISPR-Cas9, as a tool across the gamut of research undertaken by the Institute. The Institute feels that is important to note that the use of genome editing is an immensely powerful research tool, and not all uses of genome editing are ethically contentious or require any additional regulation. The use of genome editing to create cell-lines that better mimic human disease or allow high-throughput screening of drugs against cancer-causing mutations are examples where genome editing is providing highly impactful research but has no direct consequences for human health or reproduction nor any impact on the ecosystem.

The use of genome editing in animals for research does pose ethical concerns, particularly around numbers of animals used and quality of research. New genome editing technologies such as CRISPR-Cas9 may encourage an increase in the number of animals used in research as the cost and difficulty of creating a genetically modified (GM) animal continues to decrease. However, at the Sanger Institute the use of genome editing has thus far led to a reduction in the number of animals (mice) being used. In large part this has been through a reduction in the number of animals needed for breeding in the creation of genetically altered animals. However, one researcher was able to end his research using mice as CRISPR-Cas9 allowed him to accurately introduce genetic changes into induced Pluripotent Stem (iPS) cells resulting in the cells being a better disease model than the mouse models he had been using.

It is important, as gene editing technologies become more widely used, that the quality of research using animals does not decrease or create problems with animal welfare through poor experimental design or inappropriate use of the technology. Ethical use of genome editing in animals for research requires proper experimental design, with appropriate statistical and experimental controls, and should only be undertaken in centres of excellence, such as the Sanger Institute.

The Institute does not currently engage in research that seeks to introduce a genetically altered organism into the market or the wild nor does it have plans to do so. However, the Institute notes that as the UK exits the EU the question of legislation on the creation, use, import and sale of genetically modified organisms will arise. Genome editing adds a new dimension to this debate as genome editing is a different process from genetic engineering. Notably, while genetic engineering often involves the insertion of foreign DNA into an organism’s genome, changes introduced into the genome using genome editing tools may more closely resemble changes that arise “naturally” through breeding and reproduction. Like “naturally” occurring changes, these edits may not easily be traced through subsequent generations.

Germany and Australia have begun to attempt to tackle regulation of genome editing. Both countries taken a similar approach of dividing the types of gene editing into three categories based on how large a change is made in the underlying DNA and how similar the change is to those that might naturally arise. Each category has its own regulation. While the categorisation has scientific basis, we are concerned that without proper public dialogue this might be too nuanced an approach that ignores the fact that much of the current public objection to genetically modified organisms centres on the idea of artificially changing DNA. The extent of the change is not necessarily the critical factor in the public’s views. Whilst categorising genomic editing in such a manner provides a proportionate risk-based approach to regulation, there is a concern that meaningful engagement with the public must be undertaken before any such regulation could be implemented and must reflect reasonable public concerns. It is important that there is not a repeat of the introduction of genetically modified products that could lead to a total rejection of all genome edited products, despite their potential societal and global benefits.

Although editing human genomes raises clear ethical challenges it is worth noting that introducing heritable changes into the human genome is already illegal and that drug development and the use of reproductive technologies are both highly regulated. The UK has a regulatory framework for the use of reproductive technologies that is considered to be amongst the best in the world.

Lastly, genome editing and genomics should be considered distinct in terms of the ethical, social and legal issues, legislation, and implementation into healthcare. To provide an analogy, nuclear medicine and nuclear energy both have their basis in the same scientific principles, but their uses, benefits and potential harms are significantly different. It is hard to think of a sphere in which the Government operates, from the economy, healthcare, to energy where they would be linked and investigated together.

Industrial Strategy

The UK is, and always has been, a world leader in Genomics. The Sanger Institute (then the Sanger Centre) was set up as part of the Human Genome Project, a global consortium which sought to sequence the first human genome. The Sanger Institute was the single largest contributor to the project, sequencing one-third of the completed sequence.

The Sanger Institute started from a position of global dominance in genomics and has gone on to form the basis of a burgeoning, world-class genomics hub with strong links to the wider scientific hub of Cambridge. The Sanger Institute is situated on the Wellcome Genome Campus, alongside EMBL-EBI who undertake computational and informatics research and provide bioinformatics support to the Sanger Institute and ELIXIR who provide biodata infrastructure for Europe. The Campus has recently expanded and now hosts Illumina and Genomics England in a state of the art sequencing centre shared with the Sanger Institute. In addition, the campus provides incubator space for computational start-ups and spin-outs working in the fields of genomics and biodata. Annex 1 provides an overview of the activities and organisations on campus.

The Campus, which centres on the Sanger Institute and EMBL-EBI, very recently opened its Biodata Innovation Centre which offers dry-lab incubator space to start-ups and Institute spin-outs. The Centre has a very narrow gateway for entry and companies wishing to take space must work on genomics and biodata and offer complementarity with the Institutes and other organisations already on Campus. Despite this narrow gateway, and the Centre having been open for less than a year, the Centre is almost at maximum occupancy and has attracted local companies, such as Eagle Genomics, and international companies from India (Global Gene Corp) and the west coast, USA (Specific Technologies). In addition, the Centre hosts two Sanger Institute spin-outs, Microbiotica and Congenica. Sanger Institute spin-outs are not given preference when applying for space in the Centre.

Open Targets is a highly innovative pre-competitive collaboration between the Sanger Institute, EMBL-EBI, GSK and Biogen. Using genomics to identify potential new therapeutic targets, Open Targets has created a freely-available, open access platform that allows users to identify potential new targets for their disease of interest or discover what diseases a target of interest might affect.

Genomics is on the cusp of entering mainstream healthcare. It is supporting entirely new sectors within the pharmaceutical/healthcare system and is the basis for a number of innovative technologies. On the Wellcome Genome Campus alone, there are projects tracking global spread of pathogens including Ebola and Cholera, analytical tool development, development of cell models and platforms for identification of new drug targets, ancestry testing for Asian populations, diagnostics for infections, and development of therapeutics, as well as continuing innovations in genome editing, stem-cell production and sequencing. The Wellcome Genome Campus, through its almost globally unique partnerships between academia, industry and government, is helping the UK be the global leader in both the science and the translation of genomics. We would strongly encourage the UK Government to continue and build on its support for the genomics industry, in part through the Industrial Strategy.

Concluding Comments

In the last 25 years genomics has become a scalable and affordable technology which is now on the verge of entering mainstream clinical practice. There have been significant advances, which have made it a highly attractive technology for healthcare. However, its use in the clinic for genetic diagnosis is not straightforward and serious consideration needs to be given to the necessary clinical expertise and infrastructure needed to implement it properly.

Genomics is not used in the clinic only for genetic diagnoses but also for public health/infectious outbreak management and in new fields such as microbiotics. Although use of genomics in these settings is not necessarily straightforward there may be fewer issues around the difficulties of interpreting the significance of genetic variants and diagnostic impact, and these technologies provide powerful and actionable information.

Genomics England is a visionary genomics program that puts the UK in the lead on genomics for the clinic. However, a lack of transparency and clarity around decision making has affected its relationship with some stakeholders and the lack of certainty around its future is unhelpful and risks undermining the good work done thus far.

Genomics and genome editing are two distinct technologies both of which have wide-ranging and frequently distinct potential uses. Although there are scientific overlaps between genomics and genome editing the ethical, legal and societal implications of each should be considered in isolation from the other.

The development of CRISPR-Cas9-based technologies has made the potential of editing genomes at will a reality. Although, there are obvious and serious ethical issues with genome editing it is important to remember that there is already legislation and regulation in place that will cover some of the more controversial uses of the technology, particularly for uses in humans and animals in research.

Finally, the UK has always been a world leader in genomics. With an already established genomics hub at the Wellcome Genome Campus and a strong pharmaceutical industry, the UK is ideally positioned to the lead the global transition of genomics into the clinic, if given the necessary support and investment.

Annex 2

Lessons from the DDD project for the clinical implementation of diagnostic sequencing for rare genetic disorders


The Deciphering Developmental Disorders (DDD) Study is a national collaboration between the NHS and the Wellcome Trust Sanger Institute. The dual aims of the DDD study are:

  • Discovery – defining the genetic causes of developmental disorders and sharing knowledge globally through publication and via online databases such as DECIPHER
  • Diagnosis – transferring knowledge into the NHS to address unmet clinical need by improving diagnosis for patients with rare disease and up-skilling the NHS genetics workforce.

Clinical geneticists across the UK NHS recruited patients with severe rare paediatric diseases that were apparent at birth or in early childhood and often affected multiple body systems and for which a genetic diagnosis had not proved possible using conventional testing in the NHS. Scientists at the Wellcome Trust Sanger Institute sequenced and analysed all the genes in these patients and their parents.

Data analysis is divided into focused ‘quasi’ clinical reporting and more flexible research investigations. Clinical reporting entails filtering against a panel of known developmental disorder genes (~1,450) with candidate diagnostic variants communicated to the submitting genetics service for clinical and laboratory validation. Research analyses focus on identifying new causes of developmental disorders by looking at genes with an excess of mutations in patients sharing similar clinical features. To date, the DDD project has identified >30 new genes for developmental disorders, resulting in >200 diagnoses.

Project outline

Timeline –

DDD Phase 1

  • Oct’10-Mar’11: Project set-up
  • Apr’11-Apr’15: Recruitment of ~13,500 families from NHS genetics clinics
  • Apr’13-Oct’17: Analysis and reporting of results from Phase 1

DDD Phase 2

  • Oct’17-Oct’21: Ongoing research for discovery and iterative diagnosis based on new findings.


  • (~£15M for Phase 1)
  • ~£9M from Health Innovation Challenge Fund (HICF), a joint funding partnership between the Wellcome Trust and the UK Department of Health;
  • ~£1M from the NIHR;
  • and ~£5M from Wellcome Trust Sanger Institute. Funding for ongoing research in DDD Phase 2 is currently from the Wellcome Trust Sanger Institute.


DDD was designed to be minimally disruptive to participating families and genetics services. This approach resulted in excellent recruitment (we overshot our recruitment target by ~1,500 due to the popularity of the study) and a high percentage of complete family trios recruited (~88% with the child, mother and father). We based the clinical data on information routinely collected in the genetics clinic supplemented, where necessary, by additional information obtained by telephone interviews with parents. In order to ensure high quality annotation of clinical features, we requested that this step was done by the patient’s own consultant and designed an online system ( with a restricted vocabulary of >10,000 terms that was quick, simple to use and worked on NHS computers.

DDD was also designed to make full use of the expertise at the Wellcome Trust Sanger Institute in sample handling and genome sequencing as well as managing and analysing large, complex datasets.

Management structure

DDD is a large study that has enabled a world-class research institute (Sanger) to partner with all 23 of the UK NHS regional genetics services. The project has obtained >40,000 DNA samples from:

  • ~13,500 UK families with a young family member affected by a rare disease where it had not been possible to identify a diagnosis using NHS testing.

Delivering the project involved:

  • ~200 Consultant Clinical Geneticists (NHS)
  • ~100 Clinical laboratory scientists (NHS)
  • ~50 Genetic counsellors and research nurses (NHS)
  • ~20 scientists and bioinformaticians in the core team (Sanger)
  • 6 members of the management committee (clinicians, scientists and an ethicist)

Scale and Complexity

The wealth of variants in every human genome (4-5 million) poses a huge challenge in identifying the genetic cause of disease. The DDD study has developed and deployed a wide range of algorithms to identify the broad range of disease-causing genetic variants (from single base changes through to large chromosomal changes) that underlie rare disease and the diverse range of inheritance mechanisms (including new de novo mutations as well as inherited variants). Some of the analyses in the DDD study require huge data-processing capacity, taking several weeks to run on the Sanger Institute high-performance computing cluster (one of the largest computing facilities in the UK).

Choice of technology

The DDD study used the most cost-efficient genome-wide technologies available to enable us to discover the greatest number of genes and diagnose the maximum number of patients possible within the study budget. The main technology deployed was whole exome sequencing (sequencing the coding sequence of the genome, i.e. all the genes), comprising in total ~2% of the genome, in children and their parents. Using this technology, we were able to sequence all the genes of all the children and complete family trios in the study, which would not have been possible with whole genome sequencing (even at 2017 prices). For further details, please see extended data fig. 6 in the accompanying DDD paper doi:10.1038/nature21062.


In addition to the core research team at the Wellcome Trust Sanger Institute, DDD initiated a system of Complementary Analysis Projects to stimulate and enable research and advanced training in the participating NHS genetics services and their academic partners (local universities and research institutes). More than 200 projects have been established and ~50 of the DDD publications to date have been delivered by this distributed research network, rather than the core team. Many of these publications have fostered international collaboration and catalyzed new diagnoses and discoveries.


The DDD study has had major impact in the following domains:

Science – 59 publications (to date) in the peer-reviewed literature including two flagship papers in the top journal Nature. Developing and implementing a prototype clinical bioinformatics pipeline for analysing genome-wide data from rare diseases. Discovering more than 30 new genes for developmental disorders. Generating data to estimate that developmental disorders caused by de novo mutations have an average prevalence of 1 in 300 births, depending on parental age. Given current global demographics, this equates to almost 400,000 children born per year.

Genetics services – diagnoses for their patients. Access to cutting-edge technologies. Access to research data, funding and publications. Advanced training. A strong evidence base for implementing improved services. Motivating nation-wide data sharing among clinical services to improve diagnosis. Support organisations for families – information on novel developmental disorders written by clinical genetic experts (Unique ).

Families – a molecular diagnosis for their child’s developmental disorder in a third of cases. Access to information about genetics and genomics at Contacting other families through SWAN ( and Unique. Participation in a ground-breaking and world-leading study.