Wellcome Sanger Institute

Martin Group

Medical and population genomics

We analyse large-scale genetic and electronic health record data to explore fine-scale population structure, its impact on disease risk, and the genetic architecture of both rare and complex diseases. We have a particular focus on populations in which parental relatedness (consanguinity) is common. 

Our current projects use data from the following studies:

  • Three British birth cohorts: the Avon Longitudinal Study of Parents and Children (ALSPAC), the Millenium Cohort Study, and Born in Bradford. All three cohorts have rich phenotypic data on a multitude of traits, linkage to educational and health records, as well as existing genotype-chip data. We are generating exome-sequence data on ~25k probands plus a subset of their parents, in order to study the impact of rare and common variants on cognitive development and behavioural traits relevant to neurodevelopmental disorders.
  • Deciphering Developmental Disorders (DDD) is a study of >13,000 patients with rare, severe paediatric disorders who have been exome-sequenced and genotyped to find diagnoses, discover new genes and understand the genetic architecture of these conditions.
  • Genomics England – 100K Genomes Project (GEL) is a clinical whole-genome sequencing project embedded within the National Health Service, from which the data have been made available for research. We are particularly focusing on the ~20,000 families with rare paediatric disease, of whom ~10% are South Asian.
  • Genes & Health (formally known as East London Genes & Health) is a population-based cohort of British South Asians with high rates of cardiometabolic disease and of parental relatedness (current N~50,000, and growing to 100,000). Electronic health record and genotype data are available on all individuals, and exome-sequence data on ~5,500. We are currently exome-sequencing the full 50,000 individuals, in an exciting collaboration with pharma partners.

A key feature of our research is to work in partnership with the individuals and populations we are studying. We uphold strict data security and confidentiality procedures and work closely with cohorts we are studying in community engagement and dissemination of our scientific findings.


Genetic architecture of rare neurodevelopmental disorders

Rare and de novo large-effect exonic variants are known to play a major role in rare neurodevelopmental disorders (NDDs) such as intellectual disability and epilepsy. However, it is becoming increasingly clear that polygenic background plays a role. We are trying to characterise that role, to understand the interplay between common and rare variants, whether polygenic background has a direct and/or indirect effect, and, if there is an indirect effect, how it is mediated (see this recent preprint).

Genetics of cognitive and behavioural traits

Results from published (this paper and this one) and ongoing work in the DDD study suggest an important role for incompletely penetrant rare variants in NDDs, and that common variants affecting NDD risk also impact cognitive ability in the general population. In a recent collaboration led by the Hurles group, we demonstrated that damaging rare variants in highly constrained gene reduce the number of offspring individuals have, and that this is likely to be via their effects on cognitive and behavioural traits which affect individuals’ likelihood of finding a partner. Following this work, in collaboration with the Hurles group, we are beginning to explore the effect of rare and common variants on cognitive and behavioural traits at different life stages using large birth cohorts.

Patterns and rates of de novo mutations and their causes

De novo mutations make a major contribution to neurodevelopmental disorders. Parental age is the major factor contributing to de novo mutation rate but we are exploring whether rare and common variants also contribute. This is collaborative work with the groups of John Perry, Raheleh Rahbari and Aylwyn Scally. A recent preprint is here.

The genetic basis of complex disease in South Asian populations

Most studies of the genetics of complex disease have taken place in European ancestry populations, and it is well established that the results do not necessarily translate well into other populations. British South Asians have particularly high rates of type 2 diabetes and ischaemic heart disease, but our understanding of the genetic contribution to these conditions in this population is lagging behind that for European-ancestry individuals. We are exploring the portability of polygenic scores for various traits from Europeans into South Asians, the best methods for doing this to maximise prediction accuracy, and the clinical utility of these scores in the Genes & Health cohort (see our recent work on this here and here). We are also analysing genotype and electronic health record data from Genes & Health to discover new associations and to aid trans-ethnic fine-mapping, and use this to improve genetic prediction in South Asians. Additionally, we are exploring the contribution of autozygosity due to recent consanguinity to various complex traits, and investigating mechanisms driving this.

Human knockouts

Parental relatedness (consanguinity) is common in some South Asian communities, and increases the probability of an individual inheriting two copies of the same rare variant. Hence, consanguineous individuals are highly enriched for rare homozygous loss-of-function variants (knockouts), which can be informative about gene function and new drug targets. We are investigating the effects of knockouts in the BiB and ELGH projects through association analysis with various phenotypes including transcriptional profiles observed in single-cell RNAseq data, and through recall-by-genotype studies such as the one described in this paper.

Population structure and history in British South Asians

South Asia has an immensely complex population history, with thousands of anthropologically different groups, many of them characterised by hundreds or even thousands of years of endogamy. They also represent one of the largest immigrant groups in the UK. We are currently characterising the population structure and history of British Pakistanis and Bangladeshis using data from the Born in Bradford and East London Genes and Health projects. Specifically, we are using genotype-chip data and self-reported information on ethnic, tribal and biraderi (patrilineal kinship) groups to explore genetic differences and similarities between these groups as well as historical population size changes and patterns of consanguinity. We are then comparing this with historical and ethnographic accounts of group dynamics. Furthermore, we are exploring the extent to which historical bottlenecks have produced founder effects on Mendelian disease-causing variants. You can read about our recent population genetic work from the Born in Bradford project in this paper.

Recent publications and preprints (since 2020)

Huang QQ*, Wigdor E*, …., Hurles ME, Martin HC. Dissecting the contribution of common variants to risk of rare neurodevelopmental conditions, in revision and available on medrXiv (2024).

Heng TH, …, Martin HC. Widespread recessive effects on common diseases in a cohort of 44,000 British Pakistanis and Bangladeshis with high autozygosity, under review and available on medrXiv (2024).

Koko Musa M, …, Martin HC. Contribution of autosomal rare and de novo variants to sex differences in autism, submitted and available on medrXiv (2024).

Wigdor E, …., Martin HC. Investigating the role of common cis-regulatory variants in modifying penetrance of putatively damaging, inherited variants in severe neurodevelopmental disorders, Scientific Reports (2024.

Malawsky D*, van Walree E*, …., O’Connell J, Martin HC. Influence of autozygosity on common disease risk across the phenotypic spectrum, Cell (2023).

Chundru K, …, Ustach VD`, Martin HC`. Federated analysis of the contribution of recessive coding variants to 29,745 developmental disorder patients from diverse populations, in revision and available on medrxiv (2023).

Liu T*, Sankareswaran A*, Paterson G*, …, Chandak GR`Martin HC`Finer F`. Investigating misclassification of type 1 diabetes in a population-based cohort of British Pakistanis and Bangladeshis using polygenic risk scores, submitted and available on medrxiv (2023).

Lord J, …, Baralle D`, Martin HC`, Whiffin N`. Non-coding variants are a rare cause of recessive developmental disorders in trans with coding variants, in revision and available on medrxiv (2023).

Stankovic S*, Shekari *, Huang QQ*, Gardner EJ*,…, Martin HC, Perry JRB`, Murray A`. Genetic susceptibility to earlier ovarian ageing increases de novo mutation rate in offspring, under review and available on medrxiv (2022).

Warrier V, Stauffer E, Huang QQ, Wigdor E, (13 other authors), Won H*, Martin HC*, Bullmore ET*, Bethlehem RAI. The genetics of cortical organisation and development: a study of 2,347 neuroimaging phenotypes, Nature Genetics (2023).

Huang QQ*, Salah N*, …., Lumbers T`, Martin HC`, Kuchenbaecker K`. Transferability of genetic loci and polygenic scores for cardiometabolic traits in British Pakistanis and Bangladeshis, Nature Communications (2022).

Hodgson S*, Huang QQ*, …., Martin HC`, Finer S`. Integrating polygenic risk scores in the prediction of type 2 diabetes risk and subtypes in British Pakistanis and Bangladeshis: A population-based cohort study, PLoS Medicine (2022).

Warrier V, (14 other authors), Martin HC, Bourgeron T, Baron-Cohen S. Genetic correlates of phenotypic heterogeneity in  autism, Nature Genetics (2022).

Arciero E, Dogra SA*, Malawsky D*, Mezzavilla M*, ….., Iles MM`, Martin HC`. Fine-scale population structure and demographic history of British Pakistanis, Nature Communications (2021).

Martin HC, …., Hurles ME. The contribution of X-linked coding variation to severe developmental disorders, Nature Communications (2021).

Lam BYH*, Williamson A*, Finer S*, (24 other authors), Martin HC, Coll AP, Rowitch DH, Wareham NJ, van Heel DA, Timpson N, Simerly RB, Ong KK, Cone RD, Langenberg C, Perry JRB, Yeo GS, O’Rahilly S. MC3R links nutritional state to childhood growth and the timing of pubertyNature (2021).

Uffelmann E, Huang QQ, Munung NS, DeVries J, Okada Y, Martin A, Martin HC, Lappalainen T, Posthuma D. Genome-wide association studies, Nature Reviews Primers (2021).

Almarri MA, (4 other authors), Martin HC, Xue Y, Tyler-Smith C. The Genomic History of the Middle East, Cell (2021).

Gardner EJ, (7 other authors), Martin HC, (3 other authors), Hurles ME. Detecting cryptic clinically relevant structural variation in exome-sequencing data increases diagnostic yield for developmental disorders, AJHG (2021).

Kaplanis J*, Samocha KE*, Wiel L*, Zhang Z*, (4 other authors), Martin HC, (21 other authors), Hurles ME`, Gilissen C`, Retterer K`. Evidence for 28 genetic disorders discovered by combining healthcare and research data, Nature (2020).

Minikel EV, Karczewski KJ, Martin HC, (11 other authors), Daniel G. MacArthur. Evaluating drug targets through human loss-of-function genetic variation, Nature (2020).

Finer F, Martin HC, (13 other authors), van Heel DA.Cohort Profile: East London Genes & Health (ELGH), a community-based population genomics and health study in British Bangladeshi and British Pakistani people, International Journal of Epidemiology (2020).

* and ` indicate joint authorships.

Core team

Photo of Dr Qinqin Huang

Dr Qinqin Huang

Staff Scientist

Photo of Georgios Kalantzis

Georgios Kalantzis

Postdoctoral Fellow

Photo of Mahmoud Koko Musa

Mahmoud Koko Musa

Postdoctoral Fellow

Photo of Emma Wade

Emma Wade

MPhil Student

Photo of Klaudia Walter

Klaudia Walter

Senior Staff Scientist

Photo of Dr Olivia Wootton

Dr Olivia Wootton

Postdoctoral Fellow

Previous team members

Photo of Mohamed Almarri

Mohamed Almarri

PhD Student

Photo of Elena Arciero

Elena Arciero

Postdoctoral Fellow

Photo of Dr Patrick Campbell

Dr Patrick Campbell

Visiting Scientist

Photo of Daniel Malawsky

Daniel Malawsky

PhD Student

Photo of Mari Niemi

Mari Niemi

Research Associate


The Martin Group collaborates with the following groups at the Sanger Institute:


East London Genes and Health (ELGH) project

East London Genes & Health is one of the world’s largest community-based genetics studies, aiming to improve health among people of Pakistani and Bangladeshi heritage in East London by analysing the genes and health of 100,000 local people.



Loading publications...