Wellcome Sanger Institute

Martin Group

Medical and population genomics

We analyse large-scale genetic and electronic health record data to explore fine-scale population structure, its impact on disease risk, and the genetic architecture of both rare and complex diseases. We have a particular focus on populations in which parental relatedness (consanguinity) is common. 

We currently have several open positions in the group. If you are interested in joining as a postdoc, bioinformatician or staff scientist, please get in touch and send your CV (hcm at sanger.ac.uk). There are opportunities to work on sex differences in autism risk, the genetic basis of complex diseases in South Asians, and the projects mentioned below.

Our current projects use data from the following studies:

  • Genes & Health (formally known as East London Genes & Health) is a population-based cohort of British South Asians with high rates of cardiometabolic disease and of parental relatedness (current N~50,000, and growing to 100,000). Electronic health record and genotype data are available on all individuals, and exome-sequence data on ~5,500. We are currently exome-sequencing the full 50,000 individuals, in an exciting collaboration with pharma partners.
  • Born in Bradford (BiB) is a birth cohort with data on ~11,000 mothers and their children, of whom about half have Pakistani ancestry. It was established to investigate determinants of child and adult disease, and includes rich phenotypic and environmental data as well as genetic (genotype and whole-exome) and metabolomic data.
  • Deciphering Developmental Disorders (DDD) is a study of >13,000 patients with rare, severe paediatric disorders who have been exome-sequenced and genotyped to find diagnoses, discover new genes and understand the genetic architecture of these conditions.
  • Genomics England – 100K Genomes Project (GEL) is a clinical whole-genome sequencing project embedded within the National Health Service, from which the data have been made available for research. We are particularly focusing on the ~20,000 families with rare paediatric disease, of whom ~10% are South Asian.

A key feature of our research is to work in partnership with the individuals and populations we are studying. We uphold strict data security and confidentiality procedures and work closely with cohorts we are studying in community engagement and dissemination of our scientific findings.


Population structure and history in British South Asians

South Asia has an immensely complex population history, with thousands of anthropologically different groups, many of them characterised by hundreds or even thousands of years of endogamy. They also represent one of the largest immigrant groups in the UK. We are currently characterising the population structure and history of British Pakistanis and Bangladeshis using data from the Born in Bradford and East London Genes and Health projects. Specifically, we are using genotype-chip data and self-reported information on ethnic, tribal and biraderi (patrilineal kinship) groups to explore genetic differences and similarities between these groups as well as historical population size changes and patterns of consanguinity. We are then comparing this with historical and ethnographic accounts of group dynamics. Furthermore, we are exploring the extent to which historical bottlenecks have produced founder effects on Mendelian disease-causing variants. You can read about our recent population genetic work from the Born in Bradford project in this preprint.

The genetic basis of complex disease in South Asian populations

Most studies of the genetics of complex disease have taken place in European ancestry populations, and it is well established that the results do not necessarily translate well into other populations. British South Asians have particularly high rates of type 2 diabetes and ischaemic heart disease, but our understanding of the genetic contribution to these conditions in this population is lagging behind that for European-ancestry individuals. We are exploring the portability of polygenic scores for various traits from Europeans into South Asians, the best methods for doing this to maximise prediction accuracy, and the clinical utility of these scores in the Genes & Health cohort. We are also analysing genotype and electronic health record data from Genes & Health to discover new associations and to aid trans-ethnic fine-mapping, and use this to improve genetic prediction in South Asians. Additionally, we are exploring the contribution of autozygosity due to recent consanguinity to various complex traits, and investigating mechanisms driving this.

Human knockouts

Parental relatedness (consanguinity) is common in some South Asian communities, and increases the probability of an individual inheriting two copies of the same rare variant. Hence, consanguineous individuals are highly enriched for rare homozygous loss-of-function variants (knockouts), which can be informative about gene function and new drug targets. We are investigating the effects of knockouts in the BiB and ELGH projects through association analysis with various phenotypes including metabolite levels and transcriptional profiles observed in single-cell RNAseq data, and through recall-by-genotype studies such as the one described in this paper.

Genetic architecture of rare neurodevelopmental disorders

Rare and de novo large-effect exonic variants are known to play a major role in rare neurodevelopmental disorders (NDDs) such as intellectual disability and epilepsy. However, it is becoming increasingly clear that polygenic background plays a role. We are investigating how common variants impact the penetrance and expressivity of rare variants, particularly in the context of NDDs and cognition, examining both polygenic background and the role of eQTLs modifying penetrance in cis. We are also exploring whether there is a role for autozygosity in NDDs beyond the mechanism of simple monogenic recessive inheritance. We have other projects on identifying pathogenic large-effect noncoding variants in the Genomics England project, and in the transcriptional effects of pathogenic variants in spliceosome genes.

Genetics of cognitive and behavioural traits

Results from published (this paper and this one) and ongoing work in the DDD study suggest an important role for incompletely penetrant rare variants in NDDs, and that common variants affecting NDD risk also impact cognitive ability in the general population. In a recent collaboration led by the Hurles group, we demonstrated that damaging rare variants in highly constrained gene reduce the number of offspring individuals have, and that this is likely to be via their effects on cognitive and behavioural traits which affect individuals’ likelihood of finding a partner. Following this work, in collaboration with the Hurles group, we are beginning to explore the effect of rare and common variants on cognitive and behavioural traits at different life stages using large birth cohorts.

Recent publications and preprints

Led by members of the Martin group:

Hodgson S*, Huang QQ*, …., Martin HC`, Finer S`. Harnessing the power of polygenic risk scores to predict type 2 diabetes and its subtypes in a high-risk population of British Pakistanis and Bangladeshis in a routine healthcare setting, submitted and available on medrxiv (2021).

Huang QQ*, Salah N*, …., Lumbers T`, Martin HC`, Kuchenbaecker K`. Transferability of genetic loci and polygenic scores for cardiometabolic traits in British Pakistanis and Bangladeshis, under review and available on medrxiv (2021).

Arciero E, Dogra SA*, Malawsky D*, Mezzavilla M*, ….., Iles MM`, Martin HC`. Fine-scale population structure and demographic history of British Pakistanis, under review and available on biorxiv (2020).

Martin HC, …., Hurles ME.The contribution of X-linked coding variation to severe developmental disorders, Nat Commun (2021).


Gardner EJ, (6 other authors), Martin HC, Hurles ME. Sex-biased reduction in reproductive success drives selective constraint on human genes, under review and available on biorxiv (2020).

Uffelmann E, Huang QQ, Munung NS, DeVries J, Okada Y, Martin A, Martin HC, Lappalainen T, Posthuma D. Genome-wide association studies, in press at Nature Reviews Primers (2021).

Almarri MA, (4 other authors), Martin HC, Xue Y, Tyler-Smith C. The Genomic History of the Middle East, in press at Cell (2020).

Gardner EJ, (7 other authors), Martin HC, (3 other authors), Hurles ME. Detecting cryptic clinically-relevant structural variation in exome sequencing data increases diagnostic yield for developmental disorders, under review and available on medrxiv (2021).

Chen MH*, Raffield LM*, Mousas A*, (many authors), Huang QQ, (many authors), Martin HC, (many authors), Lettre G.Trans-ethnic and Ancestry-Specific Blood-Cell Genetics in 746,667 Individuals from 5 Global PopulationsCell (2020).

Kaplanis J*, Samocha KE*, Wiel L*, Zhang Z*, (4 other authors), Martin HC, (21 other authors), Hurles ME`, Gilissen C`, Retterer K`. Evidence for 28 genetic disorders discovered by combining healthcare and research data, Nature (2020).

Minikel EV, Karczewski KJ, Martin HC, (11 other authors), Daniel G. MacArthur. Evaluating drug targets through human loss-of-function genetic variation, Nature (2020).

Finer F, Martin HC, (13 other authors), van Heel DA.Cohort Profile: East London Genes & Health (ELGH), a community-based population genomics and health study in British Bangladeshi and British Pakistani people, International Journal of Epidemiology (2020).


* and ` indicate joint authorships.

Core team

Photo of Mohamed Almarri

Mohamed Almarri

PhD Student

Photo of Kartik Chundru

Kartik Chundru

Postdoctoral Fellow

Photo of Dr Qinqin Huang

Dr Qinqin Huang

Postdoctoral Fellow

Photo of Daniel Malawsky

Daniel Malawsky

MPhil Student

Photo of Emilie Wigdor

Emilie Wigdor

PhD Student

Previous team members

Photo of Elena Arciero

Elena Arciero

Postdoctoral Fellow


The Martin Group collaborates with the following groups at the Sanger Institute:


East London Genes and Health (ELGH) project

East London Genes & Health is one of the world’s largest community-based genetics studies, aiming to improve health among people of Pakistani and Bangladeshi heritage in East London by analysing the genes and health of 100,000 local people.



Loading publications...

Connect with us on Twitter