Malaria programme: Kwiatkowski group

The Malaria programme uses genomic and genetic approaches to discover molecular mechanisms of host-parasite interactions that may lead to new biological insights and improved strategies for disease prevention.

Within this programme, the Kwiatkowski group is investigating biological consequences of natural variation in the human and plasmodium genomes.

More information on the Malaria Programme.

[Susana Campino, Genome Research Limited]

Background

At the core of our research is the question: Why, in areas where people are repeatedly infected with malaria, do some become gravely ill, while others show no signs of disease at all?

Underpinning this paradox are several important and fascinating issues in biology, evolution, and medicine: What differences in individuals' innate immune system confer resistance or susceptibility to malaria? Do genetic differences make some Plasmodium populations more virulent in nature? And can we detect changes in the parasite genome that confer resistance to anti-malarial drugs?

The Kwiatkowski group is combining large-scale epidemiological studies with high-throughput analysis of genome variation to systematically search the human and plasmodium genomes for novel alleles that affect disease progression. Our goal is to use natural genomic diversity to discover molecular mechanisms of such phenotypes as protective immunity in the host or drug resistance in the pathogen. These insights will be critical for the control of antimalarial drug resitance, the development of new drugs, and ideally, an effective vaccine against this disease.

Our multidisciplinary team is divided between the Sanger Institute and the Wellcome Trust Centre for Human Genetics at the University of Oxford, and includes expertise in malaria biology, epidemiology, statistics, informatics, ethics, and programme management. At the Sanger Institute we are leveraging the cutting edge genome technologies and expertise of the Institute, while building unique informatics and experimental tools to analyse and share the vast data generated, to understand how natural genetic variation impacts malaria disease.

Research

Child with severe cerebral malaria.

Child with severe cerebral malaria.

Human genetic resistance to malaria

Currently we have two main approaches to understand the impact of genetic variation in host susceptibility to malaria. The first is large-scale genome-wide association (GWA) studies looking for association of known genetic markers across the human genome with resistance or susceptibility to malaria. We are further examining how signals of genetic association are affected by diversity in African population structure, and identifying regions of the genome under recent positive selection in malaria-endemic populations by accurate haplotype construction using family-based GWA data. The second approach involves deep resequencing of candidate resistance genes to identify new genetic variants that correlate with innate immunity to malaria within and between African populations.

This work involves close collaboration with the Sanger Institute's genotyping teams led by Panos Deloukas and the Medical Re-sequencing team led by Aarno Palotie.

These studies are being performed as part of the MalariaGEN Consortium, a network of scientists in more than 20 countries, many in the most affected regions of the world, who share their clinical samples as well as their expertise about malaria. MalariaGEN is funded by the Grand Challenges in Global Health programme of the Bill and Melinda Gates Foundation. MalariaGEN projects include multi-centre case-control and family-based association studies of severe malaria, as well as large cohort studies examining the natural evolution of infection and immunity. The consortium also has ongoing genetic linkage studies in a number of populations, as well as investigations of ethnic groups that naturally have a high level of resistance to malaria.

Malaria parasite invading a red blood cell.

Malaria parasite invading a red blood cell.
Enlarge this image (256 x 188)

Biological consequences of natural variation in the Plasmodium falciparum genome

Knowledge of the natural genomic diversity and population genetics of a single species of Plasmodium is crucial for understanding parasite's extraordinary ability to evade the immune system and to develop resistance to anti-malarial drugs.

To date, Plasmodium genome sequencing at the Sanger Institute and elsewhere has focused on laboratory-adapted parasites. We are now developing the experimental, epidemiological, and analytical tools to undertake characterisation of natural genome diversity in Plasmodium falciparum isolates from multiple malaria-endemic regions in Africa and Southeast Asia. Leveraging Solexa/Illumina high-throughput sequencing technology we hope to develop shotgun genotyping as a cost-effective method for genome-wide analysis of natural variation in Plasmodium falciparum.

Understanding the complex population genetic structures that arise under different conditions of malaria transmission will revolutionise malaria biology, serving as the foundation for large-scale epidemiological studies of genotype-phenotype correlation for example for drug resistance or immune evasion and other parasite phenotypes, and informing malaria monitoring and control strategies in the field. Accordingly, we will use natural P. falciparum variation data to inform functional analysis of parasite biology in the laboratory.

This work is being done in close collaboration with Chris Newbold (of Oxford University, and honorary Sanger Faculty) and the Parasite Genomics group lead by Matt Berriman, who head the resequencing and reannotation of the reference Plasmodium falciparum genome, 3D7.

Principal Components Analysis of Affymetrix 500K SNP chip data reveals genetic signatures of Gambian ethnic sub-populations (as indicated by colour).

Principal Components Analysis of Affymetrix 500K SNP chip data reveals genetic signatures of Gambian ethnic sub-populations (as indicated by colour).
Enlarge this image (300 x 153)

Statistical analysis of genome-wide association and short-read sequence data

The biological framework of our research programme is underpinned by cutting edge statistical and informatics solutions for the analysis, handling, and sharing of large-scale sequence and genotype data.

New high-throughput sequencing and large-scale genotyping technologies drive the need for novel statistical methods for genetic data analysis. This need is further underscored by the complexities of the population genetic structures we study: for example, the rich haplotypic diversity and low linkage disequilibrium (LD) of African populations pose unique challenges for GWA study design and analysis. Likewise, new statistical tools will be required to use short read sequence data to identify with confidence polymorphisms and structural variants, patterns of LD, and differences between Plasmodium populations. This is particularly challenging because of the low LD and AT rich nature of the parasite genome, as well as the presence of multiple parasite genomes in clinical samples.

Our epidemiological studies of human resistance and suscetibility to malaria include case-control and family trio designs, with sample sizes currently exceeding 12,000 and 2,000, respectively. For GWA analysis we have implemented analysis pipelines that convert chip intensities to genotype calls, through to ultimately testing for associations and positive selection, correcting for population artifacts, to discover putative variants for follow-up in the laboratory. This process has involves developing and applying methods to call genotypes (Illuminus), to understand the relationship of population structure to ethnicity within and across study sites, to, and to determine strategies for the selection of tagging SNPs and determination of genotype in populations with low LD.

Screenshot of LookSeq, a browser-based read alignment viewer. LookSeq is a web-based application for alignment visualization, browsing and analysis of genome sequence data.

Screenshot of LookSeq, a browser-based read alignment viewer. LookSeq is a web-based application for alignment visualization, browsing and analysis of genome sequence data.
Enlarge this image (300 x 172)

Informatics solutions for analysis and sharing of large-scale sequence and genotype data

A major activity of our team is the development of informatic technologies to manage and analyse the remarkable volume of data generated by resequencing and genotyping projects, and equally, to establish effective ways to share these complex epidemiological and genetic datasets across the malaria research community.

We have produced an improved algorithm for genotype calling from the Illumina Bead Array platform (Illuminus) and detection of positive selection from haplotype information. We are also developing browser-based software packages for simplified presentation and browsing of linkage disequilibrium along chromosomes (Marker3) and for SNP-discovery and analysis in short-read sequence data (LookSeq).

Our team is a partner in the WorldWide Antimalarial Resistance Network (WARN), a global network of malaria researchers aiming to build a web-based global antimalarial efficacy and resistance database to track resistance to malaria drugs. The proposed database will provide free access to web-based, linked sets of data, as well as tools to help analyse and publish the data.

Resources

Publically available data

LookSeq

LookSeq is a web-based application for alignment visualisation, browsing and analysis of genome sequence data.

LookSeq supports multiple sequencing technologies, alignment sources, and viewing modes; low or high-depth read pileups; and easy visualisation of putative single nucleotide and structural variation. The visible range, from whole chromosome to single base resolution, can be set manually or by scrolling or zooming the display with fast, on-the-fly rendering from the server-side alignment database. LookSeq uses a universal database for alignments of different sequencing technologies and algorithms. Sequence data from multiple sources can be viewed separately or aligned in a single display, facilitating direct comparison between datasets. LookSeq can also link to relevant external sites such as PubMed and other online analysis tools, via buttons or double-clicking on the displayed sequence annotation.

LookSeq requires no setup or installation, and is very intuitive to use.

Collaborations

MalariaGEN Genomic Epidemiology Network

MalariaGEN brings together research groups with different projects and scientific objectives to work together on large-scale investigations that depend on samples, data and expertise from multiple investigators.

Selected Publications

  • A genotype calling algorithm for the Illumina BeadArray platform.

    Teo YY, Inouye M, Small KS, Gwilliam R, Deloukas P, Kwiatkowski DP and Clark TG

    Bioinformatics (Oxford, England)2007;23;20;2741-6

* quick link - http://q.sanger.ac.uk/mal-kwia