Wellcome Sanger Institute

Archived

Hemberg Group

Quantitative models of gene expression

Archive Page

This page is maintained as a historical record and is no longer being updated.

The Hemberg Group moved to the Evergrande Center for Immunologic Diseases in February 2021. https://evergrande.hms.harvard.edu/home

Although every cell in an organism contains the same DNA, there is a great variety of cell types (e.g. skin, muscle, kidney) due to the fact that different genes are being transcribed. The amount of transcripts, or RNA, made from a specific gene can be measured in the cell and is referred to as the expression level of the gene. Understanding how, why, when and where genes are turned on and off is crucial for understanding many biological processes, ranging from devlopment to a variety of diseases, including cancer and autism.

Recent technological advances have made it possible to analyze gene expression and other related properties in a high-throughput manner, and this has resulted in a wealth of data. However, the experimental data is typically large, high-dimensional and noisy. We are interested in developing computational methods that will make it possible to extract as much information as possible from the data.

Some of the ongoing research projects are:

Inference of gene regulatory networks from single-cell RNA-seq data. Thanks to extensive annotation efforts, we have an almost complete catalogue of protein coding genes in humans and model organisms. Much less is known about how genes interact. To infer a network, one must have expression data from multiple conditions, e.g. mutants or a time-series. Due to the high levels of noise in the data and the limited number of conditions, existing methods for bulk RNA-seq have a limited ability to detect causal regulatory relations. With single-cell RNA-seq data, a more powerful approach is possible since each cell can be considered as an individual replicate. Knowing which genes interact is also key for understanding development and many diseases such as cancer and autism.
Identification of the molecular mechanisms involved in transgenerational epigenetic inheritance. Together with the Miska lab, we are studying C. elegans to learn more about how gene expression profiles can be stably inherited. Several lines of evidence have suggested the existence of such effects, but no mechanism has been identified for endogenous genes. The short generation time and the small genome makes C. elegans a powerful model system for investigating this phenomenon.
Identification and characterization of non-canonical secondary structures in DNA. Mutations outside of coding regions remain poorly understood and their importance in cancer and other diseases is unknown. We are investigation non-coding mutations from cancer samples to find out if the disruption of secondary DNA structures could play an important role.
Virtual Reality technology for visualizing genomic data. We are collaborating with HammerheadVR, a leading VR development studio to develop a novel genome browser for Virtual Reality technologies.

Our people

Previous group lead

Dr Martin Hemberg, PhD

Former CDF Group Leader

Martin Hemberg is a Career Development Fellow Group Leader and his research interests are centered around quantitative models of gene expression and gene regulation. He is particularly interested in stochastic models and analysis of single-cell data. Another line of research involves analyzing the role of non-coding transcripts and sequences.

Previous core team members

Tallulah S. Andrews

Postdoctoral Fellow

Dr Jimmy Tsz Hang Lee

Senior Data Scientist

Nicholas Lee

PhD Student

Guillermo Parada

PhD Student

Associated research

Tools & software

Tool

Discrete Distributional Differential Expression (D3E)

D3E is a method for identifying differentially expressed genes from single-cell RNA-seq experiments. D3E compares the full distribution between two sample ...

Tool

MPRAnator

A tool for the design of high-throughput massively parallel reporter assays (MPRAs)

Tool

Single-cell Consensus Clustering (SC3)

SC3 is a method for unsupervised clustering of single-cell RNA-seq data. In addition to a graphical user-interface, SC3 provides additional ...

Tool

scRNA-seq analysis course

Teaching material the Hemberg group's course on computational analysis of single-cell RNA-seq data

Related groups

Science group

Cellular Genomics Informatics

Cellular Genomics

Our team provides efficient access to cutting-edge analysis methods, environments and pipelines for Cellular Genetics programme, which leads and is involved ...

Science group

Lawniczak Group

Evolutionary genetics

Our research group uses genomics to investigate insect biodiversity and malaria transmission.

Science group

Miska Group

Non-coding RNA and epigenetics

We are interested in all aspects of gene regulation by non-coding RNA. Current research themes include: miRNA biology and pathology, miRNA ...

Science group

Nik-Zainal Group

Signatures of mutagenesis in somatic cells

Until they moved to the University of Cambridge in 2017, the Signatures of mutagenesis in somatic cells group explored patterns of ...

Science group

Reik Group

Epigenetic reprogramming

Single cell epigenomics applied to development and ageing

Wellcome Sanger Institute

Programmes and Facilities

Programme

Open Targets

Founded in 2014, Open Targets is an innovative public-private partnership that uses human genetics and genomics data for systematic drug target ...

Programme

Cellular Genomics

Cellular Genomics is a diverse and inclusive programme that aims to decode and recode the tissue ecosystems of the human body. ...

Partners

We are interested in working in close collaboration with experimentalists as it provides with direct access to the people who generated data and have a deep understanding of the underlying biology. This approach makes it easier for us to develop mathematical models and it also provides us with a better understanding of what type of computational tools are needed.Some of the people that we have worked with in the past or are currently working with are listed below.

External

Publications

Loading publications...

Careers and Study

Policies

Archive

Leadership

Faculty

Hemberg Group

Archive Page

Our people

Previous group lead

Dr Martin Hemberg, PhD

Previous core team members

Tallulah S. Andrews

Dr Jimmy Tsz Hang Lee

Nicholas Lee

Guillermo Parada

Associated research

Discrete Distributional Differential Expression (D3E)

MPRAnator

Single-cell Consensus Clustering (SC3)

scRNA-seq analysis course

Related groups

Cellular Genomics Informatics

Lawniczak Group

Miska Group

Nik-Zainal Group

Reik Group

Programmes and Facilities

Open Targets

Cellular Genomics

Partners

Publications