Functional sequences
Detecting human functional sequences with microarrays
We were inspired by recent work in our laboratories using microarrays to study DNA copy number (Fiegler et al 2003), replication timing and chromatin modifications in a variety of genomic situations from 400bp resolution in a ~200 kb pilot region, through ~75 kb resolution across the q arm of chromosome 22, to 1Mb resolution across the human genome. We aim to contribute microarray-based approaches to the ENCODE consortium to provide experimental evidence of DNA elements involved in gene regulation and replication, as well as the status of chromatin, across the pilot 1% of the genome. Specifically we are:
- Developing two sets of genomic microarrays covering the 1 % of the genome targeted in the ENCODE project. The first is a low resolution genomic clone (predominantly BACs, but also PACs, cosmids, fosmids) based microarray using the clones from the genomic sequence tile path. The second is an array of 22 000 1.25kb PCR fragments designed from the DNA sequence covering ~85% of the targeted regions - viewable here.
- Using these microarrays to assay DNA samples enriched for sequences involved in specific biological processes and
functions by methods including flow-sorting, pulse-labeling and chromatin immunoprecipitation (ChIP) so as to develop
high resolution maps of the following at genomic clone and 1.25kb resolution of:
- Replication timing
- Replication origins
- DNA methylation
- Modified histones/active and inactive chromatin
- Transcription factor binding sites
We will correlate these maps with genomic DNA features including C+G content, genes/exons, repeat elements, and SNP density. In addition we will correlate the elements we map with regions of conserved DNA sequence identified by comparative sequencing across multiple species being undertaken in the laboratory of Eric Green and maps of transcriptional activity as part of the consortium.
Team
- Ian Dunham PI
- David Vetrie Co-PI
- Nigel Carter Co-PI
References
-
A genome annotation-driven approach to cloning the human ORFeome.
Genome biology 2004;5;10;R84
PUBMED: 15461802; PMC: 545604; DOI: 10.1186/gb-2004-5-10-r84
-
DNA microarrays for comparative genomic hybridization based on DOP-PCR amplification of BAC and PAC clones.
Genes, chromosomes & cancer 2003;36;4;361-74
PUBMED: 12619160; DOI: 10.1002/gcc.10155
-
Reevaluating human gene annotation: a second-generation analysis of chromosome 22.
Genome research 2003;13;1;27-36
PUBMED: 12529303; PMC: 430954; DOI: 10.1101/gr.695703
Genetic variation
Identification of functionally variable regulatory regions in the human genome
One of the main reasons to annotate the human genome is to interpret the phenotypic consequences of genetic variation within functional genomic regions. We are using a novel approach for the selective identification of functionally variable regulatory sequences of the human genome. We are detecting correlations between variation in gene expression and nucleotide polymorphisms near those genes to identify regulatory regions and their variants that contribute to gene expression variation. This approach uses naturally occurring genomic variation (nucleotide polymorphism) and phenotypic variation (transcript levels) to detect significant associations (Figure 1). Polymorphisms associated with phenotypic variation will likely be in linkage disequilibrium with functional regulatory polymorphisms nearby, thereby identifying segments of the genome containing sequences that regulate gene expression.
Our experimental design is to use the illumina technology to screen for gene expression variation as well as to genotype relevant SNPs for the association analysis. We have designed an illumina bead array that contains approximately 350 genes from the ENCODE regions, all the human chromosome 21 genes and 100 genes from a 10 Mb genomic region of human chromosome 20. An example of a hybridized array is shown in Figure 2. The technology is highly sensitive and accurate. In Figure 3a we show the regression of two replicates from the same RNA pool and in Figure 3b the regression of two different individuals. Note the wider spread of Figure 3b as a result of difference in transcript levels between the two individuals.
We view this project as readily scalable to a whole human genome screen for gene expression variation and association with nucleotide polymorphism.
It will provide 3 different types of information:
- Genomic regions that contain variable regulatory polymorphisms
- Structure of regulatory variation in the human genome and determination of how it is associated with disease susceptibility
- Large dataset of genes that exhibit variation of expression within populations, in a manner similar to the way the HapMap project will provide the haplotype structure of the human genome
Team
- Manolis Dermitzakis PI
- Panos Deloukas Co-PI
- Stylianos E. Antonarakis, University of Geneva Co-PI
- Andrew G. Clark, Cornell University Co-PI
Data access
ENCODE - Data Access (pilot phase)
Parameter key
Amount of antibody in assay (µg) : Formaldehyde concentration (%) : Cross-linking time (minutes) 5 : 1 : 10
Contributors
Large-scale data contributors
Contact
- Detecting human functional sequences with microarrays: Nigel Carter
- Identification of functionally variable regulatory regions in the human genome: Panos Deloukas





