Browse data
Browse Data by Individual
Individual chromosome genome assembly browser
How to browse a sample
Choose the sample, the chromosome, the genome assembly and the browser that you would like to use, then click on "Browse".
This will redirect you to the first 1Mb of the chosen chromosome.
UCSC
The UCSC browser will show up to four tracks:
- The clones on the WGTP array (Track "Sanger_WGTP_clones")
- The log2ratio of copy number at each clone (Track "DyeSwap Ratios for ...")
- The clones called as losses relative to the reference DNA (Track "Loss calls ...")
- The clones called as gains relative to the reference DNA (Track "Gain calls ...")
- The distribution of log2ratios across all 270 HapMap samples can be visualised for each clone on the array by clicking on the desired clone in the track "Sanger_WGTP_Clones" to show the details page for that clone and then clicking on the link entitled "Outside Link".
For help about the UCSC browser use, please visit UCSC help
Ensembl
Choosing to display the CNVs for an individual in the Ensembl browser will show four tracks. These will be :
- The clones on the WGTP array (Track "Sanger_WGTP_clones" )
- A histogram track of the probe/clone intensity log2ratios.
- The clones called as losses relative to the reference DNA (Track "cnvs_loss_NAxxxxx")
- The clones called as gains relative to the reference DNA (Track "cnvs_gain_NAxxxxxx")
- By clicking on any of the clone features in the Sanger_WGTP_clones you can follow a link (click 'details') to view a distribution of the log2ratio scores for each clone across the whole dataset of 270 HapMap individuals.
Note that due to software issues with the Ensembl website, when viewing CNVs on the NCBI35 assembly the histogram track of log2ratios will appear as a track of vertical bars. When you mouse over these the log2ratio will be displayed as a tool tip. Ideally, use the NCBI36 assembly display as it is much more illustrative.
For help about the Ensembl browser use, please visit Ensembl help
Browse Copy Number Variable regions identified within the 270 HapMap samples
Hapmap CNVs genome assembly browser
- WGTP_CNVs - CNV identified using the WGTP array among all 270 HapMap samples
- 500KEA_CNVs - CNV identified using the Affymetrix 500k GeneChip Early Access arrays among all 270 HapMap samples
- Redon_CNVs - merged CNVs from both 500KEA and WGTP platforms for the 270 HapMap Samples
Data download
Data access summary
- Gene Expression Omnibus - access raw data from the 500K EA as well as the 500K commercial arrays, with accession numbers GSE5013 and GSE5173
- ArrayExpress - access WGTP data, with accession number E-TABM-107
- Database of Genomic Variants - access CNV calls integrated with all other CNV data
The data can be downloaded in four formats:
- A single text file containing the mean dye-swap intensity for each clone in each individual
- Raw extracted intensities for each image (BlueFuse format) in Excel-compatible text files
- Normalized extracted intensities, with low intensity spots removed and log2ratios calculated.
- Some sample image files for dye-swap experiment and a mapping file for use with the image files
Download data
The data were generated using the WGTP array with dye-swap for 269 HapMap individuals and using a single male reference : HapMap individual NA10851.
WGTP Intensities Data
Alternatively you can download all the pre and post-processed intensities for all of the 269 individuals:
Or you can download the individual's intensities of your interest:
Download WGTP Individual Sample CNVs
Download Copy Number Variable regions identified within the 270 HapMap samples
To browse data within a genome browser, please visit the 'Browse Data' tab.
Related files
Validation data
Two types of experiments are available: replicates and add_in experiments.
Their pre-processed and post-processed intensities (as defined in data description) are fully available. Only samples of raw scanner images files are available, for any interest in downloading the Raw Scanner images please contact Matthew Hurles.
Validation data:
All the validation datafiles are in the same format as the one provided for the WGTP Array.
| Pre & Post processed intensities data | |||
|---|---|---|---|
| Replicates experiments | All replicates data (~322Mo) | FTP | |
| Add_in experiments | Human chromosome validation (~380Mo) | Hamster, Mouse & self experiments validation (~118Mo) | FTP |
| Replicates Raw Images | |||||
|---|---|---|---|---|---|
| NA15510A | reference_red | reference_green | sample_green | sample_red | FTP |
| NA12144A | reference_red | reference_green | sample_green | sample_red | FTP |
| self_A | reference_red | reference_green | sample_green | sample_red | FTP |
| Add_in experiments Raw Images | ||
|---|---|---|
| chrom10 | FTP | |
| chrom10 exp 1489-16 | chr10_red_1489-16 | chr10_green_1489-16 |
| chrom10 exp 1489-17 | chr10_red_1489-17 | chr10_green_1489-17 |
| chrom10 exp 1489-18 | chr10_red_1489-18 | chr10_green_1489-18 |
| chrom10 exp 1489-19 | chr10_red_1489-19 | chr10_green_1489-19 |
| chrom11 | FTP | |
| chrom11 exp 1489-20 | chr11_red_1489-20 | chr11_green_1489-20 |
| chrom11 exp 1489-21 | chr11_red_1489-21 | chr11_green_1489-21 |
| chrom11 exp 1489-22 | chr11_red_1489-22 | chr11_green_1489-22 |
| chrom11 exp 1489-23 | chr11_red_1489-23 | chr11_green_1489-23 |
Data release
Sample
It corresponds to the HapMap Individual that has been used for the WGTP array.
Mapping files
The 2 GAL file are for the mapping on the WGTP array
WGTP_array_map1.gal is to use with experiments until 23/09/2005 and WGTP_array_map2.gal with experiments from 27/09/2005.
The text file WGTP_clone_map_NCBIxx.txt contains the mapping of the Human clones ( for both assembly NCBI35 (May 2004) and NCBI36 (March 2006) ).
Log2ratios Intensities for the 269 Hapmap Individuals
This single text file contains the mean dye-swap intensity for each clone in each individual.
All experimental artefacts, as provided in this list, have been removed from this file.
Pre processed ( Raw ) intensities ( from BlueFuse)
Foreach individual there are 2 files: one per intensity signal ( red, green )
Fluorescence intensities and log2 ratio values were extracted using the Bluefuse software (Bluegnome Ltd) from the scanner (raw) images.
Post processed intensities ( from BlueFuse )
For each individual there are 2 files : one per intensity signal ( red, green ).
These excel files are derived from the Raw intensities excel file after a post processing from the software BlueFuse.
This post processing consisted in:
- Any spot giving low signal intensities ("amplitude"<100 in both channels) or inconsistent fluorescence patterns ("confidence" < 0.5 or "quality" = 0) was excluded from further analysis.
- Log2 ratio values were then normalised by median block values, still using Bluefuse capabilities.
Raw images green/red
For each individual there are 2 files : one per intensity signal ( red, green ).
These images are the raw output from the laser scanner (Agilent Technologies) after the experiment.
Examples of raw image files raw scanner WGTP image files:
| Sample | raw scanner images | directory | |||
|---|---|---|---|---|---|
| NA07019 | sample_green | reference_red | sample_red | reference_green | FTP |
| NA07022 | sample_green | reference_red | sample_red | reference_green | FTP |
| NA07029 | sample_green | reference_red | sample_red | reference_green | FTP |
| NA12057 | reference_red | sample_green | reference_green | sample_red | FTP |
| NA18500 | reference_red | sample_green | reference_green | sample_red | FTP |
| NA18501 | reference_red | sample_green | reference_green | sample_red | FTP |
| NA18502 | reference_red | sample_green | reference_green | sample_red | FTP |
| NA18547 | reference_red | sample_green | reference_green | sample_red | FTP |
| NA18558 | reference_red | sample_green | reference_green | sample_red | FTP |
| NA18555 | reference_red | sample_green | reference_green | sample_red | FTP |
For any interest in downloading some other image files, please contact Matthew Hurles.
Validation data
All the validation datafiles are in the same format as the one provided for the WGTP Array.
Contact
This project was a collaborative effort of the groups of:
- Nigel Carter
- Matthew Hurles
- Chris Tyler-Smith
- Charles Lee, Harvard Medical School
- Steve Scherer, Hospital for Sick Children
Acknowledgements
Nigel Carter, Richard Redon, Heike Fiegler, Lyndal Montgomery, Matthew Hurles, Chris Tyler-Smith, Tatiana Zerjal, Daniel Andrews, Armand Valsesia, Fengtang Yang, Dimitrios Kalaitzopoulos, Charles Lee, Steve Scherer
Funding was provided by the Wellcome Trust.
Publication
-
Global variation in copy number in the human genome.
Nature 2006;444;7118;444-54
PUBMED: 17122850; PMC: 2669898; DOI: 10.1038/nature05329


