Contact WTSI Webmaster Printer friendly format Login to WTSI resources WTSI RSS feed
Genomics & Genetics
  • Overview
  • CGP
  • Faculty
  • Stratton
  • Futreal
  • Projects
  • Cancer Gene Census
  • COSMIC
  • CGP Resequencing Studies
  • Copy Number Mapping
  • NCI-60
  • Planned studies
  • Genomics of Drug Sensitivity in Cancer
  • Software
  • Information
  • Links
  • News
  • Publications
  • Conditions of use
Genome-Wide Human SNP Array 6.0 analysis




This array contains over 906,600 SNPs together with 946,000 copy number probes, interrogating over 1.85 million loci in a single experiment. For each SNP on the array there are six/eight features, three or four features each for allele A and B. The features for each allele are technical replicates. The non-polymorphic copy number probes are designed to known copy number variations (202,000) with the remainder (744,000) being evenly distributed across the genome. These loci are represented by a single feature. A more detailed description of the array design can be obtained from the Affymetrix website.


An algorithm has been written specifically for use with the Affymetrix SNP6 data (PICNIC - Predicting Integral Copy Numbers In Cancer). This algorithm provides a more refined analysis than has previously been applied to the Affymetrix 10K data. This includes improved normalisation of the data together with determination of underlying copy number for each segment by genome wide analysis of allele ratio and signal strength data. The data is subsequently rescaled and plotted onto its predicted underlying integer value and segmentation applied (it should be noted that rescaling the raw data to the underlying absolute copy number can affect the spread of the data points).


Analysis of the data in this way also allows for assignment of a genotype to each SNP. Because such genotypes are based on the ratio for each allele they can be more complex than the traditional AA, BB, AB assignment; potentially including such genotypes as AAB etc. Regions of loss of heterozygosity (LOH) can also be determined.

Three plots are available for SNP6 data from the CGH Viewer webpage :-
  1. Absolute copy number: This plot shows the normalised data (grey dots) for each genomic locus on the array together with segmentation information. The normalised data is rescaled to the underlying copy number with dark blue lines indicating total copy number for each genomic region and light blue giving the predicted copy number of the minor allele. Minor allele values of zero are indicative of loss of heterozygosity (LOH).

  2. Probability: This plot shows the probability of a change in state for copy number, heterozygosity or both.

  3. Genotype intensity: This plot shows the ratio of the two allelic intensities for SNPs on the array. Equal heterozygote's give a ratio value of 0.5, while homozygous calls give values of ~0.8 (AA) and ~0.2 (BB). Skewed allele ratios can result in up to four bands on the genotype intensity plot. The data is again segmented with black lines indicating regions of heterozygosity and red lines indicating regions of homozygosity (loss of heterozygosity, LOH).

For example, the following plot represents Chromosome 9 of sample CMK. The information for four didactic segments labeled A, B, C and D is described below.



Segment A: 0 - 5 Mb, Total copy number (Dark blue) 4, Minor copy number (Light blue) 2. That is, each parental allele has been duplicated. The state change probability plot indicates the end of the segment. There are three black lines in the genotype intensity plot. SNPs with points near these lines have genotypes AAAA, AABB and BBBB, going down the plot respectively.


Segment B: 5 - 21 Mb, Total copy number two, Minor copy number 0. That is, one parental allele has been lost (LOH) and the other has been duplicated. The genotype intensities have two lines, corresponding to genotypes AA or BB.


Segment C: 21 - 23 Mb, Total copy number 0, Minor copy number 0. That is, both parental alleles have been lost resulting in a homozygous deletion. The genotype intensity has a single line at 0.5, resulting from equal signal intensity from both alleles due to background hybridisation.


Segment D: 24 - 27 Mb, Total copy number 6, Minor copy number 2. That is, one parental allele has been copied to give two copies, the other duplicated to give four copies, with a total copy number of six. The genotype intensities have four lines, corresponding to genotypes AAAAAA, AAAABB, AABBBB or BBBBBB.


A detailed description of the algorithm has been submitted for publication

Greenman, C.D et al. (submitted)

Information Projects Other Services
Sanger Home
Sitemap
Site Search
Information
Careers
Press
News
Seminars
Workshops
Publications
Staff Theses
Travel Directions
Research Teams
Research Faculty
Personnel Search
Human Genetics
Model Organism Genetics
Pathogen Genetics
Bioinformatics
Sequencing
Library
Helpdesk
Webmail
VPN Access
Sign In
SSO Pass. Reset

webmaster@sanger.ac.uk

Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK  Tel:+44 (0)1223 834244

Last Modified Tue Sep 16 16:39:11 2008

Genome Research Limited is a charity registered in England with number 1021457

Help | Contact us | Legal | Cookies policy | Data sharing