CCRaVAT (Case-Control Rare Variant Analysis Tool) & QuTie (Quantitative Trait)

Enabling the analysis of rare variants in large-scale case control and quantitative trait association studies.

CCRaVAT (Case-Control Rare Variant Analysis Tool) and QuTie (Quantitative Trait) are software packages that enable efficient large-scale analysis of rare variants across specific regions or genome-wide.

These programs implement a rare variant super-locus or collapsing method that investigates the accumulation of rare variant alleles in either a case-control or quantitative trait study design.

[Genome Research Limited]

Introduction

Recent advances in high-throughput genotyping have made large-scale genetic association studies possible. Genome-wide association scans (GWAS) for complex disease have met with unprecedented success in identifying common susceptibility variants. However, the discovered common single nucleotide polymorphism (SNP) associations do not account for a large proportion of the genetic component of disease. The field is now focusing on the analysis of low frequency and rare variants (i.e. minor allele frequency (MAF) ≤0.05) to find the missing heritability in complex disease etiology (Bodmer and Bonilla, 2008; Manolio et al., 2009). While the sample sizes currently investigated are large enough for a well-powered GWAS of common variants, they are not large enough to provide sufficient power for the single-point analysis of rare variants with small to moderate effect sizes (Morris and Zeggini, 2009). We have developed rare variant analysis software, CCRaVAT and QuTie, which allow the large-scale analysis of low MAF polymorphisms by pooling rare variants within defined regions and treating them as a single 'super-locus'. This method helps identify regions that contain a significantly higher proportion of rare minor alleles in the disease cases or controls, or within groups of individuals with significantly different quantitative trait means. Collapsing multiple rare minor alleles into a single locus across pre-defined regions (either genes or sliding windows of defined sequence length) can substantially increase power for detecting association (Li and Leal, 2008; Morris and Zeggini, 2009). This approach, implemented in CCRaVAT and QuTie, can be applied to data arising from the targeted examination of specific regions or at the genome-wide scale.

The statistical properties of the rare variant super-locus or collapsing method that we have implemented are described in (Li and Leal, 2008; Morris and Zeggini, 2009). The first step in implementing this approach involves the definition of regions in which rare variant minor alleles are collapsed. These chromosomal regions can either be sliding windows of predefined length or genic regions defined by intervals either side of the transcriptional start and stop sites of genes. CCRaVAT and QuTie use the same approach up to this point. The programs differ in the study designs analyzed and significance determination. CCRaVAT analyzes case-control data and constructs a 2x2 contingency table of the presence or absence of rare variant minor alleles in cases and controls for each region. Differences in the proportion of cases and controls carrying rare variant minor alleles are tested using a Pearson's chi-squared test or a Fisher's exact test when cell counts are small. CCRaVAT also allows users to generate empirical p values by permuting case-control status a predefined number of times and repeating the analysis for each replicate. QuTie implements the analysis of quantitative traits and analyzes the differences in quantitative trait means for individuals carrying at least one rare variant minor allele and individuals carrying no rare variant minor alleles within the defined region. The quantitative trait values in the two groups are compared using linear regression and a Student's t-test.

Downloads

System Requirements

The software should run on any UNIX and GNU/Linux system with PERL 5.8 or higher.

Download CCRaVAT & QuTie (FTP)

After downloading the tar archive from the above link, execute the following commands from the Unix command line:

:> tar -xvzf rare_variant_analysis_software.tar.gz
:> unzip genes-b35.zip
:> unzip genes-b36.zip

License

CCRaVAT & QuTie is free software and is distributed under the terms of the GNU General Public License.

Citation

  • CCRaVAT and QuTie-enabling analysis of rare variants in large-scale case control and quantitative trait association studies.

    Lawrence R, Day-Williams AG, Elliott KS, Morris AP and Zeggini E

    BMC bioinformatics 2010;11;527

Selected publications

  • An evaluation of statistical approaches to rare variant analysis in genetic association studies.

    Morris AP and Zeggini E

    Genetic epidemiology 2010;34;2;188-93

  • Finding the missing heritability of complex diseases.

    Manolio TA, Collins FS, Cox NJ, Goldstein DB, Hindorff LA, Hunter DJ, McCarthy MI, Ramos EM, Cardon LR, Chakravarti A, Cho JH, Guttmacher AE, Kong A, Kruglyak L, Mardis E, Rotimi CN, Slatkin M, Valle D, Whittemore AS, Boehnke M, Clark AG, Eichler EE, Gibson G, Haines JL, Mackay TF, McCarroll SA and Visscher PM

    Nature 2009;461;7265;747-53

  • Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data.

    Li B and Leal SM

    American journal of human genetics 2008;83;3;311-21

  • Common and rare variants in multifactorial susceptibility to common diseases.

    Bodmer W and Bonilla C

    Nature genetics 2008;40;6;695-701

Contact

If you have any problems running CCRaVAT or QuTie then please contact either: Robert Lawrence or Eleftheria Zeggini.

It is recommended to contact the author of CCRaVAT & QuTie, Robert Lawrence regarding bugs or problems running the script.

* quick link - http://q.sanger.ac.uk/i3u3s80i