Genevar (GENe Expression VARiation)

Genevar is a platform of database and web services designed for integrative analysis and visualization of SNP-gene associations in eQTL (expression quantitative trait loci) studies.

Through interactive Java interface, Genevar allows researchers to investigate eQTL association patterns within a genetic region of interest instantly. The database can be installed on a standard computer in database mode to facilitate unpublished data as well as on a secure server to share discoveries among affiliations or the broader community via web-services protocols.

Default server at the Sanger Institute contains sequence variation and gene expression profiling data from the following datasets:

  • Three tissue types (adipose, LCL and skin) collected from 856 healthy female twins of the MuTHER resource (22941192);
  • Lymphoblastoid cell lines from 726 HapMap3 CEU, CHB, GIH, JPT, LWK, MEX, MKK and YRI individuals (22532805);
  • Three tissue types (adipose, LCL and skin) derived from a subset of ~160 MuTHER healthy female twins (21304890);
  • Three cell types (fibroblast, LCL and T-cell) derived from umbilical cords of 75 Geneva GenCord individuals (19644074).

[Genome Research Limited]

Featured

cis-eQTL - Gene

cis-eQTL - Gene
Enlarge this image (1196 x 696)

Gene-centric analysis (cis-eQTL - Gene)

Identify eQTLs centred on the gene of interest across tissues/populations:

  • Observed eQTLs are visualized in regional plots where a dotted line represents the user-defined P value threshold
  • Significant eSNPs are listed in a table and can be exported to tab-delimited output files
  • Regional plots can be saved (right-click) as PNG images for further presentation
  • Query results are organized in the tree node structure and can be easily browsed
  • External links to the three major genome browsers
  • Quick Start: cis-eQTL - Gene (new to Genevar?)
cis-eQTL - SNP

cis-eQTL - SNP
Enlarge this image (1196 x 696)

SNP-centric analysis (cis-eQTL - SNP)

Investigate SNP-gene associations surrounding eSNPs/lead SNPs among tissues/populations:

  • Observed eQTLs are displayed in line charts where a dotted line represents the P value threshold
  • Significant genes located within the query region will be shown in a table
  • Regional plots can be saved as PNG format images
  • Query results are organized in the tree node structure and can be easily browsed
  • Quick Start: cis-eQTL - SNP
eQTL - SNP-Gene

eQTL - SNP-Gene
Enlarge this image (1196 x 696)

SNP-gene association analysis (eQTL - SNP-Gene)

Illustrate SNP-probe associations across cell types/populations:

  • Individuals are plotted on stripcharts where observed and permutation P values are shown above
  • SNP-probe association plots can be saved (right-click) as PNG format images
  • Quick Start: eQTL - SNP-Gene

Launch application

Multiple web-services connections

Multiple web-services connections
Enlarge this image (895 x 551)

Genevar web-services mode

Genevar 3.2.0 Launch

  • Java Web Start will automatically download and install the latest version of Genevar when you click on launch.
  • After launching, Genevar is initially in the web-services mode connecting to the Sanger Institute.
  • There is no need to install a local database or load any datasets prior to your analysis in this mode.

Connecting to multiple services

  • This system design allows users to switch between different servers on the same interface (see figure).
  • Separate, password-protected servers enable researchers to share unpublished data among collaborators in a secure fashion.

System requirements

  • Java SE 6 or above is required - Download Java now if it is not installed.
  • Genevar is a cross-platform application and has been tested successfully on Linux, Mac and Windows machines.

License

Genevar and the database are freely available under the terms of the GNU Lesser General Public License.

Database installation

Genevar local database mode

Genevar local database mode
Enlarge this image (895 x 535)

Genevar local database mode

  • Users are able to load, manage and analyze their studies locally in this mode.
  • This system design allows users to switch between public servers and local database on the same interface (see figure).
  • Genevar can run completely offline in this mode as there is no communication between the interface and our servers.

Installing MySQL on Windows

The easiset way to run Genevar database is to download the noinstall, portable package (with empty template tables):

  • Download the compressed file here genevar-mysql-noinstall-5.1.46-win32.tar.gz and extract it into any your preferable directory (as it is portable!);
  • Double click on the startup.bat under the package directory to start your MySQL instance;
  • Connect to your MySQL database based on the version downloaded (see figure); and
  • Double click on the shutdown.bat to stop MySQL when finished analysis.

Installing MySQL on Windows, Mac or Linux from scratch

If an existing MySQL database is installed or not using Windows system:

  • Download suitable package from MySQL Downloads and install MySQL, see the online manual Installation Guidance;
  • Change to the MySQL installation directory (represented by BASEDIR) and start your MySQL instance;
    shell> cd BASEDIR
    shell> bin/mysqld --standalone 
    
  • Create Genevar database and user tpy (password ypt) and grant SELECT, INSERT, UPDATE, CREATE, DROP, ALTER privileges to tpy;
    shell> bin/mysql -u root -p
    mysql> CREATE DATABASE tpy_team16_genevar_2_0_0;
    mysql> CREATE USER 'tpy'@'localhost' IDENTIFIED BY 'ypt';
    mysql> GRANT SELECT, INSERT, UPDATE, CREATE, DROP, ALTER ON tpy_team16_genevar_2_0_0.* TO 'tpy'@'localhost';
    
  • Download the backup file (with empty template tables) here tpy_team16_genevar_2_0_0.sql.tar.gz and unzip it into BASEDIR;
  • Restore Genevar database;
    shell> bin/mysql -u tpy -p tpy_team16_genevar_2_0_0 < tpy_team16_genevar_2_0_0_20100519.sql 
    
  • Stop MySQL when finished analysis.
    shell> bin/mysqladmin -u root shutdown 
    

File formats

Users are able to load, manage and analyze their studies locally in the local database mode

Users are able to load, manage and analyze their studies locally in the local database mode
Enlarge this image (891 x 343)

Genevar local database mode currently supports the following load data formats.

Expression profiling data

Illumina HumanRef-8 v3, HumanWG-6 v2, v3 and HumanHT-12 v3 BeadChips:

  • TXT example (tab-delimited; including header)
    PROBE_ID         SMP001    SMP002
    ILMN_0000001     7.26      7.1
    ILMN_0000002     12.73     11.58
    

Genotype data

PLINK formats:

  • BIM example (not MAP; how to create *.bim)
    7     rs0001     0     12345     A     T
    7     rs0002     0     56789     C     G
    
  • PED example
    FAM001     SMP001     0     0     0     0     A A     G G
    FAM002     SMP002     0     0     0     0     A T     C G
    

Team16 merged format:

  • TXT example (tab-delimited; including header)
    Chr   Position  SNP_ID     Alleles  SMP001  SMP002
    7     12345     rs0001     A/T      AA      AT
    7     56789     rs0002     C/G      GG      CG
    

Development

Multi-tier architectural design

Multi-tier architectural design
Enlarge this image (747 x 488)

Develop

License

Genevar and the database are freely available under the terms of the GNU Lesser General Public License.


Release notes

3.2.0, released on 26 July 2011

[Whats new]

  • Upgraded Commons Math library to version 3.0
  • Improved table sorting in Manager panels
  • Allowed preloaded data to be accessed via the same user interface

3.1.1 Launch, released on 15 December 2011

[Whats new]

  • Added a new Database 3.0.0 option in the "New Data Connection" menu
  • Added a new "No limit" option in the P-vaule threshold drop-down list
  • Improved table sorting in Java SE 6
  • Fixed several incompatibly issues when connecting to Database 2.0.0
  • Fixed "Export List" in Cis-eQTL - Gene

3.1.0, released on 7 December 2011

[Whats new]

  • Added support for external algorithms and pre-uploaded results
  • Added support for space-delimited PLINK formats
  • Added auto selection for "Reference" drop-down list when a study is chosen
  • Fixed a "Data not found error message" when having null QTL results

[Known issues]

  • Cannot properly display "Reference" drop-down list when a study is chosen in Database 2.0.0

3.0.2, released on 15 July 2011

[Whats new]

  • Fixed several upload issues in database mode.

3.0.1 (was 3.5.0), released on 14 February 2011

[Known issues]

If you have ever encountered an error message Data not found exception whilst accessing your local database then this might affect you:

  • When uploading data via the interface onto both expression and genotype table (for your local v2.0.0 database). It looks like the upload was successful from the summary panel but actually these two tables remained empty.
  • Please re-load your files via the latest 3.0.2 interface (3.5.0 became 3.0.1 internally) if you have been using this version to upload your studies. Uploads before February via the older versions will be unaffected by the bug!
  • It is not necessary to re-create all your study names and attributes at this stage. Follow the same steps in both Genotype Manager and Expression Manager to specify your previous uploads and submit. When prompted Study already exists. Would you like to re-submit again? in the confirmation dialogue, then select Yes to re-direct the previous unsuccessfully uploads to the new ones.

Publications

Citations

  • Genevar: a database and Java application for the analysis and visualization of SNP-gene associations in eQTL studies.

    Yang TP, Beazley C, Montgomery SB, Dimas AS, Gutierrez-Arcelus M, Stranger BE, Deloukas P and Dermitzakis ET

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1HH, UK.

    Genevar (GENe Expression VARiation) is a database and Java tool designed to integrate multiple datasets, and provides analysis and visualization of associations between sequence variation and gene expression. Genevar allows researchers to investigate expression quantitative trait loci (eQTL) associations within a gene locus of interest in real time. The database and application can be installed on a standard computer in database mode and, in addition, on a server to share discoveries among affiliations or the broader community over the Internet via web services protocols. AVAILABILITY: http://www.sanger.ac.uk/resources/software/genevar.

    Funded by: Wellcome Trust

    Bioinformatics (Oxford, England) 2010;26;19;2474-6

Data release

  • Mapping cis- and trans-regulatory effects across multiple tissues in twins.

    Grundberg E, Small KS, Hedman ÅK, Nica AC, Buil A, Keildson S, Bell JT, Yang TP, Meduri E, Barrett A, Nisbett J, Sekowska M, Wilk A, Shin SY, Glass D, Travers M, Min JL, Ring S, Ho K, Thorleifsson G, Kong A, Thorsteindottir U, Ainali C, Dimas AS, Hassanali N, Ingle C, Knowles D, Krestyaninova M, Lowe CE, Di Meglio P, Montgomery SB, Parts L, Potter S, Surdulescu G, Tsaprouni L, Tsoka S, Bataille V, Durbin R, Nestle FO, O'Rahilly S, Soranzo N, Lindgren CM, Zondervan KT, Ahmadi KR, Schadt EE, Stefansson K, Smith GD, McCarthy MI, Deloukas P, Dermitzakis ET, Spector TD and Multiple Tissue Human Expression Resource (MuTHER) Consortium

    Wellcome Trust Sanger Institute, Hinxton, UK.

    Sequence-based variation in gene expression is a key driver of disease risk. Common variants regulating expression in cis have been mapped in many expression quantitative trait locus (eQTL) studies, typically in single tissues from unrelated individuals. Here, we present a comprehensive analysis of gene expression across multiple tissues conducted in a large set of mono- and dizygotic twins that allows systematic dissection of genetic (cis and trans) and non-genetic effects on gene expression. Using identity-by-descent estimates, we show that at least 40% of the total heritable cis effect on expression cannot be accounted for by common cis variants, a finding that reveals the contribution of low-frequency and rare regulatory variants with respect to both transcriptional regulation and complex trait susceptibility. We show that a substantial proportion of gene expression heritability is trans to the structural gene, and we identify several replicating trans variants that act predominantly in a tissue-restricted manner and may regulate the transcription of many genes.

    Funded by: Wellcome Trust: 081917/Z/07/Z, 085235, 090532

    Nature genetics 2012;44;10;1084-9

  • Patterns of cis regulatory variation in diverse human populations.

    Stranger BE, Montgomery SB, Dimas AS, Parts L, Stegle O, Ingle CE, Sekowska M, Smith GD, Evans D, Gutierrez-Arcelus M, Price A, Raj T, Nisbett J, Nica AC, Beazley C, Durbin R, Deloukas P and Dermitzakis ET

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, UK.

    The genetic basis of gene expression variation has long been studied with the aim to understand the landscape of regulatory variants, but also more recently to assist in the interpretation and elucidation of disease signals. To date, many studies have looked in specific tissues and population-based samples, but there has been limited assessment of the degree of inter-population variability in regulatory variation. We analyzed genome-wide gene expression in lymphoblastoid cell lines from a total of 726 individuals from 8 global populations from the HapMap3 project and correlated gene expression levels with HapMap3 SNPs located in cis to the genes. We describe the influence of ancestry on gene expression levels within and between these diverse human populations and uncover a non-negligible impact on global patterns of gene expression. We further dissect the specific functional pathways differentiated between populations. We also identify 5,691 expression quantitative trait loci (eQTLs) after controlling for both non-genetic factors and population admixture and observe that half of the cis-eQTLs are replicated in one or more of the populations. We highlight patterns of eQTL-sharing between populations, which are partially determined by population genetic relatedness, and discover significant sharing of eQTL effects between Asians, European-admixed, and African subpopulations. Specifically, we observe that both the effect size and the direction of effect for eQTLs are highly conserved across populations. We observe an increasing proximity of eQTLs toward the transcription start site as sharing of eQTLs among populations increases, highlighting that variants close to TSS have stronger effects and therefore are more likely to be detected across a wider panel of populations. Together these results offer a unique picture and resource of the degree of differentiation among human populations in functional regulatory variation and provide an estimate for the transferability of complex trait variants across populations.

    Funded by: Wellcome Trust

    PLoS genetics 2012;8;4;e1002639

  • The architecture of gene regulatory variation across multiple human tissues: the MuTHER study.

    Nica AC, Parts L, Glass D, Nisbet J, Barrett A, Sekowska M, Travers M, Potter S, Grundberg E, Small K, Hedman AK, Bataille V, Tzenova Bell J, Surdulescu G, Dimas AS, Ingle C, Nestle FO, di Meglio P, Min JL, Wilk A, Hammond CJ, Hassanali N, Yang TP, Montgomery SB, O'Rahilly S, Lindgren CM, Zondervan KT, Soranzo N, Barroso I, Durbin R, Ahmadi K, Deloukas P, McCarthy MI, Dermitzakis ET, Spector TD and MuTHER Consortium

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, United Kingdom.

    While there have been studies exploring regulatory variation in one or more tissues, the complexity of tissue-specificity in multiple primary tissues is not yet well understood. We explore in depth the role of cis-regulatory variation in three human tissues: lymphoblastoid cell lines (LCL), skin, and fat. The samples (156 LCL, 160 skin, 166 fat) were derived simultaneously from a subset of well-phenotyped healthy female twins of the MuTHER resource. We discover an abundance of cis-eQTLs in each tissue similar to previous estimates (858 or 4.7% of genes). In addition, we apply factor analysis (FA) to remove effects of latent variables, thus more than doubling the number of our discoveries (1,822 eQTL genes). The unique study design (Matched Co-Twin Analysis--MCTA) permits immediate replication of eQTLs using co-twins (93%-98%) and validation of the considerable gain in eQTL discovery after FA correction. We highlight the challenges of comparing eQTLs between tissues. After verifying previous significance threshold-based estimates of tissue-specificity, we show their limitations given their dependency on statistical power. We propose that continuous estimates of the proportion of tissue-shared signals and direct comparison of the magnitude of effect on the fold change in expression are essential properties that jointly provide a biologically realistic view of tissue-specificity. Under this framework we demonstrate that 30% of eQTLs are shared among the three tissues studied, while another 29% appear exclusively tissue-specific. However, even among the shared eQTLs, a substantial proportion (10%-20%) have significant differences in the magnitude of fold change between genotypic classes across tissues. Our results underline the need to account for the complexity of eQTL tissue-specificity in an effort to assess consequences of such variants for complex traits.

    Funded by: Wellcome Trust: 077016/Z/05/Z, 085235

    PLoS genetics 2011;7;2;e1002003

  • Common regulatory variation impacts gene expression in a cell type-dependent manner.

    Dimas AS, Deutsch S, Stranger BE, Montgomery SB, Borel C, Attar-Cohen H, Ingle C, Beazley C, Gutierrez Arcelus M, Sekowska M, Gagnebin M, Nisbett J, Deloukas P, Dermitzakis ET and Antonarakis SE

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, CB10 1HH, Cambridge, UK.

    Studies correlating genetic variation to gene expression facilitate the interpretation of common human phenotypes and disease. As functional variants may be operating in a tissue-dependent manner, we performed gene expression profiling and association with genetic variants (single-nucleotide polymorphisms) on three cell types of 75 individuals. We detected cell type-specific genetic effects, with 69 to 80% of regulatory variants operating in a cell type-specific manner, and identified multiple expressive quantitative trait loci (eQTLs) per gene, unique or shared among cell types and positively correlated with the number of transcripts per gene. Cell type-specific eQTLs were found at larger distances from genes and at lower effect size, similar to known enhancers. These data suggest that the complete regulatory variant repertoire can only be uncovered in the context of cell-type specificity.

    Funded by: Wellcome Trust: 077011, 077046

    Science (New York, N.Y.) 2009;325;5945;1246-50

Contact

  • For questions and requests about Genevar, please contact Professor Emmanouil Dermitzakis
  • Please report any bugs or error messages you observe to Genevar developer Tsun-Po Yang
  • We thank ex-Team16 members' consistent support in this project
* quick link - http://q.sanger.ac.uk/cd169ozd