Genevar (GENe Expression VARiation)

Genevar is a platform of database and web services designed for data integration, analysis and visualization of SNP-gene associations in eQTL (expression quantitative trait loci) studies.

With its interactive user interface and client-server platform, Genevar allows researchers to instantly investigate eQTL association patterns within a genetic region of interest. The database can be installed on a standard computer in database mode to utilize unpublished data, as well as on a secure server to share research findings among collaborators or with the broader community via web-services protocols.

The default public server at the Sanger Institute currently contains (i) genetic variation and gene expression profiling data,

  • Three tissue types (adipose, LCL and skin) collected from 856 healthy female twins of the MuTHER resource (22941192)
  • Lymphoblastoid cell lines from 726 HapMap3 CEU, CHB, GIH, JPT, LWK, MEX, MKK and YRI individuals (22532805)
  • Three tissue types (adipose, LCL and skin) derived from a subset of ~160 MuTHER healthy female twins (21304890)
  • Three cell types (fibroblast, LCL and T-cell) derived from umbilical cords of 75 Geneva GenCord individuals (19644074)

and (ii) genetic variation and DNA methylation profiling dataset from the listed studies:

  • Adipose tissue collected from 856 healthy female twins of the MuTHER resource (24183450)

[Genome Research Limited]

Features

cis-eQTL - Gene

cis-eQTL - Gene

zoom

Gene-centric analysis (cis-eQTL - Gene)

Discover eQTL SNPs (eSNPs) centred on the gene of interest in different tissue types/populations:

  • Observed eSNPs will be visualized as a regional plot, where a dotted line represents the user-defined P value threshold.
  • Significant eSNPs will be listed in a table and can be exported to a tab-delimited file.
  • Regional plots can be saved (by right-clicking) generating a PNG image.
  • Query results are organized in a tree node structure and can be easily browsed.
  • External links to three major genome browsers (Ensembl, HapMap and UCSC).
  • Quick start: cis-eQTL - Gene (new to Genevar?)
  • User tips: Does the top eSNP also associate with other genes? Use the SNP-centric analysis module to investigate further!
cis-eQTL - SNP

cis-eQTL - SNP

zoom

SNP-centric analysis (cis-eQTL - SNP)

Identify eQTL genes surrounding lead GWAS SNPs/top eSNPs in different tissue types/populations:

  • Observed eQTLs will be displayed in a line chart, where a dotted line represents the P value threshold.
  • Significant eQTL genes within the query region will be detailed in a table.
  • Regional plots can be saved (by right-clicking) generating a PNG image.
  • Query results are organized in a tree node structure and can be easily browsed.
  • Quick start: cis-eQTL - SNP
  • User tips: Is the SNP of interest the most significant eSNP of the observed eQTL genes? Or is the eQTL signal actually driven by other eSNPs? Use the gene-centric module to find out!
eQTL - SNP-Gene

eQTL - SNP-Gene

zoom

SNP-gene association analysis (eQTL - SNP-Gene)

Illustrate SNP-probe associations across different tissue types/populations:

  • Individual genotypes will be plotted on a strip chart, where observed and permuted P values are labeled.
  • SNP-probe association plots can be saved (by right-clicking) generating a PNG image.
  • Quick start: eQTL - SNP-Gene
  • User tips: If the genotyping/imputation data is not available due to data release restrictions in some particular studies (e.g. Grundberg et al.), the association plot function will be disabled accordingly.
cis-mQTL - CpG

cis-mQTL - CpG

zoom

mQTL (methylation quantitative trait loci) analysis

New CpG-centric and SNP-centric functionalities:

  • Discover mQTL SNPs centred on the CpG probe/CpG island/gene of interest.
  • Identify mQTLs surrounding SNP of interest/top mSNPs.
  • User tips: A1 is not the minor allele in both Grundberg et al. studies using external algorithm - it is the allele used as the reference/predictor allele in the test. Also, a positive beta indicates that the allele listed as A1 is associated with increased expression/methylation of the probe.

Launch application

Multiple web-services connections

Multiple web-services connections

zoom

Genevar web-services mode

Genevar 3.3.0 Launch

  • After clicking on launch, Java Web Start will automatically download and install the latest version of the Genevar interface on users' local machine.
  • Genevar is initially in web-services mode connecting to the Sanger Institute after it has launched.
  • It is not necessary to install a local database or to pre-load any datasets prior to analysis in this mode.
  • If Genevar is blocked by the Web Start since Java 7 update 45 (October 2013) and above, please include the host site (http://www.sanger.ac.uk) in the Exception Site List (just go to Java Control Panel / Security / Edit Site List / Add) to meet new Java security requirements.

Connecting to multiple servers

  • Genevar provides a single interface for users to switch between different servers (see figure).
  • Separate, password-protected servers enable researchers to share unpublished data among collaborators in a secure fashion.

System requirements

  • Java SE 6 or above is required - Download Java now if not installed.
  • Genevar is a cross-platform application and has been tested thoroughly on Linux, Mac and Windows systems.

License

Genevar and the database are freely available under the terms of the GNU Lesser General Public License.

Database installation

Genevar local database mode

Genevar local database mode

zoom

Genevar local database mode

  • Users are able to load, manage and analyze their datasets locally in this mode.
  • The Genevar system design provides a single interface for users to switch between public servers and local database (see figure).
  • Genevar can be run completely offline in this mode, as there is no communication between the interface and any servers when connected to a local database.

Installing Genevar database on MySQL server

Previous experience with MySQL is desirable.

  • Download a suitable package from MySQL Downloads and install a MySQL server on your local machine (see Installation Guidance).
  • Change to the MySQL installation directory (BASEDIR represents the location below) and start your MySQL instance:
    shell> cd BASEDIR
    shell> bin/mysqld --standalone
    
  • Connect to the database as superuser and create a new database:
    shell> bin/mysql -u root -p
    mysql> CREATE DATABASE tpy_team16_genevar_2_0_0;
    
  • Create a new username tpy (password ypt) and grant SELECT, INSERT, UPDATE, CREATE, DROP, ALTER privileges to this user:
    mysql> CREATE USER 'tpy'@'localhost' IDENTIFIED BY 'ypt';
    mysql> GRANT SELECT, INSERT, UPDATE, CREATE, DROP, ALTER ON tpy_team16_genevar_2_0_0.* TO 'tpy'@'localhost';
    
  • Download the SQL-format dump file tpy_team16_genevar_2_0_0.sql.tar.gz (which contains template tables) and unzip it to BASEDIR.
  • Restore Genevar database and it is ready to use in the database mode!
    shell> bin/mysql -u tpy -p tpy_team16_genevar_2_0_0 < tpy_team16_genevar_2_0_0_20100519.sql
    
  • To stop MySQL instance, type:
    shell> bin/mysqladmin -u root shutdown
    

File formats

Users are able to load, manage and analyze their datasets in local database mode

Users are able to load, manage and analyze their datasets in local database mode

zoom

Genevar local database mode currently supports the following data formats.

Expression profiling data

Illumina HumanRef-8 v3, HumanWG-6 v2, v3 and HumanHT-12 v3 BeadChips:

  • TXT example (tab-delimited; including header)
    PROBE_ID         SMP001    SMP002
    ILMN_0000001     7.26      7.1
    ILMN_0000002     12.73     11.58
    

Genotype data

PLINK formats:

  • BIM example (not MAP; how to create *.bim)
    7     rs0001     0     12345     A     T
    7     rs0002     0     56789     C     G
    
  • PED example
    FAM001     SMP001     0     0     0     0     A A     G G
    FAM002     SMP002     0     0     0     0     A T     C G
    

Team16 merged format:

  • TXT example (tab-delimited; including header)
    Chr   Position  SNP_ID     Alleles  SMP001  SMP002
    7     12345     rs0001     A/T      AA      AT
    7     56789     rs0002     C/G      GG      CG
    

Development

Multi-tier architectural design

Multi-tier architectural design

zoom

Develop

License

Genevar and the database are freely available under the terms of the GNU Lesser General Public License.

Release notes

New Java security settings (see details), released on 17 April 2014

[Known issues]

If Genevar is blocked by the Web Start since Java 7 update 45 (October 2013) and above, please include the host site (http://www.sanger.ac.uk) in the Exception Site List (just go to Java Control Panel / Security / Edit Site List / Add) to meet new Java security requirements.

  • Added additional manifest attributes (Permissions, Codebase and Application-Name).
  • Migrated application JAR files to a HTTP website from its original FTP site.
  • Genevar is a self-signed application (which is blocked by default since java7u45) due to the Open-Source nature of the project. Please add the trusted host site to the Exception Site List.

3.3.0, released on 30 September 2013

[Whats new]

  • Added a new cis-mQTL - CpG module
  • Added a new cis-mQTL - SNP module

3.2.0, released on 26 July 2011

[Whats new]

  • Upgraded Commons Math library to version 3.0
  • Improved table sorting in Manager panels
  • Allowed preloaded data to be accessed via the same user interface

3.1.1, released on 15 December 2011

[Whats new]

  • Added a new Database 3.0.0 option in the "New Data Connection" menu
  • Added a new "No limit" option in the P-vaule threshold drop-down list
  • Improved table sorting in Java SE 6
  • Fixed several incompatibly issues when connecting to Database 2.0.0
  • Fixed "Export List" in Cis-eQTL - Gene

3.1.0, released on 7 December 2011

[Whats new]

  • Added support for external algorithms and pre-uploaded results
  • Added support for space-delimited PLINK formats
  • Added auto selection for "Reference" drop-down list when a study is chosen
  • Fixed a "Data not found error message" when having null QTL results

[Known issues]

  • Cannot properly display "Reference" drop-down list when a study is chosen in Database 2.0.0

3.0.2, released on 15 July 2011

[Whats new]

  • Fixed several upload issues in database mode.

3.0.1 (was 3.5.0), released on 14 February 2011

[Known issues]

If you have ever encountered an error message Data not found exception whilst accessing your local database then this might affect you:

  • When uploading data via the interface onto both expression and genotype table (for your local v2.0.0 database). It looks like the upload was successful from the summary panel but actually these two tables remained empty.
  • Please re-load your files via the latest 3.0.2 interface (3.5.0 became 3.0.1 internally) if you have been using this version to upload your studies. Uploads before February via the older versions will be unaffected by the bug!
  • It is not necessary to re-create all your study names and attributes at this stage. Follow the same steps in both Genotype Manager and Expression Manager to specify your previous uploads and submit. When prompted Study already exists. Would you like to re-submit again? in the confirmation dialogue, then select Yes to re-direct the previous unsuccessfully uploads to the new ones.

Publications

Citation

  • Genevar: a database and Java application for the analysis and visualization of SNP-gene associations in eQTL studies.

    Yang TP, Beazley C, Montgomery SB, Dimas AS, Gutierrez-Arcelus M, Stranger BE, Deloukas P and Dermitzakis ET

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1HH, UK.

    Unlabelled: Genevar (GENe Expression VARiation) is a database and Java tool designed to integrate multiple datasets, and provides analysis and visualization of associations between sequence variation and gene expression. Genevar allows researchers to investigate expression quantitative trait loci (eQTL) associations within a gene locus of interest in real time. The database and application can be installed on a standard computer in database mode and, in addition, on a server to share discoveries among affiliations or the broader community over the Internet via web services protocols.

    Availability: http://www.sanger.ac.uk/resources/software/genevar.

    Funded by: Wellcome Trust

    Bioinformatics (Oxford, England) 2010;26;19;2474-6

Data releases

  • Global analysis of DNA methylation variation in adipose tissue from twins reveals links to disease-associated variants in distal regulatory elements.

    Grundberg E, Meduri E, Sandling JK, Hedman AK, Keildson S, Buil A, Busche S, Yuan W, Nisbet J, Sekowska M, Wilk A, Barrett A, Small KS, Ge B, Caron M, Shin SY, Multiple Tissue Human Expression Resource Consortium, Lathrop M, Dermitzakis ET, McCarthy MI, Spector TD, Bell JT and Deloukas P

    Wellcome Trust Sanger Institute, CB101SA Hinxton, UK; Department of Twin Research and Genetic Epidemiology, King's College London, SE17EH London, UK. Electronic address: elin.grundberg@mcgill.ca.

    Epigenetic modifications such as DNA methylation play a key role in gene regulation and disease susceptibility. However, little is known about the genome-wide frequency, localization, and function of methylation variation and how it is regulated by genetic and environmental factors. We utilized the Multiple Tissue Human Expression Resource (MuTHER) and generated Illumina 450K adipose methylome data from 648 twins. We found that individual CpGs had low variance and that variability was suppressed in promoters. We noted that DNA methylation variation was highly heritable (h(2)median = 0.34) and that shared environmental effects correlated with metabolic phenotype-associated CpGs. Analysis of methylation quantitative-trait loci (metQTL) revealed that 28% of CpGs were associated with nearby SNPs, and when overlapping them with adipose expression quantitative-trait loci (eQTL) from the same individuals, we found that 6% of the loci played a role in regulating both gene expression and DNA methylation. These associations were bidirectional, but there were pronounced negative associations for promoter CpGs. Integration of metQTL with adipose reference epigenomes and disease associations revealed significant enrichment of metQTL overlapping metabolic-trait or disease loci in enhancers (the strongest effects were for high-density lipoprotein cholesterol and body mass index [BMI]). We followed up with the BMI SNP rs713586, a cg01884057 metQTL that overlaps an enhancer upstream of ADCY3, and used bisulphite sequencing to refine this region. Our results showed widespread population invariability yet sequence dependence on adipose DNA methylation but that incorporating maps of regulatory elements aid in linking CpG variation to gene regulation and disease risk in a tissue-dependent manner.

    Funded by: Canadian Institutes of Health Research: EP1-120608; Wellcome Trust: 081917/Z/07/Z, 083270/Z/07/Z, 090532, 098051, 100140

    American journal of human genetics 2013;93;5;876-90

  • Mapping cis- and trans-regulatory effects across multiple tissues in twins.

    Grundberg E, Small KS, Hedman ÅK, Nica AC, Buil A, Keildson S, Bell JT, Yang TP, Meduri E, Barrett A, Nisbett J, Sekowska M, Wilk A, Shin SY, Glass D, Travers M, Min JL, Ring S, Ho K, Thorleifsson G, Kong A, Thorsteindottir U, Ainali C, Dimas AS, Hassanali N, Ingle C, Knowles D, Krestyaninova M, Lowe CE, Di Meglio P, Montgomery SB, Parts L, Potter S, Surdulescu G, Tsaprouni L, Tsoka S, Bataille V, Durbin R, Nestle FO, O'Rahilly S, Soranzo N, Lindgren CM, Zondervan KT, Ahmadi KR, Schadt EE, Stefansson K, Smith GD, McCarthy MI, Deloukas P, Dermitzakis ET, Spector TD and Multiple Tissue Human Expression Resource (MuTHER) Consortium

    Wellcome Trust Sanger Institute, Hinxton, UK.

    Sequence-based variation in gene expression is a key driver of disease risk. Common variants regulating expression in cis have been mapped in many expression quantitative trait locus (eQTL) studies, typically in single tissues from unrelated individuals. Here, we present a comprehensive analysis of gene expression across multiple tissues conducted in a large set of mono- and dizygotic twins that allows systematic dissection of genetic (cis and trans) and non-genetic effects on gene expression. Using identity-by-descent estimates, we show that at least 40% of the total heritable cis effect on expression cannot be accounted for by common cis variants, a finding that reveals the contribution of low-frequency and rare regulatory variants with respect to both transcriptional regulation and complex trait susceptibility. We show that a substantial proportion of gene expression heritability is trans to the structural gene, and we identify several replicating trans variants that act predominantly in a tissue-restricted manner and may regulate the transcription of many genes.

    Funded by: Medical Research Council: G0900339, G9815508; Wellcome Trust: 081917/Z/07/Z, 085235, 090532, 092731

    Nature genetics 2012;44;10;1084-9

  • Patterns of cis regulatory variation in diverse human populations.

    Stranger BE, Montgomery SB, Dimas AS, Parts L, Stegle O, Ingle CE, Sekowska M, Smith GD, Evans D, Gutierrez-Arcelus M, Price A, Raj T, Nisbett J, Nica AC, Beazley C, Durbin R, Deloukas P and Dermitzakis ET

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, UK.

    The genetic basis of gene expression variation has long been studied with the aim to understand the landscape of regulatory variants, but also more recently to assist in the interpretation and elucidation of disease signals. To date, many studies have looked in specific tissues and population-based samples, but there has been limited assessment of the degree of inter-population variability in regulatory variation. We analyzed genome-wide gene expression in lymphoblastoid cell lines from a total of 726 individuals from 8 global populations from the HapMap3 project and correlated gene expression levels with HapMap3 SNPs located in cis to the genes. We describe the influence of ancestry on gene expression levels within and between these diverse human populations and uncover a non-negligible impact on global patterns of gene expression. We further dissect the specific functional pathways differentiated between populations. We also identify 5,691 expression quantitative trait loci (eQTLs) after controlling for both non-genetic factors and population admixture and observe that half of the cis-eQTLs are replicated in one or more of the populations. We highlight patterns of eQTL-sharing between populations, which are partially determined by population genetic relatedness, and discover significant sharing of eQTL effects between Asians, European-admixed, and African subpopulations. Specifically, we observe that both the effect size and the direction of effect for eQTLs are highly conserved across populations. We observe an increasing proximity of eQTLs toward the transcription start site as sharing of eQTLs among populations increases, highlighting that variants close to TSS have stronger effects and therefore are more likely to be detected across a wider panel of populations. Together these results offer a unique picture and resource of the degree of differentiation among human populations in functional regulatory variation and provide an estimate for the transferability of complex trait variants across populations.

    Funded by: Wellcome Trust

    PLoS genetics 2012;8;4;e1002639

  • The architecture of gene regulatory variation across multiple human tissues: the MuTHER study.

    Nica AC, Parts L, Glass D, Nisbet J, Barrett A, Sekowska M, Travers M, Potter S, Grundberg E, Small K, Hedman AK, Bataille V, Tzenova Bell J, Surdulescu G, Dimas AS, Ingle C, Nestle FO, di Meglio P, Min JL, Wilk A, Hammond CJ, Hassanali N, Yang TP, Montgomery SB, O'Rahilly S, Lindgren CM, Zondervan KT, Soranzo N, Barroso I, Durbin R, Ahmadi K, Deloukas P, McCarthy MI, Dermitzakis ET, Spector TD and MuTHER Consortium

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, United Kingdom.

    While there have been studies exploring regulatory variation in one or more tissues, the complexity of tissue-specificity in multiple primary tissues is not yet well understood. We explore in depth the role of cis-regulatory variation in three human tissues: lymphoblastoid cell lines (LCL), skin, and fat. The samples (156 LCL, 160 skin, 166 fat) were derived simultaneously from a subset of well-phenotyped healthy female twins of the MuTHER resource. We discover an abundance of cis-eQTLs in each tissue similar to previous estimates (858 or 4.7% of genes). In addition, we apply factor analysis (FA) to remove effects of latent variables, thus more than doubling the number of our discoveries (1,822 eQTL genes). The unique study design (Matched Co-Twin Analysis--MCTA) permits immediate replication of eQTLs using co-twins (93%-98%) and validation of the considerable gain in eQTL discovery after FA correction. We highlight the challenges of comparing eQTLs between tissues. After verifying previous significance threshold-based estimates of tissue-specificity, we show their limitations given their dependency on statistical power. We propose that continuous estimates of the proportion of tissue-shared signals and direct comparison of the magnitude of effect on the fold change in expression are essential properties that jointly provide a biologically realistic view of tissue-specificity. Under this framework we demonstrate that 30% of eQTLs are shared among the three tissues studied, while another 29% appear exclusively tissue-specific. However, even among the shared eQTLs, a substantial proportion (10%-20%) have significant differences in the magnitude of fold change between genotypic classes across tissues. Our results underline the need to account for the complexity of eQTL tissue-specificity in an effort to assess consequences of such variants for complex traits.

    Funded by: Medical Research Council: G0900339; Wellcome Trust: 077016/Z/05/Z, 085235

    PLoS genetics 2011;7;2;e1002003

  • Common regulatory variation impacts gene expression in a cell type-dependent manner.

    Dimas AS, Deutsch S, Stranger BE, Montgomery SB, Borel C, Attar-Cohen H, Ingle C, Beazley C, Gutierrez Arcelus M, Sekowska M, Gagnebin M, Nisbett J, Deloukas P, Dermitzakis ET and Antonarakis SE

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, CB10 1HH, Cambridge, UK.

    Studies correlating genetic variation to gene expression facilitate the interpretation of common human phenotypes and disease. As functional variants may be operating in a tissue-dependent manner, we performed gene expression profiling and association with genetic variants (single-nucleotide polymorphisms) on three cell types of 75 individuals. We detected cell type-specific genetic effects, with 69 to 80% of regulatory variants operating in a cell type-specific manner, and identified multiple expressive quantitative trait loci (eQTLs) per gene, unique or shared among cell types and positively correlated with the number of transcripts per gene. Cell type-specific eQTLs were found at larger distances from genes and at lower effect size, similar to known enhancers. These data suggest that the complete regulatory variant repertoire can only be uncovered in the context of cell-type specificity.

    Funded by: Wellcome Trust: 077011, 077046

    Science (New York, N.Y.) 2009;325;5945;1246-50

Contact

  • Collaboration requests please contact Prof Emmanouil Dermitzakis.
  • Data and replication requests please contact each dataset publication authors.
  • Comments, questions and bug reports please contact Genevar developer Tsun-Po Yang.
  • We thank Sanger Team 16 and 147 members' consistent support during this project.
* quick link - http://q.sanger.ac.uk/jqzgkiln