GxGrare

Gene-gene interaction analysis method for rare variants from NGS high-throughput sequencing data

Download»

About GxGrare

With the rapid advancement of array-based genotyping techniques, genome-wide association studies (GWAS) have successfully identified common genetic variants associated with common complex diseases. However, it has been shown that only a small proportion of the genetic etiology of complex diseases could be explained by the genetic factors identified from GWAS. This missing heritability could possibly be explained by gene-gene interaction (epistasis) and rare variants. There has been an exponential growth of gene-gene interaction analysis for common variants in terms of methodological developments and practical applications. Also, the recent advancement of high-throughput sequencing technologies makes it possible to conduct rare variant analysis. However, little progress has been made in gene-gene interaction analysis for rare variants.

GxGrare is a new gene-gene interaction method for the rare variants in the framework of the multifactor dimensionality reduction (MDR) analysis. GxGrare consists of three steps; 1) collapsing the rare variants, 2) MDR analysis for the collapsed rare variants, and 3) detect top candidate interaction pairs. The first step is to collapse the rare variants according to their biological characteristics such as allele frequency or functional regions; this step utilizes known biological information to redefine the given genotypes to a more biologically meaningful categorical variable. An example would be a gene having no exonic rare variants given a value close to 0, and 1 otherwise, since non-exonic variants have weak or no effect on the function of a gene. The second step is to perform MDR analysis for the collapsed rare variants. The last is to use several evaluation measures to detect top candidate interaction pairs. GxGrare can be used for the detection of not only gene-gene interactions, but also interactions within a single gene.

Download software

Simulation data

  Weight Effect model Conditions data
simulation 1 No weight only interaction effect unidirectional sim1.tar.gz
simulation 2 No weight Interaction + marginal effect unidirectional sim2.tar.gz
simulation 3 No weight only interaction effect Bidirectional sim3.tar.gz
simulation 4 MAF weight only interaction effect unidirectional sim4.tar.gz
simulation 5 Conservation weight (0.5) only interaction effect unidirectional sim5.tar.gz
simulation 6 Conservation weight (0.8) only interaction effect unidirectional sim6.tar.gz
simulation 7 Conservation weight (1.0) only interaction effect unidirectional sim7.tar.gz
simulation 8 Conservation weight (0.5) Interaction + marginal effect unidirectional sim8.tar.gz
simulation 9 Conservation weight (7.5) Interaction + marginal effect unidirectional sim9.tar.gz
simulation 10 Conservation weight (1.0) Interaction + marginal effect unidirectional sim10.tar.gz

Manual



Usage: Run GxGrare
    gxgrare --in [input file] --out [output file] --score [score file] --perm [permutation number]
    ex) ./gxgrare --in example_genotype.txt --score example_score.txt --out example_result.csv --perm 1000

    Parameter
        --in : input genotype file path + name 

        --score : score file path + name

        --out : result file path + name

        --perm : permuation number

Input genotype file format (tab-delimited file) : the first column has phenotype class (0:case and 1:control). 
    [example_genotype.txt]
    ------------------------------------
    pheno    SNP1   SNP2    SNP3    SNP4
    1   1   0   0   2
    1   0   2   0   0
    1   0   1   0   0
    0   0   0   1   0
    0   1   0   0   0
    ------------------------------------

Score file format: the score file has the effect scores (0.0~1.0) for each SNP.
    [example_score.txt]
    ------------------------------------
    score
    0.2
    0.68
    0.23
    0.124
    ------------------------------------

Output result format (comma-delimited file) :
    [example_result.csv]

    - MDRcol_MAF(IG) : permuted p-value of information gain (IG) using MAF-based collapsing
    - MDRcol_MAF(BA) : permuted p-value of balanced accuracy (BA) using MAF-based collapsing
    - MDRcol_func(IG) : permuted p-value of IG using functional region-based collapsing
    - MDRcol_func(BA) : permuted p-value of BA using functional region-based collapsing
    - MDRcol_effect(IG) : permuted p-value of IG using effect-based collapsing
    - MDRcol_effect(BA) : permuted p-value of BA using effect-based collapsing