Skip to contents

Step 0: Data Preparation

Load demo TWAS data, which consists of 3 genes and 5 tissues (cell types).

data("twas_data")
str(twas_data)
#> List of 4
#>  $ E     :List of 5
#>   ..$ : num [1:715, 1:3] -0.0744 -0.0187 -0.0881 -0.0381 -0.0197 ...
#>   .. ..- attr(*, "dimnames")=List of 2
#>   .. .. ..$ : NULL
#>   .. .. ..$ : chr [1:3] "E0" "" ""
#>   ..$ : num [1:715, 1:3] -0.05206 -0.02378 -0.01437 -0.04873 -0.00659 ...
#>   .. ..- attr(*, "dimnames")=List of 2
#>   .. .. ..$ : NULL
#>   .. .. ..$ : chr [1:3] "E0" "" ""
#>   ..$ : num [1:715, 1:3] -0.15363 -0.0181 0.00329 -0.00135 0.13717 ...
#>   .. ..- attr(*, "dimnames")=List of 2
#>   .. .. ..$ : NULL
#>   .. .. ..$ : chr [1:3] "E0" "" ""
#>   ..$ : num [1:715, 1:3] -0.05677 0.02944 0.00605 -0.12619 -0.19318 ...
#>   .. ..- attr(*, "dimnames")=List of 2
#>   .. .. ..$ : NULL
#>   .. .. ..$ : chr [1:3] "E0" "" ""
#>   ..$ : num [1:715, 1:3] 0.00259 -0.03566 0.01043 0.01939 0.03691 ...
#>   .. ..- attr(*, "dimnames")=List of 2
#>   .. .. ..$ : NULL
#>   .. .. ..$ : chr [1:3] "E0" "" ""
#>  $ dat   :List of 2
#>   ..$ bed: int [1:715, 1:509] 0 0 0 0 0 0 0 0 1 0 ...
#>   .. ..- attr(*, "dimnames")=List of 2
#>   .. .. ..$ : NULL
#>   .. .. ..$ : chr [1:509] "chr22_16390411_G_A_b38_A" "chr22_16396408_C_G_b38_G" "chr22_16406147_A_G_b38_G" "chr22_16412132_G_A_b38_A" ...
#>   ..$ bim:Classes 'data.table' and 'data.frame': 509 obs. of  6 variables:
#>   .. ..$ V1: int [1:509] 22 22 22 22 22 22 22 22 22 22 ...
#>   .. ..$ V2: chr [1:509] "rs131538" "rs140378" "rs7287144" "rs5748662" ...
#>   .. ..$ V3: int [1:509] 0 0 0 0 0 0 0 0 0 0 ...
#>   .. ..$ V4: int [1:509] 16390411 16396408 16406147 16412132 16437595 16461056 16472237 16551808 16573830 16577726 ...
#>   .. ..$ V5: chr [1:509] "A" "G" "G" "A" ...
#>   .. ..$ V6: chr [1:509] "G" "C" "A" "G" ...
#>   .. ..- attr(*, ".internal.selfref")=<externalptr> 
#>  $ E.info:Classes 'data.table' and 'data.frame': 3 obs. of  4 variables:
#>   ..$ chr  : int [1:3] 22 22 22
#>   ..$ start: int [1:3] 15528192 15690026 16590751
#>   ..$ end  : int [1:3] 15529139 15721631 16592810
#>   ..$ gene : chr [1:3] "OR11H1" "POTEH" "CCT8L2"
#>   ..- attr(*, ".internal.selfref")=<externalptr> 
#>  $ pos   :List of 3
#>   ..$ : int [1:7] 1 2 3 4 5 6 7
#>   ..$ : int [1:37] 1 2 3 4 5 6 7 8 9 10 ...
#>   ..$ : int [1:509] 1 2 3 4 5 6 7 8 9 10 ...

Step 1: Train Weights

Firstly, detect cross-tissue eQTLs,

ct.eQTL = select.ct.eQTL(twas_data, verbose = F, ncores = 1)
ct.eQTL
#> $OR11H1
#> numeric(0)
#> 
#> $POTEH
#> numeric(0)
#> 
#> $CCT8L2
#> numeric(0)

One can specify ncores for parallel computing.

Then fix those ct.eQTLs, and select the tissue-specific eQTLs,

list.ts.eQTL = select.ts.eQTL(twas_data, ct.eQTL = ct.eQTL, ncores = 1)
list.ts.eQTL
#> [[1]]
#> [[1]]$OR11H1
#> [[1]]$OR11H1$BIC
#> [1] -677.0032
#> 
#> [[1]]$OR11H1$best_term
#> [1] 5
#> 
#> [[1]]$OR11H1$coef
#> (Intercept)        term 
#> -0.01445075  0.01531436 
#> 
#> [[1]]$OR11H1$single
#> [1] 5
#> 
#> [[1]]$OR11H1$single.snp
#> [1] "rs2027653"
#> 
#> [[1]]$OR11H1$common.snp
#> character(0)
#> 
#> 
#> [[1]]$POTEH
#> [[1]]$POTEH$BIC
#> [1] 737.335
#> 
#> [[1]]$POTEH$best_term
#> [1] 36
#> 
#> [[1]]$POTEH$coef
#>  (Intercept)         term 
#> -0.009329623  0.027192333 
#> 
#> [[1]]$POTEH$single
#> [1] 36
#> 
#> [[1]]$POTEH$single.snp
#> [1] "rs5746879"
#> 
#> [[1]]$POTEH$common.snp
#> character(0)
#> 
#> 
#> [[1]]$CCT8L2
#> [[1]]$CCT8L2$BIC
#> [1] 1312.692
#> 
#> [[1]]$CCT8L2$best_term
#> [1] 498
#> 
#> [[1]]$CCT8L2$coef
#> (Intercept)        term 
#>  0.12004208 -0.08698112 
#> 
#> [[1]]$CCT8L2$single
#> [1] 498
#> 
#> [[1]]$CCT8L2$single.snp
#> [1] "rs174345"
#> 
#> [[1]]$CCT8L2$common.snp
#> character(0)
#> 
#> 
#> 
#> [[2]]
#> [[2]]$OR11H1
#> [[2]]$OR11H1$BIC
#> [1] -64.34403
#> 
#> [[2]]$OR11H1$best_term
#> [1] 6
#> 
#> [[2]]$OR11H1$coef
#> (Intercept)        term 
#>  0.02199233 -0.03487156 
#> 
#> [[2]]$OR11H1$single
#> [1] 6
#> 
#> [[2]]$OR11H1$single.snp
#> [1] "rs4384884"
#> 
#> [[2]]$OR11H1$common.snp
#> character(0)
#> 
#> 
#> [[2]]$POTEH
#> [[2]]$POTEH$BIC
#> [1] 451.0904
#> 
#> [[2]]$POTEH$best_term
#> [1] 29
#> 
#> [[2]]$POTEH$coef
#> (Intercept)        term 
#>  0.06072520 -0.04569124 
#> 
#> [[2]]$POTEH$single
#> [1] 29
#> 
#> [[2]]$POTEH$single.snp
#> [1] "rs2845371"
#> 
#> [[2]]$POTEH$common.snp
#> character(0)
#> 
#> 
#> [[2]]$CCT8L2
#> [[2]]$CCT8L2$BIC
#> [1] 1117.912
#> 
#> [[2]]$CCT8L2$best_term
#> [1] 268
#> 
#> [[2]]$CCT8L2$coef
#> (Intercept)        term 
#> -0.02949304  0.07353065 
#> 
#> [[2]]$CCT8L2$single
#> [1] 268
#> 
#> [[2]]$CCT8L2$single.snp
#> [1] "rs5994176"
#> 
#> [[2]]$CCT8L2$common.snp
#> character(0)
#> 
#> 
#> 
#> [[3]]
#> [[3]]$OR11H1
#> [[3]]$OR11H1$BIC
#> [1] -552.0486
#> 
#> [[3]]$OR11H1$best_term
#> [1] 3
#> 
#> [[3]]$OR11H1$coef
#>  (Intercept)         term 
#>  0.010976617 -0.004856457 
#> 
#> [[3]]$OR11H1$single
#> [1] 3
#> 
#> [[3]]$OR11H1$single.snp
#> [1] "rs7287144"
#> 
#> [[3]]$OR11H1$common.snp
#> character(0)
#> 
#> 
#> [[3]]$POTEH
#> [[3]]$POTEH$BIC
#> [1] 570.0594
#> 
#> [[3]]$POTEH$best_term
#> [1] 32
#> 
#> [[3]]$POTEH$coef
#> (Intercept)        term 
#> -0.01171048 -0.05688065 
#> 
#> [[3]]$POTEH$single
#> [1] 32
#> 
#> [[3]]$POTEH$single.snp
#> [1] "rs2845377"
#> 
#> [[3]]$POTEH$common.snp
#> character(0)
#> 
#> 
#> [[3]]$CCT8L2
#> [[3]]$CCT8L2$BIC
#> [1] 892.6655
#> 
#> [[3]]$CCT8L2$best_term
#> [1] 392
#> 
#> [[3]]$CCT8L2$coef
#> (Intercept)        term 
#> -0.02638992  0.07828622 
#> 
#> [[3]]$CCT8L2$single
#> [1] 392
#> 
#> [[3]]$CCT8L2$single.snp
#> [1] "rs5747087"
#> 
#> [[3]]$CCT8L2$common.snp
#> character(0)
#> 
#> 
#> 
#> [[4]]
#> [[4]]$OR11H1
#> [[4]]$OR11H1$BIC
#> [1] -105.8922
#> 
#> [[4]]$OR11H1$best_term
#> [1] 7
#> 
#> [[4]]$OR11H1$coef
#>   (Intercept)          term 
#> -2.022243e-06  1.071937e-02 
#> 
#> [[4]]$OR11H1$single
#> [1] 7
#> 
#> [[4]]$OR11H1$single.snp
#> [1] NA
#> 
#> [[4]]$OR11H1$common.snp
#> character(0)
#> 
#> 
#> [[4]]$POTEH
#> [[4]]$POTEH$BIC
#> [1] 594.8341
#> 
#> [[4]]$POTEH$best_term
#> [1] 22
#> 
#> [[4]]$POTEH$coef
#> (Intercept)        term 
#>  0.01896703 -0.04184011 
#> 
#> [[4]]$POTEH$single
#> [1] 22
#> 
#> [[4]]$POTEH$single.snp
#> [1] "rs5993671"
#> 
#> [[4]]$POTEH$common.snp
#> character(0)
#> 
#> 
#> [[4]]$CCT8L2
#> [[4]]$CCT8L2$BIC
#> [1] 1266.084
#> 
#> [[4]]$CCT8L2$best_term
#> [1] 79
#> 
#> [[4]]$CCT8L2$coef
#> (Intercept)        term 
#> -0.02550013  0.13037882 
#> 
#> [[4]]$CCT8L2$single
#> [1] 79
#> 
#> [[4]]$CCT8L2$single.snp
#> [1] "rs2072466"
#> 
#> [[4]]$CCT8L2$common.snp
#> character(0)
#> 
#> 
#> 
#> [[5]]
#> [[5]]$OR11H1
#> [[5]]$OR11H1$BIC
#> [1] -102.3713
#> 
#> [[5]]$OR11H1$best_term
#> [1] 2
#> 
#> [[5]]$OR11H1$coef
#> (Intercept)        term 
#> 0.002950217 0.039894448 
#> 
#> [[5]]$OR11H1$single
#> [1] 2
#> 
#> [[5]]$OR11H1$single.snp
#> [1] "rs140378"
#> 
#> [[5]]$OR11H1$common.snp
#> character(0)
#> 
#> 
#> [[5]]$POTEH
#> [[5]]$POTEH$BIC
#> [1] 97.10953
#> 
#> [[5]]$POTEH$best_term
#> [1] 27
#> 
#> [[5]]$POTEH$coef
#> (Intercept)        term 
#>  0.03624239 -0.03725128 
#> 
#> [[5]]$POTEH$single
#> [1] 27
#> 
#> [[5]]$POTEH$single.snp
#> [1] "rs361973"
#> 
#> [[5]]$POTEH$common.snp
#> character(0)
#> 
#> 
#> [[5]]$CCT8L2
#> [[5]]$CCT8L2$BIC
#> [1] 865.5521
#> 
#> [[5]]$CCT8L2$best_term
#> [1] 353
#> 
#> [[5]]$CCT8L2$coef
#> (Intercept)        term 
#>  -0.0392554   0.0883380 
#> 
#> [[5]]$CCT8L2$single
#> [1] 353
#> 
#> [[5]]$CCT8L2$single.snp
#> [1] "rs5994253"
#> 
#> [[5]]$CCT8L2$common.snp
#> character(0)

By default, it returns the list of ts-eQTLs for each tissue. You can also specify the index of tissue to get the corresponding ts-eQTL,

select.ts.eQTL(twas_data, ct.eQTL = ct.eQTL, ncores = 1, id.tissue = 1)

Step 2: Association Test

Load GWAS summary statistics for some phenotype

data("summary_stats")
str(summary_stats)
#> 'data.frame':    1000 obs. of  5 variables:
#>  $ rsid: chr  "rs7287144" "rs5748662" "rs4010554" "rs4010558" ...
#>  $ ref : chr  "G" "A" "A" "A" ...
#>  $ alt : chr  "A" "G" "C" "G" ...
#>  $ chr : int  22 22 22 22 22 22 22 22 22 22 ...
#>  $ z   : num  0.584 0.638 0.857 0.835 0.29 ...

Then perform an association test in a tissue-specific manner to retain the tissue specificity of the trait-gene associations.

twas.single.trait(summary_stats, twas_data, list.ts.eQTL)
#> $zstat
#>         tissue1 tissue2  tissue3   tissue4  tissue5
#> OR11H1  0.00000       0 0.584223  0.000000 0.000000
#> POTEH   0.00000       0 0.000000 -0.873516 0.000000
#> CCT8L2 -2.04794       0 1.375810 -0.134001 0.806425
#> 
#> $pval
#>           tissue1 tissue2   tissue3   tissue4   tissue5
#> OR11H1 1.00000000       1 0.5590703 1.0000000 1.0000000
#> POTEH  1.00000000       1 1.0000000 0.3823819 1.0000000
#> CCT8L2 0.04056588       1 0.1688805 0.8934018 0.4199978

Alternatively, one can pass a file name to the summary statistics,

twas.single.trait(twas_data, "/tmp/gwas_summary_stats.txt", list.ts.eQTL)

For multiple traits, one can pass a list of summary statistics or files, e.g.,

twas.multiple.trait(twas_data, paste0("/tmp/trait", 1:10, "_gwas_summary_stats.txt"), list.ts.eQTL)