Step 0: Data Preparation
Load demo TWAS data, which consists of 3 genes and 5 tissues (cell types).
data("twas_data")
str(twas_data)
#> List of 4
#> $ E :List of 5
#> ..$ : num [1:715, 1:3] -0.0744 -0.0187 -0.0881 -0.0381 -0.0197 ...
#> .. ..- attr(*, "dimnames")=List of 2
#> .. .. ..$ : NULL
#> .. .. ..$ : chr [1:3] "E0" "" ""
#> ..$ : num [1:715, 1:3] -0.05206 -0.02378 -0.01437 -0.04873 -0.00659 ...
#> .. ..- attr(*, "dimnames")=List of 2
#> .. .. ..$ : NULL
#> .. .. ..$ : chr [1:3] "E0" "" ""
#> ..$ : num [1:715, 1:3] -0.15363 -0.0181 0.00329 -0.00135 0.13717 ...
#> .. ..- attr(*, "dimnames")=List of 2
#> .. .. ..$ : NULL
#> .. .. ..$ : chr [1:3] "E0" "" ""
#> ..$ : num [1:715, 1:3] -0.05677 0.02944 0.00605 -0.12619 -0.19318 ...
#> .. ..- attr(*, "dimnames")=List of 2
#> .. .. ..$ : NULL
#> .. .. ..$ : chr [1:3] "E0" "" ""
#> ..$ : num [1:715, 1:3] 0.00259 -0.03566 0.01043 0.01939 0.03691 ...
#> .. ..- attr(*, "dimnames")=List of 2
#> .. .. ..$ : NULL
#> .. .. ..$ : chr [1:3] "E0" "" ""
#> $ dat :List of 2
#> ..$ bed: int [1:715, 1:509] 0 0 0 0 0 0 0 0 1 0 ...
#> .. ..- attr(*, "dimnames")=List of 2
#> .. .. ..$ : NULL
#> .. .. ..$ : chr [1:509] "chr22_16390411_G_A_b38_A" "chr22_16396408_C_G_b38_G" "chr22_16406147_A_G_b38_G" "chr22_16412132_G_A_b38_A" ...
#> ..$ bim:Classes 'data.table' and 'data.frame': 509 obs. of 6 variables:
#> .. ..$ V1: int [1:509] 22 22 22 22 22 22 22 22 22 22 ...
#> .. ..$ V2: chr [1:509] "rs131538" "rs140378" "rs7287144" "rs5748662" ...
#> .. ..$ V3: int [1:509] 0 0 0 0 0 0 0 0 0 0 ...
#> .. ..$ V4: int [1:509] 16390411 16396408 16406147 16412132 16437595 16461056 16472237 16551808 16573830 16577726 ...
#> .. ..$ V5: chr [1:509] "A" "G" "G" "A" ...
#> .. ..$ V6: chr [1:509] "G" "C" "A" "G" ...
#> .. ..- attr(*, ".internal.selfref")=<externalptr>
#> $ E.info:Classes 'data.table' and 'data.frame': 3 obs. of 4 variables:
#> ..$ chr : int [1:3] 22 22 22
#> ..$ start: int [1:3] 15528192 15690026 16590751
#> ..$ end : int [1:3] 15529139 15721631 16592810
#> ..$ gene : chr [1:3] "OR11H1" "POTEH" "CCT8L2"
#> ..- attr(*, ".internal.selfref")=<externalptr>
#> $ pos :List of 3
#> ..$ : int [1:7] 1 2 3 4 5 6 7
#> ..$ : int [1:37] 1 2 3 4 5 6 7 8 9 10 ...
#> ..$ : int [1:509] 1 2 3 4 5 6 7 8 9 10 ...
Step 1: Train Weights
Firstly, detect cross-tissue eQTLs,
ct.eQTL = select.ct.eQTL(twas_data, verbose = F, ncores = 1)
ct.eQTL
#> $OR11H1
#> numeric(0)
#>
#> $POTEH
#> numeric(0)
#>
#> $CCT8L2
#> numeric(0)
One can specify ncores
for parallel computing.
Then fix those ct.eQTLs, and select the tissue-specific eQTLs,
list.ts.eQTL = select.ts.eQTL(twas_data, ct.eQTL = ct.eQTL, ncores = 1)
list.ts.eQTL
#> [[1]]
#> [[1]]$OR11H1
#> [[1]]$OR11H1$BIC
#> [1] -677.0032
#>
#> [[1]]$OR11H1$best_term
#> [1] 5
#>
#> [[1]]$OR11H1$coef
#> (Intercept) term
#> -0.01445075 0.01531436
#>
#> [[1]]$OR11H1$single
#> [1] 5
#>
#> [[1]]$OR11H1$single.snp
#> [1] "rs2027653"
#>
#> [[1]]$OR11H1$common.snp
#> character(0)
#>
#>
#> [[1]]$POTEH
#> [[1]]$POTEH$BIC
#> [1] 737.335
#>
#> [[1]]$POTEH$best_term
#> [1] 36
#>
#> [[1]]$POTEH$coef
#> (Intercept) term
#> -0.009329623 0.027192333
#>
#> [[1]]$POTEH$single
#> [1] 36
#>
#> [[1]]$POTEH$single.snp
#> [1] "rs5746879"
#>
#> [[1]]$POTEH$common.snp
#> character(0)
#>
#>
#> [[1]]$CCT8L2
#> [[1]]$CCT8L2$BIC
#> [1] 1312.692
#>
#> [[1]]$CCT8L2$best_term
#> [1] 498
#>
#> [[1]]$CCT8L2$coef
#> (Intercept) term
#> 0.12004208 -0.08698112
#>
#> [[1]]$CCT8L2$single
#> [1] 498
#>
#> [[1]]$CCT8L2$single.snp
#> [1] "rs174345"
#>
#> [[1]]$CCT8L2$common.snp
#> character(0)
#>
#>
#>
#> [[2]]
#> [[2]]$OR11H1
#> [[2]]$OR11H1$BIC
#> [1] -64.34403
#>
#> [[2]]$OR11H1$best_term
#> [1] 6
#>
#> [[2]]$OR11H1$coef
#> (Intercept) term
#> 0.02199233 -0.03487156
#>
#> [[2]]$OR11H1$single
#> [1] 6
#>
#> [[2]]$OR11H1$single.snp
#> [1] "rs4384884"
#>
#> [[2]]$OR11H1$common.snp
#> character(0)
#>
#>
#> [[2]]$POTEH
#> [[2]]$POTEH$BIC
#> [1] 451.0904
#>
#> [[2]]$POTEH$best_term
#> [1] 29
#>
#> [[2]]$POTEH$coef
#> (Intercept) term
#> 0.06072520 -0.04569124
#>
#> [[2]]$POTEH$single
#> [1] 29
#>
#> [[2]]$POTEH$single.snp
#> [1] "rs2845371"
#>
#> [[2]]$POTEH$common.snp
#> character(0)
#>
#>
#> [[2]]$CCT8L2
#> [[2]]$CCT8L2$BIC
#> [1] 1117.912
#>
#> [[2]]$CCT8L2$best_term
#> [1] 268
#>
#> [[2]]$CCT8L2$coef
#> (Intercept) term
#> -0.02949304 0.07353065
#>
#> [[2]]$CCT8L2$single
#> [1] 268
#>
#> [[2]]$CCT8L2$single.snp
#> [1] "rs5994176"
#>
#> [[2]]$CCT8L2$common.snp
#> character(0)
#>
#>
#>
#> [[3]]
#> [[3]]$OR11H1
#> [[3]]$OR11H1$BIC
#> [1] -552.0486
#>
#> [[3]]$OR11H1$best_term
#> [1] 3
#>
#> [[3]]$OR11H1$coef
#> (Intercept) term
#> 0.010976617 -0.004856457
#>
#> [[3]]$OR11H1$single
#> [1] 3
#>
#> [[3]]$OR11H1$single.snp
#> [1] "rs7287144"
#>
#> [[3]]$OR11H1$common.snp
#> character(0)
#>
#>
#> [[3]]$POTEH
#> [[3]]$POTEH$BIC
#> [1] 570.0594
#>
#> [[3]]$POTEH$best_term
#> [1] 32
#>
#> [[3]]$POTEH$coef
#> (Intercept) term
#> -0.01171048 -0.05688065
#>
#> [[3]]$POTEH$single
#> [1] 32
#>
#> [[3]]$POTEH$single.snp
#> [1] "rs2845377"
#>
#> [[3]]$POTEH$common.snp
#> character(0)
#>
#>
#> [[3]]$CCT8L2
#> [[3]]$CCT8L2$BIC
#> [1] 892.6655
#>
#> [[3]]$CCT8L2$best_term
#> [1] 392
#>
#> [[3]]$CCT8L2$coef
#> (Intercept) term
#> -0.02638992 0.07828622
#>
#> [[3]]$CCT8L2$single
#> [1] 392
#>
#> [[3]]$CCT8L2$single.snp
#> [1] "rs5747087"
#>
#> [[3]]$CCT8L2$common.snp
#> character(0)
#>
#>
#>
#> [[4]]
#> [[4]]$OR11H1
#> [[4]]$OR11H1$BIC
#> [1] -105.8922
#>
#> [[4]]$OR11H1$best_term
#> [1] 7
#>
#> [[4]]$OR11H1$coef
#> (Intercept) term
#> -2.022243e-06 1.071937e-02
#>
#> [[4]]$OR11H1$single
#> [1] 7
#>
#> [[4]]$OR11H1$single.snp
#> [1] NA
#>
#> [[4]]$OR11H1$common.snp
#> character(0)
#>
#>
#> [[4]]$POTEH
#> [[4]]$POTEH$BIC
#> [1] 594.8341
#>
#> [[4]]$POTEH$best_term
#> [1] 22
#>
#> [[4]]$POTEH$coef
#> (Intercept) term
#> 0.01896703 -0.04184011
#>
#> [[4]]$POTEH$single
#> [1] 22
#>
#> [[4]]$POTEH$single.snp
#> [1] "rs5993671"
#>
#> [[4]]$POTEH$common.snp
#> character(0)
#>
#>
#> [[4]]$CCT8L2
#> [[4]]$CCT8L2$BIC
#> [1] 1266.084
#>
#> [[4]]$CCT8L2$best_term
#> [1] 79
#>
#> [[4]]$CCT8L2$coef
#> (Intercept) term
#> -0.02550013 0.13037882
#>
#> [[4]]$CCT8L2$single
#> [1] 79
#>
#> [[4]]$CCT8L2$single.snp
#> [1] "rs2072466"
#>
#> [[4]]$CCT8L2$common.snp
#> character(0)
#>
#>
#>
#> [[5]]
#> [[5]]$OR11H1
#> [[5]]$OR11H1$BIC
#> [1] -102.3713
#>
#> [[5]]$OR11H1$best_term
#> [1] 2
#>
#> [[5]]$OR11H1$coef
#> (Intercept) term
#> 0.002950217 0.039894448
#>
#> [[5]]$OR11H1$single
#> [1] 2
#>
#> [[5]]$OR11H1$single.snp
#> [1] "rs140378"
#>
#> [[5]]$OR11H1$common.snp
#> character(0)
#>
#>
#> [[5]]$POTEH
#> [[5]]$POTEH$BIC
#> [1] 97.10953
#>
#> [[5]]$POTEH$best_term
#> [1] 27
#>
#> [[5]]$POTEH$coef
#> (Intercept) term
#> 0.03624239 -0.03725128
#>
#> [[5]]$POTEH$single
#> [1] 27
#>
#> [[5]]$POTEH$single.snp
#> [1] "rs361973"
#>
#> [[5]]$POTEH$common.snp
#> character(0)
#>
#>
#> [[5]]$CCT8L2
#> [[5]]$CCT8L2$BIC
#> [1] 865.5521
#>
#> [[5]]$CCT8L2$best_term
#> [1] 353
#>
#> [[5]]$CCT8L2$coef
#> (Intercept) term
#> -0.0392554 0.0883380
#>
#> [[5]]$CCT8L2$single
#> [1] 353
#>
#> [[5]]$CCT8L2$single.snp
#> [1] "rs5994253"
#>
#> [[5]]$CCT8L2$common.snp
#> character(0)
By default, it returns the list of ts-eQTLs for each tissue. You can also specify the index of tissue to get the corresponding ts-eQTL,
select.ts.eQTL(twas_data, ct.eQTL = ct.eQTL, ncores = 1, id.tissue = 1)
Step 2: Association Test
Load GWAS summary statistics for some phenotype
data("summary_stats")
str(summary_stats)
#> 'data.frame': 1000 obs. of 5 variables:
#> $ rsid: chr "rs7287144" "rs5748662" "rs4010554" "rs4010558" ...
#> $ ref : chr "G" "A" "A" "A" ...
#> $ alt : chr "A" "G" "C" "G" ...
#> $ chr : int 22 22 22 22 22 22 22 22 22 22 ...
#> $ z : num 0.584 0.638 0.857 0.835 0.29 ...
Then perform an association test in a tissue-specific manner to retain the tissue specificity of the trait-gene associations.
twas.single.trait(summary_stats, twas_data, list.ts.eQTL)
#> $zstat
#> tissue1 tissue2 tissue3 tissue4 tissue5
#> OR11H1 0.00000 0 0.584223 0.000000 0.000000
#> POTEH 0.00000 0 0.000000 -0.873516 0.000000
#> CCT8L2 -2.04794 0 1.375810 -0.134001 0.806425
#>
#> $pval
#> tissue1 tissue2 tissue3 tissue4 tissue5
#> OR11H1 1.00000000 1 0.5590703 1.0000000 1.0000000
#> POTEH 1.00000000 1 1.0000000 0.3823819 1.0000000
#> CCT8L2 0.04056588 1 0.1688805 0.8934018 0.4199978
Alternatively, one can pass a file name to the summary statistics,
twas.single.trait(twas_data, "/tmp/gwas_summary_stats.txt", list.ts.eQTL)
For multiple traits, one can pass a list of summary statistics or files, e.g.,
twas.multiple.trait(twas_data, paste0("/tmp/trait", 1:10, "_gwas_summary_stats.txt"), list.ts.eQTL)