Compute progeny pathway scores and assesses significance based on permutations

Usage

progenyPerm(
  df,
  weight_matrix,
  k = 10000,
  z_scores = TRUE,
  get_nulldist = FALSE
)

Arguments

df: A data.frame of n*m+1 dimension, where n is the number of omic features to be considered and m is the number of samples/contrasts. The first column should be the identifiers of the omic features. These identifiers must be coherent with the identifiers of the weight matrix.
weight_matrix: A progeny coefficient matrix. the first column should be the identifiers of the omic features and should be coherent with the identifiers provided in df.
k: The number of permutations to be performed to generate the null-distribution used to estimate the significance of progeny scores. The default value is 10000.
z_scores: if true, the z-scores will be returned for the pathway activity estimations. Else, the function returns a normalized z-score value between -1 and 1.
get_nulldist: if true, the null score distribution used for normalization will be returned along with the actual normalized score data frame.

Value

This function returns a list of two elements. The first element is a data frame of p*m+1 dimensions, where p is the number of progeny pathways, and m is the number of samples/contrasts. Each cell represents the significance of a progeny pathway score for one sample/contrast. The significance ranges between -1 and 1. The significance is equal to x*2-1, x being the quantile of the progeny pathway score with respect to the null distribution. Thus, this significance can be interpreted as the equivalent of 1-p.value two-sided test over an empirical distribution) with the sign indicating the direction of the regulation. The second element is the null distribution list (a null distribution is generated for each sample/contrast).

Examples

# use example gene expression matrix
gene_expression <- as.matrix(read.csv(system.file("extdata", 
"human_input.csv", package = "progeny"), row.names = 1))

# calculate pathway activities
progeny(gene_expression, scale=TRUE, organism="Human", top=100, perm=10000)
#>            Androgen   EGFR Estrogen Hypoxia JAK.STAT   MAPK    NFkB    p53
#> SRR1039508  -0.8380 0.1996  -0.3470  0.9996   0.7718 0.4066 -0.5552 0.9240
#> SRR1039509  -0.4918 0.0608  -0.6360  0.9998   0.4858 0.3284 -0.4426 0.8778
#> SRR1039512  -0.7836 0.2158  -0.4054  0.9996   0.6992 0.4132 -0.4322 0.9230
#> SRR1039513  -0.4372 0.0790  -0.6722  1.0000   0.6724 0.4352 -0.4490 0.7916
#> SRR1039516  -0.7578 0.2614  -0.2944  0.9992   0.6132 0.4200 -0.3042 0.9276
#> SRR1039517  -0.5104 0.5540  -0.6566  1.0000   0.4672 0.7292 -0.3776 0.8090
#> SRR1039520  -0.8056 0.3332  -0.3886  0.9994   0.6094 0.5606 -0.6314 0.9022
#> SRR1039521  -0.3562 0.1490  -0.6650  0.9994   0.4996 0.5776 -0.6296 0.7778
#>               PI3K   TGFb    TNFa   Trail    VEGF     WNT
#> SRR1039508  0.1660 0.5416 -0.5988 -0.3570  0.0432 -0.1070
#> SRR1039509  0.0984 0.7620 -0.4628 -0.3012  0.0508 -0.1712
#> SRR1039512  0.1418 0.6930 -0.4706 -0.3462  0.0842 -0.1586
#> SRR1039513 -0.0706 0.8240 -0.4828 -0.2668  0.0958 -0.3054
#> SRR1039516  0.1706 0.6926 -0.3564 -0.3356  0.2734 -0.3268
#> SRR1039517  0.3604 0.8662 -0.4248 -0.3586  0.2530 -0.4108
#> SRR1039520  0.1788 0.6872 -0.6702 -0.2790 -0.0824 -0.1208
#> SRR1039521  0.0350 0.8092 -0.6476 -0.2630 -0.0724 -0.3158