Calculates regulatory activities using ORA.
Arguments
- mat
Matrix to evaluate (e.g. expression matrix). Target nodes in rows and conditions in columns.
rownames(mat)must have at least one intersection with the elements innetwork.targetcolumn.- network
Tibble or dataframe with edges and it's associated metadata.
- .source
Column with source nodes.
- .target
Column with target nodes.
- n_up
Integer indicating the number of top targets to slice from mat.
- n_bottom
Integer indicating the number of bottom targets to slice from mat.
- n_background
Integer indicating the background size of the sliced targets. If not specified the number of background targets is determined by the total number of unique targets in the union of
matandnetwork.- with_ties
Should ties be kept together? The default,
TRUE, may return more rows than you request. UseFALSEto ignore ties, and return the firstnrows.- seed
A single value, interpreted as an integer, or NULL for random number generation.
- minsize
Integer indicating the minimum number of targets per source.
- ...
Arguments passed on to
stats::fisher.testworkspacean integer specifying the size of the workspace used in the network algorithm. In units of 4 bytes. Only used for non-simulated p-values larger than \(2 \times 2\) tables. Since R version 3.5.0, this also increases the internal stack size which allows larger problems to be solved, however sometimes needing hours. In such cases,
simulate.p.values=TRUEmay be more reasonable.hybrida logical. Only used for larger than \(2 \times 2\) tables, in which cases it indicates whether the exact probabilities (default) or a hybrid approximation thereof should be computed.
hybridParsa numeric vector of length 3, by default describing “Cochran's conditions” for the validity of the chisquare approximation, see ‘Details’.
controla list with named components for low level algorithm control. At present the only one used is
"mult", a positive integer \(\ge 2\) with default 30 used only for larger than \(2 \times 2\) tables. This says how many times as much space should be allocated to paths as to keys: see filefexact.cin the sources of this package.orthe hypothesized odds ratio. Only used in the \(2 \times 2\) case.
alternativeindicates the alternative hypothesis and must be one of
"two.sided","greater"or"less". You can specify just the initial letter. Only used in the \(2 \times 2\) case.conf.intlogical indicating if a confidence interval for the odds ratio in a \(2 \times 2\) table should be computed (and returned).
conf.levelconfidence level for the returned confidence interval. Only used in the \(2 \times 2\) case and if
conf.int = TRUE.simulate.p.valuea logical indicating whether to compute p-values by Monte Carlo simulation, in larger than \(2 \times 2\) tables.
Ban integer specifying the number of replicates used in the Monte Carlo test.
Value
A long format tibble of the enrichment scores for each source across the samples. Resulting tibble contains the following columns:
statistic: Indicates which method is associated with which score.source: Source nodes ofnetwork.condition: Condition representing each column ofmat.score: Regulatory activity (enrichment score).
Details
ORA measures the overlap between the target feature set and a list of most
altered molecular features in mat. The most altered molecular features can
be selected from the top and or bottom of the molecular readout distribution,
by default it is the top 5% positive values. With these, a contingency table
is build and a one-tailed Fisher’s exact test is computed to determine if a
regulator’s set of features are over-represented in the selected features
from the data. The resulting score, ora, is the minus log10 of the
obtained p-value.
Examples
inputs_dir <- system.file("testdata", "inputs", package = "decoupleR")
mat <- readRDS(file.path(inputs_dir, "mat.rds"))
net <- readRDS(file.path(inputs_dir, "net.rds"))
run_ora(mat, net, minsize=0)
#> # A tibble: 72 × 5
#> statistic source condition score p_value
#> <chr> <chr> <chr> <dbl> <dbl>
#> 1 ora T1 S01 3.82 0.000150
#> 2 ora T1 S02 3.82 0.000150
#> 3 ora T1 S03 3.82 0.000150
#> 4 ora T1 S04 3.82 0.000150
#> 5 ora T1 S05 0 1
#> 6 ora T1 S06 0 1
#> 7 ora T1 S07 3.82 0.000150
#> 8 ora T1 S08 3.82 0.000150
#> 9 ora T1 S09 3.82 0.000150
#> 10 ora T1 S10 3.82 0.000150
#> # ℹ 62 more rows