Evaluate multiple statistics with same input data

Calculate the source activity per sample out of a gene expression matrix by coupling a regulatory network with a variety of statistics.

Usage

decouple(
  mat,
  network,
  .source = source,
  .target = target,
  statistics = NULL,
  args = list(NULL),
  consensus_score = TRUE,
  consensus_stats = NULL,
  include_time = FALSE,
  show_toy_call = FALSE,
  minsize = 5
)

Arguments

mat: Matrix to evaluate (e.g. expression matrix). Target nodes in rows and conditions in columns. rownames(mat) must have at least one intersection with the elements in network .target column.
network: Tibble or dataframe with edges and it's associated metadata.
.source: Column with source nodes.
.target: Column with target nodes.
statistics: Statistical methods to be run sequentially. If none are provided, only top performer methods are run (mlm, ulm and wsum).
args: A list of argument-lists the same length as statistics (or length 1). The default argument, list(NULL), will be recycled to the same length as statistics, and will call each function with no arguments (apart from mat, network, .source and, .target).
consensus_score: Boolean whether to run a consensus score between methods.
consensus_stats: List of estimate names to use for the calculation of the consensus score. This is used to filter out extra estimations from some methods, for example wsum returns wsum, corr_wsum and norm_wsum. If none are provided, and also no statstics where provided, only top performer methods are used (mlm, ulm and norm_wsum). Else, it will use all available estimates after running all methods in the statistics argument.
include_time: Should the time per statistic evaluated be informed?
show_toy_call: The call of each statistic must be informed?
minsize: Integer indicating the minimum number of targets per source.

Value

A long format tibble of the enrichment scores for each source across the samples. Resulting tibble contains the following columns:

run_id: Indicates the order in which the methods have been executed.
statistic: Indicates which method is associated with which score.
source: Source nodes of network.
condition: Condition representing each column of mat.
score: Regulatory activity (enrichment score).
statistic_time: If requested, internal execution time indicator.
p_value: p-value (if available) of the obtained score.

Examples

if (FALSE) {
    inputs_dir <- system.file("testdata", "inputs", package = "decoupleR")

    mat <- readRDS(file.path(inputs_dir, "mat.rds"))
    net <- readRDS(file.path(inputs_dir, "net.rds"))

    decouple(
        mat = mat,
        network = net,
        .source = "source",
        .target = "target",
        statistics = c("gsva", "wmean", "wsum", "ulm", "aucell"),
        args = list(
            gsva = list(verbose = FALSE),
            wmean = list(.mor = "mor", .likelihood = "likelihood"),
            wsum = list(.mor = "mor"),
            ulm = list(.mor = "mor")
        ),
        minsize = 0
    )
}

Evaluate multiple statistics with same input data

Usage

Arguments

Value

See also

Examples