Multivariate Decision Trees (MDT)

Calculates regulatory activities using MDT.

Usage

run_mdt(
  mat,
  network,
  .source = source,
  .target = target,
  .mor = mor,
  .likelihood = likelihood,
  sparse = FALSE,
  center = FALSE,
  na.rm = FALSE,
  trees = 10,
  min_n = 20,
  nproc = availableCores(),
  seed = 42,
  minsize = 5
)

Arguments

mat: Matrix to evaluate (e.g. expression matrix). Target nodes in rows and conditions in columns. rownames(mat) must have at least one intersection with the elements in network .target column.
network: Tibble or dataframe with edges and it's associated metadata.
.source: Column with source nodes.
.target: Column with target nodes.
.mor: Column with edge mode of regulation (i.e. mor).
.likelihood: Deprecated argument. Now it will always be set to 1.
sparse: Deprecated parameter.
center: Logical value indicating if mat must be centered by base::rowMeans().
na.rm: Should missing values (including NaN) be omitted from the calculations of base::rowMeans()?
trees: An integer for the number of trees contained in the ensemble.
min_n: An integer for the minimum number of data points in a node that are required for the node to be split further.
nproc: Number of cores to use for computation.
seed: A single value, interpreted as an integer, or NULL for random number generation.
minsize: Integer indicating the minimum number of targets per source.

Value

A long format tibble of the enrichment scores for each source across the samples. Resulting tibble contains the following columns:

statistic: Indicates which method is associated with which score.
source: Source nodes of network.
condition: Condition representing each column of mat.
score: Regulatory activity (enrichment score).

Details

MDT fits a multivariate regression random forest for each sample, where the observed molecular readouts in mat are the response variable and the regulator weights in net are the covariates. Target features with no associated weight are set to zero. The obtained feature importances from the fitted model are the activities mdt of the regulators in net.

Examples

inputs_dir <- system.file("testdata", "inputs", package = "decoupleR")

mat <- readRDS(file.path(inputs_dir, "mat.rds"))
net <- readRDS(file.path(inputs_dir, "net.rds"))

run_mdt(mat, net, minsize=0)
#> # A tibble: 72 × 4
#>    statistic source condition score
#>    <chr>     <chr>  <chr>     <dbl>
#>  1 mdt       T1     S01           0
#>  2 mdt       T1     S02           0
#>  3 mdt       T1     S03           0
#>  4 mdt       T1     S04           0
#>  5 mdt       T1     S05           0
#>  6 mdt       T1     S06           0
#>  7 mdt       T1     S07           0
#>  8 mdt       T1     S08           0
#>  9 mdt       T1     S09           0
#> 10 mdt       T1     S10           0
#> # ℹ 62 more rows

Usage

Arguments

Value

Details

See also

Examples