Calculates regulatory activities using MDT.
Usage
run_mdt(
mat,
network,
.source = source,
.target = target,
.mor = mor,
.likelihood = likelihood,
sparse = FALSE,
center = FALSE,
na.rm = FALSE,
trees = 10,
min_n = 20,
nproc = availableCores(),
seed = 42,
minsize = 5
)
Arguments
- mat
Matrix to evaluate (e.g. expression matrix). Target nodes in rows and conditions in columns.
rownames(mat)
must have at least one intersection with the elements innetwork
.target
column.- network
Tibble or dataframe with edges and it's associated metadata.
- .source
Column with source nodes.
- .target
Column with target nodes.
- .mor
Column with edge mode of regulation (i.e. mor).
- .likelihood
Deprecated argument. Now it will always be set to 1.
- sparse
Deprecated parameter.
- center
Logical value indicating if
mat
must be centered bybase::rowMeans()
.- na.rm
Should missing values (including NaN) be omitted from the calculations of
base::rowMeans()
?- trees
An integer for the number of trees contained in the ensemble.
- min_n
An integer for the minimum number of data points in a node that are required for the node to be split further.
- nproc
Number of cores to use for computation.
- seed
A single value, interpreted as an integer, or NULL for random number generation.
- minsize
Integer indicating the minimum number of targets per source.
Value
A long format tibble of the enrichment scores for each source across the samples. Resulting tibble contains the following columns:
statistic
: Indicates which method is associated with which score.source
: Source nodes ofnetwork
.condition
: Condition representing each column ofmat
.score
: Regulatory activity (enrichment score).
Details
MDT fits a multivariate regression random forest for each sample, where the
observed molecular readouts in mat are the response variable and the
regulator weights in net are the covariates. Target features with no
associated weight are set to zero. The obtained feature importances from the
fitted model are the activities mdt
of the regulators in net.
Examples
inputs_dir <- system.file("testdata", "inputs", package = "decoupleR")
mat <- readRDS(file.path(inputs_dir, "mat.rds"))
net <- readRDS(file.path(inputs_dir, "net.rds"))
run_mdt(mat, net, minsize=0)
#> # A tibble: 72 × 4
#> statistic source condition score
#> <chr> <chr> <chr> <dbl>
#> 1 mdt T1 S01 0
#> 2 mdt T1 S02 0
#> 3 mdt T1 S03 0
#> 4 mdt T1 S04 0
#> 5 mdt T1 S05 0
#> 6 mdt T1 S06 0
#> 7 mdt T1 S07 0
#> 8 mdt T1 S08 0
#> 9 mdt T1 S09 0
#> 10 mdt T1 S10 0
#> # ℹ 62 more rows