Calculates regulatory activities by using UDT.
Usage
run_udt(
mat,
network,
.source = source,
.target = target,
.mor = mor,
.likelihood = likelihood,
sparse = FALSE,
center = FALSE,
na.rm = FALSE,
min_n = 20,
seed = 42,
minsize = 5
)
Arguments
- mat
Matrix to evaluate (e.g. expression matrix). Target nodes in rows and conditions in columns.
rownames(mat)
must have at least one intersection with the elements innetwork
.target
column.- network
Tibble or dataframe with edges and it's associated metadata.
- .source
Column with source nodes.
- .target
Column with target nodes.
- .mor
Column with edge mode of regulation (i.e. mor).
- .likelihood
Deprecated argument. Now it will always be set to 1.
- sparse
Deprecated parameter.
- center
Logical value indicating if
mat
must be centered bybase::rowMeans()
.- na.rm
Should missing values (including NaN) be omitted from the calculations of
base::rowMeans()
?- min_n
An integer for the minimum number of data points in a node that are required for the node to be split further.
- seed
A single value, interpreted as an integer, or NULL for random number generation.
- minsize
Integer indicating the minimum number of targets per source.
Value
A long format tibble of the enrichment scores for each source across the samples. Resulting tibble contains the following columns:
statistic
: Indicates which method is associated with which score.source
: Source nodes ofnetwork
.condition
: Condition representing each column ofmat
.score
: Regulatory activity (enrichment score).
Details
UDT fits a single regression decision tree for each sample and regulator,
where the observed molecular readouts in mat are the response variable and
the regulator weights in net are the explanatory one. Target features with
no associated weight are set to zero. The obtained feature importance from
the fitted model is the activity udt
of a given regulator.
Examples
inputs_dir <- system.file("testdata", "inputs", package = "decoupleR")
mat <- readRDS(file.path(inputs_dir, "mat.rds"))
net <- readRDS(file.path(inputs_dir, "net.rds"))
run_udt(mat, net, minsize=0)
#> # A tibble: 72 × 4
#> statistic source condition score
#> <chr> <chr> <chr> <dbl>
#> 1 udt T1 S01 0
#> 2 udt T1 S02 0
#> 3 udt T1 S03 0
#> 4 udt T1 S04 0
#> 5 udt T1 S05 0
#> 6 udt T1 S06 0
#> 7 udt T1 S07 0
#> 8 udt T1 S08 0
#> 9 udt T1 S09 0
#> 10 udt T1 S10 0
#> # ℹ 62 more rows