Trains multi-view models for all target markers, estimates the performance, the contributions of the view specific models and the importance of predictor markers for each target marker.
Usage
run_misty(
views,
results.folder = "results",
seed = 42,
target.subset = NULL,
bypass.intra = FALSE,
cv.folds = 10,
cached = FALSE,
append = FALSE,
model.function = random_forest_model,
...
)
Arguments
- views
view composition.
- results.folder
path to the top level folder to store raw results.
- seed
seed used for random sampling to ensure reproducibility.
- target.subset
subset of targets to train models for. If
NULL
, models will be trained for markers in the intraview.- bypass.intra
a
logical
indicating whether to train a baseline model using the intraview data (see Details).- cv.folds
number of cross-validation folds to consider for estimating the performance of the multi-view models
- cached
a
logical
indicating whether to cache the trained models and to reuse previously cached ones if they already exist for this sample.- append
a
logical
indicating whether to append the performance and coefficient files in theresults.folder
. Consider setting toTRUE
when rerunning a workflow with differenttarget.subset
parameters.- model.function
a function which is used to model each view, default model is
random_forest_model
. Other models included in mistyR aregradient_boosting_model
,bagged_mars_model
,mars_model
,linear_model
,svm_model
,mlp_model
- ...
all additional parameters are passed to the chosen ML model for training the view-specific models
Details
If bypass.intra
is set to TRUE
all variable in the intraview
the intraview data will be treated as targets only. The baseline intraview
model in this case is a trivial model that predicts the average of each
target. If the intraview has only one variable this switch is automatically
set to TRUE
.
Default model to train the view-specific views is a Random Forest model
based on ranger()
--
run_misty(views, model.function = random_forest_model)
The following parameters are the default
configuration: num.trees = 100
, importance = "impurity"
,
num.threads = 1
, seed = seed
.
Gradient boosting is an alternative to model each view using gradient
boosting. The algorithm is based on xgb.train()
--
run_misty(views, model.function = gradient_boosting_model)
The following parameters are the default configuration: booster = "gbtree"
,
rounds = 10
, objective = "reg:squarederror"
. Set booster
to "gblinear"
for linear boosting.
Bagged MARS is an alternative to model each view using bagged MARS,
(multivariate adaptive spline regression models) trained with
bootstrap aggregation samples. The algorithm is based on
earth()
--
run_misty(views, model.function = bagged_mars_model)
The following parameters are the default configuration: degree = 2
.
Furthermore 50 base learners are used by default (pass n.bags
as
parameter via ...
to change this value).
MARS is an alternative to model each view using
multivariate adaptive spline regression model. The algorithm is based on
earth()
--
run_misty(views, model.function = mars_model)
The following parameters are the default configuration: degree = 2
.
Linear model is an alternative to model each view using a simple linear
model. The algorithm is based on lm()
--
run_misty(views, model.function = linear_model)
SVM is an alternative to model each view using a support vector
machines. The algorithm is based on ksvm()
--
run_misty(views, model.function = svm_model)
The following parameters are the default configuration: kernel = "vanilladot"
(linear kernel), C = 1
, type = "eps-svr"
.
MLP is an alternative to model each view using a multi-layer perceptron.
The alogorithm is based on mlp()
--
run_misty(views, model.function = mlp_model)
The following parameters are the default configuration: size = c(10)
(meaning we have 1 hidden layer with 10 units).
See also
create_initial_view()
for
starting a view composition.
Examples
# Create a view composition of an intraview and a paraview with radius 10 then
# run MISTy for a single sample.
library(dplyr)
# get the expression data
data("synthetic")
expr <- synthetic[[1]] %>% select(-c(row, col, type))
# get the coordinates for each cell
pos <- synthetic[[1]] %>% select(row, col)
# compose
misty.views <- create_initial_view(expr) %>% add_paraview(pos, l = 10)
#>
#> Generating paraview
# run with default parameters
run_misty(misty.views)
#>
#> Training models
#> [1] "/tmp/RtmpcrfSZ8/file2484477ca03e/reference/results"