Skip to contents

Collect and aggregate performance, contribution and importance estimations of a set of raw results produced by run_misty().

Usage

collect_results(folders)

Arguments

folders

Paths to folders containing the raw results from run_misty().

Value

List of collected performance, contributions and importances per sample, performance and contribution statistics and aggregated importances.

improvements

Long format tibble with measurements of performance for each target and each sample. Available performance measures are RMSE and variance explained (R2) for a model containing only an intrinsic view (intra.RMSE, intra.R2), model with all views (multi.RMSE, multi.R2), gain of RMSE and gain of variance explained of multi-view model over the intrisic model where gain.RMSE is the relative decrease of RMSE in percent, while gain.R2 is the absolute increase of variance explained in percent. Each value represents the mean performance across folds (k-fold cross-validation). The p values of a one sided t-test of improvement of performance (p.RMSE, p.R2) are also available as a measure.

improvements.stats

Long format tibble with summary statistics (mean, standard deviation and coefficient of variation) for all performance measures for each target over all samples.

contributions

Long format tibble with the values of the coefficients for each view in the meta-model, for each target and each sample. The p values for the coefficient for each view, under the null hypothesis of zero contribution to the meta model are also available.

contributions.stats

Long format tibble with summary statistics for all views per target over all samples. Including mean coffecient value, fraction of contribution, mean and standard deviation of p values.

importances

List of view-specific predictor-target importance tables per sample. The importances in each table are standardized per target and weighted by the quantile of the coefficient for the target in that view. Columns other than Predictor represent target markers.

importances.aggregated

A list of aggregated view-specific predictor-target importance tables . Aggregation is reducing by mean over all samples.

See also

run_misty() to train models and generate results.

Examples

# Train and collect results for 3 samples in synthetic

library(dplyr)
library(purrr)

data("synthetic")

misty.results <- synthetic[seq_len(3)] %>%
  imap_chr(~ create_initial_view(.x %>% select(-c(row, col, type))) %>%
    add_paraview(.x %>% select(row, col), l = 10) %>%
    run_misty(paste0("results/", .y))) %>%
  collect_results()
#> 
#> Generating paraview
#> 
#> Training models
#> 
#> Generating paraview
#> 
#> Training models
#> 
#> Generating paraview
#> 
#> Training models
#> 
#> Collecting improvements
#> 
#> Collecting contributions
#> 
#> Collecting importances
#> 
#> Aggregating
str(misty.results)
#> List of 6
#>  $ improvements          : tibble [264 × 4] (S3: tbl_df/tbl/data.frame)
#>   ..$ target : chr [1:264] "ECM" "ECM" "ECM" "ECM" ...
#>   ..$ sample : chr [1:264] "/tmp/RtmpcrfSZ8/file2484477ca03e/reference/results/synthetic1" "/tmp/RtmpcrfSZ8/file2484477ca03e/reference/results/synthetic1" "/tmp/RtmpcrfSZ8/file2484477ca03e/reference/results/synthetic1" "/tmp/RtmpcrfSZ8/file2484477ca03e/reference/results/synthetic1" ...
#>   ..$ measure: chr [1:264] "intra.RMSE" "intra.R2" "multi.RMSE" "multi.R2" ...
#>   ..$ value  : num [1:264] 0.1058 92.4635 0.0988 93.4217 0.0101 ...
#>  $ contributions         : tibble [198 × 4] (S3: tbl_df/tbl/data.frame)
#>   ..$ target: chr [1:198] "ECM" "ECM" "ECM" "ECM" ...
#>   ..$ sample: chr [1:198] "/tmp/RtmpcrfSZ8/file2484477ca03e/reference/results/synthetic1" "/tmp/RtmpcrfSZ8/file2484477ca03e/reference/results/synthetic1" "/tmp/RtmpcrfSZ8/file2484477ca03e/reference/results/synthetic1" "/tmp/RtmpcrfSZ8/file2484477ca03e/reference/results/synthetic1" ...
#>   ..$ view  : chr [1:198] "intercept" "intra" "para.10" "p.intercept" ...
#>   ..$ value : num [1:198] -0.074 0.991 0.259 NA 0 ...
#>  $ importances           : tibble [726 × 5] (S3: tbl_df/tbl/data.frame)
#>   ..$ sample    : chr [1:726] "/tmp/RtmpcrfSZ8/file2484477ca03e/reference/results/synthetic1" "/tmp/RtmpcrfSZ8/file2484477ca03e/reference/results/synthetic1" "/tmp/RtmpcrfSZ8/file2484477ca03e/reference/results/synthetic1" "/tmp/RtmpcrfSZ8/file2484477ca03e/reference/results/synthetic1" ...
#>   ..$ view      : chr [1:726] "intra" "intra" "intra" "intra" ...
#>   ..$ Predictor : chr [1:726] "ECM" "ECM" "ECM" "ECM" ...
#>   ..$ Target    : chr [1:726] "ECM" "ligA" "ligB" "ligC" ...
#>   ..$ Importance: num [1:726] NA -0.621 0.305 -0.375 -0.598 ...
#>  $ improvements.stats    : tibble [66 × 5] (S3: tbl_df/tbl/data.frame)
#>   ..$ target : chr [1:66] "ECM" "ECM" "ECM" "ECM" ...
#>   ..$ measure: chr [1:66] "gain.R2" "gain.RMSE" "intra.R2" "intra.RMSE" ...
#>   ..$ mean   : num [1:66] 0.861 6.202 92.807 0.101 93.668 ...
#>   ..$ sd     : num [1:66] 0.10931 0.77475 0.38239 0.00582 0.37607 ...
#>   ..$ cv     : num [1:66] 0.12699 0.12491 0.00412 0.05781 0.00401 ...
#>  $ contributions.stats   : tibble [22 × 6] (S3: tbl_df/tbl/data.frame)
#>   ..$ target  : chr [1:22] "ECM" "ECM" "ligA" "ligA" ...
#>   ..$ view    : chr [1:22] "intra" "para.10" "intra" "para.10" ...
#>   ..$ mean    : num [1:22] 0.989 0.245 0.996 0.066 0.985 ...
#>   ..$ fraction: num [1:22] 0.8012 0.1988 0.9379 0.0621 0.7747 ...
#>   ..$ p.mean  : num [1:22] 0 0 0 0 0 0 0 0 0 0 ...
#>   ..$ p.sd    : num [1:22] 0 0 0 0 0 0 0 0 0 0 ...
#>  $ importances.aggregated: tibble [242 × 5] (S3: tbl_df/tbl/data.frame)
#>   ..$ view      : chr [1:242] "intra" "intra" "intra" "intra" ...
#>   ..$ Predictor : chr [1:242] "ECM" "ECM" "ECM" "ECM" ...
#>   ..$ Target    : chr [1:242] "ECM" "ligA" "ligB" "ligC" ...
#>   ..$ Importance: num [1:242] NA -0.616 0.345 -0.388 -0.615 ...
#>   ..$ nsamples  : int [1:242] 3 3 3 3 3 3 3 3 3 3 ...