“Method execution in the SummarizedBenchmark
framework is handled by the buildBench()
function. Occassionally, errors occur when running several methods on a new data set. The approach to error handling implemented in the buildBench()
function is described in this vignette along with options to disable error handling. SummarizedBenchmark package version: 2.20.0”
SummarizedBenchmark 2.20.0
When running a large benchmark study, not uncommonly, a single or a small subset of methods may fail during execution. This may be the result of misspecified parameters, an underlying bug in the software, or any number of other reasons. By default, errors thrown by methods which fail during buildBench()
or updateBench()
(see Feature: Iterative Benchmarking for details on updateBench()
) are caught and handled in a user-friendly way. As long as a single method executes without any errors, a SummarizedBenchmark object is returned as usual, with the assay columns of failed methods set to NA
. Additionally, the corresponding error messages are stored in the metadata of the object for reference.
As an example, consider the following example where we run case where we benchmark two simple methods. The first, slowMethod
draws 5 random normal samples after waiting 5 seconds, and the second, fastMethod
draws 5 random normal samples immediately. Each method is then passed through two post-processing functions, keepSlow
and makeSlower
, and keepFast
and makeSlower
, respectively. This results in three partially overlapping assays, keepSlow
, keepFast
and makeSlower
. With this example, we also demonstrate how mismatched assays are handled across methods.
bdslow <- BenchDesign(data = tdat) %>%
addMethod("slowMethod", function() { Sys.sleep(5); rnorm(5) },
post = list(keepSlow = identity,
makeSlower = function(x) { Sys.sleep(5); x })) %>%
addMethod("fastMethod", function() { rnorm(5) },
post = list(keepFast = identity,
makeSlower = function(x) { Sys.sleep(5); x }))
We run these methods in parallel using parallel = TRUE
and specify a timeout limit of 1 second for the BPPARAM
. Naturally, slowMethod
will fail, and fastMethod
will fail during the makeSlower
post-processing function.
Notice that during the execution process, errors caught by buildBench()
are printed to the console along with the name of the failed method and post-processing function when appropriate.
We can verify that a valid SummarizedBenchmark object is still returned with the the remaining results.
## class: SummarizedBenchmark
## dim: 5 2
## metadata(1): sessions
## assays(3): keepSlow makeSlower keepFast
## rownames: NULL
## rowData names(3): keepSlow makeSlower keepFast
## colnames(2): slowMethod fastMethod
## colData names(4): func.pkg func.pkg.vers func.pkg.manual session.idx
We can also check the values of the assays.
## $keepSlow
## slowMethod fastMethod
## [1,] 0.1514252 NA
## [2,] -0.9416348 NA
## [3,] -0.1389935 NA
## [4,] 0.5562113 NA
## [5,] -1.0477796 NA
##
## $makeSlower
## slowMethod fastMethod
## [1,] 0.1514252 0.04441397
## [2,] -0.9416348 -0.49008633
## [3,] -0.1389935 1.23284845
## [4,] 0.5562113 1.15800784
## [5,] -1.0477796 -0.49721293
##
## $keepFast
## slowMethod fastMethod
## [1,] NA 0.04441397
## [2,] NA -0.49008633
## [3,] NA 1.23284845
## [4,] NA 1.15800784
## [5,] NA -0.49721293
Notice that most columns contain only NA
values. These columns correspond to both methods which returned errors, as well as methods missing post-processing functions, e.g. no keepSlow
function was defined for the fastMethod
method. While the NA
values cannot be used to distinguish the sources of the NA
values, this is documented in the sessions
list of the SummarizedBenchmark metadata. While the sessions
object is a list containing information for all previous sessions, we are only interested in the current, first session. (For more details on why multiple sessions may be run, see the Feature: Iterative Benchmarking vignette.)
## [1] "methods" "results" "parameters" "sessionInfo"
In sessions
, there is a "results"
entry which includes a summary of the results for each combination of method and post-processing function (assay). The entries of results
can take one of three values: "success"
, "missing"
, or an error message of class buildbench-error
. The easiest way to view these resultsis by passing the results
to the base R function, simplify2array()
.
## slowMethod fastMethod
## keepFast "missing" "missing"
## keepSlow "success" "success"
## makeSlower "success" "success"
In the returned table, columns correspond to methods, and rows correspond to assays. We clearly see that many of the methods failed due to exceeding the specified time limit. If we check one of these entries more closesly, we see that it is indeed a buildbench-error
object that occurred ("origin"
) during the "main"
function.
## [1] "success"
If this error handling is not wanted, and the user would like the benchmark experiment to terminate when an error is thrown, then optional parameter catchErrors = FALSE
can be specified to eiher buildBench()
or updateBench()
. Generally, this is advised against as the outputs computed for all non-failing methods will also be lost. As a result, the entire benchmarking experiment will need to be re-executed.