Bundling a model prepares it to be saved to a file and later restored for prediction in a new R session. See the 'Value' section for more information on bundles and their usage.
Arguments
- x
An object returned from modeling functions in the h2o package.
- id
A single character. The
model_id
entry in the leaderboard. Applies to AutoML output only. Supply only one of this argument orn
.- n
An integer giving the position in the leaderboard of the model to bundle. Applies to AutoML output only. Will be ignored if
id
is supplied.- ...
Not used in this bundler and included for compatibility with the generic only. Additional arguments passed to this method will return an error.
Value
A bundle object with subclass bundled_h2o
.
Bundles are a list subclass with two components:
- object
An R object. Gives the output of native serialization methods from the model-supplying package, sometimes with additional classes or attributes that aid portability. This is often a raw object.
- situate
A function. The
situate()
function is defined whenbundle()
is called, though is a loose analogue of anunbundle()
S3 method for that object. Since the function is defined onbundle()
, it has access to references and dependency information that can be saved alongside theobject
component. Callingunbundle()
on a bundled objectx
callsx$situate(x$object)
, returning the unserialized version ofobject
.situate()
will also restore needed references, such as server instances and environmental variables.
Bundles are R objects that represent a "standalone" version of their
analogous model object. Thus, bundles are ready for saving to a file; saving
with base::saveRDS()
is our recommended serialization strategy for bundles,
unless documented otherwise for a specific method.
To restore the original model object x
in a new environment, load its
bundle with base::readRDS()
and run unbundle()
on it. The output
of unbundle()
is a model object that is ready to predict()
on new data,
and other restored functionality (like plotting or summarizing) is supported
as a side effect only.
The bundle package wraps native serialization methods from model-supplying packages. Between versions, those model-supplying packages may change their native serialization methods, possibly introducing problems with re-loading objects serialized with previous package versions. The bundle package does not provide checks for these sorts of changes, and ought to be used in conjunction with tooling for managing and monitoring model environments like vetiver or renv.
See vignette("bundle")
for more information on bundling and its motivation.
See also
These methods wrap h2o::h2o.save_mojo()
and
h2o::h2o.saveModel()
.
Other bundlers:
bundle()
,
bundle.bart()
,
bundle.keras.engine.training.Model()
,
bundle.luz_module_fitted()
,
bundle.model_fit()
,
bundle.model_stack()
,
bundle.recipe()
,
bundle.step_umap()
,
bundle.train()
,
bundle.workflow()
,
bundle.xgb.Booster()
Examples
# fit model and bundle ------------------------------------------------
library(h2o)
#>
#> ----------------------------------------------------------------------
#>
#> Your next step is to start H2O:
#> > h2o.init()
#>
#> For H2O package documentation, ask for help:
#> > ??h2o
#>
#> After starting H2O, you can use the Web UI at http://localhost:54321
#> For more information visit https://docs.h2o.ai
#>
#> ----------------------------------------------------------------------
#>
#> Attaching package: ‘h2o’
#> The following objects are masked from ‘package:stats’:
#>
#> cor, sd, var
#> The following objects are masked from ‘package:base’:
#>
#> %*%, %in%, &&, apply, as.factor, as.numeric, colnames,
#> colnames<-, ifelse, is.character, is.factor, is.numeric, log,
#> log10, log1p, log2, round, signif, trunc, ||
set.seed(1)
h2o.init()
#>
#> H2O is not running yet, starting it now...
#>
#> Note: In case of errors look at the following log files:
#> /tmp/RtmpO5Qnjx/file21cb42f1ef39/h2o_runner_started_from_r.out
#> /tmp/RtmpO5Qnjx/file21cbc37ff23/h2o_runner_started_from_r.err
#>
#>
#> Starting H2O JVM and connecting: ...... Connection successful!
#>
#> R is connected to the H2O cluster:
#> H2O cluster uptime: 2 seconds 361 milliseconds
#> H2O cluster timezone: UTC
#> H2O data parsing timezone: UTC
#> H2O cluster version: 3.44.0.3
#> H2O cluster version age: 10 months and 22 days
#> H2O cluster name: H2O_started_from_R_runner_ydg016
#> H2O cluster total nodes: 1
#> H2O cluster total memory: 3.90 GB
#> H2O cluster total cores: 4
#> H2O cluster allowed cores: 4
#> H2O cluster healthy: TRUE
#> H2O Connection ip: localhost
#> H2O Connection port: 54321
#> H2O Connection proxy: NA
#> H2O Internal Security: FALSE
#> R Version: R version 4.4.2 (2024-10-31)
#> Warning:
#> Your H2O cluster version is (10 months and 22 days) old. There may be a newer version available.
#> Please download and install the latest version from: https://h2o-release.s3.amazonaws.com/h2o/latest_stable.html
#>
cars_h2o <- as.h2o(mtcars)
#>
|
| | 0%
|
|================================================================| 100%
cars_fit <-
h2o.glm(
x = colnames(cars_h2o)[2:11],
y = colnames(cars_h2o)[1],
training_frame = cars_h2o
)
#>
|
| | 0%
|
|================================================================| 100%
cars_bundle <- bundle(cars_fit)
# then, after saveRDS + readRDS or passing to a new session ----------
cars_unbundled <- unbundle(cars_fit)
predict(cars_unbundled, cars_h2o[, 2:11])
#>
|
| | 0%
|
|================================================================| 100%
#> predict
#> 1 21.94826
#> 2 21.64605
#> 3 25.34547
#> 4 20.44883
#> 5 17.04492
#> 6 20.12585
#>
#> [32 rows x 1 column]
h2o.shutdown(prompt = FALSE)