parsnip 1.0.0
Model Specification Changes
Enable the use of case weights for models that support them.
show_model_info()now indicates which models can utilize case weights.Model type functions will now message informatively if a needed parsnip extension package is not loaded (#731).
Refactored internals of model specification printing functions. These changes are non-breaking for extension packages, but the new
print_model_spec()helper is exported for use in extensions if desired (#739).
Bug fixes
Fixed bug where previously set engine arguments would propagate through
update()methods despitefresh = TRUE(#704).Fixed a bug where an error would be thrown if arguments to model functions were namespaced (#745).
predict(type = "prob")will now provide an error if the outcome variable has a level called"class"(#720).An inconsistency for probability type predictions for two-class GAM models was fixed (#708)
Fixed translated printing for
null_model()(#752)
Other changes
Added a
glm_grouped()function to convert long data to the grouped format required byglm()for logistic regression.xgb_train()now allows for case weightsAdded
ctree_train()andcforest_train()wrappers for the functions in the partykit package. Engines for these will be added to other parsnip extension packages.Exported
xgb_predict()which wraps xgboost’spredict()method for use with parsnip extension packages (#688).Added a developer function,
.model_param_name_keythat translates names of tuning parameters.
parsnip 0.2.1
CRAN release: 2022-03-17
Fixed a major bug in spark models induced in the previous version (#671).
Updated the parsnip add-in with new models and engines.
Updated parameter ranges for some
tunable()methods and added a missing engine argument for brulee models.Added information about how to install the mixOmics package for PLS models (#680)
parsnip 0.2.0
CRAN release: 2022-03-09
Model Specification Changes
Bayesian additive regression trees (BART) were added via the
bart()function.Added the
"glm"engine forlinear_reg()for numeric outcomes (#624).Added
bruleeengines forlinear_reg(),logistic_reg(),multinom_reg()andmlp().
Bug fixes
A bug for class predictions of two-class GAM models was fixed (#541)
Fixed a bug for
logistic_reg()with the LiblineaR engine (#552).The list column produced when creating survival probability predictions is now always called
.pred(with.pred_survivalbeing used inside of the list column).Fixed outcome type checking affecting a subset of regression models (#625).
Prediction using
multinom_reg()with thennetengine with a single row no longer fails (#612).
Other Changes
When the xy interface is used and the underlying model expects to use a matrix, a better warning is issued when predictors contain non-numeric columns (including dates).
The fit time is only calculated when the
verbosityargument ofcontrol_parsnip()is 2L or greater. Also, the call tosystem.time()now usesgcFirst = FALSE. (#611)fit_control()is soft-deprecated in favor ofcontrol_parsnip().New
extract_parameter_set_dials()method to extract parameter sets from model specs.New
extract_parameter_dials()method to extract a single parameter from model specs.Argument
intervalwas added for prediction: For types"survival"and"quantile", estimates for the confidence or prediction interval can be added if available (#615).set_dependency()now allows developers to create package requirements that are specific to the model’s mode (#604).varying_args()is soft-deprecated in favor oftune_args().An
autoplot()method was added for glmnet objects, showing the coefficient paths versus the penalty values (#642).parsnip is now more robust working with keras and tensorflow for a larger range of versions (#596).
xgboost engines now use the new
iterationrangeparameter instead of the deprecatedntreelimit(#656).
Developer
- Models information can be re-registered as long as the information being registered is the same. This is helpful for packages that add new engines and use
devtools::load_all()(#653).
parsnip 0.1.7
CRAN release: 2021-07-21
Model Specification Changes
A model function (
gen_additive_mod()) was added for generalized additive models.Each model now has a default engine that is used when the model is defined. The default for each model is listed in the help documents. This also adds functionality to declare an engine in the model specification function.
set_engine()is still required if engine-specific arguments need to be added. (#513)parsnip now checks for a valid combination of engine and mode (#529)
The default engine for
multinom_reg()was changed tonnet.
Other Changes
The helper functions
.convert_form_to_xy_fit(),.convert_form_to_xy_new(),.convert_xy_to_form_fit(), and.convert_xy_to_form_new()for converting between formula and matrix interface are now exported for developer use (#508).Fix bug in
augment()when non-predictor, non-outcome variables are included in data (#510).New article “Fitting and Predicting with parsnip” which contains examples for various combinations of model type and engine. ( #527)
parsnip 0.1.6
CRAN release: 2021-05-27
Model Specification Changes
A new linear SVM model
svm_linear()is now available with theLiblineaRengine (#424) and thekernlabengine (#438), and theLiblineaRengine is available forlogistic_reg()as well (#429). These models can use sparse matrices viafit_xy()(#447) and have atidymethod (#474).-
For models with
glmnetengines:- A single value is required for
penalty(either a single numeric value or a value oftune()) (#481). - A special argument called
path_valuescan be used to set thelambdapath as a specific set of numbers (independent of the value ofpenalty). A pure ridge regression models (i.e.,mixture = 1) will generate incorrect values if the path does not include zero. See issue #431 for discussion (#486).
- A single value is required for
The
liquidSVMengine forsvm_rbf()was deprecated due to that package’s removal from CRAN. (#425)The xgboost engine for boosted trees was translating
mtryto xgboost’scolsample_bytree. We now mapmtrytocolsample_bynodesince that is more consistent with how random forest works.colsample_bytreecan still be optimized by passing it in as an engine argument.colsample_bynodewas added to xgboost after theparsnippackage code was written. (#495)For xgboost,
mtryandcolsample_bytreecan be passed as integer counts or proportions, whilesubsampleandvalidationshould always be proportions.xgb_train()now has a new optioncounts(TRUEorFALSE) that states which scale formtryandcolsample_bytreeis being used. (#461)
Other Changes
Re-licensed package from GPL-2 to MIT. See consent from copyright holders here.
set_mode()now checks ifmodeis compatible with the model class, similar tonew_model_spec()(@jtlandis, #467). Bothset_mode()andset_engine()now error forNULLor missing arguments (#503).-
Re-organized model documentation:
-
updatemethods were moved out of the model help files (#479). - Each model/engine combination has its own help page.
- The model help page has a dynamic bulleted list of the engines with links to the individual help pages.
-
generics::required_pkgs()was extended forparsnipobjects.Prediction functions now give a consistent error when a user uses an unavailable value of
type(#489)The
augment()method was changed to avoid failing if the model does not enable class probabilities. The method now returns tibbles despite the input data class (#487) (#478)xgboost engines now respect the
event_leveloption for predictions (#460).
parsnip 0.1.5
CRAN release: 2021-01-19
An RStudio add-in is available that makes writing multiple
parsnipmodel specifications to the source window. It can be accessed via the IDE addin menus or by callingparsnip_addin().For
xgboostmodels, users can now passobjectivetoset_engine("xgboost"). (#403)Changes to test for cases when CRAN cannot get
xgboostto work on their Solaris configuration.There is now an
augument()method for fitted models. Seeaugment.model_fit. (#401)Column names for
xare now required whenfit_xy()is used. (#398)There is now an
event_levelargument for thexgboostengine. (#420)New mode “censored regression” and new prediction types “linear_pred”, “time”, “survival”, “hazard”. (#396)
Censored regression models cannot use
fit_xy()(usefit()). (#442)
parsnip 0.1.4
CRAN release: 2020-10-27
show_engines()will provide information on the current set for a model.For three models (
glmnet,xgboost, andranger), enable sparse matrix use viafit_xy()(#373).Some added protections were added for function arguments that are dependent on the data dimensions (e.g.,
mtry,neighbors,min_n, etc). (#184)Infrastructure was improved for running
parsnipmodels in parallel using PSOCK clusters on Windows.
parsnip 0.1.2
CRAN release: 2020-07-03
Breaking Changes
-
parsnipnow has options to set specific types of predictor encodings for different models. For example,rangermodels run usingparsnipandworkflowsdo the same thing by not creating indicator variables. These encodings can be overridden using theblueprintoptions inworkflows. As a consequence, it is possible to get a different model fit that previous versions ofparsnip. More details about specific encoding changes are below. (#326)
Other Changes
tidyr>= 1.0.0 is now required.SVM models produced by
kernlabnow use the formula method (see breaking change notice above). This change was due to howksvm()made indicator variables for factor predictors (with one-hot encodings). Since the ordinary formula method did not do this, the data are passed as-is toksvm()so that the results are closer to what one would get ifksmv()were called directly.MARS models produced by
earthnow use the formula method.For
xgboost, a one-hot encoding is used when indicator variables are created.Under-the-hood changes were made so that non-standard data arguments in the modeling packages can be accommodated. (#315)
New Features
A new main argument was added to
boost_tree()calledstop_iterfor early stopping. Thexgb_train()function gained arguments for early stopping and a percentage of data to leave out for a validation set.If
fit()is used and the underlying model uses a formula, the actual formula is pass to the model (instead of a placeholder). This makes the model call better.A function named
repair_call()was added. This can help change the underlying modelscallobject to better reflect what they would have obtained if the model function had been used directly (instead of viaparsnip). This is only useful when the user chooses a formula interface and the model uses a formula interface. It will also be of limited use when a recipes is used to construct the feature set inworkflowsortune.The
predict()function now checks to see if required modeling packages are installed. The packages are loaded (but not attached). (#249) (#308) (tidymodels/workflows#45)The function
req_pkgs()is a user interface to determining the required packages. (#308)
parsnip 0.0.5
CRAN release: 2020-01-07
Other Changes
glmnetwas removed as a dependency since the new version depends on 3.6.0 or greater. Keeping it would constrainparsnipto that same requirement. Allglmnettests are run locally.A set of internal functions are now exported. These are helpful when creating a new package that registers new model specifications.
New Features
-
nnetwas added as an engine tomultinom_reg()#209
Breaking Changes
- There were some mis-mapped parameters (going between
parsnipand the underlying model function) forsparkboosted trees and somekerasmodels. See 897c927.
parsnip 0.0.4
CRAN release: 2019-11-02
New Features
The time elapsed during model fitting is stored in the
$elapsedslot of the parsnip model object, and is printed when the model object is printed.Some default parameter ranges were updated for SVM, KNN, and MARS models.
The model
udpate()methods gained aparametersargument for cases when the parameters are contained in a tibble or list.fit_control()is soft-deprecated in favor ofcontrol_parsnip().
Fixes
A bug was fixed standardizing the output column types of
multi_predictandpredictformultinom_reg.A bug was fixed related to using data descriptors and
fit_xy().A bug was fixed related to the column names generated by
multi_predict(). The top-level tibble will always have a column named.predand this list column contains tibbles across sub-models. The column names for these sub-model tibbles will have names consistent withpredict()(which was previously incorrect). See 43c15db.A bug was fixed standardizing the column names of
nnetclass probability predictions.
parsnip 0.0.3
CRAN release: 2019-07-31
Unplanned release based on CRAN requirements for Solaris.
Breaking Changes
The method that
parsnipstores the model information has changed. Any custom models from previous versions will need to use the new method for registering models. The methods are detailed in?get_model_envand the package vignette for adding models.The mode needs to be declared for models that can be used for more than one mode prior to fitting and/or translation.
For
surv_reg(), the engine that uses thesurvivalpackage is now calledsurvivalinstead ofsurvreg.For
glmnetmodels, the full regularization path is always fit regardless of the value given topenalty. Previously, the model was fit with passingpenaltytoglmnet’slambdaargument and the model could only make predictions at those specific values. (#195)
New Features
add_rowindex()can create a column called.rowto a data frame.If a computational engine is not explicitly set, a default will be used. Each default is documented on the corresponding model page. A warning is issued at fit time unless verbosity is zero.
nearest_neighbor()gained amulti_predictmethod. Themulti_predict()documentation is a little better organized.A suite of internal functions were added to help with upcoming model tuning features.
A
parsnipobject always saved the name(s) of the outcome variable(s) for proper naming of the predicted values.
parsnip 0.0.2
CRAN release: 2019-03-22
Small release driven by changes in sample() in the current r-devel.
New Features
A “null model” is now available that fits a predictor-free model (using the mean of the outcome for regression or the mode for classification).
fit_xy()can take a single column data frame or matrix forywithout error
Other Changes
varying_args()now has afullargument to control whether the full set of possible varying arguments is returned (as opposed to only the arguments that are actually varying).fit_control()not returns an S3 method.For classification models, an error occurs if the outcome data are not encoded as factors (#115).
The prediction modules (e.g.
predict_class,predict_numeric, etc) were de-exported. These were internal functions that were not to be used by the users and the users were using them.An event time data set (
check_times) was included that is the time (in seconds) to runR CMD checkusing the “r-devel-windows-ix86+x86_64` flavor. Packages that errored are censored.
Bug Fixes
varying_args()now uses the version from thegenericspackage. This means that the first argument,x, has been renamed toobjectto align with generics.For the recipes step method of
varying_args(), there is now error checking to catch if a user tries to specify an argument that cannot be varying as varying (for example, theid) (#132).find_varying(), the internal function for detecting varying arguments, now returns correct results when a size 0 argument is provided. It can also now detect varying arguments nested deeply into a call (#131, #134).For multinomial regression, the
.pred_prefix is now only added to prediction column names once (#107).For multinomial regression using glmnet,
multi_predict()now pulls the correct default penalty (#108).Confidence and prediction intervals for logistic regression were only computed the intervals for a single level. Both are now computed. (#156)
parsnip 0.0.0.9005
- The engine, and any associated arguments, are now specified using
set_engine(). There is noengineargument
parsnip 0.0.0.9004
- Arguments to modeling functions are now captured as quosures.
-
othershas been replaced by... - Data descriptor names have been changed and are now functions. The descriptor definitions for “cols” and “preds” have been switched.
parsnip 0.0.0.9003
-
regularizationwas changed topenaltyin a few models to be consistent with this change. - If a mode is not chosen in the model specification, it is assigned at the time of fit. 51
- The underlying modeling packages now are loaded by namespace. There will be some exceptions noted in the documentation for each model. For example, in some
predictmethods, theearthpackage will need to be attached to be fully operational.
parsnip 0.0.0.9002
- To be consistent with
snake_case,newdatawas changed tonew_data. - A
predict_rawmethod was added.
parsnip 0.0.0.9000
- The
fitinterface was previously used to cover both the x/y interface as well as the formula interface. Now,fit()is the formula interface andfit_xy()is for the x/y interface. - Added a
NEWS.mdfile to track changes to the package. -
predictmethods were overhauled to be consistent. - MARS was added.
