nearest_neighbor() is a way to generate a specification of a model
before fitting and allows the model to be created using
different packages in R. The main arguments for the
neighbors: The number of neighbors considered at
weight_func: The type of kernel function that weights the
distances between samples.
dist_power: The parameter used when calculating the Minkowski
distance. This corresponds to the Manhattan distance with
dist_power = 1
and the Euclidean distance with
dist_power = 2.
These arguments are converted to their specific names at the
time that the model is fit. Other options and arguments can be
set_engine(). If left to their defaults
NULL), the values are taken from the underlying model
functions. If parameters need to be modified,
update() can be used
in lieu of recreating the object from scratch.
nearest_neighbor( mode = "unknown", neighbors = NULL, weight_func = NULL, dist_power = NULL )
A single character string for the type of model.
Possible values for this model are
A single integer for the number of neighbors
to consider (often called
A single character for the type of kernel function used
to weight distances between samples. Valid choices are:
A single number for the parameter used in calculating Minkowski distance.
The model can be created using the
fit() function using the
"kknn" (the default)
Engines may have pre-set default arguments when executing the model fit call. For this type of model, the template of the fit calls are below:
## K-Nearest Neighbor Model Specification (regression) ## ## Computational engine: kknn ## ## Model fit template: ## kknn::train.kknn(formula = missing_arg(), data = missing_arg(), ## ks = min_rows(5, data, 5))
## K-Nearest Neighbor Model Specification (classification) ## ## Computational engine: kknn ## ## Model fit template: ## kknn::train.kknn(formula = missing_arg(), data = missing_arg(), ## ks = min_rows(5, data, 5))
kknn, the underlying modeling function used is a restricted
train.kknn() and not
kknn(). It is set up in this way so
parsnip can utilize the underlying
predict.train.kknn method to
predict on new data. This also means that a single value of that
kernel argument (a.k.a
weight_func here) can be supplied
For this engine, tuning over
neighbors is very efficient since the
same model object can be used to make predictions over multiple values
The standardized parameter names in parsnip can be mapped to their original names in each engine that has main parameters. Each engine typically has a different default value (shown in parentheses) for each parameter.
show_engines("nearest_neighbor")#> # A tibble: 2 x 2 #> engine mode #> <chr> <chr> #> 1 kknn classification #> 2 kknn regressionnearest_neighbor(neighbors = 11)#> K-Nearest Neighbor Model Specification (unknown) #> #> Main Arguments: #> neighbors = 11 #>