`nearest_neighbor()`

is a way to generate a *specification* of a model
before fitting and allows the model to be created using
different packages in R. The main arguments for the
model are:

`neighbors`

: The number of neighbors considered at each prediction.`weight_func`

: The type of kernel function that weights the distances between samples.`dist_power`

: The parameter used when calculating the Minkowski distance. This corresponds to the Manhattan distance with`dist_power = 1`

and the Euclidean distance with`dist_power = 2`

.

These arguments are converted to their specific names at the
time that the model is fit. Other options and arguments can be
set using `set_engine()`

. If left to their defaults
here (`NULL`

), the values are taken from the underlying model
functions. If parameters need to be modified, `update()`

can be used
in lieu of recreating the object from scratch.

nearest_neighbor( mode = "unknown", neighbors = NULL, weight_func = NULL, dist_power = NULL )

mode | A single character string for the type of model.
Possible values for this model are |
---|---|

neighbors | A single integer for the number of neighbors
to consider (often called |

weight_func | A |

dist_power | A single number for the parameter used in calculating Minkowski distance. |

The model can be created using the `fit()`

function using the
following *engines*:

R:

`"kknn"`

(the default)

Engines may have pre-set default arguments when executing the model fit call. For this type of model, the template of the fit calls are below:

nearest_neighbor() %>% set_engine("kknn") %>% set_mode("regression") %>% translate()

## K-Nearest Neighbor Model Specification (regression) ## ## Computational engine: kknn ## ## Model fit template: ## kknn::train.kknn(formula = missing_arg(), data = missing_arg(), ## ks = min_rows(5, data, 5))

nearest_neighbor() %>% set_engine("kknn") %>% set_mode("classification") %>% translate()

## K-Nearest Neighbor Model Specification (classification) ## ## Computational engine: kknn ## ## Model fit template: ## kknn::train.kknn(formula = missing_arg(), data = missing_arg(), ## ks = min_rows(5, data, 5))

For `kknn`

, the underlying modeling function used is a restricted
version of `train.kknn()`

and not `kknn()`

. It is set up in this way so
that `parsnip`

can utilize the underlying `predict.train.kknn`

method to
predict on new data. This also means that a single value of that
function’s `kernel`

argument (a.k.a `weight_func`

here) can be supplied

For this engine, tuning over `neighbors`

is very efficient since the
same model object can be used to make predictions over multiple values
of `neighbors`

.

The standardized parameter names in parsnip can be mapped to their original names in each engine that has main parameters. Each engine typically has a different default value (shown in parentheses) for each parameter.

parsnip | kknn |

neighbors | ks |

weight_func | kernel (optimal) |

dist_power | distance (2) |

#> # A tibble: 2 x 2 #> engine mode #> <chr> <chr> #> 1 kknn classification #> 2 kknn regressionnearest_neighbor(neighbors = 11)#> K-Nearest Neighbor Model Specification (unknown) #> #> Main Arguments: #> neighbors = 11 #>