train: Fit Predictive Models over Different Tuning Parameters in caret: Classification and Regression Training (2024)

trainR Documentation

Fit Predictive Models over Different Tuning Parameters

Description

This function sets up a grid of tuning parameters for a numberof classification and regression routines, fits each model andcalculates a resampling based performance measure.

Usage

train(x, ...)## Default S3 method:train( x, y, method = "rf", preProcess = NULL, ..., weights = NULL, metric = ifelse(is.factor(y), "Accuracy", "RMSE"), maximize = ifelse(metric %in% c("RMSE", "logLoss", "MAE", "logLoss"), FALSE, TRUE), trControl = trainControl(), tuneGrid = NULL, tuneLength = ifelse(trControl$method == "none", 1, 3))## S3 method for class 'formula'train(form, data, ..., weights, subset, na.action = na.fail, contrasts = NULL)## S3 method for class 'recipe'train( x, data, method = "rf", ..., metric = ifelse(is.factor(y_dat), "Accuracy", "RMSE"), maximize = ifelse(metric %in% c("RMSE", "logLoss", "MAE"), FALSE, TRUE), trControl = trainControl(), tuneGrid = NULL, tuneLength = ifelse(trControl$method == "none", 1, 3))

Arguments

x

For the default method, x is an object wheresamples are in rows and features are in columns. This could be asimple matrix, data frame or other type (e.g. sparse matrix) butmust have column names (see Details below). Preprocessing usingthe preProcess argument only supports matrices or dataframes. When using the recipe method, x should be anunprepared recipe object that describes the modelterms (i.e. outcome, predictors, etc.) as well as anypre-processing that should be done to the data. This is analternative approach to specifying the model. Note that, whenusing the recipe method, any arguments passed to preProcesswill be ignored. See the links and example below for more detailsusing recipes.

...

Arguments passed to the classification orregression routine (such asrandomForest). Errors will occur ifvalues for tuning parameters are passed here.

y

A numeric or factor vector containing the outcome foreach sample.

method

A string specifying which classification orregression model to use. Possible values are found usingnames(getModelInfo()). Seehttp://topepo.github.io/caret/train-models-by-tag.html. Alist of functions can also be passed for a custom modelfunction. Seehttp://topepo.github.io/caret/using-your-own-model-in-train.htmlfor details.

preProcess

A string vector that defines a pre-processingof the predictor data. Current possibilities are "BoxCox","YeoJohnson", "expoTrans", "center", "scale", "range","knnImpute", "bagImpute", "medianImpute", "pca", "ica" and"spatialSign". The default is no pre-processing. SeepreProcess and trainControl on theprocedures and how to adjust them. Pre-processing code is onlydesigned to work when x is a simple matrix or data frame.

weights

A numeric vector of case weights. This argumentwill only affect models that allow case weights.

metric

A string that specifies what summary metric willbe used to select the optimal model. By default, possible valuesare "RMSE" and "Rsquared" for regression and "Accuracy" and"Kappa" for classification. If custom performance metrics areused (via the summaryFunction argument intrainControl, the value of metric shouldmatch one of the arguments. If it does not, a warning is issuedand the first metric given by the summaryFunction isused. (NOTE: If given, this argument must be named.)

maximize

A logical: should the metric be maximized orminimized?

trControl

A list of values that define how this functionacts. See trainControl andhttp://topepo.github.io/caret/using-your-own-model-in-train.html.(NOTE: If given, this argument must be named.)

tuneGrid

A data frame with possible tuning values. Thecolumns are named the same as the tuning parameters. UsegetModelInfo to get a list of tuning parametersfor each model or seehttp://topepo.github.io/caret/available-models.html.(NOTE: If given, this argument must be named.)

tuneLength

An integer denoting the amount of granularityin the tuning parameter grid. By default, this argument is thenumber of levels for each tuning parameters that should begenerated by train. If trainControlhas the option search = "random", this is the maximumnumber of tuning parameter combinations that will be generatedby the random search. (NOTE: If given, this argument must benamed.)

form

A formula of the form y ~ x1 + x2 + ...

data

Data frame from which variables specified informula or recipe are preferentially to be taken.

subset

An index vector specifying the cases to be usedin the training sample. (NOTE: If given, this argument must benamed.)

na.action

A function to specify the action to be takenif NAs are found. The default action is for the procedure tofail. An alternative is na.omit, which leads to rejectionof cases with missing values on any required variable. (NOTE: Ifgiven, this argument must be named.)

contrasts

A list of contrasts to be used for some or allthe factors appearing as variables in the model formula.

Details

train can be used to tune models by picking thecomplexity parameters that are associated with the optimalresampling statistics. For particular model, a grid ofparameters (if any) is created and the model is trained onslightly different data for each candidate combination of tuningparameters. Across each data set, the performance of held-outsamples is calculated and the mean and standard deviation issummarized for each combination. The combination with theoptimal resampling statistic is chosen as the final model andthe entire training set is used to fit a final model.

The predictors in x can be most any object as long asthe underlying model fit function can deal with the objectclass. The function was designed to work with simple matricesand data frame inputs, so some functionality may not work (e.g.pre-processing). When using string kernels, the vector ofcharacter strings should be converted to a matrix with a singlecolumn.

More details on this function can be found athttp://topepo.github.io/caret/model-training-and-tuning.html.

A variety of models are currently available and are enumeratedby tag (i.e. their model characteristics) athttp://topepo.github.io/caret/train-models-by-tag.html.

More details on using recipes can be found athttp://topepo.github.io/caret/using-recipes-with-train.html.Note that case weights can be passed into train using arole of "case weight" for a single variable. Also, ifthere are non-predictor columns that should be used whendetermining the model's performance metrics, the role of"performance var" can be used with multiple columns andthese will be made available during resampling to thesummaryFunction function.

Value

A list is returned of class train containing:

method

The chosen model.

modelType

Anidentifier of the model type.

results

A data frame thetraining error rate and values of the tuning parameters.

bestTune

A data frame with the final parameters.

call

The (matched) function call with dots expanded

dots

A list containing any ... values passed to theoriginal call

metric

A string that specifies whatsummary metric will be used to select the optimal model.

control

The list of control parameters.

preProcess

Either NULL or an object of classpreProcess

finalModel

A fit object usingthe best parameters

trainingData

A data frame

resample

A data frame with columns for each performancemetric. Each row corresponds to each resample. If leave-one-outcross-validation or out-of-bag estimation methods are requested,this will be NULL. The returnResamp argument oftrainControl controls how much of the resampledresults are saved.

perfNames

A character vector ofperformance metrics that are produced by the summary function

maximize

A logical recycled from the function arguments.

yLimits

The range of the training set outcomes.

times

A list of execution times: everything is forthe entire call to train, final for the finalmodel fit and, optionally, prediction for the time topredict new samples (see trainControl)

Author(s)

Max Kuhn (the guts of train.formula were basedon Ripley's nnet.formula)

References

http://topepo.github.io/caret/

Kuhn (2008), “Building Predictive Models in R Using the caret”(\Sexpr[results=rd]{tools:::Rd_expr_doi("10.18637/jss.v028.i05")})

https://topepo.github.io/recipes/

See Also

models, trainControl,update.train, modelLookup,createFolds, recipe

Examples

## Not run: ######################################### Classification Exampledata(iris)TrainData <- iris[,1:4]TrainClasses <- iris[,5]knnFit1 <- train(TrainData, TrainClasses, method = "knn", preProcess = c("center", "scale"), tuneLength = 10, trControl = trainControl(method = "cv"))knnFit2 <- train(TrainData, TrainClasses, method = "knn", preProcess = c("center", "scale"), tuneLength = 10, trControl = trainControl(method = "boot"))library(MASS)nnetFit <- train(TrainData, TrainClasses, method = "nnet", preProcess = "range", tuneLength = 2, trace = FALSE, maxit = 100)######################################### Regression Examplelibrary(mlbench)data(BostonHousing)lmFit <- train(medv ~ . + rm:lstat, data = BostonHousing, method = "lm")library(rpart)rpartFit <- train(medv ~ ., data = BostonHousing, method = "rpart", tuneLength = 9)######################################### Example with a custom metricmadSummary <- function (data, lev = NULL, model = NULL) { out <- mad(data$obs - data$pred, na.rm = TRUE) names(out) <- "MAD" out}robustControl <- trainControl(summaryFunction = madSummary)marsGrid <- expand.grid(degree = 1, nprune = (1:10) * 2)earthFit <- train(medv ~ ., data = BostonHousing, method = "earth", tuneGrid = marsGrid, metric = "MAD", maximize = FALSE, trControl = robustControl)######################################### Example with a recipedata(cox2)cox2 <- cox2Descrcox2$potency <- cox2IC50library(recipes)cox2_recipe <- recipe(potency ~ ., data = cox2) %>% ## Log the outcome step_log(potency, base = 10) %>% ## Remove sparse and unbalanced predictors step_nzv(all_predictors()) %>% ## Surface area predictors are highly correlated so ## conduct PCA just on these. step_pca(contains("VSA"), prefix = "surf_area_", threshold = .95) %>% ## Remove other highly correlated predictors step_corr(all_predictors(), -starts_with("surf_area_"), threshold = .90) %>% ## Center and scale all of the non-PCA predictors step_center(all_predictors(), -starts_with("surf_area_")) %>% step_scale(all_predictors(), -starts_with("surf_area_"))set.seed(888)cox2_lm <- train(cox2_recipe, data = cox2, method = "lm", trControl = trainControl(method = "cv"))######################################### Parallel Processing Example via multicore package## library(doMC)## registerDoMC(2)## NOTE: don't run models form RWeka when using### multicore. The session will crash.## The code for train() does not change:set.seed(1)usingMC <- train(medv ~ ., data = BostonHousing, method = "glmboost")## or use:## library(doMPI) or## library(doParallel) or## library(doSMP) and so on## End(Not run)
train: Fit Predictive Models over Different Tuning Parameters in caret: Classification and Regression Training (2024)

FAQs

What does train do in R? ›

train can be used to tune models by picking the complexity parameters that are associated with the optimal resampling statistics. For particular model, a grid of parameters (if any) is created and the model is trained on slightly different data for each candidate combination of tuning parameters.

What is a function train? ›

A Train is a derived function constructed from a sequence of 2 or 3 functions, or from an array followed by two functions, which bind together to form a function.

What is tunegrid in R? ›

Source: R/tune_grid.R. tune_grid.Rd. tune_grid() computes a set of performance metrics (e.g. accuracy or RMSE) for a pre-defined set of tuning parameters that correspond to a model or recipe across one or more resamples of the data.

What is the function of the caret? ›

The caret package (short for Classification And REgression Training) contains functions to streamline the model training process for complex regression and classification problems.

What is the purpose of using train? ›

Rail is, after all, the backbone of transportation systems worldwide, connecting people and cities and avoiding private car use and traffic congestion.

What is classification with caret? ›

Classification And REgression Training, shortened with the caret, is a package in R programming with functions that attempt to streamline the process of creating predictive models. This Package contains tools for : data splitting. pre-processing.

Where to use model train()? ›

model. train() is a PyTorch function that sets the model in training mode. When you call model. train() , PyTorch enables features such as dropout and batch normalization, which are typically used during training but not during inference.

What is the purpose of the model train method? ›

Model training in machine language is the process of feeding an ML algorithm with data to help identify and learn good values for all attributes involved. There are several types of machine learning models, of which the most common ones are supervised and unsupervised learning.

What is a train in agile? ›

An Agile Release Train (ART) is a feature of the Scaled Agile Framework (SAFe). It is a long-term, dedicated cross-functional team that works toward a singular goal. The train is made up of multiple agile teams.

What is the GGally function in R? ›

'GGally' extends 'ggplot2' by adding several functions to reduce the complexity of combining geometric objects with transformed data. Some of these functions include a pairwise plot matrix, a two group pairwise plot matrix, a parallel coordinates plot, a survival plot, and several functions to plot networks.

What is the Rexp function in R? ›

The 'rexp' function in R is your go-to tool for generating random numbers following an exponential distribution. Its syntax is straightforward, yet powerful, allowing for customized data generation that fits various statistical modeling needs. n : Number of observations to generate.

What is Lazyeval function in R? ›

The lazyeval package provides tools that make it easier to correctly implement non-standard evaluation (NSE) in R. You use lazy evaluation by requiring the user to "quote" specially evaluated arguments with ~ , and then using the lazyeval package to compute with those formulas.

What is caret with example? ›

In mathematics, the caret can signify exponentiation (e.g. 3^5 for 35) where the usual superscript is not readily usable (as on some graphing calculators). It is also used to indicate a superscript in TeX typesetting.

How do I use caret? ›

The caret symbol (^) is often found near the end of input fields and commands within various software applications such as word processors and internet browsers when pressed, it jumps your cursor location up one line so you can start typing again without having to press the backspace key multiple times which takes much ...

Why do we use carets? ›

Carets are used in proofreading to signal where additional words or punctuation marks should be added to a line of text.

What does a train do? ›

Chugging across short distances or entire continents, trains act as a major form of transportation worldwide. Also called railroads or railways, trains carry within their cars passengers or freight — such as raw materials, supplies or finished goods — and sometimes both.

What is the purpose of the train engine? ›

Use. There are three main uses of locomotives in rail transport operations: for hauling passenger trains, freight trains, and for switching (UK English: shunting). Freight locomotives are normally designed to deliver high starting tractive effort and high sustained power.

How does the train game work? ›

In Train, players receive instructions from a typewriter to load people, represented by yellow pegs, to different railway stations. The player moves their trains by rolling dice, and they can use cards to slow down their opponents' trains, or accelerate their own.

What is the use of the trainControl() method? ›

The function trainControl generates parameters that further control how models are created, with possible values: method : The resampling method: "boot" , "cv" , "LOOCV" , "LGOCV" , "repeatedcv" , "timeslice" , "none" and "oob" .

Top Articles
How To Make Stovetop Potpourri: 3 Simmer Pot Recipes
Quiche Lorraine Recipe
Toro Dingo For Sale Craigslist
Log in or sign up to view
Myud Dbq
2 værelses hus i Ejby
Ravens 24X7 Forum
Hangar 67
Teenbeautyfitness
Dealer 360 Login Generac
Sarah Dreyer Obituary
Sandra Sancc
Wells Fargo Banks In Florida
Ksat Doppler Radar
Cellmapper Verizon
Craigs List Duluth Mn
First Lady Nails Patchogue
Dive into Hearts and Adventure: Top 10 Lexi Heart Books to Experience
Regal Stone Pokemon Gaia
Gsa Elibary
All Obituaries | Dante Jelks Funeral Home LLC. | Birmingham AL funeral home and cremation Gadsden AL funeral home and cremation
Uscis Fort Myers 3850 Colonial Blvd
25Cc To Tbsp
Frequently Asked Questions | Google Fiber
Dollar General Cbl Answers Shrink Awareness
Knock At The Cabin Showtimes Near Alamo Drafthouse Raleigh
Monroe County Incidents
Apple Watch 9 vs. 10 im Vergleich: Unterschiede & Neuerungen
Operation Fortune Showtimes Near Century Rio 24
Megan Hall Bikini
Erfahrungen mit Rheumaklinik Bad Aibling, Reha-Klinik, Bayern
A Closer Look at Ot Megan Age: From TikTok Star to Media Sensation
Ontpress Fresh Updates
Windows 10 Defender Dateien und Ordner per Rechtsklick prüfen
Fx Channel On Optimum
Rage Room Longmont
Donald Vacanti Obituary
Expend4bles | Rotten Tomatoes
MAXSUN Terminator Z790M D5 ICE Motherboard Review
Filmy4 Web Xyz.com
Bad Moms 123Movies
Crossword Answers, Crossword Solver
Webworx Call Management
Kayla Simmons Of Leak
Potomac Edison Wv Outages
Culver's Flavor Of The Day Wilson Nc
Project Zomboid Sleeping Event
Csuf Mail
Redbox Walmart Near Me
When His Eyes Opened Chapter 3002
Unblocked Games Premium 77
Latest Posts
Article information

Author: Gov. Deandrea McKenzie

Last Updated:

Views: 6175

Rating: 4.6 / 5 (66 voted)

Reviews: 89% of readers found this page helpful

Author information

Name: Gov. Deandrea McKenzie

Birthday: 2001-01-17

Address: Suite 769 2454 Marsha Coves, Debbieton, MS 95002

Phone: +813077629322

Job: Real-Estate Executive

Hobby: Archery, Metal detecting, Kitesurfing, Genealogy, Kitesurfing, Calligraphy, Roller skating

Introduction: My name is Gov. Deandrea McKenzie, I am a spotless, clean, glamorous, sparkling, adventurous, nice, brainy person who loves writing and wants to share my knowledge and understanding with you.