首页 > 其他 > 详细

ISL - Ch.5 Resampling Methods

时间:2017-04-03 23:35:30      阅读:305      评论:0      收藏:0      [点我收藏+]

Resampling methods involve repeatedly drawing samples from a training set and refitting a model of interest on each sample in order to obtain additional information about the fitted model.

In this chapter, we discuss two of the most commonly used resampling methods, cross-validation and the bootstrap.

 

5.1 Cross-Validation

5.1.1 The Validation Set Approach

It involves randomly dividing the available set of samples into two parts, a training set and a validation set or hold-out set. The model is fit on the training set, and the fitted model is used to predict the responses for the observations in the validation set.

 

5.1.2 Leave-one-out cross-validation (LOOCV)

 

Like the validation set approach, LOOCV involves splitting the set of observations into two parts. However, instead of creating two subsets of comparable size, a single observation (x1,y1) is used for the validation set, and the remaining observations {(x2, y2), . . . , (xn, yn)} make up the training set.

 

5.1.3 k-Fold Cross-Validation

An alternative to LOOCV is k-fold CV. This approach involves randomly dividing the set of observations into k groups, or folds, of approximately equal size. The first fold is treated as a validation set, and the method is fit on the remaining k ? 1 folds.

 

5.1.4 Bias-Variance Trade-Off for k-Fold Cross-Validation

The test error estimate resulting from LOOCV tends to have higher variance than does the test error estimate resulting from k-fold CV.

Typically, one performs k-fold cross-validation using k = 5 or k = 10.

 

5.2 The Bootstrap

Each bootstrap data set contains n observations, sampled with replacement from the original data set.

技术分享

 

ISL - Ch.5 Resampling Methods

原文:http://www.cnblogs.com/sheepshaker/p/6664008.html

(0)
(0)
   
举报
评论 一句话评论(0
关于我们 - 联系我们 - 留言反馈 - 联系我们:wmxa8@hotmail.com
© 2014 bubuko.com 版权所有
打开技术之扣,分享程序人生!