Cross-Validation

Why

When you have training set and test set for model selection, one idea would be use the training set to estimate parameters for each model candidate, and then choose the model generates the least error to test data.

However, an extra parameter was chosen using the test set -> the generalisation error is underestimated.

How

Types of cross-validation techniques

Example: cross-validation for model selection

  1. determine model options: number in polynomial/number of layers in neural network...
  2. compute the model that generated least cross validation errors
  3. then estimate the generalization error of the test set