Cross-Validation
Technique to evaluate models by dividing data into 'k' folds, training on k-1 and testing on the remaining, for a more reliable performance estimate.
Key Concepts
k-fold Cross-Validation
A popular type of cross-validation.
Stratified k-fold Cross-Validation
A variation of k-fold cross-validation that is used for imbalanced datasets.
Leave-One-Out Cross-Validation
A type of cross-validation where the number of folds is equal to the number of instances in the dataset.
Detailed Explanation
Cross-validation is a technique for evaluating machine learning models by training several models on subsets of the available input data and evaluating them on the complementary subset of the data. Using cross-validation, there are high chances that we can detect overfitting with ease.
The most common type of cross-validation is k-fold cross-validation. In k-fold cross-validation, the data is divided into k folds. The model is then trained on k-1 folds and evaluated on the remaining fold. This process is repeated k times, with each fold being used as the evaluation set once. The final performance of the model is the average of the performance on the k folds.
Real-World Examples & Use Cases
Model Selection
Cross-validation can be used to select the best model for a given problem.
Hyperparameter Tuning
Cross-validation can be used to tune the hyperparameters of a model.
Performance Estimation
Cross-validation can be used to estimate the performance of a model on unseen data.