A supervised learning task where the model predicts a continuous numerical value — such as house prices, temperature, or stock returns — rather than assigning data to discrete categories.
In Depth
Regression is, alongside classification, one of the two fundamental supervised learning tasks. While classification predicts discrete labels (categories), regression predicts continuous numerical outputs. The goal is to learn a function that maps input features to a real-valued target variable while minimizing prediction error. The simplest example is linear regression, which fits a straight line through data points — but modern regression encompasses highly complex, nonlinear models including deep neural networks.
Regression algorithms span a wide range of complexity. Linear Regression assumes a straight-line relationship between inputs and output. Polynomial Regression captures curved relationships. Ridge and Lasso Regression add regularization to prevent overfitting. Decision Tree Regression and Random Forest Regression handle non-linear patterns by splitting data into regions. Neural network regression — including deep architectures — can model extremely complex, high-dimensional relationships but requires more data and compute.
Regression models are evaluated using metrics such as Mean Squared Error (MSE), Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and R-squared (R²). Each captures different aspects of prediction quality: MSE penalizes large errors heavily, MAE treats all errors equally, and R² measures the proportion of variance explained by the model. Understanding which metric matters most depends on the application — for example, in financial forecasting, large errors (outliers) may be catastrophic, making MSE more appropriate.
Regression predicts continuous numerical values rather than categories — it is the foundation of price forecasting, demand estimation, and any task requiring quantitative predictions.