AI

Regression evaluation metrics
1. Mean Absolute Error (MAE)
2. Mean Squared Error (MSE)
3. Root Mean Squared Error (RMSE)
Coefficient of Determination (R²)

1. Mean Absolute Error (MAE)

What it measures:
The average absolute difference between predicted and actual values.
- Ignores direction (over- or under-prediction).
- Treats all errors equally.
Calculation:
MAE=6∣2∣+∣3∣+∣3∣+∣1∣+∣2∣+∣3∣=614≈2.33
- Interpretation: On average, predictions are off by 2.33 ice creams.
When to use:
- When all errors (small or large) should be weighted equally.
- Preferred when outliers are not a critical concern.

2. Mean Squared Error (MSE)

What it measures:
The average of the squared differences between predicted and actual values.
- Emphasizes larger errors (since squaring amplifies them).
Calculation:
MSE=622+32+32+12+22+32=64+9+9+1+4+9=636=6
- Interpretation: The "score" is 6, but this isn’t in ice cream units (due to squaring).
When to use:
- When large errors are particularly undesirable (e.g., in financial models).
- Used as a loss function in machine learning (optimization).

3. Root Mean Squared Error (RMSE)

What it measures:
The square root of MSE, converting the metric back to the original units (ice creams).
Calculation:
Interpretation: Predictions are off by ~2.45 ice creams on average, with larger errors weighted more heavily.
When to use:
- When you want MSE’s sensitivity to large errors but need interpretability in original units.

Key Differences Summary

Metric	Units	Outlier Sensitivity	Interpretation
MAE	Original (e.g., ice creams)	Low	"Average error is X units"
MSE	Squared (e.g., ice creams²)	High	"Squared error average is X"
RMSE	Original (e.g., ice creams)	High	"Error magnitude is ~X units"

Ice Cream Example Recap

MAE (2.33): "On average, predictions are off by ~2.3 ice creams."
MSE (6): "Large errors are penalized; score is 6 (no unit meaning)."
RMSE (2.45): "After squaring and square-rooting, errors average ~2.5 ice creams, with larger mistakes weighted more."

Which to choose?

Use MAE for straightforward, outlier-resistant interpretation.
Use RMSE if large errors are critical (e.g., safety-critical systems).

When to Use

MAE: When all errors should be treated equally
MSE/RMSE: When large errors are particularly undesirable
RMSE: Preferred when you need interpretability in original units

Coefficient of Determination (R²)

R² (R-Squared) measures the proportion of variance in the dependent variable (e.g., ice cream sales) that is predictable from the independent variable(s) in a regression model. It quantifies how well the model explains the observed data.

Key Formula

( y ): Actual values
( \hat{y} ): Predicted values
( \bar{y} ): Mean of actual values actual values

Interpretation

R² = 1: Perfect fit (100% variance explained).
R² = 0: Model explains none of the variance (no better than predicting the mean).
R² < 0: Model performs worse than a horizontal line (rare, indicates severe issues).

Ice Cream Example

R² = 0.95:
- The model explains 95% of the variance in ice cream sales.
- Only 5% of the variance is due to random/unexplained factors (e.g., festivals, weather anomalies).

Comparison with Other Metrics

Metric	Purpose	Scale	Focus
MAE	Average absolute error	Original	Magnitude of errors
MSE/RMSE	Squared/root-squared error	Squared	Penalizes large errors
R²	Proportion of variance explained	0 to 1	Model explanatory power

Why It Matters

Contextualizes Error:
- Unlike MAE/MSE, R² compares errors to natural variance in the data.
Model Selection:
- Higher R² indicates a better-fit model (but beware of overfitting!).
Intuitive Scale:
- 0–1 range simplifies communication of model performance.

Limitations

Not for All Models: Misleading for non-linear relationships.
Overfitting Risk: Adding variables can artificially inflate R².
No Direction: Doesn’t indicate if predictions are biased.

Note: Use R² alongside other metrics (e.g., RMSE) for a complete evaluation.