DEV Community

Cover image for Secrets of Regression Evaluation - A Guide to Common Metrics
Harsimranjit Singh
Harsimranjit Singh

Posted on

Secrets of Regression Evaluation - A Guide to Common Metrics

Yesterday, we discussed the inner working of Ordinary Least Squares (OLS) in simple Linear Regression. Today, we will shift our focus to evaluating the performance of these models.

Today's Focus: Regression Metrics-Evaluating Model Performance

Why Evaluation Metrics Matter

When we build a model, we aim to predict outcomes accurately. Evaluation metrics help us quantify how well our model is performing.

Understanding Regression Metrics

  • The Big Picture: These metrics offer a statistical assessment of the difference between predicted values (Yi_hat) in our data. By analyzing these metrics, we gain valuable knowledge about the strengths a weaknesses of our model.

  • Choosing the right metric: There's no "one size fits all" answer. The ideal metric depends on your specific problem and context.

  • Multiple metrics for a holistic view: Often, it's beneficial to use a combination of metrics to get a better view of the model's performance.

Regression Metrics:

Now, let's explore some of the most common regression metrics.

1. Mean Absolute Error (MAE):

  • Formula:

Mean Absolute Error
Interpretation: MAE measures the average absolute difference between the actual and predicted values. It provides idea of how Wrong the prediction is

Advantage:

  • The unit of output is the same.
  • Robust to outlier (not sensitive)

Disadvantage:

  • The graph is not differentiable at 0.

2. Mean Squared Error (MSE):

Formula:

MSE

Interpretation: MSE squares the error before averaging, giving more weight to the larger error. It's sensitive to outliers.

Advantages:

  • Loss Function differentiable at 0.
  • Useful for further analysis.

Disadvantages:

  • Sensitive to outliers.
  • Units are not the same as the target variable

3. Root Mean Squared Error (RMSE):

Formula:

RMSE

Interpretation: RMSE is the square root of MSE, bringing the error back to the original units of the target variables. This makes RMSE easier to interpret compared to MSE.

Advantages:

  • Easier to interpret than MSE due to units being the same as of target variable

Disadvantages:

  • Sensitive to outliers.

4. R2 Score:

Formula:

R2 Score

Interpretation: R-squared is a statistical measure that represents the proportion of variance in the dependent variable (target) that is explained by the independent variables(input) in the model.

Understanding Variance:
Think of it as difference: Variance is all about how much things differ from each other. In this case, we are looking at how much the actual values of dependent variables differ from what our model predicts.

  • The higher the R2 Score more the performance of the model.

Advantages:

  • Easy to interpret as proportion.

Disadvantages:

  • Can be misleading with complex models with many input columns.

Conclusion

Today, we talked about different evaluation metrics, equipping ourselves with valuable tools to assess the performance of our linear regression models. Remember, there's no single "perfect" metric – the best choice depends on your problem and data.

Stay tuned for the next topics.

Top comments (0)