Machine Learning — Model Evaluation Metrics #3

Ufuk Çolak
Nerd For Tech
Published in
4 min readMay 10, 2021

--

In my previous article, I made a short introduction to modeling and examined the model validation method. In this article, I am taking a look at the metrics we use when evaluating the success of the established models.

There are two types of problems for machine learning models as Regression and Classification. Let’s look at how metrics are in use to measure the success of these models.

Model Evaluation Metrics for Regression Models

Mean Squared Error (MSE)

It is one of the error evaluation applications for regression models. When we construct a model, the purpose of this model will be to predict the values of dependent variables. We use the formula below to detect the mistakes we make while predicting.

When we examine the formula;

  • n number of observations
  • y actual values
  • ŷ denotes the estimated values.

From the actual values of our dependent variable Y, we subtract the Y1 values predicted from it and square it. Our aim here is to detect the mistakes we make.

For example, let’s assume that the price of a house is $600K for an observation unit. As a result of the model we established, we found the price of this house as $610K in the first estimation. Here, we will have made $10K mistakes for the first observation unit. We can observe $20K errors in another estimate and $5K errors in another.

We use the mean squared error method to find out how many errors we made on average at the end of the day.

Model Evaluation Metrics for Classification Models

Let’s proceed with an example commonly used to understand success criteria in classification models. Let’s consider whether the mail is spam or not.

Using the machine learning model, we created a prediction model about whether the mail is spam or not. Now we ask this model; Do you think mail spam when we give these values of arguments?

At this point, we compare the predicted and actual values of the model.

  • If the mail is spam in the train set, but we guess it is spam, it is called True Positive (TP).
  • If the mail is spam in the train set, but we guess it is not spam, it is called False Negative (FN).
  • If the mail is not Spam in the train set, but we guess it is spam, it is called False Positive (FP).
  • If the mail is not Spam in the train set, but we guess it is not spam, it is called True Negative (TN).

Usually, we use Accuracy to evaluate the success of the model in classification problems. In other words, it is the correct classification rate.

Accuracy: (TP + TN) / All Observations

Accuracy rate for our example: (100 + 700) / 1000 = 80%

The opposite is the error rate. 1- Accuracy Rate gives us the error rate.

Error Rate: (FN + FP) / All Observations

Precision: TP / (TP + FP)

Sensitivity: TP / (TP + FN)

ROC Curve

We continue with the ROC Curve, which is one of the classification models’ success evaluation criteria. We can also define it as the graphical interpretation of the Accuracy.

If we look at the graph, there is a False Positive Rate on the X-axis and a True Positive Rate on the Y-axis. False Positive Rate shows the ratio of False Positive (FP) observations to all Negatives in Confusion Matrix. Likewise, we can consider True Positive Rate as the ratio of True Positive (TP) observations to all Positives.

Values ​​from 0 to 1 on the X and Y-axis show the ratio of False Positive and True Positive. The area under this curve is called Area Under Curve (AUC). If this area is large, it means the model is well, and if it is small, it means the model is unsuccessful.

In other words, the larger the continuous line here, the higher the prediction success of the model.

The closer this continuous line is to the dashed line in the middle, the lower the prediction success.

The dashed line in the middle is; If we had not set up any models and called all classes 1 or 0, so randomly we would have a 50% chance, in other words, 50% success. When fitting models, it is desirable that the area under the blue curve is higher than this line. In other words, when we create the ROC curve plot of the fitted models, such a shape appears and gives information about the success of our model.

Next article, see you on the Bias-Variance Trade-off and Model Tuning methods…

--

--