4.5 Logistic Model Evaluation

In deciding whether a logistic regression model is a good and useful model, we need to consider how well it predicts the outcome. We have a few measures that we’ve already discussed: the accuracy, sensitivity, specificity, the false positive rate, and the false negative rate.

In order to calculate accuracy, sensitivity, and specificity, you have to choose a threshold to deem how high a predicted probability of success needs to be to predict a success. But, what threshold should be used?

The threshold directly impacts the balance of false positives and false negatives.

If we have a high threshold (closer to 1), we have a very high bar needed to predict a success. This makes it harder to predict a success outcome and thus, we will have fewer false positives. In fact, if our threshold is 1, we wouldn’t predict any successes so we’d have zero false positives but a lot of false negatives.
If we have a low threshold (closer to 0), we have a lower bar to predict a success. It will be easier to predict a success and we will have many more false positives and fewer false negatives. If the threshold were 0, we would predict every unit to have a success and have zero false negative rate.

In practice, we have to consider the human consequences of each type of error (false positive and false negative) in order to determine the appropriate balance for the data context.

For example, let’s consider the real consequences of an error in a mammogram. If the procedure indicates abnormal tissue but it ends up not being breast cancer, it’s a false positive; this result might lead to unnecessary tests and procedures and invoke stress and anxiety for the patient. A false negative happens with a normal mammogram when, in fact, the patient does have breast cancer and this could delay important treatment. How do you weigh the benefits of catching early stages of breast cancer with the costs of unnecessary tests and anxiety?
What about the consequences of false positive or false negative in advertising and sales? A business wants to predict whether a customer would buy something if it is on sale but it costs some money to send an advertisement in the mail. A false positive happens when the business predicts a customer would be enticed to make a purchase based on an advertisement but they do not; this costs money. A false negative is a missed opportunity and a missed revenue because a customer did not receive the advertisement.

Depending on the context, we may focus on

maximizing the overall accuracy
minimizing the false negative rate (maximizing sensitivity) with greater weight than false positives
minimizing the false positive rate (maximizing specificity) with greater weight than false negatives
balancing the false positive and false negative rates.

There are techniques that attempt to capture the “goodness” of fit without choosing a specific threshold, such as the ROC curve and likelihood measures (AIC and BIC – lower is better), which you’ll learn about in future statistics classes.