Doctor's Diagnosis Dilemma
Suppose we have a medical device that can test for the presence of an imaginary rare disease called Probaphobia in a patient. This device is 99% accurate; that is, it will detect the disease in 99% of sick patients and it will give the all-clear in 99% of healthy patients. Medical studies reveal that Probophobia occurs uniformly at random in 1% of the world's population.
You go in to the doctor, feeling a little under the weather. She administers the test on you and the device claims that you have the disease. How likely is it that you actually have Probaphobia?
You guessed , but that's actually too high! The correct answer is that you're only 50% likely to have Probaphobia.
Don't worry, though: you're in good company. In 2010, psychologist Gerd Gigerenzer asked 100 American doctors a similar question about mammograms and breast cancer. Out of these 100 doctors, 95 overestimated the probability by about a factor of eight. OK, maybe you should worry if you're seeing American doctors for rare disease diagnoses.
You guessed , but that's actually too low! The correct answer is that you're actually 50% likely to have Probaphobia.
It sounds like you're a little too suspicious of this medical device. At an accuracy of 99%, it's not going to be wrong all that often. Or maybe you're underestimating how rare Probaphobia actually is.
You were right on the money! You guessed that the probability is , and the correct answer is that you're 50% likely to have Probaphobia.
In fact, you're doing much better than the people that we'd like to be good at this kind of reasoning. In 2010, psychologist Gerd Gigerenzer asked 100 American doctors a similar question about mammograms and breast cancer. Out of these 100 doctors, 95 overestimated the probability by about a factor of eight.
Let's try to get a handle on why it's only a 50% chance that you have Probaphobia given that the medical device claimed that you were positive. It can be a little bit tough to reason about probabilities, so let's try a little experiment.
Let's imagine that we take a sample of 1,600 people. Here, each person is a square. Some of these people are sick (each with probability 1%), but we won't see for sure if someone is sick or healthy for a couple more steps. First, let's test every person using our device.
Here we're coloring in everybody depending on whether or not they tested positive using our fancy medical device. Remember that right now, we don't actually know who is sick and who is healthy: we only know what the imperfect device tells us.
It turns out that XX people tested negative and YY tested positive. Let's group everyone together to make this easier to think about.
Now that everything's in order, we see that the people who test positive are very strongly in the minority. This is reasonable: only 1% of the global population has Probaphobia, and our device is pretty accurate overall. It should be rare that someone actually tests positive. But we still don't know who's sick! How can we find out, you ask?
Oh! Did I forget to mention? Probaphobia makes you sprout wings after 10 years with the disease. That way, it becomes pretty obvious who among our sample actually had the disease, regardless of what the device said. Let's simulate the passing of a decade now to let the diseased people spread their wings and reveal themselves.
Here we see who's actually sick, independent of who tested positive or negative. Let's break down what we find. We'll start with the people that the device categorized correctly: and .
As we know, the device isn't perfect. Here we've got .
Even worse, we have a truly bad outcome: This person was ill, but the device didn't detect it.
Happily, we have no false negatives! People with this outcome would be in a really bad situation: they'd be sick but the device would claim otherwise.
Now that we've got everything arranged just so, let's answer the original question: how many people who tested positive actually have Probaphobia? We'll start by looking at just those who tested positive.
Here, we remove from consideration every person that tested negative, regardless of whether or not they had Probaphobia. In most experiments, we will only have true negatives, or healthy people that tested negative for Probaphobia, to remove. False negatives are pretty rare. But their health doesn't actually matter: we're trying to get the probability of having Probaphobia among those who test positive, correctly or not. A person who tested negative, sick or healthy, doesn't tell us anything interesting about what we want to know!
Next up, let's take a closer look at our remaining folks, all of whom tested positive.
Alright, things are coming into focus here. It turns out that when we compare the numbers of the healthy and the sick when constrained only to those who tested positive, these groups are not so far off in size, especially compared to the number of device-cleared, disease-free folks! Given that you know someone has tested positive, it's then about equally as likely that they'll be in the healthy group as it is that they'll actually be sick.
To get our experimental result for the probability of having Probaphobia given that the device says you have it, we just need to divide the number of true positives by the total number of positives (true and false positive counts added together). Click on to see our result!
OK, so if we have out of , we estimate that you're only likely to actually have Probaphobia! So don't freak out yet, you might not be destined for wings.
The goal of this question was to help you reason about conditional probability: given that you know one thing about a situation, how can you use that information to deduce the likelihood of other outcomes? Here, you knew that you tested positive. This told us that one of two exceedingly rare things has happened: you could be very unlucky and actually have Probaphobia, in which case the device is almost certainly right about you! Or you could be free of this rare disease like almost everybody else yet subject to an unfortunate device misfire. With these numbers, we've seen that in a large group of people, it'll be about the same number of people in both categories.
To derive that the true probability is exactly 50%, we'd need Bayes' Theorem. That's a theorem for another explainer! But if you click the button just one more time, I'll let you play around with the device accuracy and disease frequency to see how the probability behaves.