Why do many textbooks on Bayes' Theorem include the frequency of the disease in examples on the reliability of medical tests?

A "standard" example of Bayes Theorem goes something like the following:

In any given year, 1% of the population will get disease X. A particular test will detect the disease in 90% of individuals who have the disease but has a 5% false positive rate. If you have a family history of X, your chances of getting the disease are 10% higher than they would have been otherwise.

Virtually all explanations I've seen of Bayes' Theorem will include all of those facts in their formulation of the probability. It makes perfect sense to me to account for patient-specific factors like family history, and it also makes perfect sense to me to include information on the overall reliability of the test. I'm struggling to understand the relevance of the fact that 1% of the population will get disease X, though. In particular, that fact is presumably true for all patients who receive the test; that being the case, wouldn't Bayes' Theorem imply that the actual probability of a false positive is much higher than 5% (and that one of the numbers is therefore wrong)?

Alternatively, why doesn't the 5% figure already account for that fact? Given that the 5% figure was presumably calculated directly from the data, wouldn't Bayes' Theorem effectively be contradicting the data in this case?


Solution 1:

I believe it's commonly included because it's counterintuitive. You would expect a test with a high degree of accuracy to be right most of the time but this isn't actually the case and requires more evidence. To address this I think of it as the "error of one sample" fallacy which is to say you can't do an experiment one time and make strong conclusions, even if the experiment is well-designed.

Solution 2:

Further to user856's explanation in the comments, here's a complementary answer.

The way to frame/interpret medical tests in general is to understand them as updating one's level of certainty that the patient has the disease:

  • without a medical-test result, the disease prevalence (a measure of disease frequency) can be taken as the patient's probability of having the disease;
  • however, in the context of a medical-test result, the aforementioned probability has changed: its updated value depends not just on the disease prevalence (as before), but now also on the test's sensitivity (true positive rate) and specificity (true negative rate). In other words, our knowledge of said probability has been refined.

https://i.stack.imgur.com/ZPmMO.png

p:  disease prevalence and other (prior) risk factors
v:  test sensitivity
f:  test specificity
D:  Diseased
H:  Healthy
+:  Positive test result
-:  Negative test result

The abovementioned probabilities are

  1. the positive predictive value, i.e., the probability that the patient is indeed Diseased given a positive test result $$P(D|+)=\frac{P(D+)}{P(D+)+P(H+)}=\frac{pv}{pv+(1-p)(1-f)},$$
  2. the false omission rate, i.e., the probability that the patient is actually Diseased given a negative test result $$P(D|-)=\frac{P(D-)}{P(D-)+P(H-)}=\frac{p(1-v)}{p(1-v)+(1-p)f}.$$

Thus, a screening test's predictive values $P(D|+)\,$ & $\,P(H|-)$ and overall accuracy $$P(D+)+P(H-)=pv+(1-p)f$$ depend on both its technical characteristics (sensitivity and specificity) and the population that it is being used on (disease prevalence). In particular:

  • unless the test has 100% sensitivity, its number of false-negative results is proportional to the disease prevalence $p;$
  • unless the test has 100% specificity, its number of false-positive results is proportional to $(1-p).$

N.B. The OP mentions “test reliability”, but that’s a separate issue, since reliability typically refers to consistency across retakes of a test’s results.

Here is a glossary. $$\\$$ Finally, here is a concrete extended example (based on actual data) to put all this in context: enter image description here Due to the low disease prevalence,

  • the PCR and rapid tests have a positive predictive value of only $4\%$ and $17\%$ respectively,
  • whereas their negative predictive value are both almost $100\%;$

the tests' overall accuracy are $95\%$ and $99\%$ respectively.