Binary Diagnostic Tests: A Friendly Cheat Sheet
3.1 One-page Concepts
- Sensitivity (Se = TPR): ability to correctly flag patients as positive
- Specificity (Sp = TNR): ability to correctly flag healthy people as negative
- PPV: if the test is positive, the chance the person truly has the disease
- NPV: if the test is negative, the chance the person is truly healthy
- Prevalence: fraction of patients in your population (a prior)
Memory TIP
- Se: “don’t miss cases” (minimize false negatives)
- Sp: “avoid false alarms” (minimize false positives)
- PPV/NPV depend heavily on prevalence (Bayes!)
2×2 Table (Confusion Matrix)
| True Disease (D) | True No Disease (D^{c}) | |
|---|---|---|
| Test Positive (+) | TP = (x_1) | FP = (x_2) |
| Test Negative (-) | FN = (x_3) | TN = (x_4) |
- Sensitivity (Se, TPR): (\displaystyle \frac{x_1}{x_1+x_3}=P(+\mid D))
- Specificity (Sp, TNR): (\displaystyle \frac{x_4}{x_2+x_4}=P(-\mid D^{c}))
- PPV: (\displaystyle \frac{x_1}{x_1+x_2}=P(D\mid +))
- NPV: (\displaystyle \frac{x_4}{x_3+x_4}=P(D^{c}\mid -))
- Prevalence (Prev): (\displaystyle P(D))
Why are PPV/NPV sensitive to prevalence? (Bayesian intuition)
[ PPV=\frac{Se\cdot Prev}{Se\cdot Prev+(1-Sp)(1-Prev)},\qquad NPV=\frac{Sp(1-Prev)}{(1-Se)Prev+Sp(1-Prev)}. ]
- With a rare disease (Prev ↓), false positives dominate → PPV drops
- Conversely NPV gets very high (most people are healthy, so a negative is almost surely true)
1-minute numeric feel
Prev (=0.01), Se (=0.99), Sp (=0.98)
[ PPV \approx \frac{0.99\cdot 0.01}{0.99\cdot 0.01+0.02\cdot 0.99} \approx 0.33 ] → Only ~1 in 3 positives are true patients
(NPV \approx 0.9999) → A negative is almost certainly healthy
Bigger picture (population 10,000)
- 100 patients, 9,900 healthy
- TP = 99, FN = 1 (Se 0.99)
- TN = 9,702, FP = 198 (Sp 0.98)
- PPV = (99/(99+198) \approx 0.333)
- NPV = (9702/(9702+1) \approx 0.9999)
Which metric when? (Practical cues)
- Maximize Se when missing a case is costly (cancer screening, infectious-disease triage)
→ Lower the threshold (Se↑) → follow with a high-Sp confirmatory test - Maximize Sp when false positives are costly (invasive confirm, expensive therapy)
→ Raise the threshold (Sp↑) - PPV/NPV are for patient-facing probabilities (“If positive, chance you truly have it is ~%”)
→ Recompute with your prevalence via Bayes - Don’t use Accuracy alone: with rare diseases, saying “all negative” looks accurate but is useless
Thresholds and ROC (Receiver Operating Characteristic)
4.1 What’s a threshold?
Many tests/readers output a score (continuous/ordinal).
Classify as positive if score ≥ cutoff; otherwise negative.
- Threshold ↑ → Sp↑, Se↓ (conservative: fewer false alarms, more misses)
- Threshold ↓ → Se↑, Sp↓ (aggressive: fewer misses, more false alarms)
Se and Sp trade off. Fixing a single threshold can be misleading.
ROC curve: a performance map over thresholds
- x-axis: FPR (= 1 - Sp) (false-positive rate)
- y-axis: TPR (= Se) (true-positive rate)
- Sweep thresholds, plot ((\mathrm{FPR}, \mathrm{TPR})) → ROC curve
AUC (area under curve)
- 1.0: perfect separation
- 0.5: random guessing (diagonal)
- Interpretation: probability a random patient scores higher than a random healthy subject
In practice: threshold-agnostic summary.
Even with similar AUCs, choose by the FPR/TPR region that matters for your use case.
Empirical ROC vs Parametric ROC
- Empirical ROC: compute (Se, Sp) at each threshold → connect the dots
- Parametric ROC: fit score distributions (e.g., normal for healthy/disease) → can provide AUC uncertainty (CIs)
Field cheat sheet
- Define your goal: minimize misses (Se↑) or false alarms (Sp↑)?
- Know your prevalence: prior from literature/registry/local data.
- Re-compute PPV/NPV by Bayes using your prevalence for patient counseling.
- Don’t fixate on one cutoff: use ROC/AUC and set a threshold in your operating region.
- Two-stage strategy: 1st screen (Se↑) → 2nd confirm (Sp↑).
One-liner
- Se/Sp: intrinsic test skill (less affected by prevalence)
- PPV/NPV: “probability it’s real” for patients (strongly prevalence-dependent)
- ROC/AUC: fair, threshold-free comparison of overall performance
Tips
True Positive? False Positive?
On an ROC curve, FPR and TPR are plotted on the x and y axes.
“Positive” means the classifier/test said “yes”.
- True vs False indicates whether that decision was correct.
- True Positive (TP): actually diseased and called positive.
- False Positive (FP): actually healthy but incorrectly called positive.
Relationship between TPR and FPR
Consider the tester (a clinician or classifier):
- Doctor A calls almost everyone positive (very low threshold).
- TPR high (few missed cases) and FPR high (many false alarms).
- Doctor B calls almost everyone negative (very high threshold).
- TPR low and FPR low (misses many true cases, few false alarms).
Great tests aim for high TPR with low FPR.
References
- Source: [https://angeloyeo.github.io/2020/08/05/ROC.html]