How to (correctly) leverage Bayes' theorem in clinical practice
Imagine yourself as a family medicine physician in a regular outpatient clinic when a man walks in complaining of a headache. Now even if you know virtually nothing about medicine, two options should sound familiar - tension headache and migraine. Tension headache is a type of headache where pain radiates round and round the head, sort of like a belt pressed too hard against the skull and is way more common than migraine. Without any additional information, tension headache is more likely. But as you can imagine, that is not enough. You need to ask the patient how the pain radiates, whether it is one-sided or both-sided, and whether it is exacerbated by physical activity. If he responds that the headache is one-sided, lasts for two days in a row, and is exacerbated by physical activity, migraine will have become way more likely. And this is also how doctors diagnose migraine - a few questions will usually suffice. They may also ask for the presence of an aura (visual disturbances or motor and/or sensory deficits) and increased sensitivity to light and sound.
There you go. Whether you know it or not, you have successfully applied the Bayes theorem in a clinical setting! Let's now take a step backward and analyze how the mathematics behind it relates to the previous example.
The formula above states that the posterior probability (posterior) of A given B is equal to the conditional probability of B given A (likelihood of A given B) multiplied by the prior probability of A divided by the prior probability of B (prior). I consider myself fairly proficient in statistical terminology, but this still makes little sense, so let's derive it into a much more friendly form:
In other words, the prior probability of tension headache is greater than that of the migraine, but if we obtain some additional information favoring migraine, the likelihood of the right diagnosis being migraine increases, while the likelihood of tension headache decreases.
Though this may seem obvious in hindsight, it is important to formalize this mental process when evaluating additional information, because although people have an incredible ability to estimate such probability swings in their heads, there are situations where our minds occasionally work against our common sense.
Atypical presentations of very common diseases can be more common than typical presentations of rare diseases.
Let's say a patient calls in and says that he is progressively unable to walk the stairs and stand up from a sitting position. He also says that one of his thighs seems thinner than usual, almost thinner than the diameter of the knee. You ask the patient whether the other thigh is also thinner and he is not sure. Based on this information alone, damage to the femoral nerve on the affected side or diabetic amyotrophy could be the cause. Without additional information, we may be more inclined to femoral nerve damage, because it fits very neatly and diabetic amyotrophy is not the most common presentation of peripheral nerve damage in diabetics. And because by now you understand Bayes theorem, you know that this reasoning is not ideal, because it is based purely on likelihood and not on the prior probability - we are not taking into account that nerve damage in diabetics is way more common than specific femoral nerve palsy (partial paralysis).
A note on incidence and prevalence and the prior probability of visiting the doctor
Incidence and prevalence are two very commonly used epidemiological measures of disease burden in the population. Incidence refers to the number of new cases per 100k people per year, while prevalence refers to how prevalent the disease is (what is the chance of randomly selecting a person with this disease). Using both metrics can also give us a glimpse into the average survival rate. Let's take ALS for example. ALS is a neurodegenerative disorder with an annual incidence of 2 per 100k people and an annual prevalence of 6 per 100k people. Because prevalence is very close to the incidence, the survival rate is very low - only 3 years in this case. We can simply divide the prevalence by incidence and get the average survival rate for an individual from the time of diagnosis.
But how do the incidence and prevalence relate to Bayes' theorem?
Well, to be frank, our original migraine example is not entirely correct. We automatically assumed that when a person comes to the doctor complaining of a headache, the most common cause is a tension headache. That would be correct if we randomly called people at home, performed a survey, and asked whether they are currently experiencing a headache. But let's be honest. How many times have you gone to the doctor because of a simple headache? A paracetamol pill will do just fine. But if you are experiencing a migrainous headache with various other neurological deficits, you are way more like to go to the doctor, especially if this is your first time. If we connect the dots, we can see that the Bayes' theorem applies even before the patient even sets foot in our office and when he is there, we are no longer basing our prior probability solely on the prevalence or incidence of the disease, but also the likelihood of a patient with the given disease visiting the doctor. Eureka!