One of the most intriguing news stories of the new year claimed that the Epstein-Barr virus (EBV) is the “cause” of Multiple Sclerosis (MS), and suggested that antiviral medications or vaccinations for Epstein-Barr could eliminate MS.
I am not an MD or an epidemiologist. But I do think this article forces us to think about the meaning of “cause.” Although Epstein-Barr isn’t a familiar name, it’s extremely common; a good estimate is that 95% of the population is infected with it. It’s a variant of Herpes; if you’ve ever had mononucleosis, you’ve had it; most infections are asymptomatic. We hear much more about MS; I’ve had friends who have died from it. But MS is much less common: about 0.036% of the population has it (35.9 per 100,000).
Learn faster. Dig deeper. See farther.
We know that causation isn’t a one-size-fits-all thing: if X happens, then Y always happens. Lots of people smoke; we know that smoking causes lung cancer; but many people who smoke don’t get lung cancer. We’re fine with that; the causal connection has been painstakingly documented in great detail, in part because the tobacco industry went to such great lengths to spread misinformation.
But what does it mean to say that a virus that infects almost everyone causes a disease that affects very few people? The researchers appear to have done their job well. They studied 10 million people in the US military. 5 percent of those were negative for Epstein-Barr at the start of their service. 955 of that group were eventually diagnosed with MS, and had been infected with EBV prior to their MS diagnosis, indicating a risk factor 32 times higher than for those without EBV.
It is certainly fair to say that Epstein-Barr is implicated in MS, or that it contributes to MS, or some other phrase (that could not unreasonably be called “weasel words”). Is there another trigger that only has an effect when EBV is already present? Or is EBV the sole cause of MS, a cause that just doesn’t take effect in the vast majority of people?
This is where we have to think very carefully about causality, because as important as this research is, it seems like something is missing. An omitted variable, perhaps a genetic predisposition? Some other triggering condition, perhaps environmental? Cigarettes were clearly a “smoking gun”: 10 to 20 percent of smokers develop lung cancer (to say nothing of other diseases). EBV may also be a smoking gun, but one that only goes off rarely.
If there are no other factors, we’re justified in using the word “causes.” But it’s hardly satisfying—and that’s where the more precise language of causal inference runs afoul of human language. Mathematical language is more useful: Perhaps EBV is “necessary” for MS (i.e., EBV is required; you can’t get MS without it), but clearly not “sufficient” (EBV doesn’t necessarily lead to MS). Although once again, the precision of mathematics may be too much.
Biological systems aren’t necessarily mathematical, and it is possible that there is no “sufficient” condition; EBV just leads to MS in an extraordinarily small number of instances. In turn, we have to take this into account in decision-making. Does it make sense to develop a vaccine against a rare (albeit tragic, disabling, and inevitably fatal) disease? If EBV is implicated in other diseases, possibly. However, vaccines aren’t without risk (or expense), and even though the risk is very small (as it is for all the vaccines we use today), it’s not clear that it makes sense to take that risk for a disease that very few people get. How do you trade off a small risk against a very small reward? Given the anti-vax hysteria around COVID, requiring children to be vaccinated for a rare disease might not be poor public health policy; it might be the end of public health policy.
More generally: how do you build software systems that predict rare events? This is another version of the same problem—and unfortunately, the policy decision we are least likely to make is not to create such software. The abuse of such systems is a clear and present danger: for example, AI systems that pretend to predict “criminal behavior” on the basis of everything from crime data to facial images, are already being developed. Many are already in use, and in high demand from law enforcement agencies. They will certainly generate far more false positives than true positives, stigmatizing thousands (if not millions) of people in the process. Even with carefully collected, unbiased data (which doesn’t exist), and assuming some kind of causal connection between past history, physical appearance, and future criminal behavior (as in the discredited 19th century pseudoscience of physiognomy), it is very difficult, if not impossible, to reason from a relatively common cause to a very rare effect. Most people don’t become criminals, regardless of their physical appearance. Deciding a priori who will can only become an exercise in applied racism and bias.
Virology aside, the Epstein-Barr virus has one thing to teach us. How do we think about a cause that rarely causes anything? That is a question we need to answer.