The Falsifiability of SARS-CoV-2 Origin Theories

The importance of null models & testable hypotheses

Mar 16, 2023

Science is a messy, epistemological warzone with loosely followed ground rules. The difference between our scientific warzone and other social conflicts is that science requires reproducible evidence and testable hypotheses. Deviations from these ground rules, such as relying on evidence nobody can reproduce or making our hypotheses untestable, must be identified as such to bring scientific discourse back into the field of fair play.

SARS-CoV-2 origins is a perfect case study for any philosophers or historians of science. There are two primary theories: the zoonotic theory and the lab-origin theory. Both theories encompass a range of scenarios. While the zoonotic origin theory is predominately focused on a Huanan Seafood Market origin of the outbreak, in principle there could be other scenarios currently lacking evidence (e.g. a person’s cat could’ve eaten a bat), scenarios that would also fall under the “zoonotic” umbrella. Similarly, while lab-origin theory is predominately focused on the DEFUSE grant as a blueprint for making a virus like SARS-CoV-2 at the Wuhan Institute of Virology, there are other scenarios of research-related events, including alternative labs running similar experiments or a lab worker bitten by an animal infected with a natural non-engineered CoV, scenarios that would fall under the “lab-origin” umbrella.

As anyone who has been online can see, the debate over these two theories is acrimonious. Scientists can be a stubborn bunch, for better or for worse. We think deeply about an issue, examine a set of evidence that our disciplinary eyes are best able to see, and, with confidence in our intellect and insights, we state our findings or reasoning in a very public manner that carries the exciting possibility of fame and glory for a bold discovery, alongside the ego-devastating possibility that we could be proven wrong in a similarly public manner, dealing a significant blow to self-esteem of one’s intellect and insight and a fear that one’s colleagues see you as an idiot. Our battle of beliefs can sometimes devolve to outright hostility, even threats of physical violence and accusations of fraud & misconduct, and sometimes the temptation of fame and glory can even lead people to exit the ring of scientific fair play, to manipulate evidence in a manner designed to make their theory seem stronger than it is, their paradigmatic claim to fame more evident.

While the ongoing war between the Zoos and the Labs is a novelty for many members of the public seeing this sausage factor of modern science for the first time, in reality this scuffle is, like the Crips and the Bloods or the Capulets and the Montagues, an age-old story, a replay of history. Thomas Kuhn wrote a seminal book on the structure of these scientific revolutions, detailing the history of paradigm battles of ages past. Historically, paradigms have shifted and new theories - like the theory the sun is the center of our solar system, or that matter is made up of atoms - have become adopted not at the speed of the evidence, but the rate at which old and powerful scientists clinging to old theories die.

Unfortunately, for this essential debate with massive contemporary political implications, we can’t wait for scientists to die to resolve our paradigmatic battle. We need this scientific revolution to be different from prior ones. We need to move at the speed of the evidence, and we can do that by paying close attention to how evidence moves beliefs.

Prior to Kuhn, Karl Popper wrote a seminal book in 1934 on The Logic of Scientific Discovery, arguing that any scientific theory must be “falsifiable” or that there must be some test we can run to prove a theory wrong. That, Popper said, is how evidence can move our beliefs: by disproving the untrue beliefs until only true beliefs remain standing. Popper’s treatise led to the widespread incorporation of falsifiability and falsification, null models and hypothesis-testing, as ground rules in science. A theory that is unfalsifiable - like the theory that God created the Universe - is outside the realm of science because there’s no evidence that could possibly disprove it. After all, how would you test that God created the Universe? The Big Bang leaves room for God prior to the Big Bang, but even startling discoveries of time before the Big Bang would leave room for a God before then. While there are fascinating theological arguments for or against the existence of God, there’s no evidence one could collect to put this question to rest. There is no good way to resolve the debate over the existence of a God within the framework of science. That’s okay. It doesn’t mean theological discussions and philosophy aren’t valuable for mankind, it just means they concern matters outside the realm of science and, for the time being, they are not matters for which we can expect an empirical resolution to the disagreements.

Modern epistemology and philosophy of science has grown a lot since Popper’s day, updating the collective beliefs of the field by becoming more “Bayesian”. Broadly speaking, Baysian statistics & epistemology acknowledge that we have weighted beliefs and that evidence can change how we weight our beliefs. For example: do you believe SARS-CoV-2 came from a lab? Answering “yes” or “no” is not as honest as answering the question as a matter of degree or your own, non-trivial odds that one scenario is true and the other is false. Instead of a “true vs. falsified” dichotomization of scientific beliefs, Bayesians appreciate that belief is a matter of degree, and Popper-esque evidence that is unlikely under one theory but likely under another should pull your belief away from the less-likely theory and towards the more-likely theory. Both Popperian and Bayesian ways of thinking about science rely on our ability to quantify the likelihood of data based on the different hypotheses.

We advance science by quantifying the likelihood of evidence under our theories.

In other words, whether we’re ‘falsifying’ theories or merely updating our Bayesian brains, a core ground-rule of science is that one must be able to estimate the probability of seeing certain features of the data under your theory. A refusal to estimate the odds of inconvenient dataset renders a theory much less credible, as it pulls the theory outside our scientific octagon of testability and throws it into the crowded stands of untestable social or religious melee. When supporters of a theory encounter a claim that pieces of evidence are unlikely to be observed under their theory, the two allowable options are to discard the evidence or provide updated estimates of the likelihood of the evidence. You can’t, however, discard evidence simply because you don’t want to quantify its low likelihood under your theory.

Here’s where we can circle back to the Labs vs. the Zoos. There is a lot of evidence on the table, and we’re arguing over the odds of seeing this evidence. In order to advance this paradigmatic battle at the speed of evidence, we need to estimate the odds of observing the evidence under each theory.

For example, SARS-CoV-2 emerged in Wuhan. What are the chances of that?

Under the Zoo theory, we can estimate the probability of a SARS CoV emerging in Wuhan by using pre-COVID methods to forecast spillover. By looking at the biogeography of SARS-CoVs (where in the world do these things live & tend to spillover?), summing the total populations within the known range of SARS CoVs, and dividing the population of Wuhan by this total SARS-overlapping population, we obtain a crude estimate a <1% chance of a SARS-CoV pandemic emerging in Wuhan under the zoonotic theory. This estimate is just an estimate - the methods are crude and transparent, and we can debate different methods to estimate these odds.

We can’t, however, say these odds are inestimable.

Under the lab theory’s scenario involving the DEFUSE grant proceeding with the means & opportunities available to those researchers, we can read the grant and see the live virus would be studied in two labs: University of North Carolina and the Wuhan Institute of Virology. At first pass, we might then say there’s a 50% chance of SARS-CoV-2 emerging in Wuhan under the DEFUSE scenario with the remaining 50% chance that it emerges in North Carolina. However, we may improve that estimate by noting the well-documented dilapidated nature of coronavirus research facilities in China relative to those of UNC, including the WIV lacking funds to update critical biosafety infrastructure, hiring contractors to fix HEPA filters (uh oh), and patenting duct-taped approaches for cheaper seals on animal cages. In the debate over how to estimate odds, one might argue that, with this information, the DEFUSE research would have much higher chance of a lab accident in Wuhan than in North Carolina.

We’ve estimated the likelihood of a Wuhan emergence under our two theories, and we compare the two theories by looking at the ratio of these likelihoods (what’s referred to as a “Bayes Factor”). We have <1% chance of a Wuhan emergence under the Zoo theory and an >50% chance of a Wuhan emergence under the Lab theory. A Wuhan emergence is >50x more likely, 50x easier to explain, under a lab origin than a zoonotic origin of SARS-CoV-2.

Quantifying the odds of evidence under both theories moves science at the speed of the evidence.

However, some zoonotic origin theorists are currently making arguments that remove their theory from our octagon of testability, and we have to note their escape from the ring as the concession that it is. Efforts to flee our octagon of testability are most evident when examining one of the most important pieces of evidence on SARS-CoV-2 origins: the furin cleavage site (FCS).

Prior to the COVID-19 pandemic, we had ~80 SARS-CoV genomes with an evolutionary tree spanning >1,000 years of evolution and there was not a single FCS to be found. Suddenly, SAR-CoV-2 emerged in Wuhan with a well-known genomic insertion of niche virological interest pre-COVID. Many people scientists saw this insertion and immediately thought “Oh shit, this looks like it came from a lab.”

Nonetheless, let’s be scientists and work through the exercise of quantifying the odds of an FCS. I was working with a DARPA PREEMPT team pre-COVID to predict spillover back in 2018 when DEFUSE was written, so I’m familiar with state-of-the-art approaches to forecast the genomic features of an emergent virus. In fact, I made some of them. Pre-COVID, here’s we would predict the odds of whether or not an emergent SARS-CoV would have an FCS:

At most, we might estimate <1/80 chance of finding a SARS CoV with an FCS because we’ve collected 80 SARS CoV genomes and none of them had an FCS. However, there’s actually a bit more information in the evolutionary tree that we need to incorporate. If we have 80 viruses, but 79 of them are close relatives that all differ by 1 mutation acquired in 1 superspreader event 1 day ago, we wouldn’t treat these 79 viruses as independent. Rather, we’d say that these 79 viruses diverged 1 day ago and, together, looking at all the viruses, we’ve only seen 79 days of evolution in this clade, 1 day for each virus since they diverged from their common ancestor. Then, we’d look at the 80th virus - if it diverged from this cluster of 79 viruses 1 year ago, then we’d say we’ve only seen an additional 2 years of evolution (1 year on the 80th virus, the other 1 year of evolution leading to the cluster of 79 viruses). The evolutionary tree is the scaffold on which we make our inferences about how often weird mutations happen, and it provides a more accurate way to estimate the odds of evolution producing an FCS. Rather than count the tips of the evolutionary tree for an estimate of the probability of an FCS in a SARS CoV, we should sum the branch length to estimate the rate at which FCS’s evolve per year of evolution.

If we sum the branch length of the SARS-CoV evolutionary tree, it spans over 1,000 years of evolutionary time. If we were sitting in my desk in Montana back in 2018, doing what I was doing then, forecasting spillover, I would use these evolutionary analyses to say the odds of seeing an FCS on any given year are less than 1 in 1,000. This is the conventional, state-of-the-art empirical approach for estimating the rates of evolutionary events, especially evolutionary events that aren’t totally outlandish, unprecedented possibilities, like the odds a virus evolves to become an elephant or a primate lineage evolves wings in a few generations. After all, we’ve seen FCS’s occur in other very distantly related CoVs (like seeing wings in bats but not in primates) but those insertions were not common and many FCS’s evolved by single mutations, not 12 nucleotide insertions.

Recently, I’ve seen some pushback against these odds estimates from the Zoo’s. Dr. Paul Bieniesz is a virologist at Rockefeller University. He is a smart guy and I want to recreate his argument as faithfully as I can here. Dr. Bieniesz argued that we can’t estimate the odds of an FCS appearing in a SARS-CoV because virus populations are large (often >1 billion viruses in a host!) and genomically heterogeneous. There are so many mutants created in a population of >1 billion viruses that, in his words, “the improbable becomes probable in large populations”. While rhetorically elegant (honestly, rhetorically beautiful, were it not mathematically incorrect), this argument seems to pull the zoonotic origin theory outside the octagon of testability by saying that any genomic improbability becomes probable and therefore anything could happen and there’s no reasonable way to estimate likelihoods of insertions, even if those exact insertions were written in a lab notebook by a researcher at the WIV in 2019. If zoonotic origin theories retreat to a realm where they can’t calculate the odds of seeing the genomic features present in SARS-CoV-2, they can’t be weighed against lab-origin scenarios arguing that the unusual genomic features of SARS-CoV-2, which were never seen in 1,000 years of SARS CoV evolution, were exactly the features researchers proposed to insert in a virus. Under this reasoning, there would be no evidence, not even exquisitely detailed emails proposing to insert an FCS between the S1/S2 subunit of a SARS-CoV at the WIV, that would discriminate between these theories, because their likelihoods are unquantifiable and therefore could be whatever they want it to be to match the likelihoods of the evidence under lab-origin theory.

I’m okay with virus populations having billions of virions that are all mutated versions of one-another. That’s how it’s been for the entire 1,000 years of SARS CoV evolution we see on the evolutionary tree. SARS-CoV-1 emerged in humans and infected 8,000 people, yet we never saw an FCS emerge in those 8,000 people each containing billions of SARS CoVs. Admittedly, we didn’t sample 8,000 people, but we did have 18 SARS-CoV genomes separated by months of infections, and none of those genomes had an FCS despite billions of virions in every human and a combination of within-host and between-host selection for the virus to acquire an FCS. While low-probability events become higher-probability events in large populations, they don’t suddenly become probable. Higher-probabilities can still be vanishingly small, and so to advance science have to get rigorous and produce estimates of seeing each genomic anomaly under natural-origin scenarios. Given viruses have always had large population sizes in wildlife reservoirs without acquiring an FCS, given we’ve seen the virus evolve in humans before without acquiring an FCS, and given people have even serially-passaged a bat SARS CoV in human cells and humanized mice without acquiring an FCS (a procedure that has selected for FCS’s in other viruses), we simply have no evidence to support an upwards revision of the odds of seeing an FCS in an emergent SARS CoV.

We’re seeing similar efforts to selectively remove genomic anomalies from the octagon of testability, to say the most important evidence about SARS-CoV-2’s evolutionary origins can’t discriminate between theories.

In response to our finding that the restriction map of SARS-CoV-2 is rare among wild CoVs and consistent with infectious clones as proposed in DEFUSE, many researchers hand-waved to say that other, unquantifiable processes could be at play and because other process could exist, we just can’t quantify the anomalous nature of SARS-CoV-2 compared to wild coronaviruses. Some said that recombination could explain why some of the restriction sites are in SARS-CoV-2. However, like the large population sizes of viruses, recombination has been happening in CoVs since time immemorial and, with all that recombination over all those years, it was exceedingly rare to see such an Ikea virus, such a ready-made infectious clone looking like a research product of DEFUSE, among wild coronaviruses. In our paper, we quantified the odds of that “Ikea-Virus” rarity using methods that would be agreeable pre-COVID for a virus that didn’t cause a pandemic. We found <0.07% chance of a wild coronavirus having a restriction map as-or-more idealized for infectious cloning.

Maybe recombination changes those odds, and we can debate that, but only if we stay inside the octagon. Anyone saying “recombination did it” needs to estimate their odds rigorously. First, researchers need to estimate their confidence that recombination (a hypothesis) event happened at all. That involves quantitatively estimating the rates of recombination in coronaviruses, possibly as a function of the size of recombination chunks, the locations of recombination, and the geographic overlap or evolutionary distance between the viruses being recombined. Quantifying their confidence in recombination events is hard: they’d need to estimate the variable rates of nucleotide evolution across the genome, and the odds of seeing as strong or stronger signal of recombination under a well-parameterized null model of nucleotide evolution and CoV recombination. They need to apply this procedure rigorously across the entire genome, not selectively at the sites they wish to explain-away with recombination. That’s the work required to revise odds estimates. If it were done rigorously, it might increase the odds of seeing a restriction map like SARS-CoV-2, or it might decrease the odds by requiring so many improbable recombination events from disparate viruses at such an unprecedented rate, and it may introduce new restriction sites that need their own explanations. It’s possible that lower odds of this phenomenon under the recombination scenarios will cause Zoo theorists, upon seeing those results, to prefer our 0.07% odds over the lower-odds estimated from a recombination scenario.

The same thing is happening with the human-specific codons in the FCS, although here I have more sympathy for the Zoos. The codon CGG is exceedingly rare in SARS-CoVs, appearing in <1.5% of the arginine codons, so ballpark <1/400 chance of seeing two of these codons in a row. In fact, the actual odds of seeing this codon pair, empirically estimated by looking at all CoVs, is more around 1/11,000. So, the Zoo’s may prefer the 1/400 estimate. Instead, many argue that one can’t estimate the odds of these codon pairs, or that such estimates are meaningless.

They argue that there are scenarios they can imagine, such as the FCS coming from a pangolin. Pangolins do, in fact, have a perfect match for the FCS insert… but that match to a pangolin sequence was found by searching for a small nucleotide sequence in a massive database of sequences, giving us well-defined probabilities of seeing such a match in such a large database. The probabilities are high that we would find a match for such a small sequence in such a large database. So maybe the FCS came from a pangolin, or maybe the pangolin got implicated by chance.

What if that sequence were found in an E. coli? What if it were found in a fungus? What if it were found in a plant? What if a sequence found in SARS-CoV-2 was also found in HIV? What if the FCS was also in a Moderna patent? Wherever one found that sequence, they would present the hypothesis that maybe a bat had a GI infection and got it from an E. coli or fungus in their gut, or that the bat ate an insect that ate a plant, or that the bat was in a cage next to a pangolin, or whatever. My point here is that we can’t let the existence of some zoonotic origin scenario prevent that theory from estimating the odds of an event. After all, bats and pangolins have been eaten by people for the entire 1,000 years of CoV evolution we witness, CoVs have been recombining all over the place in the entire CoV tree and, still, the CGG-CGG codon pair is exceedingly rare. Unless they have solid evidence to revise our odds estimate, the scientifically sound approach is to use our pre-COVID methods to estimate the odds of evolutionary events, and the ball is in their court to provide clear evidence revising those odds, which must include the odds of the unusual scenario they propose.

Interestingly, SARS-CoV-2 does have some sequences that are similar to sequences in HIV. Zoos discarded that as pure chance, while accepting their pangolin-FCS alignment as a game-changing scenario that renders odds of an FCS and CGG-CGG unquantifiable. They ignore that the FCS found in a pangolin is also found in a Moderna patent - why is the pangolin a game-changer but the Moderna patent cast aside, despite identical alignments? I’m not saying there’s any clear answer to these sequence-similarities. I am saying, however, that we are seeing clear signs of double standards in this scientific debate, double-standards aiming to avoid quantifying the low odds of SARS-CoV-2 genomic anomalies that are all easily explainable by DEFUSE. The existence of some hypothesized zoonotic scenario or possibility does not suddenly allow their theory to retreat from the scientific process of rigorously quantifying the likelihood of evidence under the zoonotic origin theory.

The challenge of estimating the odds of genomic evidence, evidence that is anomalous under the zoonotic origin theory that aims to say odds are unquantifiable yet exactly the evidence we’d expect under the DEFUSE lab-origin scenario, reminds me of when physicists were turning on the largest particle accelerator in human history (I forget which one, please comment if you remember this!). Some people worried, without much evidence that the particle accelerator could create a black hole and asked: what are the chances this could create a black hole? One physicist replied along the (butchered, paraphrased) lines of “well, there are two scenarios, two possibilities, and we haven’t done this before so we don’t know, so a first guess must be 50%”. Of course, if that reasoning were true and if physicists believed it, they would never have turned on the particle collider. Physicists knew that particles have been colliding in our solar system since time immemorial, just as coronaviruses have been spilling over, recombining, and evolving with large population sizes since time immemorial. You can’t just add in a hypothetical scenario to science and claim there exists fundamental uncertainty to escape the low likelihood of the evidence under your theory.

Our job as scientists is to rigorously, methodically, and empirically estimate the odds of the evidence under competing theories. I believe that’s the best way to communicate the evidence to managers, members of the public, and each-other. Any estimate, like the average height of basketball players, the batting average of a baseball player, the probability of rain in the next 5 days, or the odds of seeing an FCS in a SARS-coV, is an estimate and subject to revision with more information or better arguments. I encourage zoonotic origin theorists to present rigorous, methodical, and empirically justified revisions of our estimates so we can see their methods & logic and possibly agree with their revised estimates. Saying we can’t estimate the likelihood of genomic anomalies turns our scientific dispute into an unresolvable sectarian one. Estimating the likelihood of the evidence under competing theories, on the other hand, keeps our theories in the realm of science and helps science & society move at the speed of the evidence.

Whether you’re a Zoo or a Lab (or neither), whether you’re a Popperian or Bayesian, I think you can agree that these ground rules of quantifying the odds of evidence under opposing theories are essential rules for our epistemological warzone. Ground rules for science will keep our discussions constructive, allow evidence to shape our beliefs, and help society in this critical investigation of SARS-CoV-2 origins.

P.S. Want to falsify a lab origin? Easy: rule out the most-likely scenario. If the PI’s on the DEFUSE grant were to share their databases, lab notebooks, and communications, we could update the likelihood they went forward with the research proposed in DEFUSE. Of course, that same group received NIH/NIAID funding for the grant “Understanding the risk of bat coronavirus emergence”, that grant looked a lot like DEFUSE, and it gave these groups the means and opportunity to follow-through with their research intentions stated in DEFUSE. Progress reports from that grant also reveal how research in Wuhan had created a SARS-CoV with heightened transmissibility/virulence, yielding >100x the viral titers in humanized mice compared to the wild type strain. The genome of that research product, a SARS CoV in Wuhan engineered by the same people as DEFUSE to be more human-infectious, has not been released. Transparency about coronavirus research activities and research products could dramatically alter our beliefs about a lab origin, either by making it less likely the research was conducted, or by revealing clear evidence of gain-of-function research on SARS CoVs in Wuhan. Incidentally, we have evidence of the latter that was not shared willingly by the researchers in Wuhan - that evidence, along with all the other evidence, should affect your beliefs.

14 Comments

Walter Sobchak, Esq.

Mar 18, 2023Liked by Alex Washburne

They are pushing back against you Alex:

New data links Covid-19’s origins to raccoon dogs at Wuhan market | Coronavirus | The Guardian

https://www.theguardian.com/society/2023/mar/17/covid-19-origins-raccoon-dogs-wuhan-market-data

IANAB, but it didn't make much sense to me. YMMV.

Expand full comment

1 reply

Mike Williams

"...If the PI’s on the DEFUSE grant were to share their databases, lab notebooks, and communications, we would be able to rule out the possibility that they went forward with the research proposed in DEFUSE...."

The "speed of science" fans of Zoonotic theories..remain mute on this "minor" quibble :)

Imagine having to say (by implication)..."we dont need to see their data bases/notebooks/emails etc..we just know its the markets.." ..oh dear..

12 more comments...

A Biologist's Guide to Life

The Falsifiability of SARS-CoV-2 Origin Theories

The importance of null models & testable hypotheses